MEMSQL and MySQL

I need to start by indicating that I am in no way a database expert. I know how to get around programming applications in several languages ​​that require a database, and am relatively familiar with MySQL, Microsoft SQL Server, and MEMSQL, but again, not an expert in databases, so your input is greatly appreciated.

I worked on developing an application that should cross-reference several different tables. One very simple example of a problem I recently had is this:

  • Upload 600K to 1M records daily to a temporary table.
  • Compare what has changed between this new data stream and the old. Record this information in a separate table.
  • Restore the table with new records.

Startup # 2 is a query similar to:

SELECT * FROM (NEW TABLE) LEFT JOIN (OLD TABLE) ON (JOINED FIELD) WHERE (OLD TABLE.FIELD) IS NULL 

In this case, I compare two tables in a given field and then retrieve information about what has changed.

In MySQL (v5.6.26, x64) my query expires. I run 4 vCPU and 8 GB of RAM, but note that the rest of my configuration is the default configuration (did not change any settings).

In MEMSQL (v5.5.8, x64), my query is launched from about 3 seconds on the first try. I use the same virtual server configuration with 4 vCPU and 8 GB of RAM, also note that the rest of my configuration is the default configuration (did not change any settings).

In addition, in MEMSQL I run one node configuration. The same goes for MySQL.

I like that the use of MEMSQL allowed me to continue developing my project, and I encounter even greater queries and views for calculating the crosstab that I can run, which run fantastically in MEMSQL ... but in a perfect world, I would use MySQL . I have already come across the fact that I need to use a different set of tools to manage my instance (i.e.: MySQL Workbench works relatively well with the MEMSQL server, but I really need to create views and tables using the open source SQL Workbench and mysql java adapter The same thing for using the Visual Studio MySQL connector, but it can be painful at times, for some reason I can add queries, but I can not add table adapters) ... sorry, I will send a separate question for this :)

Given that both virtual machines have the same configuration and SSD support, can anyone give me any recommendations on how to configure my MySQL instance to run large queries like the ones above in MySQL? I understand that I can also create a database in memory, but I read that there might be some persistence problems with this. Not sure.

Thanks!

+5
source share
1 answer

Most likely, this is because you do not have an index on the joined field in one or both tables. According to this article:

https://www.percona.com/blog/2012/04/04/join-optimizations-in-mysql-5-6-and-mariadb-5-5/

Vanilla MySQL only supports nested loop joins that require the index to work well (otherwise they take quadratic time).

Both MemSQL and MariaDB support a so-called hash join, which does not require indexes in tables, but consumes more memory. Since your data set is negligible for modern RAM sizes, the extra memory overhead is not noticed in your case.

So all you have to do to solve the problem is add the indexes in the joined field to both tables.

Also tell us about the problems that you encounter with open source tools when connecting to MemSQL in a separate question or at chat.memsql.com so that we can fix it in the next version (I work for MemSQL and compatibility with MySQL tools is one of priorities for us).

+3
source

Source: https://habr.com/ru/post/1232211/


All Articles