How can I make MySQL as fast as a flat file in this scenario?

  • Suppose a key value table has at least 10 s of millions of rows.
  • Define an operation that takes a large number of identifiers (again, 10 million), finds the corresponding values ​​and sums them up.

Using a database, this operation seems to be approaching (disk seek time) * (number of lookups).

Using a flat file and reading all the contents, this operation will approach (file size)/(drive transfer rate).

Inclusion of some (rough) values ​​(from Wikipedia and / or experiments):
seek time = 0.5ms
transfer rate = 64MByte/s
file size = 800M(for 70 million int / double key / values)
65 million value lookups

DB time = 0.5ms * 65000000= 32500s= 9 hours
Flat file = 800M/(64MB/s)=12s

Experimental results are not so bad for MySQL, but a flat file still wins.


:
id/value InniDB MyISAM.

CREATE TABLE `ivi` (
  `id` int(11) NOT NULL AUTO_INCREMENT,
  `val` double DEFAULT NULL,
  PRIMARY KEY (`id`)
) ENGINE=InnoDB

32 . :

select sum(val) from ivm where id not in (1,12,121,1121);  //be sure to change the numbers each time or clear the query cache

/ java.

 private static void writeData() throws IOException {
        long t = -System.currentTimeMillis();
        File dat = new File("/home/mark/dat2");
        if (dat.exists()){
            dat.delete();
        }
        FileOutputStream fos = new FileOutputStream(dat);
        ObjectOutputStream os = new ObjectOutputStream(new BufferedOutputStream(fos));
        for (int i=0; i< 32000000; i++){
            os.writeInt(i);
            os.writeDouble(i / 2.0);
        }
        os.flush();
        os.close();        
        t += System.currentTimeMillis();
        System.out.println("time ms = " + t);
    }
    private static void performSummationQuery() throws IOException{
        long t = -System.currentTimeMillis();

        File dat = new File("/home/mark/dat2");
        FileInputStream fin = new FileInputStream(dat);
        ObjectInputStream in = new ObjectInputStream(new BufferedInputStream(fin));
        HashSet<Integer> set = new HashSet<Integer>(Arrays.asList(11, 101, 1001, 10001, 100001));
        int i;
        double d;
        double sum = 0;
        try {
            while (true){
                i = in.readInt();
                d = in.readDouble();
                if (!set.contains(i)){
                    sum += d;
                }
            }
        } catch (EOFException e) {
        }

        System.out.println("sum = " + sum);
        t += System.currentTimeMillis();
        System.out.println("time ms = " + t);
    }

:

InnoDB        8.0-8.1s            
MyISAM        3.1-16.5s
Stored proc   80-90s
FlatFile      1.6-2.4s (even after: echo 3 > /proc/sys/vm/drop_caches) 

, . , "" CRUD . , .

, MySQL , , ?


EDIT:
:
1. , , .
2. , , - , . , , , - ad-hoc. . " ".

.

+3
7

MySQL ( ) .

InnoDB . MyISAM .

ignore index(`PRIMARY`)   

tabl .

EDIT:
, " ", " - " .. :

", MySQL , ?"

, : MyISAM , -, , , .

+3

, MySQL - 100% , . MySQL , , 100%, MySQL - . 10 000 000 , ( "" ).

, 10- MySQL, , . 15 , .

+4

, , -1- - - :

select
 st.tot - v.val 
from
 ivi_sum_total st
join
(
 select sum(val) as val from ivi where id in (1,12,121,1121)
) v;

+---------------------+
| st.tot - v.val      |
+---------------------+
| 1048317638720.78064 |
+---------------------+
1 row in set (0.07 sec)

drop table if exists ivi_sum_total;
create table ivi_sum_total
(
tot decimal(65,5) default 0
) 
engine=innodb;

drop table if exists ivi;
create table ivi 
(
id int unsigned not null auto_increment,
val decimal(65,5) default 0,
primary key (id, val)
) 
engine=innodb;

delimiter #

create trigger ivi_before_ins_trig before insert on ivi
for each row
begin
  update ivi_sum_total set tot = tot + new.val;
end#

create trigger ivi_before_upd_trig before update on ivi
for each row
begin
  update ivi_sum_total set tot = (tot - old.val) + new.val;
end#

-- etc...

select count(*) from ivi;

+----------+
| count(*) |
+----------+
| 32000000 |
+----------+

select
 st.tot - v.val 
from
 ivi_sum_total st
join
(
 select sum(val) as val from ivi where id in (1,12,121,1121)
) v;

+---------------------+
| st.tot - v.val      |
+---------------------+
| 1048317638720.78064 |
+---------------------+
1 row in set (0.07 sec)

select sum(val) from ivi where id not in (1,12,121,1121);

+---------------------+
| sum(val)            |
+---------------------+
| 1048317638720.78064 |
+---------------------+
1 row in set (29.89 sec)

select * from ivi_sum_total;

+---------------------+
| tot                 |
+---------------------+
| 1048317683047.43227 |
+---------------------+
1 row in set (0.03 sec)


select * from ivi where id = 2;

+----+-------------+
| id | val         |
+----+-------------+
|  2 | 11781.30443 |
+----+-------------+
1 row in set (0.01 sec)

start transaction;
update ivi set val = 0 where id = 2;
commit;

Query OK, 1 row affected (0.01 sec)
Rows matched: 1  Changed: 1  Warnings: 0

select * from ivi where id = 2;

+----+---------+
| id | val     |
+----+---------+
|  2 | 0.00000 |
+----+---------+
1 row in set (0.00 sec)


select * from ivi_sum_total;

+---------------------+
| tot                 |
+---------------------+
| 1048317671266.12784 |
+---------------------+
1 row in set (0.00 sec)


select
 st.tot - v.val 
from
 ivi_sum_total st
join
(
 select sum(val) as val from ivi where id in (1,12,121,1121)
) v;

+---------------------+
| st.tot - v.val      |
+---------------------+
| 1048317626939.47621 |
+---------------------+
1 row in set (0.01 sec)

select sum(val) from ivi where id not in (1,12,121,1121);

+---------------------+
| sum(val)            |
+---------------------+
| 1048317626939.47621 |
+---------------------+
1 row in set (31.07 sec)
+2

, . MySQL ( ) , I/O. . , . / . ( io) , , . z5h , , , .

, , bg job , UNION - "" . mysql , . , hdfs / , hadoop.

P.S: 100% , .

+1

, ...

JAVA- HashSet, ?

MyISAM BTREE.
MEMORY HASH.

MyISAM HASH BTREE

CREATE TABLE `ivi`
(
    `id` int (11) NOT NULL AUTO_INCREMENT,
    `val` double DEFAULT NULL,
    PRIMARY KEY (`id`) HASH
) ENGINE = MyISAM;

, litte. , . , MyISAM.

  • AUTO_INCREMENT

CREATE TABLE `ivi`
(
    `id` int (11) NOT NULL,
    `val` double DEFAULT NULL,
    KEY id (`id`) HASH
) ENGINE = MyISAM;

- :

ALTER TABLE ivi DISABLE KEYS;
...
... ( )
...
ALTER TABLE ivi ENABLE KEYS;

key_buffer_size /etc/my.cnf MyISAM.

, !!!

+1

?

Flat file performance will degrade significantly when using multiple users. From the database, he must "plan" the disk reading to satisfy queries executed in parallel.

+1
source

Maybe you should take a look at NDBAPI . I think that these people were able to achieve speeds close to working with a file, but still save the data stored in InnoDB.

0
source

Source: https://habr.com/ru/post/1796345/


All Articles