Database for analytics

I am creating a large database that will generate statistical reports from incoming data.
The system will work for the most part as follows:

  • About 400 thousand 500 thousand rows - about 30 columns, mainly varchar (5-30) and date-time - will be loaded every morning. Its approximately 60 MB while in a flat file format, but growing sharply in the database with the addition of suitable indexes.
  • Various data will be generated from the data of the current day.
  • Reports from these statistics will be generated and saved.
  • The current dataset will be copied to the partitioned history table.
  • During the day, the current dataset (which was copied, not moved) can be requested by end users for information that is unlikely to include constants, but relationships between fields.
  • Users can request specialized queries from the history table, but the queries will be created by the database administrator.
  • Before loading the next day, the current data table is truncated.

This will be essentially version 2 of our existing system.

MySQL 5.0 MyISAM (Innodb ) # 6 # 4. # 4 , 5.0 . ( ), , history_queue, , . , , , , . , .

MySQL 5.1 ( MySQL) , PostgreSQL. , , , - - , . -. - MySQL, , PostgreSQL .

, . PostgreSQL " , " - , , MySQL 5.1 PostgreSQL 8.3 ?

(Oracle MS SQL) - , Oracle .

MyISAM vs. Innodb : Innodb, , 3-4 . , MySQL, , , , db Innodb.

- , , , .. MyISAM .

5.1: , 5.1. , ( 12 ) . , 5.1, , , .

PostgreSQL gotchas: COUNT (*) - - . , . COPY FROM , LOAD DATA INFILE, . - INSERT IGNORE. , , GROUP BY , . , , .

+3
9

ERP-. , 60 , ~ 21 , 16 . ~ 15 , , - . , "" PostgreSQL , , .

16- , , , - , . , .

PostgreSQL, , , , ( ). , , , , , .

( , !) . , . - , "" , , .

, , .

+2

, postgresql 7.x/8.0 8.1 ( 2x-3 ), 8.1 8.2 , . 8.2 8.3, , , .

, , .

postgresql, . , , , pre 8.2 pg .

, , , pg .

, , , postgresql?

( firebird, , , mysql postgresql)

+2

Inodb , pg . Myisam, , , Innodb , , , / .

varchar, char (n)?

? , , , .

ON EDIT:

, , , ?

: , mysql myism . ( 0,5 1 , ( )), , , .

:

create new_table select * from old_table ;

, .

. , . .

: : , MyIsam . , , , . , - , , . , .;)

( , , InnoDb , , , . .)

, a. *, b.value foo join... , a.foo = b.value... join, , .

+1

, . , 500K , , .

, ( , ), .

, . , , , , , , ( ).

PosgreSQL, , , , , ( ) .

+1

PostgreSQL. , , , Postgres 2005 - MySQL . 5.1. MyISAM , - "" MyISAM " .

Postgres , # 6. , Postgres . gotchas.

+1

Infobright, , :

http://www.infobright.org/

- psj

+1
0

. IO? ? ? .

, .

? INSERT:

INSERT INTO TABLE blah VALUES (?, ?, ?, ?)

500K , . , . MySQL :

INSERT INTO TABLE blah VALUES
  (?, ?, ?, ?),
  (?, ?, ?, ?),
  (?, ?, ?, ?)

-, crontab. , . , - .

LOAD DATA INFILE CSV . . http://dev.mysql.com/doc/refman/5.1/en/load-data.html

, , - SQL- - SQL-. , Pig Hive ?

500K, - . , .

0

myisam_key_buffer? .

, , id .., , :

INSERT INTO archive SELECT .. FROM current ORDER BY id (or date)

, , . , , ORDER BY... , .

PostgreSQL.

.

, PostgreSQL .

.

, . PostgreSQL " , " - , , MySQL 5.1 PostgreSQL 8.3 ?

, . ,

  • , , .
  • , .

, mysql postgres, , , postgres, .

postgres (Core 2 Duo, SATA-) , , 4000 - , . , , , (InnoDB - concurrency). "MyISAM " - , postgres " " 50-100 .

;)

, Big Aggregates Big Joins, postgres - MySQL, / .

- INSERT IGNORE. , , GROUP BY , . , , .

GROUP BY, , , :

INSERT INTO target SELECT .. FROM source LEFT JOIN target ON (...) WHERE target.id IS NULL

concurrency, .

0
source

Source: https://habr.com/ru/post/1705844/


All Articles