Real-time data warehouse for web access logs

We are thinking of creating a data warehouse system for downloading web access logs that are generated by our web servers. The idea is to download data in real time.

We want to present to the user a linear graph of the data and allow the user to expand by size.

The question is how to balance and design the system so that:

(1) data can be obtained and presented to the user in real time (<2 seconds),

(2) data can be aggregated per hour and per day, and

(2) since it is still possible to store a large amount of data in the warehouse and

Our current data transfer rate is approximately ~ 10 requests per second, which gives us ~ 800 thousand rows per day. My simple tests with MySQL and a simple star schema show that my queries begin to take more than 2 seconds when we have more than 8 million rows.

Is it possible that it receives real-time query performance from a "simple" data store, for example, and it still stores a lot of data (it would be nice to be able to never delete any data)

Are there ways to summarize data in higher resolution tables?

I got the feeling that this is actually not a new question (although I searched a lot on googled). Maybe someone can provide guidance on data warehouse solutions? One that comes to mind is Splunk.

Perhaps I understand too much.

UPDATE

:

  • :

    • (ip-)
    • URL
  • ;

    • ( )
+3
4

, . MySQL .

MyISAM - -. ( , InnoDB - InnoDB ). merge - , .

- , , , RAID .

: , , . 200 , 50 , . , , , . ( ) .

, , , , . - , . ( , . , , 3-4 ).

+1

Seth - , , , .

Mozilla -. , DB Vertica. , , .

, , MongoDB. , , . , ( mongodb )

, , .. http://blog.mongodb.org/post/171353301/using-mongodb-for-real-time-analytics

+2

, , ; , , ~ 5.5M.

, - , , . , , HOUR (myTimestamp) DATE (myTimestamp). , .

, , .

+1

. , 20-100 0,1 ( ), -. .

, , . - , 64- .

, , , .. , , ETL, , . , .

- - . . , , . .

- , Postgresql, MySQL. : postgresql , , , . MySQL, , , , .. , , ​​ db2, oracle, sql-. , parallelism, , ..

0

Source: https://habr.com/ru/post/1726938/


All Articles