Considerations for very large SQL tables?

I mainly create an ad server. This is a personal project that I am trying to produce on my boss, and I would like to receive any feedback about my design. I have already implemented most of what I will describe below, but it's never too late to use refactoring :)

This is a service that provides banner advertising ( http://myserver.com/banner.jpg links to http://myserver.com/clicked ) and provides reporting on subsets of data.

For each ad serving and each click, I need to write a line that has (id, value) [where value is the monetary value of this transaction; for example, $ 001 for each banner ad for $ 1 per thousand impressions, or + $ 25 per click); my conclusion is based on revenue per impression [EPC for short]: (SUM(value)/COUNT(impressions))but on subsets of data such as “Profit per impression where browser ==“ Firefox. ”The goal is to output something like“ Your total EPC is $ .50, but where the browser == "Firefox", your EPC is $ 1.00 so that the end user can quickly see the significant factors in their data.

Since there are a very large number of these subsets (tens of thousands), and the report output should include only summary data, I pre-compute the EPC subset with the cron background and save these total values ​​in the database. Once every 2-3 hits, Hit must query the Hits table for other recent hits by the visitor (for example, "find the REFERER of the last hit"), but usually each Hit performs only INSERT, so to save the answer once down, I divided the application into 3 servers [bgprocess, mysql, hitserver].

Currently, I have structured all this as 3 normalized tables: hits, events, and visitors. Visitors are unique to each visitor session. A hit is recorded every time a visitor loads a banner or makes a click, and events compare various many-to-many relationships from visitors to hits (for example, the example “Visit” “Visitor X to Banner Y”, which is unique, but may have several hits.) The reason I keep all hit data in one table is because although my example above only describes “Banner impressions → clicks”, we also track “clicks → pixel lights”, “pixel lights → second clickthrough "and" second clickthrough → page pixel sale. "

, . , . , , , .

, , SO-, ? , ? , ProcessedHits ( ) UnprocessedHits ( ), Hit.at Date, ?

, , , , , ~ 3 :) TIA !

+3
2

, .

, , , . - . , . , .

- . , , , , . , , , , (, , SSD).

-, , , , - , , .

INNODB MyISAM?

:

  • - DB
  • , , .
+1

, "", . , . "", . , , , , / .

, .

0

Source: https://habr.com/ru/post/1727302/


All Articles