MySQL: choosing a database structure - big data - duplicate data or bridges

We have a 90 GB MySQL database with very large tables (over 100 M rows). We know that this is not the best DB mechanism, but that is not what we can change at this stage.

Planning for serious refactoring (performance and standardization), we reflect on several approaches to how to restructure our tables.

Currently, the data stream / storage is as follows:

  • We have one table called articles, one join table called article_authors, and one authors table

  • One author may have 1..n firstnames, 1..n lastnames, 1..n emails

  • Each author has a unique parent (unique_author), except that this author is a parent

Possible data request scenarios:

  • Get the author’s first name, last name and email address for this article
  • Get a unique author.id for an author called John Smith
  • Get all articles by John Smith

The current database schema looks like this: enter image description here

EDIT: The main problem with this structure is that we always duplicate the same name_name and last_names.

Now we are oscillating between two different structures:

  • A large number of tables, the data is divided and there are connections with identifiers. There are no duplicates in the main tables: articles and authors. Not sure how this will affect performance, since we will need to use multiple connections to retrieve data, for example:

1

  1. article_authors ( , ), . 10 , 10 article_authors:

2

+4
2

, , . , ? , : http://mysql.rjweb.org/doc.php/index_cookbook_mysql#many_to_many_mapping_table

# 1 " ". .

# 2 . phone_number last_name, phone_numbers (, , , ), . (, , ).

; , , . 1: . ( , - - - . .)

, . " " . . ? .

"id". MEDIUMINT UNSIGNED AUTO_INCREMENT . "J. K. Rowling" "JK Rowling" id.

, id . id ..

, . , . .

, :

  • , , author_id , .
  • , author_id .

( , DATABASEs MySQL.)

, . , .

, -, - ( "JK Rawling" ) . , - .

, , . , - , - , . , ; ALTER TABLE CREATE TABLE .

, (provider_id, full_author_name, author_id), , . , . , . , .

, , . .

... " " . author_id .

, . (SQL .)

+3

, : "jkrowling @" + "gmail.com". , , , ...

+1

Source: https://habr.com/ru/post/1690019/


All Articles