How to normalize SQL database

I was wondering if anyone has any suggestions for normalizing the database. Now I do not mean designing the structure, I mean how to actually move database data from the old structure to a new, normalized structure. I know that I can write something like a PHP script, but I was wondering if there is a way to do this in SQL. In particular, MySQL.

** EDIT: Has anyone tried something like SwisSQL? This is a migration tool, but I'm not sure that it will do what I ask.

+4
source share
3 answers

Here is an example of table normalization in Script. I advise you to do something like this

eg Table: tbl_tmpData Date, ProductName, ProductCode, ProductType, MarketDescription, Units, Value 2010-01-01, 'Arnotts Biscuits', '01', 'Biscuit', 'Store 1', 20, 20.00 2010-01-02, 'Arnotts Biscuits', '01', 'Biscuit', 'Store 2', 40, 40.00 2010-01-03, 'Arnotts Biscuits', '01', 'Biscuit', 'Store 3', 40, 40.00 2010-01-01, 'Cola', '02', 'Drink', 'Store 1', 40, 80.00 2010-01-02, 'Cola', '02', 'Drink', 'Store 2', 20, 40.00 2010-01-03, 'Cola', '02', 'Drink', 'Store 2', 60, 120.00 2010-01-01, 'Simiri Gum', '03', 'Gum', 'Store 1', 40, 80.00 2010-01-02, 'Simiri Gum', '03', 'Gum', 'Store 2', 20, 40.00 2010-01-03, 'Simiri Gum', '03', 'Gum', 'Store 3', 60, 120.00 

First you must create your date table:

 CREATE TABLE tbl_Date ( DateID int PRIMARY KEY IDENTITY(1,1) ,DateValue datetime ) INSERT INTO tbl_Date (DateValue) SELECT DISTINCT Date FROM tbl_Data WHERE Date NOT IN (SELECT DISTINCT DateValue FROM tbl_Date) 

you must create a table of your market

 CREATE TABLE tbl_Market ( MarketID int PRIMARY KEY IDENTITY(1,1) ,MarketName varchar(200) ) INSERT INTO tbl_Market (MarketName) SELECT DISTINCT MarketDescription FROM tbl_tmpData WHERE MarketName NOT IN (SELECT DISTINCT MarketDescription FROM tbl_Market) 

you must create a ProductType table

 CREATE TABLE tbl_ProductType ( ProductTypeID int PRIMARY KEY IDENTITY(1,1) ,ProductType varchar(200) ) INSERT INTO tbl_ProductType (ProductType) SELECT DISTINCT ProductType FROM tbl_tmpData WHERE ProductType NOT IN (SELECT DISTINCT ProductType FROM tbl_ProductType) 

you must create a product table

 CREATE TABLE tbl_Product ( ProductID int PRIMARY KEY IDENTITY(1,1) , ProductCode varchar(100) , ProductDescription varchar(300) ,ProductType int ) INSERT INTO tbl_Product (ProductCode, ProductDescription, ProductType) SELECT DISTINCT tmp.ProductCode,tmp.ProductName, pt.ProductType FROM tbl_tmpData tmp INNER JOIN tbl_ProductType pt ON tmp.ProductType = pt.ProductType WHERE ProductCode NOT IN (SELECT DISTINCT ProductCode FROM tbl_Product) 

you must create a data table

 CREATE TABLE tbl_Data ( DataID int PRIMARY KEY IDENTITY(1,1) , DateID varchar(100) , ProductID varchar(100) , MarketID varchar(300) ,Units decimal(10,5) , value decimal(10,5) ) INSERT INTO tbl_Data (ProductID, MarketID, Units, Value) SELECT t.DateID , p.ProductID , m.MarketID , SUM(tmp.Units) , SUM(tmp.VALUE) FROM tbl_tmpData tmp INNER JOIN tbl_Date t ON tmp.Date = t.DateValue INNER JOIN tbl_Product p ON tmp.ProductCode = p.ProductCode INNER JOIN tbl_Market m ON tmp.MarketDescription = m.MarketName GROUP BY t.DateID, p.ProductID, m.MarketID ORDER BY t.DateID, p.ProductID, m.MarketID 
+8
source

Download the MySQL Workbench from the MySql website and then connect your MySQL instance to the Workbench Utitily user interface.

Once this is completed.

Write a Script that turns your data into the structure you want.

0
source

I recently did this and have an idea of ​​how to perform the general procedure.

  • Start by modeling your data. When you start with a non-standardized database, you need to create a suitable model for which you want to transfer your data. This includes the identification of atomic objects that must live in their own tables. Identify duplicate data and determine where it should go. Also list all the relationships that exist in your data structure.

    Optional step. The database usually comes with an interface that probably also needs to be updated. Look at this project and at this step and decide if there are any isolated parts that can wait, both in the data structure and in the interface program. How much should be included is determined by practical aspects such as time and budget. Perhaps some part still does not need modification.

    It may also be possible to completely start from scratch, skip backward compatibility and allow two parallel systems

  • Write a script that adds all the new columns and tables needed for normalized data.

  • Write another script that transfers the unnormalized data to a new normalized data structure. This is the hardest part that I would say, and can be pretty dirty, depending on how bad the old data is.

  • Force all model constraints to apply to new normalized data by adding constraints to new tables and columns. This is also best done in a script. You will see it if the data migration was successful. If this happens, you can add all the restrictions. If this does not succeed, some restriction will fail, and you will have to go back and see what failed.

  • Finally, create another script that will delete all columns and tables that were deleted in the new model. By doing this, you can easily identify all the places in the interface that need updating. Everything that says anything in these columns and tables needs to be updated in the interface.

Some general tips are to do all the development against a possibly abridged copy of the database. For instance. in MySQL you can do SQL Dump using, for example, Workbench and test scripts. You will probably need several iterations in the database before starting the migration. In this regard, also perform the actual migration to a copy of the database so as not to break anything during the production process.

0
source

Source: https://habr.com/ru/post/1341550/


All Articles