I was tasked with creating a process for synchronizing data between a CSV file created by another provider and more than 300 separately structurally identical CRM databases. All CRM databases are defined in a single instance of SQL Server. Here are the specifics:
The source data will be a CSV, which contains a list of all the email addresses in which customers have chosen marketing communications. This CSV file will be sent in full every night, but it will contain date and time stamps at the recording level, which will allow me to select only those records that have been changed since the last processing cycle. A CSV file potentially has many hundreds of thousands of lines, although the expected changes on a daily basis will be substantially lower.
I will select the data from the CSV and will convert each row into a user object List<T>.
As soon as the CSV is requested and the data is converted, I will need to compare the contents of this List<T>with the CRM databases. This is because any email address contained in a CSV file can:
- Does not exist in any of the 300 databases.
- Act in one of 300 databases
- In several databases
In any case, when there is a match between the email address in the CSV master list and any CRM database, the corresponding CRM record will be updated with the values contained in the CSV file.
At a high, very general level, I thought I would need to do something like this:
foreach(string dbName in masterDatabaseList)
{
foreach(string emailAddress in masterEmailList)
{
bool matchFound = EmailExistsInDb(emailAddress)
if (matchFound )
{
}
}
}
? 300 , , CSV. SQL :
"SELECT * FROM EMAIL_TABLE WHERE EMAIL_ADDRESS IN(email1,email2, email3,...)"
, , /, , SQL .
? 300 , , , . , , .