Template for updating SQL Server 2008 slave databases from the wizard to minimize crashes

Question

Template for updating SQL Server 2008 slave databases from the wizard to minimize crashes

We have an ASP.NET web application hosted by a multi-instance web farm using SQL Server 2008, in which we aggregate and pre-process data from multiple sources into a format optimized for fast end-user query performance (creating 5-10 million rows in some tables). Aggregation and optimization are performed by a service on the back server, which we then want to distribute to several read-only copies used by web application instances to facilitate maximum scalability.

My question is the best way to get this data from the database in the background from read-only source copies so as not to kill their performance during the process. Front-end web application instances will be under constant load and should have good responsiveness at all times.

The base database is constantly being updated, so I suspect that transactional replication will not be the best approach, as a constant stream of updates for copies will hurt their performance.

Data problems are not a big problem, so snapshot replication can be a way, but it will lead to poor performance during replication periods.

Performing a reset and bulk insert will result in periods without data for user queries.

Actually, I don’t want to write a complex cluster approach when we remove copies from the cluster during the upgrade - is there something in this direction that we can do without much effort, or is there a better alternative?

+4

synchronization sql-server-2008 scalability replication

Rob west Jun 18 '09 at 11:48

source share

3 answers

I have never had to deal with this scenario before, but came up with a possible solution for this. Basically, this will require a change in your basic database structure. Instead of storing data, you will keep track of changes in this data. Thus, if a record is added, you save "Table X, inserted a new record with these values: ..." With the changes, just save the table, field and change the value. With deletion, just save which record is deleted. Each modification will be saved with a time stamp.

Your client systems will store their local copies of the database and will regularly request all database changes after a certain date / time. Then you make these changes to the local database and it will be updated again.

And the background? Well, it just saves a list of changes and maybe a table with basic data. Saving only the changes also means that you are tracking the history, allowing you to ask the system what it looked like a year ago.

How well this will work depends on the number of changes in the database. But if you request changes every 15 minutes, it should not be so much data every time.

But then again, I never had the opportunity to work on this in a real application, so for me this is still a theoretical principle. It seems fast, but it will take a lot of work.

+1

Wim ten brink Sep 01 '09 at 11:37

source share

Option 1 Write a line-level transaction data transfer application. This may take longer, but will not lead to interruption of the site using data, because there are lines before and after reading, only with new data. This processing will be performed on a separate server to minimize load.

In sql server 2008, you can set READ_COMMITTED_SNAPSHOT to ON to ensure that the updated row does not cause blocking.

But basically, this whole application reads new data, as it is available from one database to another.

Option 2 Move data (tables or an entire database) from the aggregation server to an external server. Automate this if possible. Then switch your web application to point to a new database or table for future queries. This works, but requires control of the web application, which you may not have.

Option 3 If you talked about one table (or this could work with many), then you can do swap exchange. So you write your code against the sql view that points to table A. You work in table B, and when it is ready, you update the view to point to table B. You can even write a function that defines the active table and automates whole swap.

Option 4 Perhaps you can use something like byte level replication on the server. That sounds scary. Which basically copies the server from point A to point B right down to the most bytes. This is mainly used in DR situations that sound like it might be a random / sorta DR situation, but actually.

Option 5 Give up and find out how to sell insurance. :)

+1

Ryan montgomery Sep 04 '09 at 1:35

source share

Remus Rusanu · Accepted Answer · 2009-09-04T21:53:12+0000

There is actually a technology built into SQL Server 2005 (and 2008) that addresses these issues. Service Broker (I will call SSB). The problem is that it has a very steep learning curve.

I know that MySpace uses SSB publicly to manage its fleet of SQL servers: MySpace uses the SQL Server service broker to protect the integrity of 1 petabyte of data . I know several other (large) sites that use similar templates, but, unfortunately, they are not published, so I can not refer to the names. I have personally been involved with some projects on this technology (I am a former member of the SQL Server team).

Now keep in mind that SSB is not a dedicated data transfer technology such as replication. This way, you won’t find anything like the Publish Wizard and the simple Replication deployment options (check the table and submit it). SSB is a reliable messaging technology, and therefore its primitives stop at the messaging level, you will need to write code that uses data change capture , packs it as messages, and also decompresses the message into relational tables at the destination.

Why, nevertheless, some companies prefer SSB over replication when asked, as you describe, because SSB has a much better history when it comes to reliability and scalability. I know projects that exchange data between 1500+ sites, far beyond the capabilities of Replication. SSB also abstracts from the physical topology: you can move databases, rename machines, rebuild servers without changing the application. Because data flows through logical routes , the application can update on the fly to new topologies. SSB is also resistant to long periods of deactivation and downtime, capable of resuming data flow through hours, days, and even months of disconnection. The high throughput achieved by integrating with the engine (SSB is part of the SQL engine itself, is not a combination of satellite applications and processes such as Replication) means that lagging changes can be a process in reasonable times (I know sites that go through half million transactions per minute). SSB applications typically rely on internal activation to process incoming data. SSB also has some unique functions, such as built-in load balancing (via routes) with sticky session semantics, support for interdependent processing of interdependent applications, priority data , specific support for database mirroring, certificate-based authentication for cross-domain operations, built-in saved timers and many others.

This is not a specific answer "how to move data from table T on server A to server B". This is a more general technology that allows "to exchange data between server A and server B".

Template for updating SQL Server 2008 slave databases from the wizard to minimize crashes

More articles: