The most efficient way to query multiple identical tables in separate databases

Question

The most efficient way to query multiple identical tables in separate databases

I have a server (SQL Server 2005) with several archive databases (1 for a quarter, stretching over 8 years), which are all structurally identical.

I often have to query back a specific date range that spans n databases, usually n is a small 1-3, but maybe I need to query the whole set.

Any thoughts - the most effective way to do this, both from the purity of the code, and in terms of performance?

Current solutions are pretty ad-hoc, there is a collection of views covering the entire or only the latest database, other solutions - generate dynamic SQL, which determines which DBs contain the data to be searched.

Obviously, table splitting would be the ideal solution, but I cannot do this because it is a third-party database

Dave

EDIT: I cannot combine databases, as they are controlled by a third party, the total data size is about 50 GB, not huge, the largest tables contain about 1.5 m rows per quarter.

EDIT2: a data warehouse is certainly the right solution in the long run (this is in plan), but I cannot do it today :(

+3

sql database sql-server-2005

David Hayes Nov 04 '09 at 19:06

source share

5 answers

Here is something that is going to do it!

Declare
@Database varchar (8000),
@Sql varchar (8000)
START Declaring a DBName cursor LOCAL FAST_FORWARD To select a name FROM sys.databases where the name is "Your_DB_Names%"

Open DBName WHILE (1 = 1) Start Fetch Next from DBName to @Database

if @@ Fetch_status = -1 Break
if @@ Fetch_status = -2 Continue

Set @Sql = 'use' + @Database Print @Sql Run (@Sql)

SELECT * FROM TABLE - your query here End Close DBName
Expand the name DBName END

+3

Danielle Paquette-Harvey Nov 04 '09 at 19:44

source share

I have done this often, and let me tell you that individual databases are ACC pain. This forces you to make all kinds of logic, like this one - it kind of breaks the encapsulation, which is the database in the first place.

What you are looking at is a data warehouse. You should consider consolidating all your databases into one and making it read-only. Then you take incremental / hourly incremental backups of your live data and restore it to your warehouse. Then your warehouse is always updated, and you run your reports against this instead of live data.

This can lead to your reports killing your databases in real time, and I think that up to 90% of business needs do not require 100% accurate numbers on time.

Do hard material once - create a warehouse. :-)

EDIT

Something that I have done in the past is to create an idea of the tables I use and use related databases (if the dbs were on other machines)

 Create view view_tale as select * from activedb.dbo.table union select * from db1.dbo.table union select * from db2.dbo.table

Stealth, performance, but solves the problem neatly. Then you only have a one-time setup problem (creating a view for each table that you request) and a central place to change to update the list of databases for ongoing maintenance, and not support N number of reports up to date.

+2

Chris K Nov 04 '09 at 19:52

source share

Daniel's answer worked for me, with a slight change below. We have dozens of development databases on our servers for all our clients, in groups with regular prefixes and suffixes. Using the cursor allows you to view all records in all databases of a particular group. I had to make changes, though, because the “execute” didn't work for the “use” command, so I just made the whole command with the database name.

 Declare
 @Database varchar (8000),
 @Sql varchar (8000)
 BEGIN Declare DBName Cursor LOCAL FAST_FORWARD For Select name FROM sys.databases where name like 'MyPrefix% MySuffix'

 Open DBName WHILE (1 = 1) Begin Fetch Next From DBName into @Database

 if @@ Fetch_status = -1 Break
 if @@ Fetch_status = -2 Continue

 set @Sql = 'select * from' + @ Database + 'MyTable'
 print @sql
 execute (@sql)
 End
 Close DBName
 Deallocate dbname
 End

+1

mutatron Aug 24 2018-11-22T00:

source share

Depending on the size of the databases, it might be better to combine them into one database and index them correctly.

You can write your own SSIS package and schedule it to consolidate data periodically (daily / hourly / etc.).

0

Raj More Nov 04 '09 at 19:24

source share

Philip Kelley · Accepted Answer · 2009-11-04 19:53

One way to do this: use sp_msForEachDb.

- Round 1 -------

Call this system procedure with the varchar parameter. (This is actually a LOT messier than this, check the code in the main database if you want to know what it really does.) The parameter should be a snippet of dynamic code - for example,

DECLARE @DemoParameter varchar(1000) SET @DemoParameter = 'SELECT MyCol from MyTable where CreatedOn between ''Jan 1, 1980'' and ''Dec 21, 2012''' EXECUTE sp_msForEachDb @DemoParameter

This will cause a query to each database of the SQL instance, returning one set for each database - with the exception of those databases in which there was no necessary table (s) that would cause an error (in particular, system databases), This leads us to...

- Round 2 ---------

Inside the dynamic code, since the databases are repeated in all instances of the question mark? will be replaced with the name of the database currently being processed. You can use this to filter which databases should be processed and which should not. Also note that the "current" database will not be modified by the subroutine; you must do this yourself. This gives us code like:

 SET @DemoParameter = ' IF ''?'' like ''%Foo%'' BEGIN USE ? SELECT MyCol from MyTable where CreatedOn between ''Jan 1, 1980'' and ''Dec 21, 2012'' '

This will only result in querying those databases whose names contain the characters "foo". Perhaps you can check for a table in each database; other methods suggest.

This will allow you to drop one data set for each database, which does not help much if you need all of them in one neat and orderly data set, and this forces us ...

- Round 3 ------------

In short: create a temporary table and populate it from a dynamic query. As I will show below, you can include the database name and always the server name - it is very useful when your searches for lost data in dozens of databases extend to several servers.

Create (or clear) the temp table:

 IF object_id('tempdb.dbo.##Foo') is null CREATE TABLE ##Foo ( ServerName varchar(100) not null ,DBName varchar(100) not null -- Add your own columns here ,MyCol int not null ) ELSE --Option: Delete this line to not clear on each run TRUNCATE TABLE ##Foo

Run the code (this is my main template, you can easily work with @DemoParameter):

 EXECUTE sp_msForEachDB ' IF ''?'' like ''%Foo%'' BEGIN USE ? INSERT ##Foo select @@servername, db_name() ,MyCol from MyTable END '

... and this should create a single temporary table with your data. Check this out, I wrote this without actually testing the code, and typso will be silp in. (# temp tables should work just like ## temp, I usually do this with special system support problems)

The most efficient way to query multiple identical tables in separate databases

More articles: