Creating test data in a database

I know about some test data generators, but most seem to just populate the database of names and addresses (feel free to correct me).

We have a large integrated and normalized application - for example, invoices have part numbers associated with table tables, customer numbers associated with customer tables, change logs related to audit information, etc., which are obviously difficult to fill out by chance. We are currently fogging real-life data to get test data (but not very good).

What tools / methods are used to create large volumes of data for testing?

+22
database integration-testing
Aug 19 '08 at 14:47
source share
6 answers

Where I work, we use the RedGate Data Generator to generate test data.

Since we work in the banking sector. When we have to work with nominative data (credit card numbers, personal identifier, phone numbers), we developed an application that can mask these database fields so that we can work with them as real data.

I can say that with Redgate you can get closer to what your real data might look like on a production server, since you can customize each field of each table in your BD.

+7
Aug 19 '08 at 14:54
source share

You can create data plans using the VSTS Database Edition (with the latest Power Power 2008 tools).

It includes a data creation wizard that allows you to automate the creation of data by pointing to an existing database to get something realistic, but contains completely different data.

+3
Aug 19 '08 at 19:13
source share

I have deployed my own data generator that generates random data matching regular expressions. The basic idea is to use validation rules twice. First, you use them to generate reliable random data, and then use them to verify new input into production. I said rewrite the utility, as it seems like a good training project. It is available at googlecode .

+3
Oct 25 '08 at 6:16
source share

I just completed a project creating 3,500,000+ insurance policies. Due to the limitations of HIPPA and PHI, using pure real data is PITA. For this, I used the Datatect tool ( http://www.datatect.com/ ).

Some of the things that I like about this tool are:

  • Uses ODBC so that you can generate data in any ODBC data source. I used this for Oracle, SQL and MS Access databases, flat files and Excel tables.
  • Extensible via VBScript. You can write hooks in various parts of the data generation workflow to expand the capabilities of the tool. I used this function to “synchronize” dependent columns in a database and to control the frequency distribution of values ​​to match actual frequencies.
  • Consciously. When filling out columns of a foreign key, pulls the valid keys from the parent table.
+2
Oct 01 '08 at 14:26
source share

Red Gate is good ... but not perfect.

I found that I did better when I wrote my own data generation tools. I use it when I want to generate “Customers” ... but this is not great if you want to simulate the randomness with which customers can participate, for example, when creating orders ... some with one element with several elements.

Homegrown tools will provide the most “realistic” data that I think.

+1
Aug 19 '08 at 17:06
source share

Joel also mentioned RedGate in podcast # 11

0
Aug 19 '08 at 15:03
source share



All Articles