Database Recommendations

Question

Database Recommendations

I have a dataset that looks like this:

id   name     c1    c2    c3    c4   ...  c50
-----------------------------------------------
1    string1  0.1   0.32  0.54 -1.2  ...  2.3
2    string2  0.12  0.12 -0.34  2.45 ...  1.3
...
(millions of records)

So, I have an identifier column, a row column, and then 50 floating point columns.

Only one type of query will be executed in this data, which in a traditional SQL SELECT statement will look like this:

SELECT name FROM table WHERE ((a1-c1)+(a2-c2)+(a3-c3)+...+(a50-c50)) > 1;where a1,a2,a3,etcare the values that are generated before sending the request (do not fit into the data table).

My question is: Does anyone have any recommendations as to which type of database will handle this type of query the fastest. I used SQL server(which is very slow), so I am looking for other opinions.

Will there be a way to optimize the SQL server for this type of query? I was also interested in learning about column storage databases such as MonetDB. Or perhaps a document repository database, for example MongoDB. Anyone have any suggestions?

Thanks a lot, Brett

+3

sql database sql-server mongodb monetdb

Bret Dec 29 '10 at 20:02

source share

4 answers

SQL Server:

( + - ..), .

, .

, , .

+2

JNK 29 . '10 20:07

. http://hsqldb.org/

, ...

0

Will 29 . '10 20:05

:

(a1 + a2 + a3 + ... + a50) > 1 + (c1 + c2 + c3 + ... + c50)

c = 1 + c1 + ... + c50 a = a1 + ... + a50 . ... WHERE @a > c. .

( SQL Server). , . , , , 100 . . , ... , . " " - .

Even if the values have truly variable precision, so the two digits are not accurate enough, it might make sense to create an integer index to reduce the rows that need to be checked. The query can check both the approximate value (to achieve the index) and the exact value (to get an accurate result). If you do, make sure that the original values are rounded in the right direction to avoid losing accurate results.

0

Wreach Dec 29 '10 at 20:45

source share

Mark byers · Accepted Answer · 2010-12-29T20:07:17+0000

You can continue to use SQL Server and use a constant computed column that calculates the sum of all the values and index that are.

ALTER TABLE tablename ADD SumOfAllColumns AS (c1 + c2 + ... + c50) PERSISTED

Then you can change your request as:

SELECT name FROM tablename WHERE SumOfAllColumns < a1+a2+a3+...+a50 - 1

This query will be able to use the index in the calculated column and quickly find the corresponding rows.

Database Recommendations

More articles: