My master's thesis is devoted to detecting poor database structure by analyzing metadata and stored data. We do this by extracting the metadata model from this DBMS, and then run the ruleset for this metadata.
To extend this process through data analysis, we need to allow the rules to directly query the database, but we must maintain the independence of the DBMS so that the queries can be applied to PostgreSQL, MSSQL and MySQL.
We discussed some functional design of queries, such as:
new Query(new Select(columnID), new From(tableID), new Where(new Equality(columnID1, columnID2)))
And then using a DBMS-specific serializer.
Another approach is to let the rules handle all of this on their own:
public Query QueryDatabase(DBMS dbms)
{
if (dbms == PostgreSQL) { return "select count(1) from Users"}
if (dbms == MSSQL) {return ....}
}
- ? -? , Entity, , , , .
, , .
, , (mssql), , (@table) (@column):
DECLARE @TotalCount FLOAT;
SELECT @TotalCount = COUNT(1) FROM [@table];
SELECT SUM(pcount * LOG10(@TotalCount / pcount)) / (LOG10(2) * @TotalCount)
FROM (SELECT (Count([@column])) as pcount
FROM [@table]
GROUP BY [@column]) as exp1
, , . . , SQL, .
. DO . !
, , . , .;)
:
new Query()
.Variable(varname => FLOAT)
.Set(varname => new Query().Count(1).From(table) )
.Select(new Aggregate().Sum(varname => "pcount * LOG10(varname / pcount)"))
.From(
new Query()
.Select(pcount => new Aggregate().Count(column)
.From(table)
.GroupBy(column)
)
, . . ?
LINQ:
let totalCount = Table.Count
from uv un from r in Table
group r by r["attr"]
select r.Count
select r.Count * Log2((totalCount / r.Count))
, ...