I have one financial MySQL MySQL database with the following schema:
+-----------------+---------------------+------+-----+---------+-------+ | Field | Type | Null | Key | Default | Extra | +-----------------+---------------------+------+-----+---------+-------+ | symbol_id | tinyint(3) unsigned | YES | MUL | NULL | | | timestamp | timestamp(6) | YES | MUL | NULL | | | buy_sell | char(1) | YES | | NULL | | | price | decimal(10,6) | YES | MUL | NULL | | +-----------------+---------------------+------+-----+---------+-------+
There are 200 unique symbol_id s. Ultimately, I want to be able to calculate the covariance (by waiting time) of the price of all these pairs. At first, I can only rely on calculating the covariance of one pair, and then I can iterate.
To calculate the covariance, I need two arrays of the same length (in this case, price ). I struggle with how to write this as a single query, and avoiding having all the records returned to me for local covariance calculation.
Here is what I am trying to accomplish in two pseudo -SQL queries:
SELECT (AVG(price1*price2) - AVG(price1)*AVG(price2)) as covar FROM data
and
SELECT price AS price1 WHERE HOUR(timestamp)=1 AND symbol_id=1 LIMIT(MIN(COUNT(price1,price2))) SELECT price AS price2 WHERE HOUR(timestamp)=1 AND symbol_id=2 LIMIT(MIN(COUNT(price1,price2)))
The first statement takes two arrays of equal lengths price1 and price2 and computes the covariance. The second statement is that it selects two different types: everything happens within 1 hour of transactions and limits the return values ββto an equal length.
In my limited knowledge of SQL, I had trouble understanding how I would combine these queries. Any help is much appreciated. Ultimately, being able to run a single query that calculates pairwise covariance for a specific period of time will be great.
source share