Postgres average

I am running Postgres 9.2 and I have a large table similar to

CREATE TABLE sensor_values
(
  ts timestamp with time zone NOT NULL,
  value double precision NOT NULL DEFAULT 'NaN'::real,
  sensor_id integer NOT NULL
)

I have values ​​coming into the system constantly, i.e. a lot per minute. I want to maintain the current standard deviation / mean for the last 200 values, so I can determine if the new values ​​that are part of the system are within 3 standard deviations of the mean. To do this, I will need the current standard deviation, and it will be constantly updated for the last 200 values. Since the table can be hundreds of millions of rows, I don’t want the latter to download 200 rows for a sensor ordered by time, and then do vg (value), var_samp (value) for each new value. I and assuming this will update the standard deviation and mean value faster.

I started writing the PL / pgSQL function to update the rolling variance and value for each new value that the system enters for a particular sensor.

I can do this using an alias like

newavg = oldavg + (new_value - old_value)/window_size
new_variance += (new_value-old_value)*(new_value-newavg+old_value-oldavg)/(window_size-1)

This is based on http://jonisalonen.com/2014/efficient-and-accurate-rolling-standard-deviation/

Basically, the window has a value of 200. The old value is the first value of the window. When a new meaning comes, we move the window forward. After receiving the result, I save the following values ​​for the sensor

The first value of the window.
The mean average of the window values.
The variance of the window values.

Thus, I do not need to constantly get the last 200 values ​​and do the sum, etc. I can reuse these values ​​when a new sensor value arrives.

My problem is in the first run. I do not have previous window data for the sensor, i.e. of the three values ​​above, so I have to do it in a slow way.

sort of

WITH s AS
        (SELECT value FROM sensor_values WHERE sensor_values.sensor_id = $1  AND ts >= (NOW() - INTERVAL '2 day')::timestamptz ORDER BY ts DESC LIMIT 200)
    SELECT avg(value), var_samp(value)  INTO last_window_average, last_window_variance FROM s;

(ealiest) select? s PL/pgSQL.

, PL/pgSQL /, , , ? ?

+4
1

, , 200 . , :

CREATE INDEX i_sensor_values ON sensor_values(sensor_id, ts DESC);

:

SELECT sum("value") -- add more expressions as required
  FROM sensor_values
 WHERE sensor_id=$1
 ORDER BY ts DESC
 LIMIT 200;

PL/pgSQL. 9.3 ( ), LATERAL .

, , , IndexOnlyScan .

Loose Index scans.

P.S. value , SQL.

0

Source: https://habr.com/ru/post/1584212/


All Articles