To get started, let's summarize the number of hours records in your table.
SELECT CAST(DATE_FORMAT(entry_time,'%Y-%m-%d %k:00:00') AS DATETIME) hour, COUNT(*) samplecount FROM table GROUP BY CAST(DATE_FORMAT(entry_time,'%Y-%m-%d %k:00:00') AS DATETIME)
Now, if you register something every six minutes (ten times per hour), all of your samplecount values ββshould be ten. This expression: CAST(DATE_FORMAT(entry_time,'%Y-%m-%d %k:00:00') AS DATETIME)
looks hairy, but it just truncates your timestamps to the hour at which they occur, resetting the minutes and seconds.
It is reasonably effective and you will begin. This is very effective if you can put the index in the entry_time column and limit your query to, say, yesterday's patterns, as shown here.
SELECT CAST(DATE_FORMAT(entry_time,'%Y-%m-%d %k:00:00') AS DATETIME) hour, COUNT(*) samplecount FROM table WHERE entry_time >= CURRENT_DATE - INTERVAL 1 DAY AND entry_time < CURRENT_DATE GROUP BY CAST(DATE_FORMAT(entry_time,'%Y-%m-%d %k:00:00') AS DATETIME)
But itβs not so good to find whole hours that go by with missing samples. It is also slightly sensitive to jitter in your sample. That is, if your sample at the top of the hour sometimes takes half a second earlier (10:59:30), and sometimes half a second later (11:00:30), your hourly counts will be disabled. So, this hour-long summary thing (or a summary of the day, or a brief summary, etc.) is not bulletproof.
You need a request for an independent connection, so that everything is in order; it is a little bigger than a ball and not so effective.
Let's start by creating a virtual table (subquery) like this, with numbered samples. (This is a pain in MySQL; some other expensive DBMSs make work easier. It doesn't matter.)
SELECT @sample: =@sample +1 AS entry_num, c.entry_time, c.value FROM ( SELECT entry_time, value FROM table ORDER BY entry_time ) C, (SELECT @sample:=0) s
This small virtual table gives entry_num, entry_time, value.
The next step, we attach it to ourselves.
SELECT one.entry_num, one.entry_time, one.value, TIMEDIFF(two.value, one.value) interval FROM ( ) ONE JOIN ( ) TWO ON (TWO.entry_num - 1 = ONE.entry_num)
This aligns the tables next to each other with two offsets on the same row defined by the ON clause for the JOIN.
Finally, we select values ββfrom this table with interval
greater than your threshold, and there are sample times right before the missing ones.
This query is used for all join requests. I told you it was a ball.
SELECT one.entry_num, one.entry_time, one.value, TIMEDIFF(two.value, one.value) interval FROM ( SELECT @sample: =@sample +1 AS entry_num, c.entry_time, c.value FROM ( SELECT entry_time, value FROM table ORDER BY entry_time ) C, (SELECT @sample:=0) s ) ONE JOIN ( SELECT @sample2: =@sample2 +1 AS entry_num, c.entry_time, c.value FROM ( SELECT entry_time, value FROM table ORDER BY entry_time ) C, (SELECT @sample2:=0) s ) TWO ON (TWO.entry_num - 1 = ONE.entry_num)
If you need to do this during production on a large table, you may want to do this for a subset of your data. For example, you can do this every day for samples of the previous two days. This would be decently effective, and also make sure that you did not miss the missing patterns at midnight. To do this, your small virtual tables with rolls will look like this.
SELECT @sample: =@sample +1 AS entry_num, c.entry_time, c.value FROM ( SELECT entry_time, value FROM table ORDER BY entry_time WHERE entry_time >= CURRENT_DATE - INTERVAL 2 DAY AND entry_time < CURRENT_DATE ) C, (SELECT @sample:=0) s