I suck this one. I would like to do this in pure sql, but at this point any decision will do.
I have tables ta and tb containing lists of events that occurred at about the same time. The goal is to find "orphan" entries from ta to tb . For instance:.
create table ta ( dt date, id varchar(1)); insert into ta values( to_date('20130101 13:01:01', 'yyyymmdd hh24:mi:ss') , '1' ); insert into ta values( to_date('20130101 13:01:02', 'yyyymmdd hh24:mi:ss') , '2' ); insert into ta values( to_date('20130101 13:01:03', 'yyyymmdd hh24:mi:ss') , '3' ); create table tb ( dt date, id varchar(1)); insert into tb values( to_date('20130101 13:01:5', 'yyyymmdd hh24:mi:ss') , 'a' ); insert into tb values( to_date('20130101 13:01:6', 'yyyymmdd hh24:mi:ss') , 'b' );
But let's say I have to use a threshold of + -5 seconds. Thus, the search query will look something like this:
select ta.id ida, tb.id idb from ta, tb where tb.dt between (ta.dt - 5/86400) and (ta.dt + 5/86400) order by 1,2
(script: http://sqlfiddle.com/#!4/b58f7c/5 )
Rules:
- Events are displayed 1 to 1
- The nearest event in
tb for the given in ta will be considered the correct mapping.
However, the resulting query should return something like
IDA | IDB 1 | a 2 | b 3 | null <-- orphan event
Although the sample request I put here shows exactly the problem I have. When time overlaps, it is difficult to systematically select the correct line.
dense_rank() seems to be the answer to choosing the right lines, but what sorting / sorting will put them correctly?
Worth mentioning, I am doing this on Oracle 11gR2.