How to remove consecutive duplicate lines in KDB?

Question

How to remove consecutive duplicate lines in KDB?

For example, if I have a table below, I want to delete the third row:

Stock   Price
-------------------
GOOG    101
GOOG    102
GOOG    102     <- want to remove this
GOOG    101

Note. Despite the fact that line 4 is a duplicate of line 1, I do not want to delete it, since it is not a sequential duplicate. That is, this is not a duplicate of the line directly above.

I would also like to check for duplicates in multiple fields, not just that Price.

+4

kdb q-lang

mchen Apr 11 '14 at 14:19

source share

2 answers

You can also use differ

q)t:([]stock:4#`GOOG; price:101 102 102 101)
q)differ t
1101b
q)t where differ t
stock price
-----------
GOOG  101
GOOG  102
GOOG  101

now suppose there is a time column as you indicate in your comment above

q)t:update time:til count i from t
q)t
stock price time
----------------
GOOG  101   0
GOOG  102   1
GOOG  102   2
GOOG  101   3
q)t where differ `stock`price#t
stock price time
----------------
GOOG  101   0
GOOG  102   1
GOOG  101   3

t , . , @jgleeson (, , , , )

q)\ts do[10000;r:t where differ t]
31 1184j
q)\ts do[10000;r2:t where not t~'prev t]
62 1488j
q)r~r2
1b

+3

JPC 24 . '14 12:08

jgleeson · Accepted Answer · 2014-04-11T14:43:28+0000

d:([]Stock:4#`GOOG;Price:101 102 102 101)
q)d
Stock Price
-----------
GOOG  101
GOOG  102
GOOG  102
GOOG  101

q)d where not d~'prev d
Stock Price
-----------
GOOG  101
GOOG  102
GOOG  101

How to remove consecutive duplicate lines in KDB?

More articles: