Pandas Random Pattern With Removal

I know DataFrame.sample(), but how can I do this and also remove the sample from the dataset? (Note: AFAIK this has nothing to do with fetch with replacement)

For example, the essence of what I want to achieve does not actually work:

len(df) # 1000

df_subset = df.sample(300)
len(df_subset) # 300

df = df.remove(df_subset)
len(df) # 700
+4
source share
2 answers

If your index is unique

df = df.drop(df_subset.index)

Example

df = pd.DataFrame(np.arange(10).reshape(-1, 2))

Example

df_subset = df.sample(2)
df_subset

enter image description here


a drop

df.drop(df_subset.index)

enter image description here

+5
source

pandas random sampling :

train=df.sample(frac=0.8,random_state=200)
test=df.drop(train.index)
+1
source

Source: https://habr.com/ru/post/1656594/


All Articles