I have a dataframe like this:
import pandas as pd
dic = {'A':[100,200,250,300],
'B':['ci','ci','po','pa'],
'C':['s','t','p','w']}
df = pd.DataFrame(dic)
My goal is to separate a row in 2 data frames:
- df1 = contains all rows that do not repeat values along a column
B(unque rows). - df2 = contains only lines that repeat topics.
The result should look like this:
df1 = A B C df2 = A B C
0 250 po p 0 100 ci s
1 300 pa w 1 250 ci t
Note:
- the data can be generally very large and have many values that are repeated in column B, so the answer should be as general as possible
- If there are no duplicates, df2 should be empty! all results should be in df1
source
share