python - How to Remove the Data in a Specific Column for the Duplicate IDs? -


i have simple dataframe:

id  name    state 1   john    dc 1   john    va 2   smith   ne 3   janet   ca 3   janet   nc 3   janet   md       

i want delete state value duplicate ids so:

id  name    state 1   john    nan 1   john    nan 2   smith   ne 3   janet   nan 3   janet   nan 3   janet   nan 

any idea how solve problem?

thanks,

duplicated returns boolean mask rows duplicated on columns defined in subset. keep=false indicates shouldn't consider first or last of duplicates non-duplicate. using loc allows assign rows duplicates happen.

df.loc[df.duplicated(subset=['id'], keep=false), 'state'] = none  df 

enter image description here


Comments