python - Only use part of a Pandas dataframe -


i feel asking silly question has been asked thousand times cannot seem find anywhere. might using wrong terminology.

anyway, have pandas frame df. , use part of dataframe. more i'd use in loop:

unique_values = df['my_column'].tolist() unique_values = list(set(unique_values))  value in unique_values:     tempdf = df[df['my_column] == value]     # stuff tempdf 

but doesn't seem work. there way 'filter' dataframe column's value?

use df.groupby instead:

for value, tempdf in df.groupby('my_column'):     # stuff tempdf 

you code work, after fixing missing single quote around 'my_column, slower using df.groupby.

evaluating df['my_column'] == value in loop forces pandas run through len(df) comparisons each iteration of loop. df.groupby partitions dataframe groups 1 pass through dataframe.


Comments