i've looked through bunch of questions , answers related issue, i'm still finding i'm getting copy of slice warning in places don't expect it. also, it's cropping in code running fine me previously, leading me wonder if sort of update may culprit.
for example, set of code i'm doing reading in excel file pandas dataframe
, , cutting down set of columns included df[[]]
syntax.
izmir = pd.read_excel(filepath) izmir_lim = izmir[['gender','age','mc_old_m>=60','mc_old_f>=60','mc_old_m>18','mc_old_f>18','mc_old_18>m>5','mc_old_18>f>5', 'mc_old_m_child<5','mc_old_f_child<5','mc_old_m>0<=1','mc_old_f>0<=1','date delivery','date insert','date of entery']]
now, further changes make izmir_lim
file raise copy of slice warning.
izmir_lim['age'] = izmir_lim.age.fillna(0) izmir_lim['age'] = izmir_lim.age.astype(int)
/users/samlilienfeld/anaconda/lib/python3.5/site-packages/ipykernel/main.py:2: settingwithcopywarning: value trying set on copy of slice dataframe. try using .loc[row_indexer,col_indexer] = value instead
i'm confused because thought df[[]]
column subsetting returned copy default. way i've found suppress errors explicitly adding df[[]].copy()
. have sworn in past did not have , did not raise copy of slice error.
similarly, have other code runs function on dataframe filter in ways:
def lim(df): if (geography == "all"): df_geo = df else: df_geo = df[df.center_jo == geography] df_date = df_geo[(df_geo.date_survey >= start_date) & (df_geo.date_survey <= end_date)] return df_date df_lim = lim(df)
from point forward, changes make of values of df_lim
raise copy of slice error. way around i've found change function call to:
df_lim = lim(df).copy()
this seems wrong me. missing? seems these use cases should return copies default, , have sworn last time ran these scripts not running in these errors.
need start adding .copy()
on place? seems there should cleaner way this. insight or appreciated.
izmir = pd.read_excel(filepath) izmir_lim = izmir[['gender','age','mc_old_m>=60','mc_old_f>=60', 'mc_old_m>18','mc_old_f>18','mc_old_18>m>5', 'mc_old_18>f>5','mc_old_m_child<5','mc_old_f_child<5', 'mc_old_m>0<=1','mc_old_f>0<=1','date delivery', 'date insert','date of entery']]
izmir_lim
view/copy of izmir
. subsequently attempt assign it. throwing error. use instead:
izmir_lim = izmir[['gender','age','mc_old_m>=60','mc_old_f>=60', 'mc_old_m>18','mc_old_f>18','mc_old_18>m>5', 'mc_old_18>f>5','mc_old_m_child<5','mc_old_f_child<5', 'mc_old_m>0<=1','mc_old_f>0<=1','date delivery', 'date insert','date of entery']].copy()
whenever 'create' new dataframe in following fashion:
new_df = old_df[list_of_columns_names]
new_df
have truthy value in it's is_copy
attribute. when attempt assign it, pandas throws settingwithcopywarning
.
new_df.iloc[0, 0] = 1 # should throw error
you can overcome in several ways.
option #1
new_df = old_df[list_of_columns_names].copy()
option #2 (as @ayhan suggested in comments)
new_df = old_df[list_of_columns_names] new_df.is_copy = none
option #3
new_df = old_df.loc[:, list_of_columns_names]
Comments
Post a Comment