python - Change pd datetime object to integer -


i have pandas dataframe 2 dates in them. want take difference in days between them. resulting difference looks string ex ('7 days'). there way change integer date difference?

y['datepulled'] = pd.to_datetime(y['datepulled']) y['dates'] = pd.to_datetime(y['dates']) y['datediff'] = y['datepulled'] - y['dates'] y['datediff'] 0    7 days 1    6 days 2    5 days 3    4 days 4    3 days 5    2 days 6    1 days 

you can use:

(y['datediff'] / np.timedelta64(1, 'd')).astype(int) 

or:

y['datediff'].dt.days 

sample:

import pandas pd import numpy np  y = pd.dataframe({ 'datepulled': ['2016-01-05','2016-01-04'],                      'dates': ['2016-01-01','2016-01-02']})  y['datepulled'] = pd.to_datetime(y['datepulled']) y['dates'] = pd.to_datetime(y['dates']) y['datediff'] = y['datepulled'] - y['dates'] print (y)  #output float, cast int y['datediff1'] = (y['datediff'] / np.timedelta64(1, 'd')).astype(int)  y['datediff2'] = y['datediff'].dt.days print (y)        dates datepulled  datediff  datediff1  datediff2 0 2016-01-01 2016-01-05    4 days          4          4 1 2016-01-02 2016-01-04    2 days          2          2 

in larger dataframe first method faster:

y = pd.concat([y]*1000).reset_index(drop=true)  in [236]: %timeit (y['datediff'] / np.timedelta64(1, 'd')).astype(int) 1000 loops, best of 3: 789 µs per loop  in [237]: %timeit y['datediff'].dt.days 100 loops, best of 3: 15.3 ms per loop 

Comments