2019年1月23日 星期三

[ Python 常見問題 ] Calculate the time difference between two consecutive rows in pandas

Source From Here 
Question 
I have a pandas dataframe as follows 
  1. Dev_id     Time  
  2. 88345      13:40:31  
  3. 87556      13:20:33  
  4. 88955      13:05:00  
  5. .....      ........  
  6. 85678      12:15:28  
The above dataframe has 83000 rows. I want to take time difference between two consecutive rows and keep it in a separate column. The desired result would be 
  1. Dev_id    Time          Time_diff(in min)  
  2. 88345      13:40:31      20  
  3. 87556      13:20:33      15  
  4. 88955      13:05:00      15  
I have tried df['Time_diff'] = df['Time'].diff(-1) but getting error as shown below 
TypeError: unsupported operand type(s) for -: 'datetime.time' and 'datetime.time'

How to solve this? 

How-To 
Problem is pandas need datetimes or timedeltas for diff function, so first converting by to_timedelta, then get total_seconds and divide by 60: 
  1. >>> import pandas as pd  
  2. >>> from datetime import datetime, timedelta  
  3. >>> df = pd.DataFrame([[1, datetime.now()], [2, datetime.now()- timedelta(hours = 1)]], columns = ['id', 'time'])  
  4. >>> df  
  5.    id                       time  
  6. 0   1 2019-01-24 09:10:19.732798  
  7. 1   2 2019-01-24 08:10:19.732864  
  8. >>> df['time_diff'] = df['time'].diff(-1).dt.total_seconds().div(60)  
  9. >>> df  
  10.    id                       time  time_diff  
  11. 0   1 2019-01-24 09:10:19.732798  59.999999  
  12. 1   2 2019-01-24 08:10:19.732864        NaN  
If want floor or round per minutes: 
  1. >>> df['time'].diff(-1).dt  
  2.  0x7ff17c4bad90>  
  • >>> df['time'].diff(-1).dt.floor('T')  
  • 0   00:59:00  
  • 1        NaT  
  • >>> df['time_diff'] = df['time'].diff(-1).dt.floor('T').dt.total_seconds().div(60)  
  • >>> df  
  •    id                       time  time_diff  
  • 0   1 2019-01-24 09:10:19.732798       59.0  
  • 1   2 2019-01-24 08:10:19.732864        NaN  


  • 沒有留言:

    張貼留言

    [Git 常見問題] error: The following untracked working tree files would be overwritten by merge

      Source From  Here 方案1: // x -----删除忽略文件已经对 git 来说不识别的文件 // d -----删除未被添加到 git 的路径中的文件 // f -----强制运行 #   git clean -d -fx 方案2: 今天在服务器上  gi...