time series - Compare two columns of Unequal Length in R using logical Operator -


i dealing big time series dataset, , compare 2 columns first column looks

            timeperiod          timefortreatment        2014-08-01 00:00:00        102.81818        2014-08-01 01:00:00         12.34483        2014-08-01 02:00:00         35.67568        2014-08-01 03:00:00        125.57692        2014-08-01 04:00:00         97.56250        2014-08-01 05:00:00         36.66667 

and second column looks like

        arrivaltime                 2014-08-01 00:14:00               2014-08-01 00:22:00                 2014-08-01 00:47:00                 2014-08-01 01:07:00                2014-08-01 01:19:00                 2014-08-01 01:53:00  

both of unequal lengths second being larger first. have compare first column second final 1 looks below. logic comparison if arrival time in second column less entry in first column (time being 1 hour here) gets value of time of treatment specific period

             arrival          timefortreatment        2014-08-01 00:14:00        102.81818        2014-08-01 00:22:00        102.81818        2014-08-01 00:47:00        102.81818        2014-08-01 01:07:00         12.34483        2014-08-01 01:19:00         12.34483        2014-08-01 01:53:00         12.34483 

i have made logic based on 2 for loops , taking forever 50k + values:

for (i in 1:nrow(date))  {     (j in 1:nrow(period))     {          if (date[i,1]>=period[j,])         {              z[i,]=t[j,]              j=j+1         }      }      i=i+1  } 

i wondering there other way in can done. in regard highly appreciated. editing answer accommodate cases different time period.

             timeperiod                  timefortreatment               2014-08-01 00:14:00               75               2014-08-01 00:19:00              143               2014-08-01 00:44:00              126               2014-08-01 01:04:00              125               2014-08-01 01:19:00              125               2014-08-01 01:49:00              122 

for case, output shown below based on same logic i.e. (arrival>=time period)

              arrival          timefortreatment        2014-08-01 00:14:00            75        2014-08-01 00:22:00           143        2014-08-01 00:47:00           126        2014-08-01 01:07:00           125        2014-08-01 01:19:00           125        2014-08-01 01:53:00           122  

let me know if more details needed

here solution, 1 for loop, faster solution exists.

df1 = data.frame(timeperiod = seq(as.posixct("2014-08-01 00:00:00"), as.posixct("2014-08-01 05:00:00"), = "1 hour"),             timefortreatment = c(102.81818, 12.34483, 35.67568, 125.57692, 97.56250, 36.66667)) df2 = data.frame(arrivaltime = c(as.posixct("2014-08-01 00:14:00"), as.posixct("2014-08-01 00:22:00"), as.posixct("2014-08-01 00:47:00"), as.posixct("2014-08-01 01:07:00"), as.posixct("2014-08-01 01:19:00"), as.posixct("2014-08-01 01:53:00")))   library(stringr) df2$time_min = as.posixct(paste0(str_sub(df2$arrivaltime, 1, 14), "00:00"))  (i in 1:nrow(df2)) {  df2$timefortreatment[i] = df1$timefortreatment[df1$timeperiod == df2$time_min[i]] } 

edit

with no periodicity in timeperiod, can use difftime function :

df1 = data.frame(timeperiod = c(as.posixct("2014-08-01 00:14:00"), as.posixct("2014-08-01 00:19:00"), as.posixct("2014-08-01 00:44:00"), as.posixct("2014-08-01 01:04:00"), as.posixct("2014-08-01 01:19:00"), as.posixct("2014-08-01 01:49:00")), timefortreatment = c(75, 143, 126, 125, 125, 122)) df2 = data.frame(arrivaltime = c(as.posixct("2014-08-01 00:14:00"), as.posixct("2014-08-01 00:22:00"), as.posixct("2014-08-01 00:47:00"), as.posixct("2014-08-01 01:07:00"), as.posixct("2014-08-01 01:19:00"), as.posixct("2014-08-01 01:53:00")))  (i in 1:nrow(df2)) {   df2$timefortreatment[i] = df1$timefortreatment[which.min(abs(difftime(df2$arrivaltime[i], df1$timeperiod)))] }   # apply solution   my_function = function(value) {   output = df1$timefortreatment[which.min(abs(difftime(value, df1$timeperiod)))] } df2$timefortreatment = apply(df2, 1, my_function)   > df2           arrivaltime timefortreatment 1 2014-08-01 00:14:00               75 2 2014-08-01 00:22:00              143 3 2014-08-01 00:47:00              126 4 2014-08-01 01:07:00              125 5 2014-08-01 01:19:00              125 6 2014-08-01 01:53:00              122 

Comments