python - Pandas equivalent rbind operation -


basically, looping through bunch of csv files , in end append each dataframe one. actually, need rbind type function. so, did search , followed guide. however, still not ideal solution.

a sample code attached below. instance shape of data1 47 42. shape of data_out_final becomes (47, 42), (47, 84), , (47, 126) after first 3 files. idealy, should (141, 42). in addition, check index of data1, rangeindex(start=0, stop=47, step=1). appreciate suggestions!

my pandas version 0.18.1

code

appended_data = [] csv_each in csv_pool:     data1 = pd.read_csv(csv_each, header=0)     # here     appended_data.append(data2)  data_out_final = pd.concat(appended_data, axis=1) 

if using data_out_final = pd.concat(appended_data, axis=1), shape of data_out_final becomes (141, 94)

ps

kind of figure out. actually, have standardize column names before pd.concat.

>>> df1                   b 0 -1.417866 -0.828749 1  0.212349  0.791048 2 -0.451170  0.628584 3  0.612671 -0.995330 4  0.078460 -0.322976 5  1.244803  1.576373 6  1.169629 -1.135926 7 -0.652443  0.506388 8  0.549604 -0.691054 9 -0.512829 -0.959398  >>> df2                   b 0 -0.652161  0.940932 1  2.495067  0.004833 2 -2.187792  1.692402 3  1.900738  0.372425 4  0.245976  1.894527 5  0.627297  0.029331 6 -0.828628 -1.600014 7 -0.991835 -0.061202 8  0.543389  0.703457 9 -0.755059  1.239968  >>> pd.concat([df1, df2])                   b 0 -1.417866 -0.828749 1  0.212349  0.791048 2 -0.451170  0.628584 3  0.612671 -0.995330 4  0.078460 -0.322976 5  1.244803  1.576373 6  1.169629 -1.135926 7 -0.652443  0.506388 8  0.549604 -0.691054 9 -0.512829 -0.959398 0 -0.652161  0.940932 1  2.495067  0.004833 2 -2.187792  1.692402 3  1.900738  0.372425 4  0.245976  1.894527 5  0.627297  0.029331 6 -0.828628 -1.600014 7 -0.991835 -0.061202 8  0.543389  0.703457 9 -0.755059  1.239968 

unless i'm misinterpreting need, need.


Comments