basically, looping through bunch of csv files , in end append
each dataframe one. actually, need rbind
type function. so, did search , followed guide. however, still not ideal solution.
a sample code attached below. instance shape of data1 47 42. shape of data_out_final
becomes (47, 42), (47, 84), , (47, 126) after first 3 files. idealy, should (141, 42). in addition, check index of data1
, rangeindex(start=0, stop=47, step=1)
. appreciate suggestions!
my pandas
version 0.18.1
code
appended_data = [] csv_each in csv_pool: data1 = pd.read_csv(csv_each, header=0) # here appended_data.append(data2) data_out_final = pd.concat(appended_data, axis=1)
if using data_out_final = pd.concat(appended_data, axis=1)
, shape of data_out_final becomes (141, 94)
ps
kind of figure out. actually, have standardize column names before pd.concat
.
>>> df1 b 0 -1.417866 -0.828749 1 0.212349 0.791048 2 -0.451170 0.628584 3 0.612671 -0.995330 4 0.078460 -0.322976 5 1.244803 1.576373 6 1.169629 -1.135926 7 -0.652443 0.506388 8 0.549604 -0.691054 9 -0.512829 -0.959398 >>> df2 b 0 -0.652161 0.940932 1 2.495067 0.004833 2 -2.187792 1.692402 3 1.900738 0.372425 4 0.245976 1.894527 5 0.627297 0.029331 6 -0.828628 -1.600014 7 -0.991835 -0.061202 8 0.543389 0.703457 9 -0.755059 1.239968 >>> pd.concat([df1, df2]) b 0 -1.417866 -0.828749 1 0.212349 0.791048 2 -0.451170 0.628584 3 0.612671 -0.995330 4 0.078460 -0.322976 5 1.244803 1.576373 6 1.169629 -1.135926 7 -0.652443 0.506388 8 0.549604 -0.691054 9 -0.512829 -0.959398 0 -0.652161 0.940932 1 2.495067 0.004833 2 -2.187792 1.692402 3 1.900738 0.372425 4 0.245976 1.894527 5 0.627297 0.029331 6 -0.828628 -1.600014 7 -0.991835 -0.061202 8 0.543389 0.703457 9 -0.755059 1.239968
unless i'm misinterpreting need, need.
Comments
Post a Comment