i have simple csv file:
101,8 102,10 102,6 103,5 104,0
with duplicated entries row[0] on second , third line , want keep last (or lower row[1] value) duplicate. way have figured out how make work correctly using dict() sort, having problems writing csv file correct format. code:
from operator import itemgetter pprint import pprint import csv open('cards1.csv', 'rb') csvfile: reader = csv.reader(csvfile, delimiter=',') open('cards2.csv', 'wb') csvfile1: writer = csv.writer(csvfile1, delimiter=',') rows = iter(reader) sort_key = itemgetter(0) sorted_rows = sorted(rows, key=sort_key) unique_rows = dict((row[0], row) row in sorted_rows) pprint (unique_rows) writer.writerows(unique_rows)
which prints:
{'101': ['101', '8'], '102': ['102', '6'], '103': ['103', '5'], '104': ['104', '0']}
but writes files as:
1,0,2 1,0,3 1,0,1 1,0,4
where remove duplicate in row[0] largest value in row[1]. (btw, order of created csv not critical)
if understand correctly.
instead of:
writer.writerows(unique_rows)
you want like:
for row in unqiue_rows.values(): writer.writerow(row)
Comments
Post a Comment