i'd create pipeline ruffus package python , struggling simplest concepts. 2 tasks should executed 1 after other. second task depends on output of first task. in ruffus documentation designed import/export from/to external files. i'd handle internal data types dictionaries.
the problem @follows doesn't take inputs , @transform doesn't take dicts. missing something?
def task1(): # generate dict properties = {'status': 'original'} return properties @follows(task1) def task2(properties): # update dict properties['status'] = 'updated' return properties
eventually pipeline should combine set of functions in class update class object on go.
you should use ruffus decorators when there input/output files. example, if task1
generates file1.txt
, input task2
, generates file2.txt
write pipeline follows:
@originate('file1.txt') def task1(output): open(output,'w') out_file: # write stuff out_file @follows(task1) @transform(task1, suffix('1.txt'),'2.txt') def task2(input_,output): open(input_) in_file, open(output,'w') out_file: # read stuff in_file , write stuff out_file
if want take dictionary input, don't need ruffus, can order code appropriately (as run sequentially) or call task1
in task2
:
def task1(): properties = {'status': 'original'} return properties def task2(): properties = task1() properties['status'] = 'updated' return properties
Comments
Post a Comment