i wrote function using multiprocessing packages python , tried boost speed of code.
from arch.univariate import arx, garch multiprocessing import process import multiprocessing import time def batch_learning(x, lag_array=none): """ x time series array lag_array contains possible lag numbers """ # init queue used triggering different processes queue = multiprocessing.joinablequeue() data = multiprocessing.queue() # worker called arx_fit triggered queue.get() def arx_fit(queue): while true: q = queue.get() q.volatility = garch() print "starting fit lags %s" %str(q.lags.size/2) try: q_res=q.fit(update_freq=500) except: print "error:...." print "finished lags %s" %str(q.lags.size/2) queue.task_done() # init 4 processes in range(4): process_i = process(target=arx_fit, name="process_%s"%str(i), args=(queue,)) process_i.start() # put arx model objects queue continuously num in lag_array: queue.put(arx(x, lags=num)) # sync processes here queue.join() return
after calling function:
batch_learning(a, lag_array=range(1,10))
however got stuck in middle , got print out messages below:
starting fit lags 1 starting fit lags 3 starting fit lags 2 starting fit lags 4 finished lags 1 finished lags 2 starting fit lags 5 finished lags 3 starting fit lags 6 starting fit lags 7 finished lags 4 starting fit lags 8 finished lags 6 finished lags 5 starting fit lags 9
it runs forever without printouts on mac os el captain. using pycharm debug mode , tim peters suggestions, find out processes quitted unexpectedly. under debug mode, can pinpoint svd
function inside numpy.linalg.pinv() used arch library causing problem. question is: why? works single process for-loop cannot work 2 processes or above. don't know how fix problem. numpy bug? can me bit here?
i have answer question myself , providing solutions. have solved issue, @tim peters , @aganders.
the multiprocessing hangs when use numpy/scipy libraries on mac os because of accelerate framework used in apple os replacement openblas numpy built on. simply, in order solve similar problem, have follows:
- uninstall numpy , scipy (scipy needs matched proper version of numpy)
follow procedure on link rebuild numpy openblas.
reinstall scipy , test code see if works.
some heads testing multiprocessing codes on mac os, when run code, better set env variable run code:
openblas_num_threads=1 python import_test.py
the reason doing openblas default create 2 threads each core run, in case there 8 threads running (2 each core) though set 4 processes. creates bit overhead thread switching. tested openblas_num_threads=1 config limit 1 thread each process on each core, indeed faster default settings.
Comments
Post a Comment