Python multiprocessing numpy.linalg.pinv cause segfault -


i wrote function using multiprocessing packages python , tried boost speed of code.

from arch.univariate import arx, garch multiprocessing import process import multiprocessing import time  def batch_learning(x, lag_array=none):     """     x time series array     lag_array contains possible lag numbers     """     # init queue used triggering different processes     queue = multiprocessing.joinablequeue()     data = multiprocessing.queue()      # worker called arx_fit triggered queue.get()     def arx_fit(queue):         while true:             q = queue.get()             q.volatility = garch()             print "starting fit lags %s" %str(q.lags.size/2)             try:                 q_res=q.fit(update_freq=500)             except:                 print "error:...."             print "finished lags %s" %str(q.lags.size/2)             queue.task_done()     # init 4 processes     in range(4):         process_i = process(target=arx_fit, name="process_%s"%str(i),   args=(queue,))         process_i.start()     # put arx model objects queue continuously     num in lag_array:         queue.put(arx(x, lags=num))      # sync processes here     queue.join()         return 

after calling function:

batch_learning(a, lag_array=range(1,10)) 

however got stuck in middle , got print out messages below:

starting fit lags 1 starting fit lags 3 starting fit lags 2 starting fit lags 4 finished lags 1 finished lags 2 starting fit lags 5 finished lags 3 starting fit lags 6 starting fit lags 7 finished lags 4 starting fit lags 8 finished lags 6 finished lags 5 starting fit lags 9 

it runs forever without printouts on mac os el captain. using pycharm debug mode , tim peters suggestions, find out processes quitted unexpectedly. under debug mode, can pinpoint svd function inside numpy.linalg.pinv() used arch library causing problem. question is: why? works single process for-loop cannot work 2 processes or above. don't know how fix problem. numpy bug? can me bit here?

i have answer question myself , providing solutions. have solved issue, @tim peters , @aganders.

the multiprocessing hangs when use numpy/scipy libraries on mac os because of accelerate framework used in apple os replacement openblas numpy built on. simply, in order solve similar problem, have follows:

  1. uninstall numpy , scipy (scipy needs matched proper version of numpy)
  2. follow procedure on link rebuild numpy openblas.

  3. reinstall scipy , test code see if works.

some heads testing multiprocessing codes on mac os, when run code, better set env variable run code:

openblas_num_threads=1 python import_test.py 

the reason doing openblas default create 2 threads each core run, in case there 8 threads running (2 each core) though set 4 processes. creates bit overhead thread switching. tested openblas_num_threads=1 config limit 1 thread each process on each core, indeed faster default settings.


Comments