python - Put a 2d Array into a Pandas Series -


i have 2d numpy array put in pandas series (not dataframe):

>>> import pandas pd >>> import numpy np >>> = np.zeros((5, 2)) >>> array([[ 0.,  0.],        [ 0.,  0.],        [ 0.,  0.],        [ 0.,  0.],        [ 0.,  0.]]) 

but throws error:

>>> s = pd.series(a) traceback (most recent call last):   file "<stdin>", line 1, in <module>   file "/miniconda/envs/pyspark/lib/python3.4/site-packages/pandas/core/series.py", line 227, in __init__     raise_cast_failure=true)   file "/miniconda/envs/pyspark/lib/python3.4/site-packages/pandas/core/series.py", line 2920, in _sanitize_array     raise exception('data must 1-dimensional') exception: data must 1-dimensional 

it possible hack:

>>> s = pd.series(map(lambda x:[x], a)).apply(lambda x:x[0]) >>> s 0    [0.0, 0.0] 1    [0.0, 0.0] 2    [0.0, 0.0] 3    [0.0, 0.0] 4    [0.0, 0.0] 

is there better way?

well, can use numpy.ndarray.tolist function, so:

>>> = np.zeros((5,2)) >>> array([[ 0.,  0.],        [ 0.,  0.],        [ 0.,  0.],        [ 0.,  0.],        [ 0.,  0.]]) >>> a.tolist() [[0.0, 0.0], [0.0, 0.0], [0.0, 0.0], [0.0, 0.0], [0.0, 0.0]] >>> pd.series(a.tolist()) 0    [0.0, 0.0] 1    [0.0, 0.0] 2    [0.0, 0.0] 3    [0.0, 0.0] 4    [0.0, 0.0] dtype: object 

edit:

a faster way accomplish similar result pd.series(list(a)). make series of numpy arrays instead of python lists, should faster a.tolist returns list of python lists.


Comments