JavaEar 专注于收集分享传播有价值的技术资料

Pandas categorical data conversion

df1 = DataFrame({'Site Name' : pd.Categorical(list('EENEESENNENNENNWNWSSESSESSWSWVRBWWNWWSW'),categories=['E','ENE','ESE','N','NE','NNE','NNW','NW','S','SE','SSE','SSW','SW','VRB','W','WNW','WSW']), 'B' : numpy.arange(20) })

This is my code to convert categorical data, i keep getting an error like this

"ValueError: arrays must all be same length"

Please help me out

1个回答

    最佳答案
  1. You get the error because 'Site Name' is a categorical with 39 entries whilst your 'B' column has 20 entries, if you changed it to match the length (np.arange(39))then it works fine:

    In [5]:
    import pandas as pd
    import numpy as np
    df1 = pd.DataFrame({'Site Name' : pd.Categorical(list('EENEESENNENNENNWNWSSESSESSWSWVRBWWNWWSW'),categories=['E','ENE','ESE','N','NE','NNE','NNW','NW','S','SE','SSE','SSW','SW','VRB','W','WNW','WSW']), 'B' : np.arange(39) })
    df1
    
    Out[5]:
         B Site Name
    0    0         E
    1    1         E
    2    2         N
    3    3         E
    4    4         E
    5    5         S
    6    6         E
    7    7         N
    8    8         N
    9    9         E
    10  10         N
    11  11         N
    12  12         E
    13  13         N
    14  14         N
    15  15         W
    16  16         N
    17  17         W
    18  18         S
    19  19         S
    20  20         E
    21  21         S
    22  22         S
    23  23         E
    24  24         S
    25  25         S
    26  26         W
    27  27         S
    28  28         W
    29  29       NaN
    30  30       NaN
    31  31       NaN
    32  32         W
    33  33         W
    34  34         N
    35  35         W
    36  36         W
    37  37         S
    38  38         W
    

    The DataFrame ctor requires that when passing a dict as the data param all values must be array-like and the same length