i trying calculate regression output using python library unabl;e intercept value when use library:
import statsmodels.api sm
it prints regression analysis except intercept.
but when use:
from pandas.stats.api import ols
my code pandas:
regression = ols(y= sorted_data3['net_realization_rate'],x = sorted_data3[['cohort_2','cohort_3']]) print regression
i the intercept warning librabry deprecated in future trying use statsmodels.
the warning while using pandas.stats.api:
warning (from warnings module): file "c:\python27\lib\idlelib\run.py", line 325 exec code in self.locals futurewarning: pandas.stats.ols module deprecated , removed in future version. refer external packages statsmodels, see examples here: http://statsmodels.sourceforge.net/stable/regression.html
my code statsmodels:
import pandas pd import numpy np pandas.stats.api import ols import statsmodels.api sm data1 = pd.read_csv('c:\shank\regression.csv') #importing csv print data1
running cleaning code
sm_model = sm.ols(sorted_data3['net_realization_rate'],sorted_data3[['cohort_2','cohort_3']]) results = sm_model.fit() print '\n' print results.summary()
i tried statsmodels.formula.api: as:
sm_model = sm.ols(formula ="net_realization_rate ~ cohort_2 + cohort_3", data = sorted_data3) results = sm_model.fit() print '\n' print result.params print '\n' print results.summary()
but error:
typeerror: init() takes @ least 2 arguments (1 given)
final output: 1st pandas 2nd stats.... want intercept vaule 1 pandas stats also:
so, statsmodels
has add_constant
method need use explicitly add intercept values. imho, better r alternative intercept added default.
in case, need this:
import statsmodels.api sm endog = sorted_data3['net_realization_rate'] exog = sm.add_constant(sorted_data3[['cohort_2','cohort_3']]) # fit , summarize ols model mod = sm.ols(endog, exog) results = mod.fit() print results.summary()
note can add constant before array, or after passing true
(default) or false
prepend
kwag in sm.add_constant
or, not recommended, can use numpy explicitly add constant column so:
exog = np.concatenate((np.repeat(1, len(sorted_data3))[:, none], sorted_data3[['cohort_2','cohort_3']].values), axis = 1)
Comments
Post a Comment