Python 3 statsmodels Logit ValueError: entering the DLASCL parameter number 5 had an illegal value

Working with an example of logistic regression and encountering some difficulties when approaching the statsmodels part. In the past, I had difficulties with Python 3 and pandas dataframes, where df returns an iterator, not a list. I tried to configure the same with "logit", but still getting a ValueError

import numpy as np
import pandas as pd
import os
import statsmodels.api as sm
import pylab as pl

df = pd.read_csv('admissions.csv')
df.head(n=5)

df.columns = ['admit', 'gre', 'gpa', 'prestige']
dummy_ranks = pd.get_dummies(df['prestige'], prefix='prestige')
cols_to_keep = ['admit', 'gre', 'gpa']
data = df[cols_to_keep].join(dummy_ranks.ix[:, 'prestige_2':])
data['intercept'] = 1.0
train_cols = data.columns[1:]


logit = sm.Logit(data['admit'], data[train_cols])

result = logit.fit()

ValueError: when entering the DLASCL parameter number 5 had an invalid value

+4
source share
1 answer

Your "admissions.csv" is empty.

http://www.ats.ucla.edu/stat/data/binary.csv http://blog.yhat.com/posts/logistic-regression-python-rodeo.html . , .

:

admit   gre gpa rank
0   380 3.61    3
1   520 2.93    4

:

admit   gre gpa rank
0       3.61    3
1   520 2.93    4
+2

Source: https://habr.com/ru/post/1656651/


All Articles