Logistic Regression in Python: part-01
- Independent variable, also called input or predictor, doesn’t depend on other features of interest (or at least you assume so for the purpose of the analysis).
- The dependent variable, also called output or responses, depending on the independent variables.
In [01]: # creating one hot encoding of categorical column.data = pd.get_dummies(df, columns =['job', 'marital', 'default', 'housing', 'loan', 'poutcome'])
In [02]: data.head()
Dropping the “unknown”
In [03]: data.columns[12]Out[03]: 'job_unknown'
In [04]: data.drop(data.columns[[12, 16, 18, 22, 24]], axis=1, inplace=True)
In [05]: data.columnsOut[16]: Index(['y', 'job_admin.', 'job_bluecollar', 'jobentrepreneur','jobhousemaid', 'job_management', 'job_retired', 'job_self-employed','jobservices', 'job_student', 'job_technician', 'job_unemployed','marital_divorced', 'marital_married', 'marital_single', 'default_no','default_yes', 'housingno', 'housing_yes', 'loan_no', 'loan_yes','poutcome_failure', 'poutcome_nonexistent', 'poutcomesuccess'],dtype='object')