Logistic Regression

The purpose of this assignment is to perform logistic regression, interpret the results, and analyze whether or not the information generated can be used to address a specific business problem.

For this assignment, you will use the "Adult Incomes" data set from the Topic Materials.

The marketing department is interested in creating advertising directed primarily at high-income individuals, and it has come to you seeking very specific customer data. The director of marketing explains that individuals with large amounts of disposable income tend to purchase luxury items. Therefore, understanding what predictors are correlated with high income can be very useful for a marketing department because it can help it tailor messages to the high-earning cohort. For example, individuals that earn capital gains tend to be high-income earners, and advertisements for luxury items can be targeted toward them on realty or investment websites.

As a member of the analytics team, you have been asked to determine a list of predictors and their relative impact on the likelihood of an individual being a high-income earner. Individuals earning more than $50,000 annually are considered high-income earners. In your summary, include a discussion of how the marketing department can use your results to devise a smart advertising strategy.

Question 4: Given that approximately 26% of the individuals in the data have incomes greater than $50,000 annually, rerun the model in Question 3 with a cut-off of 0.26. Show the classification tables and percent correct for each predicted outcome (>50K and <=50K) for the training data and test data. Why is the percent that is correct usually lower when the test data are used? Include the "Training Classification Table" and "Test Classification Table" outputs when submitting the answer.

Question 5: Consider the following individual: Age=30, Sex=Female, Hours_Per_Week=40, Capital_Gain=$0. Based on the logistic model from Question 4, what is the probability of this individual earning more than $50,000 annually? What would be the predicted class for this individual? Explain your answer.