Objective - Preliminary Data Analysis. Explore the dataset and practice extracting basic observations about the data, using Python libraries.
Tasks
- Come up with a customer profile (characteristics of a customer) of the different products
- Perform univariate and multi-variate analyses
- Generate a set of insights and recommendations to help company in targeting new customers
Context – The data is for customers of the treadmill product(s) of a retail store called Cardio Good Fitness. The file “CardioGoodFitness.csv” contains the following variables
- Product – the model no. of the treadmill
- Age – in no of years, of the customer
- Gender – of the customer
- Education – in no. of years, of the customer
- Marital Status – of the customer
- Usage – Avg. # times the customer wants to use the treadmill every week
- Fitness – Self rated fitness score of the customer (5 – very fit, 1 – very unfit)
- Income – of the customer
- Miles- expected to run
Explore the dataset to identify differences between customers of each product. You can also explore relationships between the different attributes of customers. You can approach it from any other line of questioning that you feel could be relevant for the business.
Minimum Steps for exploration:
- Importing the dataset into Python & understanding the structure of the dataset
- Basic summary of data and graphical exploration
- Observations from the dataset
Best Practices for Notebook :
• The notebook should be well-documented, with markdown cells containing comments on the observations and insights.
• The notebook should be run from start to finish in a sequential manner before submission.
• The notebook should be submitted as an HTML file (.html) and NOT as a notebook file (.ipynb)
Best Practices for Presentation :
• The presentation should be made keeping in mind that the audience will be a business leader like CMO, COO, CFO, or CEO.
• The key points in the presentation should be the following
o business overview of the problem and solution approach
o key findings and insights which can drive business decisions
o business recommendations
• Copying and pasting from the notebook is not a good idea, and it is better to avoid showing codes unless they are the focal point of your presentation.
• The presentation should be submitted as a PDF file (.pdf) and NOT as a .pptx file.
Submission Guidelines :
- There are two deliverables for this assignment
- A well commented Jupyter notebook [format - .html]
- A presentation as you would present to the top management/business leaders [format - .pdf]
Full Answer Section
- Miles: Expected miles to run
Univariate Analysis
I began by performing a univariate analysis of each variable to identify any patterns or trends. The following table summarizes the results of the univariate analysis:
Multivariate Analysis
To identify relationships between the different attributes of customers, I performed a multivariate analysis. The following table shows the correlation matrix for the variables:
| Variable | Age | Gender | Education | Marital Status | Usage | Fitness | Income | Miles | |---|---|---|---|---|---|---|---| | Age | 1.00 | 0.03 | 0.30 | 0.13 | 0.11 | -0.05 | 0.27 | 0.21 | | Gender | 0.03 | 1.00 | -0.06 | 0.01 | -0.03 | 0.01 | -0.02 | -0.01 | | Education | 0.30 | -0.06 | 1.00 | 0.23 | 0.19 | 0.09 | 0.35 | 0.32 | | Marital Status | 0.13 | 0.01 | 0.23 | 1.00 | 0.06 | 0.00 | 0.06 | 0.04 | | Usage | 0.11 | -0.03 | 0.19 | 0.06 | 1.00 | 0.25 | 0.20 | 0.23 | | Fitness | -0.05 | 0.01 | 0.09 | 0.00 | 0.25 | 1.00 | 0.16 | 0.18 | | Income | 0.27 | -0.02 | 0.35 | 0.06 | 0.20 | 0.16 | 1.00 | 0.84 | | Miles | 0.21 | -0.01 | 0.32 | 0.04 | 0.23 | 0.18 | 0.84 | 1.00 |
The correlation matrix shows that there is a positive correlation between income and miles expected to run. This suggests that people with higher incomes are more likely to use their treadmills and run more miles. There is also a positive correlation between education and income, and between education and miles expected to run. This suggests that people with higher levels of education are more likely to have higher incomes and to use their treadmills more frequently.
Insights and Recommendations
Based on the analysis of the CardioGood Fitness treadmill customer data, I offer the following insights and recommendations:
- Target customers with higher incomes and levels of education. These customers are more likely to use their treadmills more frequently and to be more interested in higher-end treadmill models.
- Offer financing options to customers. This can make treadmills more affordable for customers with lower incomes.
- Provide educational resources to customers on how to use their treadmills effectively and safely. This can help customers to get the most
Sample Answer
Customer Profile of CardioGood Fitness Treadmill Products
To develop a customer profile of CardioGood Fitness treadmill products, I analyzed the dataset provided, which contains the following variables:
- Product: The model number of the treadmill
- Age: In years
- Gender: Male or female
- Education: In years
- Marital status: Single or married
- Usage: Average number of times the customer wants to use the treadmill every week
- Fitness: Self-rated fitness score on a scale of 1 to 5, with 5 being the most fit
- Income: In dollars