Data Analysis and Decision Model

A mutual fund is a company that pools money from many investors and invests the money in securities such as stocks, bonds, and short-term debt. The combined holdings of the mutual fund are known as its portfolio.
The level of risk in a mutual fund depends on what it invests in. Usually, the higher the potential returns, the higher the risk will be. For example, stocks are generally riskier than bonds, so an equity fund tends to be riskier than a fixed income fund.

Some specialty mutual funds focus on certain kinds of investments, such as emerging markets, to try to earn a higher return. These kinds of funds also tend to have a greater risk of a larger drop in value.
Given the general belief among people that mutual funds tend to be risker and are not safe to invest. Following is your null hypothesis.
H0 : Null Hypothesis: Mutual funds are risky and does not give much returns.
Average rate of return (µ) <= ~2 % Alternate Hypothesis you are trying to prove if Mutual funds are safe to invest. Ha : Alternate Hypothesis : Mutual funds are safe and give good returns. Average rate of return (µ) > 2%
(2% is the avg safe return a person can earn from US debt securities. This percent varies based on the country.
For e.g.: If you choose to collect samples from India avg rate of return would be 6% – 8 % . In that case H0 will be <= 8%)

Given the above statements following is the expectation from the project: -
Part 1: Collect Your Dataset

You can collect your data from any reliable website.
Current website that I’m following is https://www.moneycontrol.com/mutual-funds/find-fund/ You can also choose a website of your own.
You can also go to top mutual fund companies and look for their historical performance.
For e.g : https://www.troweprice.com/personal-investing/tools/fund-research/historical-performance
Kaggel or Github also has datasets for mutual funds

Part 2: Exploratory Data Analysis

In this part you are expected to do some exploratory data analysis on the dataset. It basically includes providing summary analysis. Plotting various types of plots and tables showing useful information about your data.
Attaching one example of EDA done on COVID 19.
https://onlinelibrary.wiley.com/doi/full/10.1002/jmv.25743.
Make sure to write with proper analyses and explanation of each plot and table with proper label and axis. Pasting just the diagram with no explanation will not gain you any points.
Minimum number of plots and tables required : 5
Part 3: Hypothesis Testing

(Hint: It will be a one-sided hypothesis of ‘>’ greater than because your proving mutual fund gives more return then a normal debt instrument)
You will be using average rate of return /yield ( choose any 1Y, 5Y, 10Y) to prove that mutual funds are safe to invest.
Use multiple samples or funds to prove your hypothesis.


PART 4: BONUS

ANOVA Analysis for the samples collected.