You are on page 1of 18

Finance and Risk

Analytics-
Module 2

Name: Sweta Kumari


PGP-DSBA Online
July’ 21
Date: 14/05/2022

0
Proprietary content. ©Great Learning. All Rights Reserved. Unauthorized use or distribution prohibited
Table of Contents
Problem : Company Analysis
1.  Random Forest on Train Data………………………………………………………………………………………………
2. Random Forest on Test Data………………………………………………………………… ……………………………6
3. LDA Model on Train Data ………………………………………………………………………………………..…………7
4. LDA Model on Test Data ……………………………………………………………………………………………………8
5. Comparison between Logistic Regression , LDA and Random Forest ……………………………….10
6.  Recommendation on Credit Data …………………………………………………………………………………...11

Problem Statement 2 : Market Risk Analysis 12


1) Draw Stock Price Graph(Stock Price vs Time) for any 2 given stocks with inference ……..12
2) Calculate Returns for all stocks with inference………………………………………………………………13
3) Calculate Stock Means and Standard Deviation for all stocks with inference…………………14.
4) Draw a plot of Stock Means vs Standard Deviation and state your inference……...15
5) Conclusion and Recommendations……………………………………………………………………….17
List of Figure.
Figure 1: Confusion Matrix of Random forest training data
Figure 2: Roc graph of Random Forest training data
Figure 3: Confusion Matrix of Random Forest test data 6
Figure 4: ROC graph of Random Forest test data 6
Figure 5: Confusion Matrix of LDA Model training data
Figure 6 :Area under curve graph for LDA Model training data 7
Figure 7:Area under curve graph for LDA Model test data
Figure 8: Confusion matrix comparing between test and train data of LDA model
Figure 9 : Stock Price Graph……………………………………………………………………………………………………………12
Figure 10: Return for all the stocks ………………………………………………………………………………………………….13
Figure 11: Std Deviation of Stocks …………………………………………………………………………………………………..15
Problem Statement 1:

Businesses or companies can fall prey to default if they are not able to keep up their debt
obligations. Defaults will lead to a lower credit rating for the company which in turn reduces its
chances of getting credit in the future and may have to pay higher interests on existing debts as well
as any new obligations. From an investor's point of view, he would want to invest in a company if it
is capable of handling its financial obligations, can grow quickly, and is able to manage the growth
scale.

A balance sheet is a financial statement of a company that provides a snapshot of what a company
owns, owes, and the amount invested by the shareholders. Thus, it is an important tool that helps
evaluate the performance of a business.

Data that is available includes information from the financial statement of the companies for the
previous year (2015). Also, information about the Networth of the company in the following year
(2016) is provided which can be used to drive the labeled field.

Question 1.8- Build a Random Forest Model on Train Dataset. Also showcase your model building
approach.

Ans- Random Forest is a Supervised Machine Learning Algorithm that is used widely in
Classification and Regression problems. It builds decision trees on different samples and takes their
majority vote for classification and average in case of regression.

 The model is based on a train and test data split of 67:33 with a random forest .
 We have used the above parameters as grid parameters
 Using the Random Forest model Train dataset, we obtain 98% accuracy and the following
results:
 Through grid search, we built a Random Forest model and got the following parameters 
 RandomForestClassifier(max_depth=6, max_features=6,
min_samples_leaf=14,min_samples_split=40, n_estimators=201, random_state=1)

Predicting the probability using the best grid


Fig 1

Classification report :

Area under curve : Area under Curve is 0.9935126397949674


Fig2

Question 1.9 Validate the Random Forest Model on test Dataset and state the performance
matrices. Also state interpretation from the model

Ans - RF Model Performance Evaluation on Training data


Confusion Matrix:

Fig 3
Classification Matrix :
Area Under Curve : Area under Curve is 0.9881682797557322

Fig 4

Observation: For the test dataset, the Random Forest Model provided an accuracy of 97% and an
AUC of 98%.

Question : 1.10 Build a LDA Model on Train Dataset. Also showcase your model building approach

Ans : Built LDA model using below parameters :

best_params : {'penalty': 'l1', 'solver': 'liblinear', 'tol': 0.0001}

best_estimator : LogisticRegression(max_iter=100000, n_jobs=2, penalty='l1', solver='liblinear')

Classification report on Training data :


Confusion Matrix on Training data :

Fig 5

Accuracy - Training Data 0.895503746877602

Area under Curve:


Fig 6
Question 1.11 Validate the LDA Model on test Dataset and state the performance matrices. Also
state interpretation from the model

Ans:
Area under curve for Test data :

Fig 7

Accuracy : Accuracy - Test Data 0.8842905405405406

Observation :Accuracy shows us that test data is performing better than train data .
Build LDA Model

Number of rows and columns of the training set for the independent variables: (2402, 63)
Number of rows and columns of the training set for the dependent variable: (2402,)
Number of rows and columns of the test set for the independent variables: (1184, 63)
Number of rows and columns of the test set for the dependent variable: (1184,)

Training Data and Test Data Confusion Matrix Comparison

Fig 8
Classification Report:

LDA Train Accuracy : 0.9383846794338052


LDA Test Accuracy : 0.9358108108108109
AUC and ROC for the training & test data
AUC for the Training Data: 0.958
AUC for the Test Data: 0.936
Question 1.12 Compare the performances of Logistics, Radom Forest and LDA models (include ROC
Curve)

Ans :

ROC Curve for the 3 models on the Training data

Fig

ROC Curve for the 3 models on the Test data


Fig

Observation : From above comparison we see that Random Forest performs better on these data
set.

Question 1.13 State Recommendations from the above models


Ans :

 According to the Model results, the Random Forest model gives the best results; therefore,
RF should be used for the prediction
 Book value adj. unit curr. And Net worth are the two most important factors for predicting
Net worth for the upcoming year.
 Compared to the Logistic regression model, the LDA model provides better results
 The correlation between Net worth and Net worth next year is high, while the Value of
output and Cost of production are highly correlated with Gross sales
 In addition, we observe high correlations between net sales and PBDT, PBDT, PBIT, PBT, PAT,
and Adjusted PAT.
 We should consider predicting 28 variables out of 66 in the dataset with a Variation Inflation
factor of *5, so as to remove the impact of the Variation Inflation. in order to
avoid multicollinearity on the prediction

Question 2 Problem Statement :


The dataset contains 6 years of information(weekly stock information) on the stock prices of 10
different Indian Stocks. Calculate the mean and standard deviation on the stock returns and share
insights.

Question 2.1 Draw Stock Price Graph(Stock Price vs Time) for any 2 given stocks with inference.
Ans : Axis Bank, SAIL, Shree Cement, Sun Pharma, Jindal Steel, Indian_Hotel, Mahindra & Mahindra,
Indian_Hotel, Mahindra & Mahindra, Idea_Vodafone, and Jet have stock prices available on our
dataset. Airways for the period Mar 2014 to Mar 2020.

Infosys Stock Price :

Mahindra & Mahindra Stock Price:

Shree Cement stock price :


Jet Airways Stock Price :

Fig 9
Observation:
 Infosys shares have been increasing in trend, dropped in 2017 and have now increased
again and have dropped in 2020.
 Since 2014, Shree cements has seen an increasing trend and has been stable in 2017 and
has once again seen an increasing trend in 2018.
 In 2018, the M&M and Jet airline's share price plummeted sharply and has been
fluctuating since then .

Question 2.2 Calculate Returns for all stocks with inference


Ans : I have taken consideration of top 5 return because it’s hard to reflect all the data .
Fig 10
 We computed the Returns on Stock and observed that Shree cements provides the highest
returns when compared to other stocks, followed by Infosys and Axis Bank.
 Idea_Vodafone, Jet Airways, Jindal Steel and SAIL provide the lowest returns.
 We also observe that Jet Airways' returns fluctuate a lot.

Question 2.3 Calculate Stock Means and Standard Deviation for all stocks with inference
Ans:

Stock Mean for all the stocks

Stock Std Deviation for all the stocks :


Observation:
 Stock price Mean is highest for Shree_cement, followed by, Sun Pharma and Axis Bank which
shows the stock prices are higher for these stock
 Stock Returns Mean is highest for Shree cement, followed by Infosys, Axis_Bank,
Indian_Hotel indicating returns are higher in this stock.
 We observe Standard Deviation of Stocks of Shree_Cement, Jet_Airways, Sun_Pharma and
Infosys are higher showing higher risk in the stock.
 We observe that Stock Returns Standard Deviation of Idea_Vodafone, Jet_Airways,
Jindal_Steel and SAIL are highest showing highly volatile returns for these stock
 While lowest volatility in returns is for Infosys and Shree_Cement

Question 2.4 Draw a plot of Stock Means vs Standard Deviation and state your inference.

Ans :
Fig 11

Observation : Stock with a lower mean & higher standard deviation do not play a role in a portfolio
that has competing stock with more returns & less risk. Thus, for the data we have here, we are only
left few stocks. Ones with higher return for a comparative or lower risk are considered better.
Question : 2.5 Conclusion and Recommendations
Ans :

Conclusion:
 Stock with a lower mean and higher std deviation do not play a role in a portfolio that has
competing stock with more returns and less risks .Thus for the data we have here, we are
only left few stocks :
1)one with highest return and low risk
2)one with lowest risk and highest return
 Stocks like Shree Cement, Infosys, and Axis Bank offer low risk and high returns, and make
good investments.
 It is less risky for Sun Pharma, Mahindra and Mahindra, and SAIL to generate lower returns.
 Vodafone, Jet Airways, and Jindal Steel are poor investments because they have higher risk
and lower returns.

Recommendations:
 We would recommend using the stocks means vs std deviation plot to assess the risk to
reward ratio .More volatile stock might give short term gains but might not be a good
investment in the long term .Whereas a low volatile stock might not be a good investment in
short term but might give a good return in long term.
 Stocks like Shree Cement, Infosys, and Axis Bank offer low risk and high returns, and make
good investments .Highly recommendable for invest for long run .
 It is less risky for Sun Pharma, Mahindra and Mahindra, and SAIL to generate lower
returns ,people who are fresher in the stock market can think of these stocks as they are less
risky and low return, and it might give them chance to learn more about stock market .
 Vodafone, Jet Airways, and Jindal Steel are poor investments because they have higher risk
and lower returns. So, people should be very conscious before investing in these stocks for
long run.
 Considering the above insights, investors should choose the stock that matches their
preferences from options 

You might also like