Professional Documents
Culture Documents
REPORT ON
BACHELOR OF ENGINEERING
IN
INFORMATION TECHNOLOGY
BY
CERTIFICATE
This is to certify that the preliminary project report entitled
is a record of bonafide work carried out by him under the supervision and guidance of Dr.
A. M. Bagade in partial fulfillment of the requirement of Savitribai Phule Pune University
for the award of the Degree of Bachelor of Engineering (Information Technology).
This project report has not been earlier submitted to any other Institute or University for
the award of any degree or diploma.
ACKNOWLEDGEMENT
The creation of this report required a lot of guidance from many people and it is
always a pleasure to remember them. We are grateful to our institute for making available
all the resources necessary for research for my seminar topic. We would also like to thank
all the people who have been contributory in successful completion of this report.
We would like to thank our Internal Guide Dr. A. M. Bagade for guiding us at
each and every step of the project since it’s commencement. We would also like to thank
our External Guides, Mr. S. S. Pande and Mr. T. A. Rane for analyzing our work and
providing us with their valuable inputs throughout the project.
We would like to extend our gratitude towards all the staff members of the IT
department for their constant support and encouragement. Also, our deepest gratitude
towards Dr. A. M. Bagade, Head of Information Technology Department, PICT and
Dr. P. T. Kulkarni, Principal, PICT for providing us with this wonderful opportunity to
extend our knowledge and explore new horizons.
Finally we appreciate all our fellow colleagues and group members who have
worked hard to the best of their abilities to do research for this seminar. Being with them
and working with them has truly showed us the importance of teamwork and they have
always motivated us to carry on.
LIST OF FIGURES
LIST OF ABBREVIATIONS
MA Moving Average
FE Feature Extraction
CONTENTS
CERTIFICATE 2
ACKNOWLEDGEMENT 3
LIST OF FIGURES 4
LIST OF ABBREVIATIONS 5
CONTENTS 6
ABSTRACT 8
INTRODUCTION 9
1.1 BACKGROUND 9
1.2 RELEVANCE 9
1.3 PROJECT UNDERTAKEN 9
1.4 ORGANIZATION OF PROJECT REPORT 10
LITERATURE REVIEW 11
2.1 EXISTING METHODOLOGIES 12
2.2 PROPOSED METHODOLOGIES 14
IMPLEMENTATION 29
5.1 STAGES OF IMPLEMENTATION 29
5.1.1 TECHNICAL ANALYSIS 29
5.1.2 SENTIMENTAL ANALYSIS 30
5.2 IMPLEMENTATION TECHNIQUES 31
EVALUATION METHODS 33
6.1 EVALUATION METHODS 33
6.1.1 EXAMPLE SET OF FEATURES 33
6.1.2 EXAMPLE SET OF MODELS 33
6.2 EXAMPLE 34
CONCLUSIONS 35
7.1 CONCLUSION 35
7.2 LIMITATIONS 35
7.3 FUTURE SCOPE 35
REFERENCES 36
APPENDICES 38
A] BASE PAPER(S) 38
B] PLAGIARISM REPORT 39
ABSTRACT
CHAPTER 1
INTRODUCTION
1.1 BACKGROUND
Sentiment analysis is contextual mining of text which identifies and extracts subjective information in
source material, and helping a business to understand the social sentiment of their brand, product or
service while monitoring online conversations. This concept came into existence since the innovation
of social networking websites such as Twitter, Facebook, Tinder etc. and people started posting their
honest online reviews about any particular brand or product on the Internet. Stock Prediction is one
such field which is borne out of this. Sentimental Analysis encompasses several domains such as
Natural Language Processing, text analysis, computational linguistics, and biometrics. Currently
many traders and brokers use indicators based on mathematical formulae. Many of them even don't
know how they work. Technical indicators will be in use if market or company stock repeats pattern;
but many times it doesn't. Through this project we have attempted to overcome most of the above
flaws, make the system cheaper and easier to use and understand.
1.2 RELEVANCE
Stock market prediction task is a fascinating topic and it divides researchers and academics into two
groups, people who believe we'll devise mechanisms to predict the market and people who believe that
the market is efficient and whenever new information comes up the market absorbs it by correcting
itself, hence there's no space for prediction.
Stock price prediction can be used to gain insight about market behavior over time, spotting trends that
would otherwise not have been noticed. With the increasing computational power of the computer,
machine learning will be an efficient method to solve this problem. However, the public stock data-
sets is too limited for any machine learning algorithm alone to work with, while asking for more
features may cost thousands of dollars every day.
In our project we will perform sentimental analysis of Twitter data (news or comments) to gain insight
on customer behavior. It will also help us analyze the public sentiments. The researchers found a
major correlation between stock returns and individual’s reactions. In fact, valuable data in the domain
of stock market should include several features like time, targeted audience, and brand but the most
important feature for the decision makers who are looking to invest in the stock market are time and
brand.
Output of sentimental analysis can’t be the only deciding factor in predicting process. Use of technical
indicators to check the actual movement of the market is essential. Many times market movement can’t
be predicted due to pseudo force and it can’t sustain more or less than the upper and lower circuit. So
here investors should be aware of retracement of market movement at any time.
In the proposed model of our project “Stock Prediction using Sentimental and Technical Analysis”,
we extract features not only from time series data source (price and volume) but also from sentimental
analysis results mining from Twitter. The aggregation of features is done and multiple kernel learning
framework is used to learn and predict stock movement.
In this chapter we have attempted to give a brief overview of the project. In Chapter 2 we talk in detail
about the literature review. Chapter 3 of this report deals with requirement specification and analysis.
Chapter 4 delves into the design aspect of the project. Chapter 5 talks in detail about the
implementation of the project. In Chapter 6 we discuss the conclusion and future scope.
CHAPTER 2
LITERATURE REVIEW
CHAPTER 3
We aim to create a system for stock prediction based on a set of technical trading
rules and sentimental analysis of date present on social media. The aim of the research is
to check if it is possible to obtain a set of trading pattern, which could be used to take
decisions while trading such as a Buy, Sell, Exit, Stop Loss, etc.
3.2 CONCEPT
The sentiments from various investors, investment firms and traders are important
to calculate an idea where the market can go, but this data cant be trusted completely as
there can be a trap used by various operators who are financially strong to manipulate the
market and create a state of confusion among the common retail investors . Hence
technical analysis of the stocks is also important to predict the future performance of a
stock or a company .
3.3 SCOPE
3.4 OBJECTIVES
3.5.1 DATASETS
1. User defined bag of words text file (positive and negative files).
2. Historical Stock Price from Yahoo Finance.
3. Tweets from project members via Twitter.
1. For predicting the sentiments from the tweets fetched all the words
that are commonly used in stock market should be updated in the
user defined bag of words.
2. The prediction of the price of the stock will be done only
through technical analysis, sentiment analysis will only give an
idea about the market’s sentiments.
1. Python Libraries.
CHAPTER 4
4.1 ARCHITECTURE
4.2 DFD1
4. AUTO REGRESSION
Autoregression is basically a time series model that uses
observations from previous time steps or from the past data as input to a
regression of y on x to predict the worth at the following time step. It is a
simple concept that may end up in accurate forecasts on a spread of your
time series problems which is a great way to predict the future price of a
company's stock.
CHAPTER 5
IMPLEMENTATION
5. Correlation:
If the previous day stock price is more than the current day stock price, the current
day is marked with a numeric value of 0, else marked with a numeric value of 1. This
correlation analysis turns out to be a classification problem which can be solved using
machine learning classification algorithms. The accuracy of model increases as it gets
trained on more data.
CHAPTER 6
EVALUATION METHODS
5.2 EXAMPLE
Fig12. Example
Stock value of Yes Bank went down when the rumors were spread in market.
Bank took some days to file a complaint and it was again set to retain its position.
Considering this scenario we can say that use of sentimental analysis can make
excellent profit; but we can’t only rely on it. Along with sentimental analysis we must
apply technical analysis to minimize risk factor in trading. The retracement of market at
any time can be well predicted using technical indicators i.e. based on volume, price,
percentage change in buyers and sellers, etc.
CHAPTER 7
CONCLUSIONS
7.1 CONCLUSION
The proposed system will help beginner traders as a decision support tool and help
them take decisions accordingly. Collective analysis of news regarding market have been
made easy using machine learning algorithms.
7.2 LIMITATIONS
1. Sentimental analysis will only work on english language and not other regional
languages.
2. Input datasets has been taken from Twitter itself.
1. We have considered only twitter data for analyzing people's sentiments which may
be biased because not all the people who trade share their opinions on Twitter.
The study can be extended by incorporating data from various platforms like
moneycontrol.com, stock twits, Yahoo Finance, etc.
2. This project is the initial phase of development where the algorithms are selected
and are being developed. As we proceed further the algorithm will be optimized
for more technical parameters and more complex parameters, to get much more
real world output values. And also, we can extend this project for other markets
like foreign stock markets, commodities, Forex trading, etc.
REFERENCES
[3] “Stocks Market Prediction Using Support Vector Machine”, Zhen Hu,
Jie Zhu, and Ken Tse, 2013 6th International Conference on Information
Management, Innovation Management and Industrial Engineering.
[10] “Predicting the Effects of News Sentiments on the Stock Market”, Dev
Shah , Haruna Isah, Farhana Zulkernine, 2018 IEEE International
Conference on Big Data.
APPENDICES
A] BASE PAPER(S)
B] PLAGIARISM REPORT