You are on page 1of 25

Report of the Summer Internship Project

On

DATA ANLYTICS CONSULTING VIRTUAL INTERNSHIP


At
Company Name: Forage
Location: Hyderabad

Duration:

BY
Mr. K BASU NAYAK (2451-20-733-082)

Department of Computer Science and Engineering


Maturi Venkata Subba Rao (MVSR) Engineering College
(An Autonomous Institution)
(Affiliated to Osmania University & Recognized by AICTE)
Nadergul, Saroor Nagar Mandal, Hyderabad – 501 510
2023-24.
Department of Computer Science and Engineering
Maturi Venkata Subba Rao (MVSR) Engineering College (Autonomous)
(Affiliated to Osmania University & Recognized by AICTE)
Nadergul, Saroor Nagar Mandal, Hyderabad-501510
2023-2024

Certificate
This is to certify that the Summer Internship work entitled “DATA ANALYTICS
CONSULTING VIRTUAL INTERNSHIP” is a bonafide work carried out by Mr. K BASU
NAYAK (2451-20-733-082) in partial fulfillment of the requirements for the award of degree
of Bachelor of Engineering in Computer Science and Engineering from Maturi Venkata
Subba Rao (MVSR) Engineering College, affiliated to OSMANIA UNIVERSITY, Hyderabad
during the Academic Year 2022-23 under our guidance and supervision.

INTERNAL EXAMINER EXTERNAL EXAMINER

DECLARATION
This is to certify that the work reported in the present summer internship entitled “DATA
ANALYTICS CONSULTING VIRTUAL INTERNSHIP” is a record of bonafide work done
by us as part of internship in the KPMG FORAGE. The report is based on the project work
done entirely by us and not copied from any other source.

K Basu nayak
(2451-20-733-082)

ii
ACKNOWLEDGEMENTS
We would like to express our sincere gratitude and indebtedness to my summer internship
guide Mr V Sathish, Asst. Professor for his valuable suggestions and interest throughout the
course of this summer internship.

We are also thankful to our principal Dr. G Kanaka Durga and Mr. J Prasanna Kumar,
Professor and Head, Department of Computer Science and Engineering, Maturi Venkata
Subba Rao (MVSR) Engineering College, Hyderabad for providing excellent infrastructure
for completing this summer internship successfully as a part of our B.E. Degree (CSE). We
would like to thank our summer internship coordinator Ms. Sirisha Daggubati, Asst.
Professor for their constant monitoring, guidance, and support.

We convey our heartfelt thanks to the lab staff for allowing us to use the required
equipment whenever needed.

Finally, we would like to take this opportunity to thank our families for their support
through the work. We sincerely acknowledge and thank all those who gave directly or
indirectly their support in the completion of this work.

K Basu nayak
(2451-20-733-082)

iii
MISSION

VISION
vision is to empower students with visual insights into the journey of their data, fostering
a sense of digital literacy and security consciousness. It envisions a visually engaging
representation that not only educates but also sparks interest in networking concepts.

iv

COURSE OBJECTIVES
To prepare the students

 To give an experience to the students in solving real life practical problems with all its
constraints.
 To give an opportunity to integrate different aspects of learning with reference to real
life problems.
 To enhance the confidence of the students while communicating with industry
engineers and give an opportunity for useful interaction with them and familiarize
with work culture and ethics of the industry.

COURSE OUTCOMES
On successful completion of this course student will be

 Formulate a problem to map the requirements of real world scenario


 Design/develop a small and suitable product in hardware or software.
 Exhibit the skills to use contemporary technologies used by the industry
 Evaluate the solution against pre-existing alternatives with reference to pre specified
criteria
 Demonstrate an understanding of work culture and ethics of the industry
 Display effective technical communication skills both orally and written in the form
of a report

ABSTRACT
The objective of the internship is to facilitate reflection on experiences obtained in the
internship and to enhance understanding of academic material by application in the internship
setting. Internships will provide students the opportunity to test their interest in a particular
career before permanent commitments are made.

Internship students will develop skills and techniques directly applicable totheir careers.
Internship programs will enhance advancement possibilities of graduates.

Develop skills in analyzing Data Sets and perform different traditional techniques, processing
methods, make uses of different various algorithms toprocess data quickly and efficiently.

vi

TABLE OF CONTENTS

TABLE OF
CONTENT
Content Page No.
Chapter 1: Introduction
1.1: Big Data
1.2: Data Analytics
1.3: Data Science
1
1
2
2
Chapter 2: Problem
Statement 3
Chapter 3: Motivation 5
Chapter 4:
Methodological Details
4.1: Task 1
4.2: Task 2
4.3: Task 3
6
5
7
7
Chapter 5: Result
5.1: Result and analysis
5.2: Output
11
11
12
Chapter 6: Conclusion
6.1: Conclusion
13
13
Acknowledgement 15
Worksheet 16
References
TABLE OF
CONTENT
Content Page No.
Chapter 1: Introduction
1.1: Big Data
1.2: Data Analytics
1.3: Data Science
1
1
2
2
Chapter 2: Problem
Statement 3
Chapter 3: Motivation 5
Chapter 4:
Methodological Details
4.1: Task 1
4.2: Task 2
4.3: Task 3
6
5
7
7
Chapter 5: Result
5.1: Result and analysis
5.2: Output
11
11
12
Chapter 6: Conclusion
6.1: Conclusion
13
13
Acknowledgement 15
Worksheet 16
References
TABLE OF
CONTENT
Content Page No.
Chapter 1: Introduction
1.1: Big Data
1.2: Data Analytics
1.3: Data Science
1
1
2
2
Chapter 2: Problem
Statement 3
Chapter 3: Motivation 5
Chapter 4:
Methodological Details
4.1: Task 1
4.2: Task 2
4.3: Task 3
6
5
7
7
Chapter 5: Result
5.1: Result and analysis
5.2: Output
11
11
12
Chapter 6: Conclusion
6.1: Conclusion
13
13
Acknowledgement 15
Worksheet 16
References
CHAPTER 1: INTRODUCTION 1

1.1 : BIG DATA 1


1.2 : DATA ANALYTICS 2

CHAPTER 2: PROBLEM STATEMENT 3

CHAPTER 3: METHODOLOGICAL DETAILS 5

3.1: TASK 1 6

3.2: TASK 2 7

3.3: TASK 3 7

CHAPETR 4: RESULT 11

4.1: RESULT AND ANALYSIS 11

4.2: OUTPUT 12

CHAPTER 5: CONCLUSION 13

REFFERENCES 16
vii

LIST OF FIGIRES

Fig.No Figure Name Page No

Fig. 4.2 Before Analysis 10

Fig. 4.2 After analysis 10


viii

CHAPTER I

1. INTRODUCTION
1.1Big Data
What is Data?

The quantities, characters, or symbols on which operations are performed by a computer,


which may be stored and transmitted in the form of electrical signals and recorded on
magnetic, optical, or mechanical recording media. Now, let’s learn Big Data definition

What is Big Data?

Big Data is a collection of data that is huge in volume, yet growing exponentially with time.
It is a data with so large size and complexity that none of traditional data management tools
can store it or process it efficiently. Big data is also a data but with huge size.

What is an Example of Big Data?

Following are some of the Big Data examples-The New York Stock Exchange is an example
of Big Data that generates about one terabyte of new trade data per day.

Types Of Big Data

Following are the types of Big Data :

1. Structured

2. Unstructured

3. Semi-structured
2.2 Data Analytics
As the process of analysing raw data to find trends and answer questions, the definition of
data analytics captures its broad scope of the field. However, it includes many techniques
with many different goals. The data analytics process has some components that can help
a variety of initiatives. By combining these components, a successful data analytics
initiative will provide a clear picture of where you are, where you have been and where
you should go

Types of Data Analytics

Descriptive analytics helps answer questions about what happened. These techniques
summarize large datasets to describe outcomes to stakeholders. By developing key
performance indicators (KPIs,) these strategies can help track successes or failures.
Metrics such as return on investment (ROI) are used in many industries. Specialized
metrics are developed to track performance in specific industries. This process requires
the collection of relevant data, processing of the data, data analysis and data visualization.
This process provides essential insight into past performance.

Diagnostic analytics helps answer questions about why things happened. These
techniques supplement more basic descriptive analytics. They take the findings from
descriptive analytics and dig deeper to find the cause. The performance indicators are
further investigated to discover why they got better or worse. This generally occurs in
three steps:

 Identify anomalies in the data. These may be unexpected changes in a metric or a


particular market.
 Data that is related to these anomalies is collected.
 Statistical techniques are used to find relationships and trends that explain these
anomalies.
 Predictive analytics helps answer questions about what will happen in the future.
These techniques use historical data to identify trends and determine if they are likely
to recur.
 Predictive analytical tools provide valuable insight into what may happen in the future
and its techniques include a variety of statistical and machine learning techniques,
such as: neural networks, decision trees, and regression.Prescriptive analytics helps
answer questions about what should be done. By using insights from predictive
analytics, data-driven decisions can be made. This allows businesses to make
informed decisions in the face of uncertainty. Prescriptive analytics techniques rely on
machine learning strategies that can find patterns in large datasets. By analysing past
decisions and events, the likelihood of different outcomes can be estimated.

CHAPTER II

Problem Statement:
The client provided KPMG with 3 datasets:

Customer Demographic

Customer Addresses

Transactional data in the past 3 months

To correct issues in data set like accuracy, completeness or duplicate values or null values
2

CHAPTER III
Task 1 : Data Quality Assessements

As per voicemail, please find the 3 datasets attached from Sprocket Central Pty Ltd:

Customer Demographic

Customer Addresses

Transaction data in the past three months


Can you please review the data quality to ensure that it is ready for our analysis in phase two.
Remember to take note of any assumptions or issues we need to go back to the client on. As
well as recommendations going forward to mitigate current data quality concerns.

I’ve also attached a data quality framework as a guideline. Let me know if you have any
questions.

Draft an email to the client identifying the data quality issues and strategies to mitigate
these issues. Refer to ‘Data Quality Framework Table’ and resources below for
criteria and dimensions which have been considered.

Using programs like Excel, Google Sheets, Tableau, Power BI to start. Feel free to use
Python, R Programming Language, Mat Lab and other data analytics tools that you
know of.

Task 2:

Sprocket Central Pty Ltd has given us a new list of 1000 potential customers with

their demographics and attributes. However, these customers do not have prior

transaction history with the organization.

The marketing team at Sprocket Central Pty Ltd is sure that, if correctly analysed, the

data would reveal useful customer insights which could help optimize resource

allocation for targeted marketing. Hence, improve performance by focusing on high

value customers.

For context, Sprocket Central Pty Ltd is a long-standing KPMG client

Whom specializes in high-quality bikes and accessible cycling accessories to riders.

Their marketing team is looking to boost business by analysing their existing

customer dataset to determine customer trends and behaviour.

Using the existing 3 datasets (Customer demographic, customer address and


transactions) as a labelled dataset, please recommend which of these 1000 new

customers should be targeted to drive the most value for the organization.

Task 3:

The client is happy with the analysis plan and would like us to proceed. After

building the model we need to present our results back to the client.

Visualizations such as interactive dashboards often help us highlight key findings and

convey our ideas in a more succinct manner. A list of customersor algorithm won’t cut

it with the client, we need to support our results with the use of visualizations.

Please develop a dashboard that we can present to the client at our next meeting.

Display your data summary and results of the analysis in a dashboard.


It is important to keep in mind the business context when presenting your findings:
What are the trends in the underlying data?
Which customer segment has the highest customer value?
What do you propose should be Sprocket Central Pty Ltd ’s marketing and growth strategy?
What additional external datasets may be useful to obtain greater insights into customer
preferences and propensity to purchase the products?
Specifically, your presentation should specify who Sprocket Central Pty Ltd’s marketing team
should be targeting out of the new 1000 customer list as well as the broader market segment
to reach out to.

CHAPTER IV
Result and analysis:
Result and analysis: We have successfully analysed the dataset given by Sprocket Central Pty
Ltd. The final step was interpreting the results from the data analysis. This part is essential
because it's how a business will gain actual value from the previous four steps. Interpreting
data analysis results should validate why you conducted it, even if it's not 100 percent
conclusive.

OUTPUT:
Before Analysis

After Analysis
CONCLUSION:

Hence, we have successfully completed Assessment of data quality


And completeness in preparation for analysis, then targeted high value
Customers based on customer demographics and attributes and used visualizations to
present insights

You might also like