Data Analytics Doccument Chinni

An Internship Report
on
DATA ANALYTICS VIRTUAL INTERNSHIP
Submitted in partial fulfillment of requirements for the award degree of
BACHELOR OF TECHNOLOGY
By
GUNTUPALLI CHINNI GANESH (20JR1A0496)

Department of Electronics and Communication Engineering
KKR & KSR INSTITUTE OF TECHNOLOGY AND SCIENCES

Approved by AICTE, Permanently Affiliated to JNTUK
Accredited by NBA & Accredited by NAAC with ‘A’ Grade
Vinjanampadu (V)Vatticherukuru (M), Guntur (Dt) – 522017
1|Page
KKR & KSR INSTITUTE OF TECHNOLOGY AND
SCIENCES
BONAFIDE CERTIFICATE
This is to certify that this Internship report is the bonafide work of

“GUNTUPALLI CHINNI GANESH(20JR1A0496)”, who carried out the
Internship under my SPOC during the academic year 2022-2023 towards
partial fulfillment of the requirements of the Degree of Bachelor of
Technology in Electronics and Communication Engineering from
JNTUK.
Signature of the SPOC Signature of the Head of the Department
Submitted for Viva voice Examination held on
EXTERNAL EXAMINER
2|Page
KKR & KSR INSTITUTE OF TECHNOLOGY AND SCIENCES
STUDENT DECLARATION
We solemnly declare that this Internship report “DATA

ANALYTICS VIRTUAL INTERNSHIP” is Bonafide work
done purely byme, carried out under the Point of Contact of Dr.
D.V.Krishna Reddy, towards partial fulfillment of the
requirements of the Degree of Bachelor of Technology in
Elecronics and Communication Engineering from Jawaharlal
Nehru Technological University, Kakinada during the year
2022-23.
Signature of the Student
GUNTUPALLI CHINNI GANESH

20JR1A0496
3|Page
ABSTRACT
The objective of this Data Analytics Virtual Internship is focused on

extracting insights from data. It comprises the processes, tools and techniques of data
analysis and management, including the collection, organization, and storage of data.
The chief aim of data analytics is to apply statistical analysis andtechnologies on data
to find trends and solve problems. Data analytics has become increasingly important
in the enterprise as a means for analyzing and shaping business processes and
improving decision-making and business results.
Data Analytics is a package containing many tools that are all integral to
performing analytics on data sets. This includes importing data, cleaning and
preparing data, and tools and applications for doing statistics and data mining.
Data Analytics is a new driver of the world economic and societal changes. The
world’s data collection is reaching a tipping point for major technological changes
that can bring new ways in decision making, managing our health, cities, finance and
education. While the data complexities are increasing including data’s volume,
variety, velocity and veracity, the real impact hinges on our ability to uncover the
`value’ in the data through Data Analytics technologies. Data Analytics poses a grand
challenge on the design of highly scalable algorithms and systems to integrate the data
and uncover large hidden values from datasets that are diverse, complex, and of a
massive scale. Potential breakthroughs include new algorithms, methodologies,
systems and applications in Data Analytics that discover useful and hidden knowledge
from the Data efficiently and effectively
4|Page
CERTIFICATE OF INTERN
5|Page
ACKNOWLEDGEMENT
I take this opportunity to express my deepest gratitude and appreciation
to all those people who made this Internship work easier with words of
encouragement, motivation, discipline, and faith by offering different places to
look to expand my ideas and help me towards the successful completion of this
Internship work.
First and foremost, I express my deep gratitude to Mr. K.Subba Rao,

Chairman, KKR & KSR Institute of Technology and Sciences for providing
necessary facilities throughout the Information Technology program.
I express my sincere thanks to Dr.P.Babu, Principal, KKR & KSR

Institute of Technology and Sciences for his constant support and cooperation
throughout the Artificial Intelligence and Data science program.
I express my sincere gratitude to Mr.R.Ramesh, Professor& HOD,

Computer Science and Engineering, KKR & KSR Institute of Technology and
Sciences for his constant encouragement, motivation, and faith by offering
different places to look to expand my ideas. I would like to express my sincere
gratitude to our guide Dr. D V Krishna Reddy for his insightful advice,
motivating suggestions, invaluable guidance, help and support in successful
completion of this Internship.
I would like to take this opportunity to express my thanks to the

teaching and non- teaching staff in the Department of Computer Science and
Engineering, KKR & KSR Institute of Technology and Sciences for their
invaluable help and support.
APARNA OSIPILLI
6|Page
Table of Contents
AWS Academy Data Analytics:
Stipulated Completion
Module Module Content Date Date
Introduction
• Big Data
Module 1 • Big Data Pipeline
• Big Data Tools
• Big Data Collection 01/05/2023 9/05/2023
• Big Data Storage
• Big Data Ingestion
• Big Data Processing and
Analysis
• Big Data Visualization
Lab 1
Module 2  Lab 1 Introduction 10/05/2023 19/05/2023
 Store Data in Amazon
S3
Lab 2
 Query Data in Amazon
Athena
Lab 3
 Lab 3 Introduction
Module 4
30/05/2023 08/06/2023
7|Page
 Query data in Amazon
S3 with Amazon Athena
and AWS Glue
Lab 4
 Analyze Data with
Amazon Redshift
Lab 5
 Analyze Data with
Module 6
Amazon Sage maker, 16/06/2023 22/06/2023
Jupyter Notebooks and
Bokeh
Lab 6
Module 7  Automate Loading Data 23/06/2023 30/06/2023
with the AWS Data
Pipeline
Lab 7
 Analyze Streaming Data
Module 8 with Amazon Kinesis 01/07/2023 07/07/2023
Firehose, Amazon
Elasticsearch and
Kibana
Lab 8
Module 9  Lab 8 Introduction
 Analyze IoT Data with 08/07/2023 14/07/2023
AWS IoT Analytics
8|Page
About AICTE
History
The beginning of formal technical education in India can be dated back to the mid-
19th century. Major policy initiatives in the pre-independence period included the
appointment of the Indian Universities Commission in 1902, issue of the Indian
Education Policy Resolution in 1904, and the Governor General’s policy statement of
1913 stressing the importance of technical education, the establishment of IISc in
Bangalore, Institute for Sugar, Textile & Leather Technology in Kanpur, N.C.E. in
Bengal in 1905, and industrial schools in several provinces
Initial Set-up
All India Council for Technical Education (AICTE) was set up in November
1945 as a national-level apex advisory body to conduct a survey on the facilities
available for technical education and to promote development in the country in a
coordinated and integrated manner. And to ensure the same, as stipulated in the
National Policy of Education (1986), AICTE was vested with:
 Statutory authority for planning, formulation, and maintenance of norms
& standards
 Quality assurance through accreditation
 Funding in priority areas, monitoring, and evaluation
 Maintaining parity of certification & awards
 The management of technical education in the country
Role of National Working Group
The Government of India (the Ministry of Human Resource Development) also

constituted a National Working Group to look into the role of AICTE in the
9|Page
context of proliferation of technical institutions, maintenance of standards, and
other related matters.
Overview of AICTE Internship Program
The most crucial element of internships is that they integrate classroom

knowledge and theory with practical application and skills developed in
professional Or community settings.
Organizations are getting familiar, that work these days is something other than
an approach to win your bread. It is a dedication, an awareness of others’
expectations, and a proprietorship. In order to know how the applicant might
"perform" in various circumstances, they enlist assistants and offer PPOs (Pre-
Placement Offers) to the chosen few who have fulfilled every one of their
necessities.
For getting a quicker and easier way out of such situations, many
companies and students have found AICTE to be of great help. Through its
internship portal, AICTE has provided them with the perfect opportunity to
emerge as a winner in these trying times. The website provides the perfect
platform for students to put forth their skills & desires and for companies to place
the intern demand. It takes just 15 seconds to create an opportunity, auto-match,
and an auto-post to google, bing, glassdoor, Linkedin, and similar platforms. The
selected intern's profiles and availability are validated by their respective colleges
before they join or acknowledge the offer. Shortlisting the right resume, with
respect to skills, experiences, and location just takes place within seconds.
Nothing but authentic and verified companies can appear on the portal.
10 | P a g e
Additionally, there are multiple modes of communication to connect with interns.
Both claiming to be satisfied in terms of time management, quality, security
against frauds, and genuineness.
All you need to do was to register at this portal https://internship.aicte-india.org/
Fill in all the details, send in your application or demand, and just sit back & see
your vision take a hike.
AICTE Internship Platforms
About EduSkills
EduSkills is a non-profit organization that enables an Industry 4.0-ready digital

workforce in India. Our vision is to fill the gap between Academia and Industry
by ensuring world class curriculum access to our faculties and students.
11 | P a g e
We want to completely disrupt the teaching methodologies and ICT-based
education system in India. We work closely with all the important stakeholders
in the ecosystem Students, Faculties, Education Institutions, and Central/State
Governments by bringing them together through our skilling interventions. Our
three-pronged engine targets social and business impact by working holistically
on Education, Employment and Entrepreneurship.
EduSkills with AICTE:
With a vision to create an industry-ready workforce who will eventually become

leaders in emerging technologies, EduSkills & AICTE launches a Virtual
Internship program on Machine learning, supported by AWS Academy.
About AWS Academy: AWS Academy provides higher education institutions

with a free, ready-to-teach data analytics curriculum that prepares students to
pursue industry-recognized certifications and in-demand data analyst jobs. Our
curriculum helps educators stay at the forefront of AWS Data Analytic innovation
so that they can equip students with the skills they need to get hired in one of the
fastest-growing industries.
12 | P a g e
Plan of Internship program
a) I am studying Computer Science and Engineering in my B.Tech at KKR &

KSR INSTITUTE OF TECHNOLOGY AND SCIENCES Vinjanampadu
(V), Vatticherukuru(M), Guntur (Dt).
I had completed my AWS Data Analytics internship. This is my summer

internship, and I'm really excited about it. This is an online virtual internship. We
were assisted in completing the internship by all of the allocated teachers.
b) The starting and ending date of my internship is from May to July. Each week
I had done different modules. In the first month, I learned about the introduction
of big data, storing data in Amazon s3, and Query data in Amazon Athena. In the
second month, creating an AWS Glue crawler, analyzing data with Amazon
Redshift, and analyzing data with Amazon sage maker, jupyter notebooks, and
bokeh. In the third month, automate loading data with the AWS Data Pipeline,
analyze streaming data with Amazon kinesis firehose, amazon elastic search, and
kibana, and Analyze IoT Data with AWS IOT Analytics. Finally, at the end of
May, I completed my internship.
c) I belong to department of Computer Science and Engineering and the duration

of our training is nearly three months. Firstly, in the beginning of May, our faculty
guided us on how to do the first part of our internship. They provided us with all
the guidelines and a monthly plan to complete the data analytics course. We
completed it according to that plan.
Our professors also assisted us in completing the labs, which made data
analytics much easier. Because of the faculty's supervision, we were able to finish
the second portion of the internship with ease and it was done by the month of July.
13 | P a g e
Training Program
a) I worked in DATA ANALYTICS INTERNSHIP, by EduSkills in the AICTE

platform.
I. Department gave an AWS LMS account to train us to complete the

internship. AWS Academy Architecting are the two courses that are
mandatory to complete the internship program. Each of the courses
includes knowledge checks, and labs to give us practical experience in
working with Data Analytics.
II. Department guided us with online classes scheduled a week per each
course in the course completion. Provided a sheet to complete the capstone
project. They have given a maximum level of understanding of the course
to smoothly complete the course on time.
b) AWS DATA ANALYTICS
Lab 1: Store data in Amazon S3
This module presents an introduction to Amazon Simple Storage Service

(Amazon S3). Companies need the ability to simply and securely collect, store,
and analyze their data on a massive scale. Amazon S3 is object storage that is
built to store and retrieve any amount of data from anywhere: websites and mobile
apps, corporate applications, and data from the Internet of Things (IoT) sensors
or devices.
Amazon S3 overview:
 Data is stored as objects in buckets

 Virtually unlimited storage
 Single object is limited to 5 TB
 Designed for 11 9s of durability
 Granular access to buckets and objects
14 | P a g e
Amazon S3 storage classes:
Amazon S3 offers a range of object-level storage classes that are designed for
different use cases:
 Amazon S3 Standard
 Amazon S3 Intelligent-Tiering
 Amazon S3 Standard-Infrequent Access (Amazon S3 Standard-IA)
 Amazon S3 One Zone-Infrequent Access (Amazon S3 One Zone-IA)
 Amazon S3 Glacier
 Amazon S3 Glacier Deep Archive
In this lab, Amazon S3 is used throughout the course, you must know how
to create Amazon S3 buckets and load data for subsequent labs. using the
AWS management console to create an Amazon S3 bucket, add an IAM user
to a group that has full access to the Amazon S3 service, upload files to
Amazon S3, and run simple queries on the data in Amazon S3.
Big data Collection: -
There are various ways of collecting the big data. AWS has its own services for
collecting the data, some of them are:
1.) Amazon Elastic Compute Cloud:
This is a computing service that is good for hosting web applications. Agents
can be installed on Amazon EC2 to send clickstream data, web server access
logs, error logs, etc.
2.) Amazon Kinesis:
Amazon Kinesis is a set of AWS Services for processing data in real-time

Kinesis can process hundreds of terabytes of data per hour.
3.) Internet of Things:
AWS offers a suite of IoT services that provide device software, control and
data services. These services enable you to connect securely to IoT devices and
transfer data at any scale.
15 | P a g e
Big Data Storage:
There are many Storage options available in the AWS. Some of them are
1.) Amazon Simple Storage Service

2.) Amazon Relational Database Service
3.) Amazon DynamoDB
Data Ingestion: -
Data can be collected into AWS Services in various ways. Some of them are
1.) AWS Glue

2.) AWS Data pipeline
Data Processing and Analysis: -
There are various managed and scalable services available to make the Analysis
of data easy. They are
1.) AWS EMR

2.) AWS RedShift
3.) AWS Elasticsearch Service
Data Visualization: -
Visualization is a crucial part in big data analysis. The tools provided by AWS
for data visualization are:
1.) Amazon SageMaker
2.) Jupyter notebooks
3.) AWS IoT Analysis
4.) Amazon Kinesis Data Analytics
5,) Amazon Quicksight
16 | P a g e
Lab1: Store data in Amazon S3
This module presents an introduction to Amazon Simple Storage Service (Amazon

S3). Companies need the ability to simply and securely collect, store,and analyze their
data on a massive scale. Amazon S3 is object storage that is built to store and retrieve
any amount of data from anywhere: websites and mobile apps, corporate applications,
and data from the Internet of Things (IoT)sensors or devices.
o Reviewing user permissions in IAM
17 | P a g e
o Creating Buckets in S3
o Bucket Successfully Created
18 | P a g e
Task – 3: Uploading an object into S3 bucket
o Querying the files in S3 bucket
19 | P a g e
o Uploading Compressed files into S3 bucket
Lab 2: Query Data in Amazon Athena
Lab 2 introduces you to Amazon Athena, which is the first analysis

service can use Amazon Athena to query structured, unstructured, and semi-
structured data. Amazon Athena integrates with AWS Glue, In this lab you
will practice using the AWS management console to create an Amazon S3
bucket, add an IAM user to a group that has full access to the Amazon S3
service, upload files to Amazon S3, and run simple queries on the data in
Amazon S3.
20 | P a g e
Amazon Athena is an interactive query service that makes it easy to
analyze data in Amazon S3 using standard SQL. Athena is serverless, so there
is no infrastructure to manage, and you pay only for the queries that you run.
Athena is easy to use. Simply point to your data in Amazon S3, define the
schema, and start querying using standard SQL. Most results are delivered
within seconds. With Athena, there’s no need for complex ETL jobs to prepare
your data for analysis. This makes it easy for anyone with SQL skills to
quickly analyze large-scale datasets.
Task done in lab2:

 Create and Query an Athena database
 Optimizing the database using views or portioning the data
 Creating and querying Views

Create and Query an Athena database
o To query using Athena we have to first set the external query result
location
o Querying result location updated to the S3 bucket
21 | P a g e
o Amazon Athena Query Editor
o Table Creating Interface
22 | P a g e
The data stored can be optimized using views or partitioning the data
Partitioning the data by taking only January month data
By partitioning the data, we can decrease the time in queue and the data to be
scanned
23 | P a g e
o Creating the views
24 | P a g e
Lab 3: Creating an AWS Glue crawler:
Lab 3 introduces you to AWS Glue. Lab 3 builds on that idea to show how to
use AWS Glue to infer the schema from the data. This lab includes:
 Access AWS Glue in the AWS Management Console.

 Create a crawler with AWS Glue.
 Create a database and table with AWS Glue.
 Query data in Amazon S3 with Amazon Athena and AWS Glue.
To create a crawler in AWS Glue starting from the Athena console

1. Open the Athena console at https://console.aws.amazon.com/athena/.
2. In the query editor, next to Tables and views, choose Create, and then choose
AWS Glue crawler.
3. On the AWS Glue console Add crawler page, follow the steps to create a
crawler. For more information, see Using AWS Glue Crawlers in this guide
and Populating the AWS Glue Data Catalog in the AWS Glue Developer
Guide.
25 | P a g e
AWS Glue interface
o Creating a Crawler
o Creating a crawler from the data store
26 | P a g e
o Selecting a Query result location
o Querying the data
27 | P a g e
Lab 4: Analyze Data with Amazon Redshift
Lab 4 introduces you to Amazon Redshift. The lab addresses the

Volume aspect of big data problems. Amazon Redshift can accommodate very
large datasets. Amazon Redshift uses columnar storage to scale to very large
datasets. Although the lab does not focus on the creation or management of
Amazon Redshift clusters, you should also review the overall architecture of
an Amazon Redshift solution.
An architectural diagram is included in the lab instructions, and
additional resources are listed to provide more background material Amazon
Redshift is a fast, fully managed data warehouse that makes it simple and cost-
effective to analyze all your data by using standard SQL and your existing
business intelligence (BI) tools. Amazon Redshift is compatible with the tools
that you already know and use. Amazon Redshift supports standard SQL. It
also provides high-performance Java Database Connectivity (JDBC) andOpen
Database Connectivity (ODBC) connectors, which enable you to use theSQL
clients and BI tools of your choice.
Amazon Redshift features:
 Fast, fully managed data warehouse service
 Easily scale with no downtime
 Columnar storage and parallel processing architectures
 Automatically and continuously monitors cluster
 Encryption is built in you only need to enable it.
o Reviewing the security group for accessing the Amazon Redshift

console
28 | P a g e
Creating and Configuring the Amazon redshift cluster
o Amazon Redshift Dashboard Screen
o Cluster creation Interface
29 | P a g e
o Cluster Permissions
o Cluster Dashboard
30 | P a g e
o Cluster successfully created
Lab 5: Analyze data with Amazon sage maker jupyter Notebooks ,and
Bokeh
Lab 5 introduces you to Amazon Sage Maker, Jupyter notebooks, and
the Bokeh Python package. Amazon Sage Maker is a fully managed machine
learning service. Though machine learning is not a part of this course, this lab
uses Amazon Sage Maker as a way of hosting a Jupyter notebook for the
learners to work with. The main purpose of this lab is to provide you with an
opportunity to visualize data and practice using visualizations to support a
business decision.
The main purpose of this lab of this is:
 Describe Jupyter notebooks and the Bokeh visualization package.

 Create a Jupyter notebook with Amazon Sage Maker.
 Import data into a Jupyter notebook.
 Create a presentation with a Jupyter notebook.
31 | P a g e
 Visualize data with the open-source Bokeh Python package.
o Obtain the AWS Identity and Access Management (IAM) role for
accessing Sagemaker
o Notebook instances Dashboard
32 | P a g e
o Open Jupyter
o Visualizing data using bokeh
33 | P a g e
o Creating data from database
Lab 6: Automate Loading Data with the AWS Data Pipeline
Lab 6 introduces you to the AWS Data Pipeline. The AWS Data
Pipeline is a web service you can use to migrate and transform data. The main
purpose of this lab is to provide learners with an opportunity to automate
moving data and to understand how this service fits into the larger context of
data analysis.
At the end of this module, learners will be able to:
 Access AWS Data Pipeline in the AWS Management Console.

 Create a data pipeline.
 Load data from Amazon S3 into Amazon Redshift with a data pipeline.
 Troubleshoot a data pipeline.
34 | P a g e
 Export data from Amazon Redshift to a Jupyter notebook.
Lab 7: Analyze Streaming Data with Amazon Kinesis Firehose , Amazon

Elastic search and Kibana
Lab 7 introduces you to Amazon Kinesis, Amazon Elasticsearch
Service (Amazon ES), and Kibana. This lab addresses the Velocity aspect of
big data problems. Amazon Kinesis is a suite of services for processing
streaming data. With Amazon Kinesis, you can ingest real-time data such as
video, audio, website clickstreams, or application logs. You can process and
analyze the data as it arrives, instead of capturing it all to storage before you
begin analysis.
In this lab, you use the Amazon Kinesis Data Firehose service to read
data from an application log and then send the data through Amazon ES to
Kibana. Kibana is an open-source visualization and analytics platform that
integrates with Amazon ES.
At the end of this module, you will be able to:
 Access Amazon Kinesis Data Firehose and Amazon Elastic search Service
(Amazon ES) in the AWS Management Console.
 Create a Kinesis Data Firehose delivery stream.
 Integrate a Kinesis Data Firehose delivery stream with Amazon ES.
 Build visualizations with Kibana.
35 | P a g e
o Kibana User Interface
o Kibana dev tools window
36 | P a g e
Sample Webpage created by AWS
o We will now view Insights about this website using Kibana
o Creating new index Pattern
37 | P a g e
o Visualizing insights of the website using pie chart
o Visualizing the website insights using Heat-map
38 | P a g e
Lab 8: Analyze IOT Data with AWS IOT Analytics
Lab 8 introduces you to AWS IoT Analytics and AWS IoT Core. AWS
IoT Analytics automates the steps required for analysing IoT data. You can filter,
transform, and enrich the data before storing it in a time-series data store. AWS
IoT Core provides connectivity between IoT devices and AWS Services. IoT
Core is fully-integrated with IoT Analytics.
AWS IoT Analytics is a fully-managed service that makes it easy to run

and operationalize sophisticated analytics on massive volumes of IoT data
without having to worry about the cost and complexity typically required to
build an IoT analytics platform. It is the easiest way to run analytics on IoT
data and get insights to make better and more accurate decisions for IoT
applications and machine learning use cases.
IoT data is highly unstructured which makes it difficult to analyze with

traditional analytics and business intelligence tools that are designed to
process structured data. IoT data comes from devices that often record fairly
noisy processes (such as temperature, motion, or sound). The data from these
devices can frequently have significant gaps, corrupted messages, and false
readings that must be cleaned up before analysis can occur. Also, IoT data is
often only meaningful in the context of additional, third party data inputs. For
example, to help farmers determine when to water their crops, vineyard
irrigation systems often enrich moisture sensor data with rainfall data from the
vineyard, allowing for more efficient water usage while maximizing harvest
yield.
39 | P a g e
o Creating an AWS IoT Analytics channel
o Creating an AWS IoT Analytics data Store
40 | P a g e
o Creating an AWS Core IoT Rule
o Rules Configure Action
41 | P a g e
o Creating a Dataset
o Accessing the Query Results
42 | P a g e
o Querying the Dataset
43 | P a g e
Work Samples
Capstone Project:
This project provides me with an opportunity to demonstrate the

solution design skills that I have developed throughout this course.
Assignment is to design and deploy a solution for the following case:
Introducing the Example Social Research Organization
Example Social Research Organization is a (fictitious) non-profit

organization that provides a website for social science researchers to obtain
global development statistics. For example, visitors to the site can look up
various data, such as the life expectancy for any country in the world over the
past 10 years.
Following are the tasks to be completed to design the social research

organization :-
 Deploy a PHP application that runs on an Amazon Elastic Compute Cloud
(Amazon EC2) instance
 Create a database instance that the PHP application can query
 Create a MySQL database from a structured query language (SQL) dump file
 Update application parameters in an AWS Systems Manager Parameter Store
 Secure the application to prevent public access to backend systems
44 | P a g e
Attribution Percentage of the Staff
45 | P a g e
RESULT
Critical Analysis
I. This internship helped me to relate my theoretical concepts with my practical

experience by completing the labs provided on AWS platform.
II. Each module consists of 2 types of labs linked to a specific AWS service. Both
types of laboratories include a guided laboratory as well as a challenge
laboratory. The guided laboratories provide step-by-step guidance to complete
the laboratories, while the challenge laboratory only provides guidance to
complete the laboratory. In this case, the guided labs serve as a guide for the
challenge labs. The laboratories begin with module 3, which covers the
practical aspects of adding a storage layer. The following module teaches you
how to add a computing layer, and later modules teach you how to add a
database layer, create a networking environment, connect networks, secure
user and application access, and design decoupled architectures and so on.
III. A capstone project is included in the last module, and it allows you to
demonstrate the solution design abilities you've learned throughout the course.
Finally, my internship aided in the development of my practical knowledge of
AWS.
46 | P a g e
47 | P a g e
Conclusion
As a result, I would like to conclude that internship played a critical part in not only
expanding my theoretical but also practical knowledge.
Throughout the internship, I found that several things are important:

In this lab, we learn how to create Amazon S3 buckets and load data for subsequent
labs. Using the AWS management console to create an Amazon S3bucket, add an IAM
user to a group that has full access to the Amazon S3 service, upload files to Amazon
S3, and run simple queries on the data in Amazon S3
Here, we setup the components of AWS IoT Analytics implementation an then used
a Python script to simulate loading data into AWS IoT Core . After you have loaded
the data into AWS IoT Analytics you perform queries to analyze the data.
You can create a crawler by starting in the Athena console and then using the
AWS Glue console in an integrated way. When you create the crawler, you
specify a data location in Amazon S3 to crawl.
By pursuing this internship, I was able to get a good understanding of how data
analytics works and how it is important in expanding, and improving business with the
help of AWS services which are mainly focusing on problems but not oninfrastructure
or other hardware components. With the Big data foundations, I was able to gain a basic
knowledge of data analytics. As data analytics is a popular technology, it is both
beneficial and promising in the future. Because of it, we arepredicting many trends and
patterns for further estimation, this platform is user- friendly and simple to use.
In Conclusion, it was a challenging experience, I sense that the internship was valuable
in developing my Big Data/Data Science skills. I am positive that the knowledge and
experience that I gained from AWS in Data Analytics and my internship will help me
in establishing a successful career ahead.
48 | P a g e

Data Analytics Doccument Chinni

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Data Analytics Doccument Chinni

Uploaded by

Copyright:

Available Formats

An Internship Report

GUNTUPALLI CHINNI GANESH (20JR1A0496)

KKR & KSR INSTITUTE OF TECHNOLOGY AND SCIENCES

This is to certify that this Internship report is the bonafide work of

Signature of the SPOC Signature of the Head of the Department

Submitted for Viva voice Examination held on

Department of Electronics and Communication Engineering

We solemnly declare that this Internship report “DATA

Signature of the Student

GUNTUPALLI CHINNI GANESH

The objective of this Data Analytics Virtual Internship is focused on

First and foremost, I express my deep gratitude to Mr. K.Subba Rao,

I express my sincere thanks to Dr.P.Babu, Principal, KKR & KSR

I express my sincere gratitude to Mr.R.Ramesh, Professor& HOD,

I would like to take this opportunity to express my thanks to the

AWS Academy Data Analytics:

Role of National Working Group

The Government of India (the Ministry of Human Resource Development) also

Overview of AICTE Internship Program

The most crucial element of internships is that they integrate classroom

AICTE Internship Platforms

EduSkills is a non-profit organization that enables an Industry 4.0-ready digital

EduSkills with AICTE:

With a vision to create an industry-ready workforce who will eventually become

About AWS Academy: AWS Academy provides higher education institutions

a) I am studying Computer Science and Engineering in my B.Tech at KKR &

I had completed my AWS Data Analytics internship. This is my summer

c) I belong to department of Computer Science and Engineering and the duration

a) I worked in DATA ANALYTICS INTERNSHIP, by EduSkills in the AICTE

I. Department gave an AWS LMS account to train us to complete the

b) AWS DATA ANALYTICS

Lab 1: Store data in Amazon S3

This module presents an introduction to Amazon Simple Storage Service

 Data is stored as objects in buckets

1.) Amazon Elastic Compute Cloud:

2.) Amazon Kinesis:

Amazon Kinesis is a set of AWS Services for processing data in real-time

3.) Internet of Things:

1.) Amazon Simple Storage Service

1.) AWS Glue

1.) AWS EMR

1.) Amazon SageMaker

2.) Jupyter notebooks

3.) AWS IoT Analysis

4.) Amazon Kinesis Data Analytics

5,) Amazon Quicksight

This module presents an introduction to Amazon Simple Storage Service (Amazon

o Reviewing user permissions in IAM

o Bucket Successfully Created

o Querying the files in S3 bucket

Lab 2: Query Data in Amazon Athena

Lab 2 introduces you to Amazon Athena, which is the first analysis

Task done in lab2:

 Optimizing the database using views or portioning the data

 Creating and querying Views

o Querying result location updated to the S3 bucket

o Table Creating Interface

Partitioning the data by taking only January month data

 Access AWS Glue in the AWS Management Console.

To create a crawler in AWS Glue starting from the Athena console

o Creating a crawler from the data store

o Querying the data

Lab 4 introduces you to Amazon Redshift. The lab addresses the