You are on page 1of 22

BITS & BYTES

rahulsharma.rs891@gmail.com

JULY 2023 EDITION


AZSD4G257I

This file is meant for personal use by rahulsharma.rs891@gmail.com only. 1


Sharing or publishing the contents in part or full is liable for legal action.
WHAT’S

INSIDE?
Leadership Speaks 03

Great Learning Journey 05

Discover 07

That’s A Good Question! 09

What’s New? 12
rahulsharma.rs891@gmail.com
AZSD4G257I

Industry Trends 14

Data Science at Work 16

AI at Work 18

Mentor Speaks 19

Crossword Solution 20

This file is meant for personal use by rahulsharma.rs891@gmail.com only. 2


Sharing or publishing the contents in part or full is liable for legal action.
JULY 2023 EDITION

do that in isolation to “prove” to the team,


LEADERSHIP rather collaborate with them and add insights/

SPEAKS
provide support to the team wherever needed.

Know your team: Just names, education,


past experience and hobbies are not enough.
Knowing your team means understanding
the personality types and how people act
in a certain situation, what makes a person
productive and what are the short and long
term career goals of your team members.
This will ensure that you are more aligned
with your team. A team that is being led to
a goal that aligns with their personal and
organizational goals is the most productive
one.

Cast a vision: A clear and concise vision


statement is always a driver for high-
MAYANK BAJPAI performance teams. The role of a leader here
Associate General Manager, Sales, is to break the “bigger future” into smaller and
rahulsharma.rs891@gmail.com
Great Learning
AZSD4G257I achievable milestones. What is important is to
celebrate the milestones and keep heading to
Q1. What advice would you give someone the vision statement.
going into a leadership position for the first
time? Q2. What’s the best book/movie/series
you’ve read/watched this year? Do share
When an individual contributor moves to a your takeaways?
leadership position, the reasons are mostly
obvious: they are good at their work and This would be the movie “12 Mighty Orphans”
have shown initiative and collaboration in which is based on the book “Twelve Mighty
the past. However, the role of a leader varies Orphans: The Inspiring True Story of the
largely based on the organization, the nature Mighty Mites Who Ruled Texas Football”.
of the work and the people that are a part of The movie is about the football team of a
the team. The advice that I would like to give Fort Worth orphanage who, during the Great
such an individual is to follow 3 important Depression went from playing without shoes,
leadership principles: or even to football, to playing for the Texas
state championships. The coach Rusty Russell,
Lead by example: A new leader needs to who was also an orphan was the architect of
be accepted by the “tribe” and in order to their success and took them from not having
achieve the trust and confidence of your skills to not believing their worth to inspiring
team, you will have to get your hands dirty the entire nation by their winning streak.
and maneuver tough tasks/ steep deadlines
with grace, so your team accepts you for the
value you add. This does not mean you should

This file is meant for personal use by rahulsharma.rs891@gmail.com only. 3


Sharing or publishing the contents in part or full is liable for legal action.
JULY 2023 EDITION

The takeaways from this movie are:


(i) A coach/ leader who can relate to the
team and understand their insecurities and
strengths well is always going to produce
results.
(ii) As a leader, you will not always get the
best people from a skill perspective but
teams can make mammoth tasks achievable,
provided there is good coaching and
discipline.

Q3. How do you help a new employee


understand the culture of your organization?

The culture of an organization is how you


and the person sitting next to you behave in
situations/ carry themselves with others in
the organization. In my opinion, it is definitely
good to give a detailed understanding of
organizational values, mission and vision
to new employees. However, what is great,
is to ensure that all employees in the team
who have been with the organization for
long enough, to hold up to these values.
rahulsharma.rs891@gmail.com
AZSD4G257I
We humans are wired to understand what
is socially acceptable in a certain group
by “reading the room” and if all current
employees hold the culture well, all new ones
will imbibe the same.

Q4. How do you respond to criticism?

Criticism when given in privacy and with


rationale is the single most important thing
that can drive you and your career forward.
I am open to criticism but I do not accept
it right away. I need a day of thinking
critically, understanding the core reason for
the problem and building a solution/action
plan. If I do not agree with the criticism, I
prefer a detailed dialogue with the critic
so as to either agree to it or explain how
the situation/problem is a consequence of
misunderstanding. This helps me improve my
capabilities through iterations.

This file is meant for personal use by rahulsharma.rs891@gmail.com only. 4


Sharing or publishing the contents in part or full is liable for legal action.
JULY 2023 EDITION

be a more feasible option, where I could join


GREAT LEARNING the mentoring sessions without any hassle of

JOURNEY
traveling.

Navigating through video lectures, study


materials and assignment preparations could
pose challenges for any individual immersed
in their professional responsibilities. However,
these sessions at Great Learning were an
excellent medium to understand the concept,
clarify any doubts, and have an idea of the
working methodologies and solutions.

Above all, the mentors extended their support


at every point in time. They showed me the
path to move ahead in the program, guiding
me at every stage, whether it was clarifying
any concepts, or assignment formats, or
sharing industry experiences. Thanks to their
SUSHREE support, I have learned new skill sets, and
PGP-DSBA ALUMNUS now I have a better idea of how Data Science
rahulsharma.rs891@gmail.com works in the real world.
AZSD4G257I
I've been actively involved in the IT
industry for the past 11 years, specializing in The assignments and capstone project gave
software development for projects involving me a glimpse of how work progresses in a
implementation and operations. My area Data Science project.
of experience includes technical ability in
SQL development on a variety of systems, Lastly, my advice to people who are just
including Teradata, MSSQL, and Oracle. starting to learn analytics would be that
analytics is an interesting field to start a career
Prior to enrolling in this program, my role in. Embrace the vastness of the course and
involved serving as a Technology Lead at continuously explore and learn, both in and
Infosys. The primary hurdle I encountered in outside of work. Analytics is an exciting field
my professional journey before joining this with a wealth of opportunities for growth and
program was related to my job role itself. development.
The projects I was involved in at that time
did not provide the desired enhancement to
my learning or skills. It was during this phase
that I stumbled upon the PGP-DSBA Program
offered by Great Learning which presented an
opportunity for transformative learning.

At first, I had apprehensions about learning


online. However, Data Science and Business
Analytics are fascinating fields that I was
curious about. The online medium seemed to

This file is meant for personal use by rahulsharma.rs891@gmail.com only. 5


Sharing or publishing the contents in part or full is liable for legal action.
JULY 2023 EDITION

when looking for a solid blended course on


GREAT LEARNING AIML. I thought the platform was special

JOURNEY
because it used a methodical approach to
teaching AIML topics. I enrolled in the course
because I was pleased by its structure and low
cost. I was initially wary of learning through
an online medium because I thought there
wouldn't be as much face-to-face interaction
and collaboration between the students.
However, the Great Learning team stressed
the value of group projects and mentoring
sessions inside each module, which would
enhance student collaboration. Before
entrance, all the questions were answered.

As previously said, mentored learning sessions


are crucial because they foster a stronger
bond between the learners and their mentors.
Additionally, guided learning sessions
LAKSHMAN PRASAD helped me comprehend how the principles
PGP-AIML ALUMNUS are applicable in real-world situations. The
knowledge that mentors shared with me
was really useful, and my capstone project
I'm an enthusiastic car engineer with six years
benefited from their expertise in the field.
of experience working in the research and
rahulsharma.rs891@gmail.com
AZSD4G257I
They were gracious enough to provide me
development department of a well-known
with any course or project related help.
automaker in Chennai. I have experience with
benchmarking, managing vehicle attributes, The quality of mentored learning sessions is
and optimizing vehicle architecture. excellent. The sessions were highly useful in
clearing all our doubts. The mentors were kind
Dealing with several datasets that may be
enough to accept additional sessions if at all
used for benchmarking, target setting, etc. has
required to give a better understanding of the
been a constant part of my working life. Data
concepts to the learners. Even after the course
mining, statistical analysis, data visualization,
is over, the mentors' documentation and links
data presentation, and other specialized data
continue to be helpful. Their assistance is
analytics technologies were all necessary for
essential to completing the capstone project
this. Although I have a degree in mechanical
flawlessly.
engineering, I felt that I needed some
improvement in the aforementioned areas. Keep your foundation strong is my advice
I learned about Artificial Intelligence and the to those who are just getting started with
field of Machine Learning at that time, and I mastering analytics. Do numerous assignments
thought that knowledge would be beneficial that will aid in your understanding of the
to my job. I looked for a decent program that concepts if you are unable to grasp them right
would enable me to increase my knowledge of away. Learning after working hours initially
AIML as part of my quest to learn more about seems challenging. Try to tackle a task that is
it. a regular part of your office work and begin
applying the new strategies. Learning would
Through a Google search, I learned about
be simpler as a result.
the Great Learning PGP-AIML Online Course

This file is meant for personal use by rahulsharma.rs891@gmail.com only. 6


Sharing or publishing the contents in part or full is liable for legal action.
JULY 2023 EDITION

DISCOVER
Data Analyst vs Data Scientist
Introduction Definition
rahulsharma.rs891@gmail.com
AZSD4G257I
In the fast-paced realm of data-driven Data Analyst: A Data Analyst is a professional
decision-making, the roles of Data Analysts who gathers, organizes, and interprets
and Data Scientists have gained significant complex sets of data to uncover meaningful
prominence. While the terms “Data insights, trends, and patterns. They primarily
Analyst” and “Data Scientist” are often used focus on transforming raw data into actionable
interchangeably, there are key distinctions information, allowing businesses to make
that set them apart. In this blog, we will delve informed decisions.
into the definitions, backgrounds, educational
requirements, job functions, skills, differences,
and similarities between Data Analysts and KNOW MORE
Data Scientists. Additionally, we will explore
their respective roles and responsibilities, as
well as the average salaries they command in
India, the US, the UK and Canada.

This file is meant for personal use by rahulsharma.rs891@gmail.com only. 7


Sharing or publishing the contents in part or full is liable for legal action.
JULY 2023 EDITION

Label Encoding in Python – To perform label encoding in Python, we can


use the Scikit-Learn library, which provides
2023 a range of preprocessing utilities, including
rahulsharma.rs891@gmail.com the LabelEncoder class. Here’s a step-by-step
AZSD4G257I
guide:
Label encoding is a technique used in Machine
Learning and Data Analysis to convert
categorical variables into numerical format.
It is particularly useful when working with
KNOW MORE
algorithms that require numerical input, as
most Machine Learning models can only
operate on numerical data. In this explanation,
we’ll explore how label encoding works and
how to implement it in Python.

Let’s consider a simple example of a dataset


containing information about different
types of fruits, where the “Fruit” column has
categorical values such as “Apple,” “Orange,”
and “Banana.” Label encoding assigns a
unique numerical label to each distinct
category, transforming the categorical data
into a numerical representation.

This file is meant for personal use by rahulsharma.rs891@gmail.com only. 8


Sharing or publishing the contents in part or full is liable for legal action.
JULY 2023 EDITION

THAT’S A GOOD polynomial regression, logistic regression (for


classification tasks), or more complex models

QUESTION! like decision trees, support vector machines,


or neural networks.

How do regression
models 3. Feature Selection: Identify and select the
rahulsharma.rs891@gmail.com
AZSD4G257I most relevant independent variables (features)
embark on a journey to find that are likely to have a significant impact on
the perfect fit? the dependent variable.

4. Splitting the Data: Divide the dataset into


Gaurav Das, a Great Learning mentor says,
training, validation, and test sets. The training
set is used to train the model, the validation
Regression models embark on a journey to
set helps tune hyperparameters, and the
find the perfect fit by searching for the
test set assesses the model's generalization
best-fitting relationship between the
performance.
independent (predictor) variables and the
dependent (target) variable. The goal is to
create a model that accurately represents 5. Training the Model: Use the training data
the underlying data-generating process. This to fit the regression model. The model adjusts
journey typically involves several steps: its parameters to minimize the difference
between its predictions and the actual target
values.
1. Data Collection and Preparation: The first
step is to collect and organize the dataset,
ensuring that it is relevant and clean.

2. Choosing the Right Model: Select an


appropriate regression model based on
the nature of the data and the problem.
Common choices include linear regression,

This file is meant for personal use by rahulsharma.rs891@gmail.com only. 9


Sharing or publishing the contents in part or full is liable for legal action.
JULY 2023 EDITION

6. Model Evaluation: We can assess the


model's performance on the validation set
using appropriate evaluation metrics such as
Mean Squared Error (MSE), Mean Absolute
Error (MAE), R-squared, or cross-validated
scores.

7. Hyperparameter Tuning: Fine-tune the


model by adjusting hyperparameters (e.g.,
learning rate, regularization strength, tree
depth) to optimize its performance.

8. Model Validation: After hyperparameter


tuning, evaluate the model's performance on
the test set to ensure it generalizes well to
unseen data.

9. Interpretation and Visualization: We can


analyze the model's coefficients or feature
importances (can be used for feature
selection) to understand the relationships
between variables.

10. Deployment and Monitoring (for


rahulsharma.rs891@gmail.com
AZSD4G257I
Production Models): If the model meets
performance criteria, it can be deployed for
predictions in a real-world environment.

The "perfect fit" for a regression model does


not necessarily mean achieving a perfect
prediction of the target variable, as this is
often impossible due to noise and inherent
variability in data. Instead, the goal is to
build a model that provides the best possible
representation of the underlying relationships
within the data, balancing bias and variance
to achieve good predictive performance on
unseen data.

This file is meant for personal use by rahulsharma.rs891@gmail.com only. 10


Sharing or publishing the contents in part or full is liable for legal action.
JULY 2023 EDITION

Deepak Gupta, a Great Learning mentor also 2. Feature Selection: Selecting the right set
shares his insights of independent variables (features) for your
model is crucial. Irrelevant or highly correlated
features can lead to overfitting or poor
What Is Regression?
generalization.

In the realm of statistical modeling and


Machine Learning, Linear Regression stands 3. Train-Test Split: Divide your dataset into
as one of the fundamental techniques that two subsets, a training set used for model
has stood the test of time. With its simplicity training and a testing set used for model
and interpretability, it serves as an excellent evaluation. This ensures that you can assess
starting point for both beginners and your model's performance on unseen data,
seasoned data scientists to understand the giving you an estimate of how well it will
generalize.
principles of predictive modeling.

Understanding Linear Regression

At its core, Linear Regression is a statistical


method used to model the relationship
between a dependent variable and one or
more independent variables by fitting a
linear equation to observed data. The aim
is to find the best-fitting line that minimizes
the difference between predicted and actual
rahulsharma.rs891@gmail.com
values, thus enabling predictions and insights
AZSD4G257I
into the relationship between variables.
In a simple linear regression with one
independent variable, the relationship
between the variables can be expressed as:
y = mx + b, Where:
• y is the dependent variable
• x is the independent variable
• m is the slope of the line
• b is the y-intercept

Building the Best Regression Model:


Creating a robust Linear Regression model
involves several key steps, each of which
contributes to the model's accuracy and
effectiveness.

1. Data Collection and Pre-processing:


Collecting relevant and high-quality data is the
foundation of any successful modeling effort.
Ensure that your data is clean, complete,
and representative of the problem you're
addressing.

This file is meant for personal use by rahulsharma.rs891@gmail.com only. 11


Sharing or publishing the contents in part or full is liable for legal action.
JULY 2023 EDITION

4. Model Training: Using the training data,


the model calculates the optimal slope (m)
and y-intercept (b) values to fit the data.
This is often done through mathematical
optimization techniques like the Least Squares
method.

5. Model Evaluation: Once the model


is trained, it's essential to evaluate its
performance on the testing data. Common
evaluation metrics for regression models
include Mean Squared Error (MSE), Root Mean
Squared Error (RMSE), Mean Absolute Error
(MAE), and R2 score. These metrics measure
the differences between predicted and actual
values.

6. Model Improvement: Based on the


evaluation results, you might need to iterate
and improve your model. This could involve
adjusting model hyperparameters, trying
different algorithms, or exploring more
advanced techniques if necessary.
rahulsharma.rs891@gmail.com
7. Interpretation: One of the strengths of
AZSD4G257I
Linear Regression is its interpretability. The
coefficients (m values) associated with
each feature can provide insights into the
relationships between variables. Positive
coefficients indicate a positive correlation,
while negative coefficients suggest a negative
correlation.

Conclusion:
Linear Regression remains a cornerstone of
Statistical Modeling and Machine Learning
due to its simplicity, interpretability, and
effectiveness. By following the steps
outlined above, you can build a robust Linear
Regression model that accurately predicts
outcomes and provides valuable insights
into the relationships between variables.
Remember that while Linear Regression is
a powerful tool, it's important to explore
more complex techniques for non-linear
relationships or when dealing with high-
dimensional data.

This file is meant for personal use by rahulsharma.rs891@gmail.com only. 12


Sharing or publishing the contents in part or full is liable for legal action.
JULY 2023 EDITION

WHAT’S

NEW?
Google's RT-2 Model: A Leap Towards complex commands such as "pick up the
rahulsharma.rs891@gmail.com
Human-like AI and Robotic Learning bag about to fall off the table" or "move the
AZSD4G257I
banana to the sum of two plus one." Notably,
While neural networks draw inspiration from these commands require the translation of
human brain function, they fall short of true web-based knowledge into robotic actions
emulation. However, Google's newly unveiled that involve previously unseen scenarios.
RT-2 Model, crafted by DeepMind, could
herald a significant stride toward human-like RT-2 effectively teaches robots to
AI. The model's innovation lies in its potential comprehend and communicate in the
to foster seamless communication between language of human operations, addressing
humans and robots, leveraging a blend of a long-standing challenge in robotics. The
web and robotics data to distill generalized intricacies of physical variables have posed
instructions for robotic control. obstacles to complex tasks in the realm of
robotics, setting them apart from their more
At its core, the RT-2 Model is a Vision- chatbot-like counterparts.
Language-Action (VLA) model, crafted using
transformer-based techniques and trained on This innovation has the potential to supplant
a hybrid of text and image data gleaned from conventional robotic training methods, which
the web. This model's distinctive feature is its demanded extensive data points related to
capacity to instruct robots. It gleans web data, the immediate environment. The resource-
conceptual insights, and general knowledge to intensive and time-consuming nature of
imbue robots with informed behavior. traditional training methods could be
revolutionized by RT-2's capability to transfer
The groundbreaking aspect of RT-2's training knowledge and concepts directly to robotic
process revolves around its capacity to bridge devices.
the gap between abstract concepts and robot
actions. The model is equipped to understand
This file is meant for personal use by rahulsharma.rs891@gmail.com only. 13
Sharing or publishing the contents in part or full is liable for legal action.
JULY 2023 EDITION

As RT-2 facilitates a better grasp of the


environment, it offers a glimpse into the
future of adaptable robotic technology.
Advancements in visual modeling further
augment this progress, underscoring the
profound impact of AI on the evolution of
robotic capabilities.

Revolutionizing Intelligence: Australian Team


Awarded Grant to Merge AI with Human Brain
Cells

In a groundbreaking development, an
Australian team hailing from Monash
University and Cortical Labs has secured
a substantial grant worth $600,000 from
Australia's Office of Intelligence. The primary
objective of this endeavor is to integrate
Artificial Intelligence (AI) with human brain
cells, marking an intersection of advanced
technology and biological sciences. Notably,
this team was behind the creation of
DishBrain, brain cells capable of playing the
classic game Pong.
rahulsharma.rs891@gmail.com
AZSD4G257I
Associate Professor Adeel Razi, affiliated with
the university's Turner Institute for Brain and
Mental Health, elucidated that the grant was
awarded to address the pressing need for a
new breed of machine intelligence - one that
“learns throughout its lifetime.“.

This file is meant for personal use by rahulsharma.rs891@gmail.com only. 14


Sharing or publishing the contents in part or full is liable for legal action.
JULY 2023 EDITION

INDUSTRY 1. Concrete Demonstration of Skills:

TRENDS Imagine you're applying for a Data Analyst


role. Instead of just discussing SQL querying
techniques, imagine presenting a project
Jyant Mahara, a Data where you extracted actionable insights from
rahulsharma.rs891@gmail.com a sales database. The tangible project not only
Science expert, has shared
AZSD4G257I
validates your skills but also gives interviewers
some insights on Cracking a glimpse of your potential impact.
Data Science Interviews
Successfully: Unveiling 2. Storytelling with Data: Let's say you're
Proficiency Through Live aiming for a Machine Learning engineer
position. Instead of merely explaining
Projects and Impactful Insights​
algorithms, picture sharing is a project
where you build a model to predict customer
"Securing success in Data Science
churn. Presenting the problem, your solution,
interviews often requires more
and how it affected the business creates
than theoretical knowledge. Live a compelling narrative that interviewers
projects, showcasing hands-on skills remember.
and impactful insights, can be your
strongest ally. In this article, we'll
3. Hands-on Experience Matters: A candidate
explore the role of live projects in
discussing clustering algorithms is one
elevating your Data Science interview
thing, but someone showcasing a clustering
performance, complete with easy-to- project that categorized customer segments
understand examples. for targeted marketing leaves a lasting
impression. Hands-on experience signals that
The Power of Live Projects:
you're not just theoretical, but also capable of
Live projects serve as a bridge between applying concepts.
theory and application. They demonstrate
your ability to tackle real-world problems
using data-driven approaches. Here's how
they can elevate your interview successfully:
This file is meant for personal use by rahulsharma.rs891@gmail.com only. 15
Sharing or publishing the contents in part or full is liable for legal action.
JULY 2023 EDITION

Examples in Action:
Predictive Analytics for E-commerce:
Imagine applying for a Data Scientist role.
Present a project where you used historical
transaction data to predict customer
preferences. Discuss how this could aid in
inventory management and personalizing user
experiences.

Healthcare Optimization:
For a healthcare Data Analyst position, talk
about a project involving patient's data. Show
how you used statistical analysis to identify
trends, aiding in resource allocation and
efficient patient care.

Smart Home Energy Management:


Seeking an IoT-focused role? Present a project
rahulsharma.rs891@gmail.com
where you employ sensors to monitor energy
AZSD4G257I
usage in a smart home. Discuss how AI-driven
insights could lead to energy-efficient living.

Conclusion:
Live projects are your backstage pass to
showcase your skills, problem-solving abilities,
and practical insights. From predicting stock
market trends to optimizing healthcare
operations, they elevate your interview game
by making Data Science concepts tangible.
Embrace the power of live projects, and watch
your interview success soar.”

This file is meant for personal use by rahulsharma.rs891@gmail.com only. 16


Sharing or publishing the contents in part or full is liable for legal action.
JULY 2023 EDITION

DATA SCIENCE are also important to keep oneself informed


and protected from bad practices. As he is
planning to start his own practice in future,

AT WORK Business Analytics seemed like a way ahead


in forging the path and setting the foundation
stone.

It was challenging to understand the technical


terms in medical journals and interpret the
data. Also, creating a meaningful study design
and analyzing data with respect to the thesis
of the Junior Residents always seemed like a
daunting task. The studies being conducted
were of poor quality and the analysis of the
data was not up to the mark as the data
collected was analyzed by statisticians who
had no domain knowledge of the study.
Hence, publishing such studies in good
Dr. D B MOHANTY journals was nearly impossible.
PGP-DSBA ALUMNUS The statistical methods and the logic behind
rahulsharma.rs891@gmail.com them helped him choose the tools as per
AZSD4G257I
Dibyendu works as a psychiatrist and is the data and the study to get relevant and
currently employed in a government hospital meaningful insights. For example, one of his
in Pune as a Senior Resident. He is involved Junior Residents had collected data for her
in clinical and academic responsibilities. In his thesis but she and the statistician were having
clinical role, he has to see patients in the problems analyzing the data. So, Dibyendu
Outpatient Department (OPD) and the applied the knowledge gained during the
Inpatient Department (IPD). His duty course and conducted data pre-processing
concerning academic activities involves and sequentially did the univariate, bivariate
teaching Junior Residents, assisting them with and multivariate analysis. He was able to apply
their thesis work, and participating in regular ANOVA and linear regression.
academic-oriented sessions.

He feels there is a plethora of large data in


the medical field which has accumulated
over the years and systematically analyzing
them can give important insights into the
causes and treatment of various ailments.
They are already in use in the medical field but
spurious analysis can result in questionable
results which might downgrade the scientific
literature. So, an in-depth understanding of
Data Science can prevent such mistakes. Also,
understanding and critiquing medical journals

This file is meant for personal use by rahulsharma.rs891@gmail.com only. 17


Sharing or publishing the contents in part or full is liable for legal action.
JULY 2023 EDITION

In one analysis,there was one dependent


continuous variable and more than
two independent categorical variables.
Hence, ANOVA was applied. In the other
analysis,there was one dependent continuous
variable and multiple independent continuous
variables with multicollinearity among them.
Hence, linear regression seemed a valid
choice. He used SPSS, Python and Tableau for
analysis.The study suggested that the severity
of tobacco dependence score was significantly
affected by childhood adverse events by
more than 15 %. He recommended the faculty
to include statistics as an important part of
the curriculum and should be imparted in a
more structured fashion just like the course.
He was able to change their reluctant behavior
and bleak view of the statistical world and
was able to encourage them to pursue it with
confidence thus improving the statistical
efficiency by 7% and reducing the man hours
by 5%.

Now he has more confidence in himself and


rahulsharma.rs891@gmail.com
he feels he can contribute more to society and
AZSD4G257I
his field than he used to feel before doing this
course.

This file is meant for personal use by rahulsharma.rs891@gmail.com only. 18


Sharing or publishing the contents in part or full is liable for legal action.
JULY 2023 EDITION

pages. Other difficulties included the fact


that the image documents she got in real
time each day were from various clients and

AI AT had different layouts for the classes as well


as skewed photographs, watermarked pages,

WORK
grayscale backgrounds, and overlapping fonts.
OCR accuracy, data integrity, unbalanced
classes, incorrectly labeled classes, and data
extraction all posed significant issues in this
case.

For OCR, she utilized Tesseract; for image


enhancement, OpenCV; for classification,
LinearSVC; and for cross-validation, stratified
k-fold. Exploratory Data Analysis has
improved our understanding of the data.
She personally investigated labels that were
significantly misclassified in her initial attempt
and was able to grasp the categorization
criteria for information.

POORNIMAA NAGESSWARE In order to tag orphan pages to their parent,


PGP-AIML ALUMNUS the dispersed photos were first combined
rahulsharma.rs891@gmail.com into a single document. By accurately
AZSD4G257I classifying every orphan document, the
Poornimaa is based out of Chennai. She lives
accuracy increased from 20% to 81%.
with her spouse and two daughters. She
Second, she checked the model's output
started her career as an application developer.
for mistakes by validating it. Using test data
She worked as a Data Engineer, and has now
from every fold, misclassified classes were
transformed into a Data Scientist in Analytics.
reduced. The remaining wrong classes were
She has exposure working in multiple domains
caused by noise, and one of the classes
like retail, airline, investment, and wholesale
had inappropriate annotations. During pre-
banking and holds a total of 13.5 years of
processing, she adjusted the annotation and
experience.
eliminated the noise. The precision increased
to 90%.
One of the in-house products had received
claim paperwork in picture format from
several clients in a variety of templates, which The progression of her POC to the following
raised concerns at her place of employment. step, data identification and extraction, had
Finding and classifying the templates, doing been hampered by a shift in accuracy. Instead
OCR, extracting attributes, and automating of buying a vendor-based tool for data
the entire pipeline were the issues. The main identification and capture, an effort was made
goal was to fully automate the workflow and to offer an AIML solution. The advantages
raise the accuracy to 90% or above. allowed the company to cut costs and gave
the division greater prospects in the AIML
Her current goal was to assign each page to market.
the correct document class after receiving
the client's image documents, which were
jumbled up and segregated into independent

This file is meant for personal use by rahulsharma.rs891@gmail.com only. 19


Sharing or publishing the contents in part or full is liable for legal action.
JULY 2023 EDITION

MENTOR Q2. How did you decide you want to be a


Data Scientist?

SPEAKS I became interested in this role as I always


wanted to explore both the business and
technology side of the world. I wanted
to learn more about it and gain more
knowledge.

Q3. What preparations did you do to


achieve your goal?

I did all the research and understanding


about the job role and responsibilities,
Brushing-up basics, fundamentals, scenario
based questions, programming languages
and multiple self evaluations was the key
preparation technique.
In this edition, we will hear about our
mentor Jainesh Garg’s journey of Q4. How did you get your first job and
describe your journey (difficulties that you
becoming a Data Science industry expert.
rahulsharma.rs891@gmail.com
AZSD4G257I faced and how did you overcome)?

Getting the first job involved conducting


Q1. Describe your current role.
extensive research into current industry
challenges and potential technological
In my current position, I translate data into
solutions. Ultimately, roles that blend
analytical insights, concepts into programs,
technology and business acumen require
and outcomes into managerial actions to
adeptly discussing how technology can
provide important business decisions a
empower various business aspects.
competitive edge. Have completed
end-to-end projects in the fields of risk and
Q5. Advice to budding Data Scientists and
strategy management, and have worked
Bussiness Analysts?
with a variety of clients while installing
cutting-edge tools and cognitive intelligence
My advice would be that the learners should
technologies that allow industries to make
focus on fundamentals of Statistics, Data
data-driven decisions and possess cross-
Manipulation-Analysis with an expertise in
industry expertise in the following areas:
the areas of Machine Learning and Model
life sciences, health care, product lifecycle
Deployments. For business facing roles,
management (PLM), supply chain, consumer
learners should focus on structured thinking
goods, ETL, data warehousing tools/
and overall storytelling skills. Curiosity
technologies, implementation of end-to-end
towards learning and exploring new things
analytical solutions, and visual storytelling.
will keep things rolling.

This file is meant for personal use by rahulsharma.rs891@gmail.com only. 20


Sharing or publishing the contents in part or full is liable for legal action.
JULY 2023 EDITION

CROSSWORD SOLUTION
1
V
A
2
R I
C L U S T E R A N A L Y S
3
I S M
4
A A P
N P U 5
S
6
B A R C H A R T C
E C A 7
C A
H T O T
E I R T
8
D E C I S I O N T R E E
P N E R
rahulsharma.rs891@gmail.com A L P
AZSD4G257I R L
A
K T O
9
H I S T O G R A M
O
N

ACROSS DOWN
3. A visual representation of data using bars of 1. A method used to fill in missing values in a
varying heights or length - CLUSTER ANALYSIS dataset - VARIANCE

6. A graphical representation of data using 2. A statistical measure that describes the


points or dots - BARCHART spread of a dataset - IMPUTATION

8. A statistical technique used to find 4. A graphical representation of the distribution


relationships between variables - of a dataset - APACHESPARK
DECISIONTREE
5. A widely used framework for big data
9. A method used to allocate customers into processing and analytics - SCATTERPLOT
different segments based on their
Similarities - HISTOGRAM 7. A Machine Learning algorithm used for both
classification and regression tasks, known for its
simplicity and interpretability - CORRELATION

This file is meant for personal use by rahulsharma.rs891@gmail.com only. 21


Sharing or publishing the contents in part or full is liable for legal action.
JULY 2023 EDITION

LEARNING BIRD CHIRPS


Those people who develop the ability to
continuously acquire new and better forms of
knowledge that they can apply to their work and
to their lives will be the movers and shakers in
our society for the indefinite future.

- Brian Tracy

THE EDITORIAL TEAM


rahulsharma.rs891@gmail.com
AZSD4G257I

Mugdha Deepala Anamika Singhal Shikha S

CREATIVE TEAM

Vijaya Patel
(Design)
This file is meant for personal use by rahulsharma.rs891@gmail.com only. 22
Sharing or publishing the contents in part or full is liable for legal action.

You might also like