You are on page 1of 32

NOV 2019 | ISSUE 23

DATA
STREAK
A monthly digest on all things Data

2nd Anniversary Edition


WHAT’S
INSIDE
How To Be Transition-Ready? 2

Article Barn 6

Mergers And Acquisitions 9

In A Nutshell 10

6 Machine Learning Applications Enhancing The Healthcare Sector 2019 12

Did You Know? 16

New Innovations 18

Kaggle Problem Statement 19

Why Our Learners Love Data Streak 21

Top Data Scientists To Follow 22

DataShots 23

Data Ticklers 25

The Great Hack Review – Save Your Data Before It Backfires! 26

Career Transitions 28

Guesstimate Solutions 29
How To Be
01 Transition-Ready?
If there’s one thing I’ve learned as a Student Mentor for the Data
Science batch, it is this: getting feedback on your data science
job application or interview is virtually impossible. We, at upGrad,
have been constantly taking reviews from recruiters and have
also been modifying our career content accordingly.

There are several good reasons why companies are careful


Sagar Patil about giving feedback. There’s the fact that many people don’t
Student Mentor respond well to negative feedback, and some get downright
Data Science Vertical combative.
upGrad - Student Success Team

Imagine the time it would take a recruiter, to send a thoughtful


feedback email to you - and to the dozens (or hundreds) of other
applicants they must also consider. And there’s the fact that, at
the end of the day, they get absolutely nothing out of providing
any kind of feedback, no matter how helpful or obvious it may be
to the candidate, it's simply time consuming for them.
The tragic end result of all this is a huge number of confused,
directionless aspiring data scientists.

But here’s the good news!

There aren’t actually that many reasons why applicants get


turned down from data science roles, and there’s a lot that you
can do, to cover those bases. Reasons like, the technical and non
technical skills that most applicants don’t have, but companies
want, are what this article is all about.

From the recent collated feedback we’ve received from some of


the recruiters we worked with, many candidates have been
rejected right after the tech round. We noticed that a lot of
learners are lacking SQL knowledge.

2
Reason 1: Python For Data Science

A vast majority of data science roles are Python-based. A few


tools distinguish learners from job-ready pros when it comes to
Python for DS. They’re great differentiators if you want to build
outstanding projects that get noticed by employers. To force
yourself to improve your data science theory and implementa-
tion game, use these in a few projects, if you haven’t already:

Data Exploration:

You should have pandas functions like .corr(), scatter_matrix(),


.hist() and .bar() at the tip of your tongue. You should always be
looking for opportunities to visualize your data using PCA or
t-SNE, using sklearn's PCA and TSNE functions.

Feature Selection:

90% of the time, your dataset will have way more features

1
than you need (which leads to excessive training time, and a
heightened risk of overfitting). Get familiar with basic filter
methods (look up scikit-learn’s VarianceThreshold and
SelectKBest functions), and more sophisticated model-based
feature selection methods (look up SelectFromModel).

Hyperparameter Search For Model Optimization:


Guesstimate
A birthday cake has to be
You should know what GridSearchCV does and how it works.
equally divided into 8
Likewise for RandomSearchCV. To really stand out, try experi-
pieces in exactly 3 cuts.
menting with skopt's BayesSearchCV to learn how you can
Determine the way to
apply bayesian optimization to your hyperparameter search.
make this division
Walmart Labs possible?

Use sklearn's pipeline library to wrap their preprocessing,


feature selection and modeling steps together. Discomfort with
pipeline is a huge tell that a data scientist needs to get more
familiar with their modeling toolkit.

3
7
Bayes’s Theorem:
It’s a foundational pillar of
probability theory, and it comes
up all the time in interviews.
You should practice doing
some basic Bayes Theorem
whiteboarding problems.

Basic Probability
You should be able to answer
questions like these.

Model Evaluation

When it comes to classification


problems, for example, most
people default to using model
accuracy as their metric, which
is usually a terrible choice. Get
comfortable with sklearn's
Reason 2: SQL Knowledge precision_score, recall_score,
Having Python knowledge is great but having SQL knowledge will f1_score, and roc_auc_score
help you crack the interview. Most companies would hire a functions, and the theory
candidate with multiple skill sets, considering they know you can behind them. For regression
work on both tools. So get familiar with SQL as well. tasks, understanding why you
would use mean_squared_error
rather than mean_absolute_
Reason 3: Career Content error (and vice-versa) is also
crucial.
What is the Maximum Likelihood Estimator (MLE)? – Do you know
the answer to this question? If not, you probably haven’t gone
through our Career Content. We have added ‘Additional
Resources/Interview Question’ in every course to provide you with
answers to the interview questions that are likely to be asked by
recruiters. 77% of the learners who successfully made a career
transition said that this content has helped them transition during
the program. Completing the weekly deadlines should be your
priority; but you may have time during your commute to go
through our career content.

Reason 4: Probability And Statistics


Probability and statistics don’t always come up explicitly
on-the-job, but they’re foundational to all data science work. As a
result, it’s easy to bomb an interview if you haven’t read up on:

4
Reason 5: Technical Knowledge Reason 6:
Data scientists are increasingly required to take on software Employability Test
engineering work. Many employers insist that applicants
Imagine you have an interview
should understand how to manage their code and keep clean
with a recruiter and they ask
notebooks and scripts in particular: you to give a pre-hiring test to
understand your knowledge.
Version Control You probably won’t have
You should know how to use Git, and Google indexing for enough time to prepare for the
your repository. If you don’t, start with this tutorial. same and may end up failing in
the first round. Despite having
Web Development relevant experience, you might
have had to face rejection.
Some companies like their data scientists to be comfortable
During your course with
accessing data that’s stored on their web app, or via an API. upGrad, every learner has to
Getting comfortable with the basics of web development is give an Employability Test. The
important, and the best way to do that is to learn a bit of reason for us providing you
Flask. with a series of employability
tests is to get you ready for the
unprecedented job interview
Web Scraping test. Having said that, this will
This is slightly related to web development. Sometimes, also give you an insight on your
you’ll need to automate data collection by scraping data overall knowledge of the topics
from live websites. Two great tools to consider for this are covered in the test. Did you
know, these tests are based on
BeautifulSoup and Scrapy.
the recruiters POV rather than
academic knowledge?crucial.
Clean Code
Learn how to use docstrings. Don’t overuse inline comments. It’s a foundational pillar of prob-
Break your functions up into smaller functions. Way smaller. ability theory, and it comes up
There shouldn’t be functions in your code longer than 10 lines. all the time in interviews. You
Give your functions good, descriptive names (function_1 is not should practice doing some
a good name). Follow pythonic convention and name your basic Bayes Theorem white-
variables with underscores like_this and not LikeThis or boarding problems.
likeThis. Don’t write python modules (.py files) with more than
400 lines of code.

5
Reason 7: Business Instinct
A number of people have an understanding that getting hired is about showing that you’re the most
technically skilled applicant to a role. It’s not. In reality, companies want to hire people who can help
them make more money, faster. In general that means moving beyond just technical ability, and building
a number of additional skills:

Making Something People Want:


When most people are in ‘data science learning mode, they follow a very predictable series of steps
- import data, explore data, clean data, visualize data, model data and evaluate model. It’s fine when
you’re focused on learning a new library or technique, but going on autopilot is a really bad habit in
a business environment, where everything you do costs the company time (money). You’ll want to get
good at thinking like a business, and making good guesses as to how you can best leverage your
time to make meaningful contributions to your team and company.

Asking The Right Questions


Companies want to hire people who are able to keep the big picture in mind while they tune their
models, and ask themselves questions like, “am I building this because it’s going to legitimately help
my team and company, or because it’s a cool use case for an algorithm I really like?” and “what key
business metric am I trying to optimize, and is there a better way to do that?”.

Explaining Your Results


Management needs you to tell them what products are selling well, or which users are leaving for a
competitor and why, but they have no idea (and don’t care about) what a precision/recall curve is, or
how hard it was for you to avoid overfitting your model. Try building a project and explaining it to a
friend who hasn’t taken math since high school. (Hint: your explanation shouldn’t involve any algo-
rithm names, or refer to hyperparameter tuning. Simple words are better words). Keep in mind that
other, less well-defined things like personality fit can often be a factor, too. If you didn’t get along with
your interviewer, or if the conversation felt strained or awkward, it’s always possible that your techni-
cal qualifications are solid, but that you didn’t hit check the culture fit box. Companies regularly turn
down applicants who would have been amazing technical performers for exactly this reason, so don’t
take a rejection or two too much to heart! All the best!!

6
Article
02 Barn

The Art In Data Science: From Visualisations To Storytelling

Data science is a high-ranking profession that allows the curiosity to make game-changing
discoveries in the field of Big Data. A report from Indeed, one of the top job sites, has shown a
29% increase in demand for data scientists year over year. Moreover, since 2013, the demand
has increased by a whopping 344%. What’s the reason for such a demand?

Learn More

Machine Learning Improves Medical Eye Imaging Resolution

Researchers at Duke University’s Pratt School of Engineering have developed a machine learn-
ing algorithm that can increase the resolution of optical coherence tomography (OCT), an imag-
ing technology similar to ultrasound that uses light instead of soundwaves.

Learn More

7
The Data Science Life Cycle Consists Of 7 Phases

In this post, we will go through each of them briefly. Check how the infographic depicts different
phases in the data science life cycle. Data Science being a mixture of various tools, algorithms
and machine learning principles aim in identifying hidden patterns or insights from our data
which helps us to make improvised decisions.

Learn More

There Are Roughly 144,527 Data Science Jobs On LinkedIn Alone

Let us understand what skillsets you require to be employable and the current trends in the
market. As a data scientist, you are in high demand. So, how can you increase your marketabili-
ty even more? Check out these current trends in skills most desired by employers in 2019.

Learn More

Machine Learning Changes Our Life, Whether We Like It Or Not

Therefore it is important for us to know the Regulation and Ethics in Data Science and Machine
Learning. Let’s understand why regulation is essential and what are the three requirements
every statistical algorithm should satisfy in order to be a better risk assessment tool on respon-
sible machine learning principles.

Learn More

8
Mergers And
03 Acquisitions

This merger aims


to define the future
of aerospace and
defense!
Tell Me More

Blackberry is
making its largest
acquisition ever!
Tell Me More

9
In A
04 Nutshell

Travel money can now be reduced - Thanks to Artificial


Intelligence

Anyone running a company, or who’s part of a company that travels requires business
travels, is familiar with the challenges often faced with travel expenses and figuring out
how much a business trip will cost. The new startup TravelBank is now working towards
reducing both the struggle and the cost of arranging business trips, by applying machine
learning to help employees document spending as well as filing expense reports. The app
is not only designed to help employees keep track of expenses, but it is also focusing on
helping employees change their behaviour, to ensure they spend less money. In return,
companies that use TravelBank, can then “reward” their employees who spend less than
the predicted budget half of what they saved.

How AI is solving one of music’s most expensive problems

Making music is one of the most human things we do, but in recent years, AI has stepped
in to lend a helping hand. Now, AI is reaching the mastering process, raising hard
questions about the need for human experts in the most specialized areas of music
production. Mastering is the final step in audio post production, and balances out all of a
song’s elements so it will sound consistent no matter how you’re listening to it - on Spotify,
in iTunes, or on a CD. The goal of mastering is to make the listening experience balanced
and cohesive from song to song. As mastering engineer Ian Cooper says, mastering is “a
bit like photography - you can make the sky bluer, the greens greener.”

10
SomaDetect - A unique application of Machine Learning
in the dairy industry

“Can I meet your cows?”- that is always one of the questions Bethany asks the dairy
farmers she meets these days. After studying math and environmental studies, Bethany
Deshpande earned a PhD in biology, researching Thermokarst lakes - shallow lakes
caused by the thawing and collapse of ice-rich permafrost - and modeling their oxygen,
and microbial composition. The education didn’t teach her about the dairy industry, but
she gained a lot of knowledge about sensors, developed a deep appreciation for data,
and learned that tough problems can be solved with a healthy mix of curiosity and
determination. Today, she is an accidental AgTech (agricultural technology) entrepreneur,
leading a company that uses data science to improve the sustainability and efficiency of
dairy farming. In a dairy farm, the milk is tested every few days by the farmer and upon
delivery. The farmers’ compensation is directly tied to the quality of the milk, a measure of
concentration of different components such as fat, protein, and their ratio. Further, the
milk gets tested for presence of antibiotics and somatic cell counts (SCC). Even small
traces of antibiotics or SCC levels above a certain threshold may lead to rejection by the
processor per regulatory standards. Somadetect marries a century-old technology with
the latest data science algorithms to address the deficiencies of the current system in fast
& accurate analysis of the quality of the milk and the health of a cow.

AI can make art now, but artists aren’t afraid

Artists are supposed to be among the least likely to lose their jobs to automation, but what
happens when AI-enabled features start painting, editing, and doing other parts of their
jobs for them? AI tools are already starting to automate what used to be time-consuming
manual processes - but the results may be good for artists’ creativity, rather than potential
job killers. Companies that make industry-standard creative tools like Adobe and Celsys
have been adding AI features to their digital art software in recent years in the hopes that
it’ll speed up workflows by eliminating drudge work, and give artists more time to
experiment. From machine learning tools that help find specific video frames faster, to
features that color in entire works of line art with just a button, AI is being incorporated in
subtle, but surprisingly impactful ways.

11
05
6 Machine Learning
Applications Enhancing
The Healthcare Sector 2019
The ever increasing population of the world has put tremendous
pressure on the healthcare sector to provide quality treatment and
healthcare services. Now, more than ever, people are demanding
smart healthcare services, applications, and wearables that will help
them to lead better lives and prolong their lifespan.

By 2025, Artificial Intelligence in the healthcare sector is projected


Prashant Kathuria to increase from $2.1 billion (as of December 2018) to $36.1 billion at
Data Scientist a CAGR of 50.2%.
upGrad
The healthcare sector has always been one of the greatest
proponents of innovative technology, and Artificial Intelligence and
Machine Learning are no exceptions. Just as AI and ML permeated
rapidly into the business and e-commerce sectors, they also found
numerous use cases within the healthcare industry. In fact, Machine
Learning (a subset of AI) has come to play a pivotal role in the realm
of healthcare – from improving the delivery system of healthcare
services, cutting down costs, and handling patient data to the
development of new treatment procedures and drugs, remote
monitoring and so much more.

This need for a ‘better’ healthcare service is increasingly creating


the scope for artificial intelligence (AI) and machine learning (ML)
applications to enter the healthcare and pharma world. With no
dearth of data in the healthcare sector, the time is ripe to harness the
potential of this data with AI and ML applications. Today, AI, ML, and
deep learning are affecting every imaginable domain, and
healthcare, too, doesn’t remain untouched.

Also, the fact that the healthcare sector’s data burden is increasing
by the minute (owing to the ever-growing population and higher
incidence of diseases) is making it all the more essential to incorpo-
rate Machine Learning into its canvas. With Machine Learning, there
are endless possibilities. Through its cutting-edge applications, ML
is helping transform the healthcare industry for the better.

12
Now that you are familiar with the core components of ML
systems, it’s time to take a look at the different ways they “learn.”

Pattern Imaging Analytics


Today, healthcare organisations around the world are particularly
interested in enhancing imaging analytics and pathology with the
help of machine learning tools and algorithms. Machine learning
applications can aid radiologists to identify the subtle changes in
scans, thereby helping them detect and diagnose the health
issues at the early stages. One such pathbreaking advancement is
Google’s ML algorithm to identify cancerous tumours in mammo-
grams. Also, very recently, at Indiana University-Purdue University
Indianapolis, researchers have made a significant breakthrough
by developing a machine learning algorithm to predict (with 90%
accuracy) the relapse rate for myelogenous
leukaemia (AML). Other than these breakthroughs, researchers at
Stanford have also developed a deep learning algorithm to identi-
fy and diagnose skin cancer.

Personalised Treatment
And Behavioral Modification
Between 2012-2017, the penetration rate of Electronic Health
Records in healthcare rose from 40% to 67%. This naturally means
more access to individual patient health data. By compiling this
personal medical data of individual patients with ML applications
and algorithms, health care providers (HCPs) can detect and
Guesstimate
2
assess health issues better. Based on supervised learning,
medical professionals can predict the risks and threats to a
There are 2 jugs
patient’s health according to the symptoms and genetic informa-
with 4 litres and 5 litres of
tion in his medical history. This is precisely what IBM Watson
Oncology is doing. Using patients’ medical information and medi-
water respectively. The
cal history, it is helping physicians to design better treatment plans objective is to pour exactly
based on an optimized selection of treatment choices. Behavioral 7 litres of water in a
modification is a crucial aspect of preventive medicine. ML bucket. How can it be
accomplished?

13
4
Drug Discovery And Manufacturing Identifying Diseases
Machine learning applications have found their way into the field
and Diagnosis
of drug discovery, especially in the preliminary stage, right from
Machine Learning, along with Deep
initial screening of a drug’s compounds to its estimated success Learning, has helped make a remark-
rate based on biological factors. This is primarily based on able breakthrough in the diagnosis
next-generation sequencing. Machine Learning is being used by process. Thanks to these advanced
pharma companies in the drug discovery and manufacturing technologies, today, doctors can
process. However, at present, this is limited to using unsupervised diagnose even such diseases that
were previously beyond diagnosis –
ML that can identify patterns in raw data. The focus here is to
be it a tumour/or cancer in the initial
develop precision medicine powered by unsupervised learning,
stages to genetic diseases. For
which allows physicians to identify mechanisms for “multifactorial” instance, IBM Watson Genomics
diseases. The MIT Clinical Machine Learning Group is one of the integrates cognitive computing with
leading players in the game. Its precision medicine research aims genome-based tumour sequencing
to develop such algorithms that can help to understand the to further the diagnosis process so
disease processes better and accordingly chalk out effective that treatment can be started
treatment for health issues like Type 2 diabetes. Apart from this, head-on. Then there’s Microsoft’s
InnerEye initiative launched in 2010
R&D technologies, including next-generation sequencing and
that aims to develop breakthrough
precision medicine, are also being used to find which alternative
diagnostic tools for better image
paths for the treatment of multifactorial diseases. Microsoft’s analysis.
Project Hanover uses ML-based technologies for developing
precision medicine. Even Google has joined the drug discovery
bandwagon. Pharmaceutical manufacturers can harness the data
from the manufacturing processes to reduce the overall time
required to develop drugs, thereby also reducing the cost of
manufacturing.

14
Personalised Treatment Robotic Surgery
By leveraging on patient medical history, ML technologies Today, doctors can successfully operate
can help develop customised treatments and medicines even in the most complicated situations,
that can target specific diseases in individual patients. with precision, Thanks to robotic
This, when combined with predictive analytics, reaps surgery. Case in point - the Da Vinci
further benefits. So, instead of choosing from a given set robot. This robot allows surgeons to
of diagnoses or estimating the risk to the patient based on control and manipulate robotic limbs to
his/her symptomatic history, doctors can rely on the perform surgeries with precision and
predictive abilities of ML to diagnose their patients. IBM fewer tremors in tight spaces of the
Watson Oncology is a prime example of delivering human body. Robotic surgery is also
personalised treatment to cancer patients based on their widely used in hair transplantation
medical history. procedures as it involves fine detailing
and delineation. Today robotics is
Understanding the importance of people in the healthcare spearheading in the field of surgery.
sector, Kevin Pho states: Robotics powered by AI and ML algo-
rithms enhance the precision of surgical
“Technology is great. But people and process improve tools by incorporating real-time surgery
care. The best predictions are merely suggestions until metrics, data from successful surgical
they’re put into action. In healthcare, that’s the hard part. experiences, and data from pre-op
Success requires talking to people and spending time medical records within the surgical
learning context and workflows - no matter how badly procedure. According to Accenture,
vendors or investors would like to believe otherwise.” robotics has reduced the length of stay
in surgery by almost 21%. Mazor Robot-
ics uses AI to enhance customization
and keep invasiveness at a minimum in
surgical procedures involving body
parts with complex anatomies, such as
the spine.

15
Did You
06 Know?

AI Can Now
Cheer You Up!

AI Is In Line
To Be The Next
Picasso!

16
Machines Are
Now Chefs!

AI Is Now A
Producer And
A Director!

17
New
07 Innovations
01. Story
There was a recent buzz about an AI
System in japan writing novel, “The Day
a Computer Writes a Novel”, that was
supposed to win a literary prize. The
Research and Development team start-
ed to write their own novel and then
deconstructed it into several parts. After
this process, the AI was commissioned
to sequentially arrange the parts it was
Source: Wakefield, J. (2015). Can a machine become an artist?. [online] BBC
assigned and create “another story News. Available at: https://www.bbc.com/news/technology-33677271
similar to the sample novel,” construct- [Accessed 16 Jun. 2019].
ing it from the “different words, phrases,
characters, and plot outlines that had
been fed to it.”

02. Music
The famous Rockstar “David Bowie” was
the co-writer behind a program whose
function was to generate “lyric ideas.”

The ideas were inspiration behind most


of his songs. The program created
random sentences based on a technique
called: “the cut-up technique”. The algo-
rithm helped in writing lyrics which was
done by David manually. The process
used by the algorithm was a unique one
wherein it selected random sentences
from different locations and divided
them into bits and combine them in an
entirely new way. The algorithm could
create weird “combinations of ideas”
and David would select the ones which
were capable of being distributed
amongst the public. This helped him to
create new and original songs around
the ideas.

18
Kaggle
08 Problem Statement
New York City Airbnb Open Data

Since 2008, guests and hosts have used Airbnb to expand on traveling possibilities and
present a more unique, personalised way of experiencing the world. This dataset
describes the listing activity and metrics in NYC, NY for 2019.

Problem Link:
https://www.kaggle.com/dgomonov/new-york-city-airbnb-open-data/activity

India - Trade Data

India is one of the fastest developing nations of the world and trade between nations is
the major component of any developing nation. This dataset includes the trade data for
India for commodities in the HS2 basket.

Problem Link:
https://www.kaggle.com/lakshyaag/india-trade-data/kernels

Pizza Restaurants And The Pizza They Sell

This is a list of over 3,500 pizzas from multiple restaurants provided by Datafiniti's Business
Database. The dataset includes the category, name, address, city, state, menu information,
price range, and more for each pizza restaurant.

Problem Link:
https://www.kaggle.com/datafiniti/pizza-restaurants-and-the-pizza-they-sell/kernels

19
Trending YouTube Video Statistics

Testimonials
YouTube (the world-famous video sharing website) maintains a list of the top trending
videos on the platform. According to Variety magazine, “To determine the year’s top-trend-
ing videos, YouTube uses a combination of factors including measuring user interactions
(number of views, shares, comments and likes). Note that they’re not the most-viewed
videos overall for the calendar year”. Top performers on the YouTube trending list are
music videos (such as the famously virile “Gangnam Style”), celebrity and/or reality TV
performances, and the random dude-with-a-camera viral videos that YouTube is
well-known for.

Problem Link:
https://www.kaggle.com/datasnaek/youtube-new

Used Cars Dataset

Craigslist is the world's largest collection of used vehicles for sale, yet it's very difficult to
collect all of them in the same place. A student built a scraper for a school project and
expanded upon it later to create this dataset which includes every used vehicle entry
within the United States on Craigslist.

Problem Link:
https://www.kaggle.com/austinreese/craigslist-carstrucks-data/kernels

20
Why Our Learners
09 Love Data Streak
I read the Data Streak magazine and I am delighted to see the nice curated content on Data
Science and AI and the best part was the news about companies utilising AI solutions for differ-
ent uses, like the one news of AI Game based interviews in Unilever.
Also the top Data Scientist to follow is important. Thanks for starting this initiative.
Kapil Manchanda - PGDML & AI

The games at work are very nice and new to me. Also, how the industry is changing from tradi-
tional way of hiring is very informative. I am so much interested in Student Article & Career tran-
sitions & latest trends . I will always eagerly look into this and actively look into In a Nutshell to
know the industry, current developments, case studies, new ideas. And yes, guesstimates are
very nice too. Data ticklers - finally makes us relax.
Balamurugan Gurusamy - PGDML& AI

Love those problem solving approaches. I love SME sections and what we learn on latest
trends. I’m certainly looking forward to more technical content or articles.
Bhavin Panchal - PGDML & AI

I went through the Data Streak magazine and I found it very useful to know about the latest hap-
penings in AI and ML. I like the dataShots section. It has brief info about how AI and ML is being
applied in real life.
Pradeep Kumar Reddy Kondreddy - PGC

What I generally like is the top personalities to follow.


I usually lookup their LinkedIn bio and start following.
This helps me because I'm an active LinkedIn professional and posts, comments by these
people helps me keep updated and informed.
Vendhan Psd - Masters in ML & AI

The data streak for the July month is amazing. The content is really a treat to read.
Great initiative by upGrad and I am waiting for the Data streak for next month..:blush::blush:
Tavish Aggarwal PGDML & AI

21
Top Data Scientists To
10 Follow

RONALD VAN LOON VIN VASHISHTA

His contribution in the field of digital transformation He is one of the biggest data science, machine
has been recognised by organisations such as learning, AI and deep learning stalwarts within
Onalytica, Dataconomy, and Klout. He is also an the HR niche. He has over 20 years of experi-
author for a number of leading big data websites, ence and has built the most trusted brand in
including The Guardian, The Datafloq, and Data data science and machine learning. He is the
Science Central, and he regularly speaks at founder and chief data scientist at V-Squared
high-profile events and conferences. You must Data Strategy. His LinkedIn article “How to
follow him if you are an ardent enthusiast of data Become a Data Scientist, No Matter Where
science, big data, the IoT (Internet of Things), Your Career Is At Now” is a great read.
predictive analytics, and business intelligence.

22
11 Data Shots

3 must-watch documentaries on Statistics and


Machine Learning:

1. Humans Need Not Apply

2. Let My Dataset Change Your Mindset

3. The Joy Of Statistics

Make your life smarter with these 5 AI gadgets:

1. Tapia AI Robot Companion

2. Vi - The First True Artificial Intelligence Personal Trainer

3. Mycroft - Open Source AI Tool to Control Your Home

4. Bonjour Smart AI Alarm Clock

5. Chris Digital CO-DRIVER -AI Gadgets

23
Top Linkedin Profiles Of The Month

Juhi Srivastava Abhinav Sharma


Masters in Data Science PGD in Data Science
June 2019 Batch June 2019 Batch

About: About:
Believe in Innovation and Growth, Hard work Result oriented and dedicated professional
and Knowledge. skilled with 2.5 years of experience in Oil &
Gas Wellsite Operations and Data Visualiza-
Industrial Knowledge on: tion & Analysis and Coding Skills like SQL, C++
Artificial Intelligence, Machine Learning, Deep and Python. Known for displaying high ethical
Learning, Neural Networks, Natural Language standards, integrity and confidentiality while
Processing, Conversational AI, Computer always exceeding expectations independent-
Vision and Predictive Analytics. ly or as a part of a team.
Cherish, Adore and Challenge.
Experience:
Experience: 2.5 years of experience as a Senior Process
2.8+ experience as a Data Scientist in Data Engineer-Wells in Royal Dutch Shell,
Genpact, Hyderabad. Chennai.

24
Data
12 Ticklers

One of the
monty python team has
invented an unmanned aircraft What do you call a 3.14 m
that does sky-writing that‛s long python?
spelled the same
backwards as forwards?

Palin drone A snake

25
The Great Hack: Review

13 Save Your Data


Before It Backfires!
As the dictionary says, A Hack, as it is commonly understood, is
when someone stealthily gains access to a computer system
using vulnerabilities in the code or by tricking a gullible user into
revealing their credentials. Asking a user of a computer or social
network to click on an “I agree” button and then harvesting their
data in order to influence them.

Karan Mehta The Great Hack is a documentary cum movie that examines the
Student Mentor effects when private companies harvest online information about
Data Science Vertical us.
upGrad - Student Success Team

It is a documentary that tell us how our personal data has


become a commodity that is collected, analyzed and then spit
back at us in the form of targeted messaging, with the pursuit of
changing our behavior, as one of the movie’s subjects puts it.

Like most people, you’d probably never heard the name


“Cambridge Analytica,” or were even aware of the company’s
existence before March of 2018, when the New York Times and
the Guardian began reporting on the firm’s harvesting of private
Facebook user information. “The Great Hack” concerns itself
with the United States Presidential election of 2016 and, to a
lesser extent, the Brexit vote and other international political
campaigns. The common factor in all these events is a
now-defunct firm named Cambridge Analytica, represented
throughout the film by several former employees. At the height of
its powers, the company held up to 5,000 data points about each
of the people contained in its databases. This information was
used for a variety of purposes meant to manipulate a certain
cross-section of people. This information was used for a variety of
purposes meant to manipulate a certain cross-section of people.
The master manipulators didn’t go after people whose minds had
been made up; they went after people referred to as “the
persuadables.”

26
Using the collected data, Cambridge Analytica set So to all my wonderful readers be careful
out to create fear and apathy to achieve the results from today as the photos you click, the
of the political parties that hired them. Carroll, the messages you exchange, the things you
main buy reflect your personality and all this data
character’s lawsuit is an attempt to retrieve the data is easily available to the apps that live in
collected on him.To say that we love in a world of your electronic devices. The most recent
illusions would be true, but we don’t admit Why? example of this is FaceApp that became an
Simply because we're addicted to it. internet trend within hours but no one
bothered to read their privacy agreement.
To conclude, The Great Hack will alarm you,
infuriate you, and - hopefully - activate you. We’re My rating for this movie is a 3.5/5 as it
told that tech companies are the richest businesses paints people like Mark Zuckerberg as the
in the world, and since data is the hottest villains of this modern world where data is
commodity on the market, they’ll do anything to get apparently “more valuable than oil”, yet it
it. That we don’t know what’s being done with our offers nothing that we don’t already know.
data is the scariest aspect of all this. “The Great
Hack” hammers that point home quite successfully.

27
14 Career
Transitions

AAKASH DUSANE JAY PRAKASH KUMAR

SOFTWARE DATA DATA DATA


ENGINEER SCIENTIST LEAD SCIENTIST
Virel.ai Technologies Quantzig TCS TCS

PUSHPENDRA SHARMA DHAVAL KAMANI

DATA DATA EXECUTIVE ASSISTANT VICE


SCIENTIST SCIENTIST VICE PRESIDENT PRESIDENT (INDIA)
- CYBER SECURITY
Learnogether TCS YES Bank
Mashreq Bank

BOBY JOHNSON

INTERN- MACHINE DATA


LEARNING SCIENTIST
Cognicor Vizury

28
15 GUESSTIMATE
SOLUTIONS

1
A birthday cake has to be equally divided into
8 equal pieces in exactly 3 cuts. Determine the way
to make this division possible?

This puzzle is not really difficult to solve if you really put your mind to work.
The approach entails slicing the cake horizontally down the centre, followed
by making another division vertically through the centre.

The two divisions made across horizontal and vertical directions will give
you 4 equal pieces of the cake.

In the final step, simply stack the 4 pieces one above the other, and then
make the third division, splitting the stack into half..

This gives you the 8 equal pieces of cake, along with answer to your puzzle.

29
2
There are 2 jugs with 4 litres and 5 litres of water
respectively. The objective is to pour exactly 7 litres
of water in a bucket. How can it be accomplished?

This question can be rated of medium difficulty and shouldn’t ideally


take more than 2 minutes to answer..

The approach here is to initially fill the 5L jug with water and empty the
same into the 4L jug. The 5L jug will be left with 1L of water, which is
poured into the bucket. Meanwhile, empty the 4L jug.

The above step is repeated, so that the bucket is filled with 2L of water.

Finally, fill the 5L jug with water and empty the same into the bucket.

The bucket will now have 7L of water, as you add % L directly to the
previously collected 2L of water in the bucket.

30
To share your stories/articles/blogs,
write to us at
pgdds@upgrad.com | pgdml@upgrad.com

FIND US HERE:

upGrad Education Private Limited, Nishuvi, 75, Dr. Annie Besant Road, Worli, Mumbai – 400018

You might also like