You are on page 1of 3

I S S U E 5 ° N O V E M B E R 2 0 2 1

HULT DATA GLOBAL


"The world is a big data problem" - Andrew McAfee

Dataiku Recap

By Felipe Dominguez

On November 2nd we host our third event of the year with

Dataiku. It was an incredible workshop focused on how to use

the platform which allows us to do data preprocessing, data

preparation, data visualization, and machine learning modeling.

During two hours, Darien Mitchel-Tontar, a Data Scientist at

Dataiku led us through a classification problem regarding credit

fraud, in order to develop a machine learning model to predict

fraud or not. This was the first in-person event and we count on

more than 25 members of our club, from Boston and San

Francisco campuses.

We hope you enjoyed the event and you are ready for the

upcoming ones!

Github

By Isabella Oñate

In an increasingly collaborative environment where releases

are continuously integrated into production, working in teams

is an essential skill within software development. GitHub is a

repository hosting service emerging from git, the control

versioning system, where you can manage release versions of

your code, showcase your work but also collaborate with

other developers to create and release software together.

In our upcoming practical workshop, we will teach you the

best industry practices for using git and GitHub in


In this issue:
collaboration with your peers. Thrive in the distributed team

era by understanding and using git as another hard skill in

your professional toolkit.


- The Events

- Beyond the scene of R

- Code in English

- Tips & Tricks

- Increase your curiosity


I S S U E 5 ° N O V E M B E R 2 0 2 1

What industries leverage R to make data-driven


Beyond the scene of R with
decisions?
Professor Kurnicki

By Nikhita Arora Academia, especially mathematics and statistics

departments, heavily use R. Financial companies use R


The R programming language has become the single most
to optimize portfolios and build specific risk profiles. I
valuable tool for data science over the world. Since its
have also seen a lot of Biotech companies, especially
official release in 2000, it has been used in domains
here in Silicon Valley, use R in their R&D teams. There is
ranging from biology to marketing to make data-based
a use case for R in any and every industry if their focus
decisions. Most preferred by statisticians, R has gained
is research.
tremendous popularity over the last few years. It currently

ranks #14 in the TIOBE Programming Community index.


What is your favourite feature of R?

We caught up with Thomas Kurnicki, whose passion for


I love the Shiny package that makes it easy to build
programming languages, especially R, converted him from
interactive apps straight from the R code. You don't
a student at Hult International Business School to a part-
need to have a ton of frontend and backend. It is easy
time Data Science professor. A strong promoter of R,
to code because you always have the same syntax, and
Thomas believes R is extremely easy if you have fun and
most importantly - it makes the user addicted to your
code smart. He likes to refer to his students as "Data
work. A non-technical person gets to see the code in
Pirates" and teaches tips and tricks to "hack" coding.
the form of a user interface that is super easy to use
Here are some of the secrets you should know about R:
and comprehend. What's not to like about this

optimized efficiency!
You believe that R is the ultimate data science
environment. Tell us why? For non-programmers looking to learn R, where to
begin their journey? Any tips and tricks?
First and foremost, R was designed by statisticians for

statisticians. You can't make sense of any data without


It is highly preferred to do a course at school for
understanding statistics, and the R environment is very
discipline and depth. But if you cannot go to school,
intuitive for anyone looking to analyze and/or manipulate
there are tons of online resources available out there -
data. Secondly, it allows you to interpret and interact and
Data Camp, Coursera, Udemy offer a plethora of
visualize data, making it a complete package for data
learning at your pace self-taught modules. They don't
scientists.
give you the business insight, but they are a very good

place to start. Once you know a little, you can use


What are some of the most common applications of Kaggle to find datasets and free business cases to
R? How have you used R in your work? practice your skills. You can even reach out to data

science teams at companies and answer their business


R continues to be one of the most heavily utilized
challenges using R - coding in R is considered super
packages in academia and research, given its statistical
sexy in the real world!
prowess. Scholars working on research prefer to work in R

to not just analyze data but also create models. We have


If you had to describe R in only three words, what
seen a fair use of R, especially in building climate and
would they be?
financial models. Personally, I have used R to build a

forecasting model to predict yield curves. I could build not


I would say PDP - Practice Data Piracy. Hack coding in
only the model but also an interactive app using shiny - all
R because remember: "R is easy. R is fast. And R is
under 300 lines of code!
free."

Another popular use of R is in ongoing business reporting. "If you can do it in Excel don't do it"
R allows us to combine multiple resources, thus offering - Thomas Kurnicki
flexibility and automation. R's ability to calculate different

statistics and graphical plots makes it a go-to language


Find out more about with
for building daily or weekly reports in many business
Thomas Kurnicki's book here!
functions and industries.
I S S U E 5 ° N O V E M B E R 2 0 2 1

Code in English
By Nicola Bini

ELearning a new programming language is hard and

requires time. Some compare it with the process of

learning a new language.

What if we could create algorithms by coding in the

natural English language, instead? OpenAI, an AI

research firm, has developed an AI-powered system

called Codex that translates natural language to

code. Codex is the model that powers GitHub

Copilot. Check the QR code to see how easy it is to

write custom applications without writing a single

line of code.

Tips & Tricks

By Erica Moulet Vargas

As professor Kurnicki says:"R is easy"; and it can

become easier if you have the right tools to

improve your skills. That is why HULT Data Global

decided to come up with this cheatsheet that for

sure is going to be very useful for a start. Within it

you will be able to find the Ali Baba's cave of

cheatsheet in R: from the basic use of R to the

ggplot package. So be curious and check the

Rrrrrrchives!

Social Sustainability
We congratulate Nabanita Talukdar, a faculty Go Follow:
member and research fellow of Hult, and Blodwen

Tarter, marketing professor at Golden Gate @hultdata


University, for presenting their case, Agaati: Using

Data Analysis to Define Marketing Mix for

Sustainable Luxury Apparel, alongside the Golden


in/hultdataglobal
Gate University in the 26th Annual Jacobs &

Clevenger Case Writers’ Workshop.


http://hultglobal.tk/
This case was about social sustainability within a

luxury fashion context and using statistical analysis

to make informed marketing decisions. You can @hbdaclub


follow them on LinkedIn!

You might also like