You are on page 1of 6

Jellyfish Training

Search courses and more...


What I learned studying for the GCP Machine Learning Exam
LiveClass
What I learned studying for the GCP Machine Learning Exam
30 OCT, 2020 READ TIME: 12 MINUTES
PUBLISHED BY

Francisco Quintana
Data Scientist
LINKEDIN
INSPIRE INFORM IGNITE
In August 2020, Google launched a BETA version of a new addition to their GCP certification
library: Professional Machine Learning Engineer. This Certification spans a broad range of
machine learning topics including design, implementation, deep-learning frameworks and
documentation.

I always like to challenge myself with Google Cloud, so this BETA certification was too good
an opportunity to pass up. Plus I want to be considered an expert in Machine Learning with
Google Cloud Platform, so it was time to put that into practice.

Being one of the first to pass the exam in October, along with my Colleague Di Wu, I wanted
to share my thoughts on this certification and how you should prepare. So here are a few
takeaways from my experience.

Firstly, for the uninitiated, what does a Machine Learning Engineer do?
We build and optimize machine learning models, primarily with the purpose of taking a
product from first trial size to end user. The initial model is taken from a Data Scientist, who
is more a jack of all trades and will build the model and ensure it actually works.

Although similar, there’s a key difference in these two roles. Data Scientist’s don’t look at
productization or building a complete solution; their role stops with the model. The Machine
Learning Engineer takes extra steps to improve the model, refining and updating it and
taking into account all product and performance recommendations so it’s up to scratch.

What do you need to do to pass?


If you’re interested in gaining expertise within Google Cloud technology, this certification is
for you - but be warned, Machine Learning is by far the toughest Google exam I’ve taken so
far.

To pass you have to have good software principles, a strong knowledge of the machine
learning capabilities of the GCP stack and Machine Learning & GCP theory - which other
Google certifications haven’t really included so far. The Machine Learning certification is
valid for 2 years and definitely worth having to stand out in the field.

What do I need to study in depth?


There are a broad range of topics within the Machine Learning exam - but I’d say the key
things to brush up on are:
Cloud technologies and software
CI/CD & Pipelines, BigQuery, Datalab and other tools in the GCP stack
Software design principles
Continuous delivery vs. continuous development
How to test and optimise model performance (the models vary so it’s good to understand
classification, linear regression, etc.)
Moderating performance and how to understand performance metrics (eg. precision vs.
recall vs. f1 score)
TensorFlow and KubeFlow development framework
Solution architecture (i.e. when to use a custom solution vs. a package solution)
Understanding the ethical uses of AI (user privacy is key)
Finally, you also want to have a deep understanding of business use cases; applying the
business side to machine learning and fair valuation, too.

You can view the full list of topics I encountered at the end of this article.

How much experience do you need to pass?


I’d say you need at least 2 years general GCP experience to take the exam. As for hands-on
machine learning experience? It’s tough to say; if you know the topics, then you’ll be good to
go. Some have passed with just 6 months experience. Having said that - at least a year of
professional experience developing solutions in the market, not just looking at theory, would
be helpful. A data engineering certification would also be valuable to have in your arsenal,
but not essential.

Preparing for the Google Machine Learning exam


You need to know how to monitor tests - and how you evaluate model performance is
important. The math behind machine learning - what the different model parameters do,
algorithm optimizing, etc - are all helpful for this exam. I’d also recommend studying
TensorFlow data structures and Tensorflow’s DataSet API. Finally, look at how to optimise
solutions from a technical perspective as well as a business perspective.

Before taking Machine Learning, consider the other certifications


While you can complete this exam in isolation I’d recommend completing other certifications,
starting with the Associate Cloud Engineer. Similar topics are covered, so it’d help with
general awareness and knowledge.

I’d also recommend the Professional Data Engineer exam, although it’s not absolutely
necessary. If you’ve also done the Cloud Engineer certification you’d need to study less as
you already know about data storage, etc. If you can, get training on the Machine Learning
certification and the GCP platform.

What are the toughest parts of the exam?


Definitely the math side; the test is online-proctored due to COVID-19, so you can’t use a
pen or paper. That’s difficult as it’s all off the top of your head. Although it’s worth noting that,
as a Beta exam, the math elements may not be included in the final version.
You also need to be up to date on documentation - the function or concept may not have
changed but the documentation may have been updated and the name of concepts also
updated.

Lastly, the software development side can be difficult - in particular, good design principles
and developing continuous pipelines. So brush up on those bits!

If I were sitting this exam again, I’d have studied the documentation more
They’ll expect you to be up to date with the latest Google technology and terminology. Ditto
the TensorFlow data structures and concepts. I’d also be more cautious of software design
principles.

I only had two weeks to study so more time building solutions would’ve been helpful - I’ll say
it again, if you can get some GCP Training that’ll help for sure.

My final tip? Get online-proctor ready


It’s a remote exam - for now - so have the room clear, no notes, books or noise in the
background so your results aren’t invalidated. Make sure you have a strong internet
connection so there are no distractions or disruptions. If you have time, try to build some of
the concepts out to make life easier for yourself. Finally, use deductive reasoning to
eliminate any wrong answers and read the questions very carefully to help you with any
multiple choice answers.

Good luck!

Appendix 1: Exam basics


https://cloud.google.com/certification/machine-learning-engineer

Length: Two hours

Registration fee: $200 (plus tax where applicable)

Languages: English

Exam format: Multiple choice and multiple select

Exam Delivery Method:

Take the online-proctored exam from a remote location, review the online testing
requirements here.
Take the onsite-proctored exam at a testing center, locate a test center near you.
Prerequisites: None

Recommended experience: 3+ years of industry experience including 1+ years designing


and managing solutions using GCP.
Appendix 2: Topics I encountered in the machine learning exam
The questions in this exam covered a wide range of topics including some very simple
programming questions:

How to improve the performance of a model


How to improve the performance of hardware
Some data engineering questions
TFX & Kubeflow Questions
Cloud Build
BQML, Tensorflow & Sci kitlearn
Tensorflow topics:

Tensorflow Records
Tensors
TFX (Tensorflow extended)
TPUs
When to use them
How to improve them
What are estimators
How to deal with common errors
Data API
Kubeflow topics:

Why use kubeflow


Kubeflow components
Benefits of using Kubeflow
How to automate Kubeflow
What is Kubeflow hybrid
Machine Learning:

When to use supervised learning vs unsupervised


Decision Trees, TF models, Transfer learning
When would you use TF
Evaluating models
Recall
Precision
F1 score
How to solve overfitting and underfitting
AI platform:

How it works
How to store models
Uploading data to it
Notebooks
Monitoring
How you can use it with R
How it can be used to work locally and on the cloud
Improve performance
Training on cluster

CI/CD & Pipelines:

What is continuous ingestions and how you can use GCP with it
What is continuous delivery and how you can use GCP with it
How to use Cloud Monitoring to ensure pipelines work
How to use Tensorflow to ensure pipelines work
How to use Kube Flow to ensure pipelines work
How to use other GCP monitoring tools Stackdriver, Logging etc
When you want to use a pipeline a certain way
How to deal with low latency
How to deal with slow performing pipelines
Building real-time and batch systems
Training & Testing:

How to avoid training skew


How to use preprocessing functions in TF to help with training and export that to testing and
live predictions
How TFX can help
How to do this across different tools like AI platform, BQML etc
BQML:

What can you do with it


What algorithms exist
When can you use it
Hardware:

What are TPUs and how they work


What are CPUs and how they work
What are GPUs and how they work
When to use each one
How to use TPU 3.0
How to use GPU Nvidia
How to use CPUs
How to deal with lack of memory issues
Business and ML:

How can you use ML to help solve this problem


Typical questions: You run a delivery app company and how can you empower your drivers
by using ML. You have data from this source and this source and typically drivers complain
about X
You want to increase revenue by suggesting related products what approach do you suggest
Evaluating the model performance against business metrics:
How do you devise a testing plan to see model performance
Pricing, Permissions and Privacy/governance:

Responsible AI practices
Costs/benefits of using different GCP products
How to provide permissions to the model
Evaluation:

How to build a process using different GCP tools to ensure your model performance is
always good
How to use AI explanations to help understand model performance
How to use TFX to evaluate model performance
How to use Kubeflow
Hybrid Cloud Models:

How to work with hybrid cloud system


What to do on private cloud vs public
How Kubeflow can help

You might also like