Professional Documents
Culture Documents
Francisco Quintana
Data Scientist
LINKEDIN
INSPIRE INFORM IGNITE
In August 2020, Google launched a BETA version of a new addition to their GCP certification
library: Professional Machine Learning Engineer. This Certification spans a broad range of
machine learning topics including design, implementation, deep-learning frameworks and
documentation.
I always like to challenge myself with Google Cloud, so this BETA certification was too good
an opportunity to pass up. Plus I want to be considered an expert in Machine Learning with
Google Cloud Platform, so it was time to put that into practice.
Being one of the first to pass the exam in October, along with my Colleague Di Wu, I wanted
to share my thoughts on this certification and how you should prepare. So here are a few
takeaways from my experience.
Firstly, for the uninitiated, what does a Machine Learning Engineer do?
We build and optimize machine learning models, primarily with the purpose of taking a
product from first trial size to end user. The initial model is taken from a Data Scientist, who
is more a jack of all trades and will build the model and ensure it actually works.
Although similar, there’s a key difference in these two roles. Data Scientist’s don’t look at
productization or building a complete solution; their role stops with the model. The Machine
Learning Engineer takes extra steps to improve the model, refining and updating it and
taking into account all product and performance recommendations so it’s up to scratch.
To pass you have to have good software principles, a strong knowledge of the machine
learning capabilities of the GCP stack and Machine Learning & GCP theory - which other
Google certifications haven’t really included so far. The Machine Learning certification is
valid for 2 years and definitely worth having to stand out in the field.
You can view the full list of topics I encountered at the end of this article.
I’d also recommend the Professional Data Engineer exam, although it’s not absolutely
necessary. If you’ve also done the Cloud Engineer certification you’d need to study less as
you already know about data storage, etc. If you can, get training on the Machine Learning
certification and the GCP platform.
Lastly, the software development side can be difficult - in particular, good design principles
and developing continuous pipelines. So brush up on those bits!
If I were sitting this exam again, I’d have studied the documentation more
They’ll expect you to be up to date with the latest Google technology and terminology. Ditto
the TensorFlow data structures and concepts. I’d also be more cautious of software design
principles.
I only had two weeks to study so more time building solutions would’ve been helpful - I’ll say
it again, if you can get some GCP Training that’ll help for sure.
Good luck!
Languages: English
Take the online-proctored exam from a remote location, review the online testing
requirements here.
Take the onsite-proctored exam at a testing center, locate a test center near you.
Prerequisites: None
Tensorflow Records
Tensors
TFX (Tensorflow extended)
TPUs
When to use them
How to improve them
What are estimators
How to deal with common errors
Data API
Kubeflow topics:
How it works
How to store models
Uploading data to it
Notebooks
Monitoring
How you can use it with R
How it can be used to work locally and on the cloud
Improve performance
Training on cluster
What is continuous ingestions and how you can use GCP with it
What is continuous delivery and how you can use GCP with it
How to use Cloud Monitoring to ensure pipelines work
How to use Tensorflow to ensure pipelines work
How to use Kube Flow to ensure pipelines work
How to use other GCP monitoring tools Stackdriver, Logging etc
When you want to use a pipeline a certain way
How to deal with low latency
How to deal with slow performing pipelines
Building real-time and batch systems
Training & Testing:
Responsible AI practices
Costs/benefits of using different GCP products
How to provide permissions to the model
Evaluation:
How to build a process using different GCP tools to ensure your model performance is
always good
How to use AI explanations to help understand model performance
How to use TFX to evaluate model performance
How to use Kubeflow
Hybrid Cloud Models: