Professional Documents
Culture Documents
Tecton - The State of Applied ML 2023
Tecton - The State of Applied ML 2023
Stateof
of Applied
Applied
Machine Learning
Machine Learning 2023
2023
AN
AN REPORT
REPORTFROM
FROM
Table of contents
Introduction 3
Key Findings 5
Methodology 11
A Deeper Dive Into Survey Results 12
Respondent demographics 13
Analysis 15
Conclusion 28
Additional Resources 29
2
INTRODUCTION
3
INTRODUCTION
Even though most organizations are just getting started on their journey to applied ML, we Real-time ML models can be powered by batch data only, but can also use streaming
already have an idea of its transformative potential through existing use cases like getting or real-time data to increase the accuracy of predictions by incorporating the freshest
ETA and pricing predictions when calling a rideshare, and obtaining quotes for car and information available. (Read more about real-time ML.)
home insurance.
4
Key Findings
1 2 3
Applied AI/ML 10
models
is a top priority 6
Roughly 50% of survey participants indicated their
organizations have 6 or more models in production,
for organizations models and companies are planning on deploying more models quickly.
Based on survey responses, the projected median for the number of
models in production in the next 12 months will increase from 6 to 10.
How many models in production do you have today? How many do you expect to have in 12 months?
Today In 12 months
Companies are leveraging applied ML for use cases Search and ranking 32.6%
that directly impact revenue: The top 3 use cases are
Risk analytics 33.2%
recommender systems (45.1% of respondents), customer
analytics (43.3%), and personalization (36.4%). Fraud detection 30.2%
37.6%
Deployment Challenges Building production data pipelines
Companies are
Lack of data science resources 19.1%
investing in their
Serving models with enterprise-grade SLAs 17.3%
address challenges How long does it take to Does your company aim to reduce model
90.9% deploy a new model from deployment time within the next 12 months?
idea to production? If so, by how much?
Time to Deployment
Deploying a new model to production
is a long process (>1 month for 65.0% of
respondents and >3 months for 31.7%).
65.0%
54.5 %
of respondents
aim to reduce deployment time
However, 54.5% of respondents shared by at least
25
that their organizations aim to reduce
deployment time by at least 25% over
%
the next 12 months.
31.7% over the next
11.8%
12
months
9.1% 7
< 1 week > 1 week > 1 month > 3 months > 6 months
KEY FINDINGS 2
The MLOps Tech Stack
Only 9.5% of respondents say their organization has a full MLOps stack
today (5 out of 5 components).
59.1%
Companies are
components will see the largest increases (~43 percentage points
increase for both) in adoption in the next 12 months.
MLOps stack to
currently have in your MLOps tech stack?
20.6% 20.3%
How many do you plan to have in the
next 12 months?
13.8%
address challenges
A. Model serving
B. Model registry and versioning 9.8% 10.1% 9.5%
7.8% 8.6% 8.8%
C. Feature store / feature platform
5.6%
D. Model monitoring and
observability
E. Data monitoring 0 1 2 3 4 All 5
59.1% MLOps stack MLOps stack MLOps stack MLOps stack MLOps stack MLOps stack
components components components components components components
Today In 12 Months
Today In 12
Months
81.9% 83.8%
80.5% 80.4%
74.2%
20.6% 20.3%
Today In 12 Months
Model serving Model registry Feature store / Model monitoring Data monitoring
and versioning feature platform and observability
13.8% 8
9.8% 10.1% 9.5%
7.8% 8.6% 8.8%
5.6%
KEY FINDINGS 3
81.2% How many of your production models are making REAL-TIME (i.e., sub-100
milliseconds) predictions TODAY? How many do you anticipate in 12 months?
68.3%
Number of models making Today In 12 Months
real-time predictions
57.0%
68.3% of ML respondents surveyed indicated
their teams already have at least one real-
time (i.e., sub-100 milliseconds) ML model in 44.9%
Real-time machine
production, while 14.7% have more than 10.
36.1%
learning is gaining
27.3%
26.0%
20.0%
20.6% 20.3%
14.7%
real traction
11.2%
8.1%
13.8%
0 1 2 3 4 Al
MLOps stack MLOps stack MLOps stack MLOps stack MLOps stack MLOp
components components components components components comp
34.1%
Which types of data sources does your company use to generate ML features today or plan
Only to adopt
Batch Data In 12 Months
within 12 months? (“Only batch data” results shown)
9
Recommendations
1 2 3
The majority of survey respondents (68.3%) are already Survey participants shared that the biggest roadblocks Business ROI is named as a top 3 roadblock by many survey
using real-time ML, and that number will likely increase to are generating training data and building production participants who are early in their applied ML journey. The
81.2% in 12 months. Evaluate your current use cases and the data pipelines. MLOps solutions have matured a lot over survey found that companies are prioritizing recommender
potential ROI of moving to real-time ML—competitors with the past few years and the right ones will help overcome systems, customer analytics, and personalization use
similar use cases are probably already doing so. these operational challenges. cases.
10
METHODOLOGY
Over 1,700 people in various industries completed the survey. Respondents are mainly
from Europe and North America, representing companies of all sizes, from small
startups to enterprises with more than 100,000 employees. Note that due to the
characteristics and structure of this survey, the results may not accurately represent
ML teams across all industries. Both selection bias and observational bias could be
present in the data. Despite the limitations of the survey, this report is unique in its
focus on teams involved in building, validating, deploying, and monitoring ML models,
and aims to serve as a guide on the adoption of ML across industries.
11
A Deeper Dive Into Survey Results
RESPONDENT DEMOGRAPHICS
Industry Role
Software & Internet 26.2%
Data Science
Financial Services 13.5%
30.8%
Education 8.3%
Gaming 1.5% Note: The first 4 listed industries had 50,001 - 100,000 3.6%
>100 respondents each, and as a result, employees
we provide specific numbers for these Other
Other 1.2% industries throughout the remainder
of the report. Other industries will be 100,000+ 7.2%
2.2%
Hospitality 0.9% employees
labeled as “all others.”
13
RESPONDENT DEMOGRAPHICS
Individual contributor
Individual contributor (Staff level or above) Manager/Director/VP 1-9 51.1%
(Associate to Senior level)
25.9% 24.3% employees
44.0%
C-Suite
5.6%
10-49 32.7%
employees
50-199 10.5%
employees
200+ 5.8%
employees
14
A N A LYS I S Applied ML is a top priority for organizations
Highlights
• Over of 60% survey respondents identified applied ML
as a top 3 company initiative. Among these, 23.8% of Low
respondents identified applied ML as the top company priority
initiative. 11.7% Top priority How important is applied ML to your organization?
23.8%
• The Software & Internet industry places the highest
Medium
emphasis on applied ML, with 29.2% of respondents priority
Split by industry
identifying it as the top priority and 67.0% naming it as a (Top 10 Initiative) Software & Internet
top 3 priority. 28.2%
Financial Services
High priority
(Top 3 Initiative) Education
36.3%
Healthcare
37.8% 37.9%
35.9%
35.3%
31.6%
29.2%
28.1% 27.6% 28.7%
25.9%
24.1%
23.3%
18.8% 18.4%
14.3%
11.3%
9.7% 9.7%
8.6%
Yes No
1-250 employees 45.3% 54.7%
16
A N A LYS I S Applied ML is a top priority for organizations
Highlights: ML Models in Production Median number of models
in production
How many ML models do you have in production today?
0 5.0%
• As expected, company size correlates with the number of
1-2 8.4%
models in production today— i.e., the largest companies
3-5 14.1%
have the most models in production (median: 12 models),
6-10 26.1%
while the smallest ones have the fewest number of models
11-20 20.0%
in production (median: 4 models)
Today In 12 months 21+ 26.4%
production for the smallest teams (<10 team members) Split by industry Split by company size Split by data team size
Today In 12
25 Today In 12
25
Months Months
21 21
16 16 16
12 12 12 12 12 12
10 1010 10 10
9
8 8 8 8 8
6 6 6 6 6
4 4 4 4 4
• The top 3 use cases are recommender systems (45.1% Recommender systems 45.1%
of respondents), customer analytics (43.3%), and
personalization (36.4%) Customer analytics 43.3%
Personalization 36.4%
Other 16.1%
18
A N A LYS I S Real-Time ML
81.2%
Today In 12 Months
27.3%
Real-time ML models can be powered by batch data only,
but can also use streaming or real-time data to increase 20.0%
14.7%
the accuracy of predictions by incorporating the freshest 11.2%
8.1%
information available. (Read more about real-time ML.)
1+ 3+ 6+ 11+ 21+
Highlights
26.0%
Models Models Models Models Models
20.6%68.3% of
• Real-time ML already has significant adoption: 20.3%
How many of your models in production are making real-time (sub-100 milliseconds) predictions today? Split by industry
30%
20%
10%
1
This number is one of the biggest surprises uncovered in this survey, and it’s
a big step forward compared to where the industry was even just a year
ago. Keep in mind, however, that this number reflects mainly early real-time 0 1+ 3+ 6+ 11+ 21+
experiments on the path to full ML adoption rather than a full-blown real-time Models Models Models Models Models Models
ML deployment. Even so, 68.3% is a very promising number, and even more so
when looking at 12-month out prospects. 19
A N A LYS I S Real-Time ML
Do you have a central ML platform team responsible for building and maintaining your ML infrastructure?
Highlights
• The existence of a central ML platform team is highly Those with at least one real-time ML model in production today
Today 34.1%
In 12 months 11.5%
Split by industry
32.3%
Software & Internet
10.9%
34.1%
Financial Services
10.6%
44.7%
Education
13.2%
42.1%
Healthcare
11.2%
31.5%
All Others
11.4%
20
A N A LYS I S Challenges on the road to applied ML
Highlights What are the 3 biggest challenges encountered when deploying new models to production?
26.1% 25.7%
• Demonstrating business ROI (34.3%) 23.5%
19.1% 18.0% 17.3% 17.2%
2.3%
Generating accurate Demonstrating Controlling Lack of data science Serving models with Other
training data business ROI infrastructure costs resources enterprise-grade SLAs
Building production Collaboration between Integrating multiple Governance, security, Lack of engineering
data pipelines data science and tools into a coherent and auditability resources
engineering teams stack
21
A N A LYS I S Challenges on the road to applied ML
Highlights What are the 3 biggest challenges encountered when deploying new models to production? Split by industry
0%
Generating accurate Demonstrating Controlling Lack of data science Serving models with
training data business ROI infrastructure costs resources enterprise-grade SLAs
Collaboration between
Building production data science and Integrating multiple tools Governance, security, Lack of engineering
data pipelines engineering teams into a coherent stack and auditability resources
40%
30%
20%
10%
0%
Split by number of models in production 0 Models 1-2 Models 3-5 Models 6+ Models
22
A N A LYS I S Challenges on the road to applied ML
• Respondents who shared that their companies have What are the 3 biggest challenges encountered when deploying new models to production? Split by batch vs. real-time
respectively) 30%
month for 65.0% of respondents and >3 months for 31.7%) Generating accurate Demonstrating Controlling Lack of data science Serving models with
training data business ROI infrastructure costs resources enterprise-grade SLAs
• Companies in the Software & Internet industry deploy Building production Collaboration between Integrating multiple Governance, security, Lack of engineering
data pipelines data science and tools into a coherent and auditability resources
models to production faster than average. engineering teams stack
How long does it take to deploy a new model from idea to production? How long does it take to deploy a new model from idea to production? Split by industry
90%
Software & Internet
Financial Services
90.9%
Education
70%
Healthcare
All Others
65.0% 50%
30%
31.7%
10%
9.1% 11.9%
< 1 week > 1 week > 1 month > 3 months > 6 months < 1 week > 1 week > 1 month > 3 months > 6 months 23
A N A LYS I S Challenges on the road to applied ML
Highlights Does your company aim to reduce model deployment time within the next 12 months? If so, by how much?
Yes by 25%
or more
54.6%
Yes by 50%
or more
29.9%
Yes by 75%
or more
8.9%
Yes by 90%
or more
3.6%
Does your company aim to reduce model deployment time within the next 12 months? If so, by how much? Split by industry
Software & Internet
Financial Services
Education
No Healthcare
All Others
Yes by 10%
or more
Yes by 25%
or more
Yes by 50%
or more
Yes by 10%
0%
or more 10% 20% 30% 40% 50% 60% 70% 80%
24
Yes by 25%
or more
A N A LYS I S MLOps Stack
Using today
• Companies are investing heavily in machine learning, and MLOps stack components
A. Model serving
a number of them are planning to shore up their MLOps
B. Model registry and versioning
stack to solve for these challenges. 26.0%
C. Feature store / feature platform
20.3%
0 1 5.6% 20.6%2 3 4 All 5 D. Model monitoring and observability
- 3
9.6% of survey participants shared that their companies MLOps stack 13.8% stack
MLOps MLOps stack MLOps stack MLOps stack MLOps
7.8% 8.6% 8.8% 9.8% 10.1% 9.5% stack E. Data monitoring
components components components components components components
have adopted 3 or more components out of the 5 major
0 1 2 3 4 All 5
ones—and this number is expected to almost double (to MLOps stack MLOps stack MLOps stack MLOps stack MLOps stack MLOps stack Using today
Companies with only
78.0%) by the end of 2023 components components components components components components batch models in production
59.1% (at least 1) Will be using
in 12 Months
- Only 9.5% of participants shared that their companies Companies with real-time
models in production
have adopted all 5 major components at the time of the How many components are in your MLOps stack? Split by batch/real-time
0 1 2 3 4 All 5 (at least 1)
survey, but 59.1% of participants expect to adopt all 5 in MLOps stack MLOps stack MLOps stack MLOps stack MLOps stack MLOps stack
components components components components components components
the next 12 months 26.0%
20.6% 20.3% Companies with only
• Companies with real-time ML models in production are 5.6% batch models in production
13.8%
7.8% 8.6% 8.8% 9.8% 10.1% 9.5% (at least 1)
more advanced in terms of MLOps adoption
0 1 2 3 4 All 5 Companies with real-time
models in production
MLOps stack MLOps stack MLOps stack MLOps stack MLOps stack MLOps stack
(at least 1)
components components components components components components
0 1 2 3 4 All 5
MLOps stack MLOps stack MLOps stack MLOps stack MLOps stack MLOps stack
components components components components components components
69.7%
How many components of the MLOps stack will you be using in the next 12 months? Split by batch/real-time
0 1 2 3 4 All 5
MLOps stack MLOps stack MLOps stack MLOps stack MLOps stack MLOps stack
0
components 1
components 2
components 3
components 4
components 43.6% All 5
components
MLOps stack MLOps stack MLOps stack MLOps stack MLOps stack MLOps stack
Companies with only
components components components components components componentsbatch models in production
(at least 1)
Companies with only
Companies with real-time
batch models in production
models in production
(at least 1)
16.1% (at least 1)
3.6% 12.5% 11.0% Companies with real-time
9.0% 2.4% 10.2%
7.8% 6.3% 7.9% models in production
(at least 1)
0 1 2 3 4 All 5
MLOps stack MLOps stack MLOps stack MLOps stack MLOps stack MLOps stack
components components components components components components 25
A N A LYS I S MLOps Stack
Highlights
Which components of the MLOps stack does your company use today or plan to adopt within 12 months?
26.0%
Model serving Model registry Feature store / Model monitoring Data monitoring
and versioning feature platform and observability
20.6% 20.3%
13.8%
26
A N A LYS I S MLOps Stack
Which enterprise data platforms does your company use? What is your company using for model serving?
Other 1.9%
None 17.2%
27
CONCLUSION
As companies continue to invest in machine learning and work to overcome the challenges, the
future of applied ML is shaping up to benefit both the builders behind the scenes and the end users
of those applications. Organizations are expected to increase the number of models that make
batch-driven and real-time-driven predictions to make better decisions and provide improved
customer experiences. As a result, companies who move faster toward real-time ML use cases
stand to gain a competitive advantage in the market.
By sharing the insights from this survey, we hope to inform and guide data teams on the evolving
landscape of applied machine learning and contribute to its growth and success.
28
ADDITIONAL RESOURCES
Whitepaper: The Buyer’s Guide to Evaluating Feature Stores & Feature Platforms
Video: Designing & Scaling FanDuel’s ML Platform: Best Practices & Lessons Learned
Video: How to Make the Jump From Batch to Real-Time Machine Learning
29
Accelerate your journey to applied machine learning. Request a demo today.
tecton.ai
Tecton is a feature platform for production machine learning (ML) that was founded by the creators of Uber’s Michelangelo ML platform. The
platform is fully managed and helps data teams accelerate the iteration and deployment of real-time ML models while maximizing their
accuracy and reliability. Tecton simplifies feature management and optimizes the cost of running models, enabling data teams to avoid
expensive infrastructure costs. Tecton’s platform streamlines workflows and allows teams to focus on developing better ML models, resulting in
improved business outcomes. Tecton’s customers include Fortune 500 companies and innovative firms like Convoy, HelloFresh, Plaid, and Tide.