Building A Python Web Service With Ray

Building a Python
web service with Ray
Philipp Moritz
September 30, 2020
What this talk is about
Show design patterns for building Python web services with

Ray using Ray tasks and actors

Show how we are building Anyscale as a Ray application

Show how we are building Anyscale as a Ray application
Show how to address practical challenges like type checking,
testing, tracing, monitoring and deployment
Requirements for a web service
Needs to be available 24/7

Needs to be scalable according to user demand

Needs to be scalable according to user demand
Needs to integrate external Python libraries and frameworks,

e.g. web serving frameworks or machine learning libraries
Traditional Python web service architecture
Flask server
aiohttp server
fastAPI server
Web logic
Redis
Flask server
Celery
Redis Queue
aiohttp server Multiprocessing
Service 1 Service 2 Service 3

fastAPI server
Web logic Business logic

Database
Redis
Flask server
Celery
Redis Queue
aiohttp server Multiprocessing Blob store

fastAPI server
Web logic Business logic Data

Database
Redis
Flask server
Celery
Redis Queue

fastAPI server
Challenges: Programming, scaling,
monitoring, tracing, fault tolerance
Ray web service architecture
Database
Redis
Flask server
Celery
Redis Queue

fastAPI server

Database
Flask server
Task Actor Task
Task Actor Task

aiohttp server Blob store
Actor Actor Task
Actor Actor
fastAPI server Task

Database
Advantages of Ray:
Flask server
Task Actor Task
● Unified programming model
Task Actor Task
aiohttp server Blob store
Actor Actor Task
Actor Actor
fastAPI server Task

Database
Advantages of Ray:
Flask server
Task Actor Task
Task Actor Task
● Automatic scheduling,
aiohttp server resource management Blob store
Actor Actor Task
Actor Actor
fastAPI server Task

Database
Advantages of Ray:
Flask server
Task Actor Task
Task Actor Task
● Autoscaling Actor Actor Task
Actor Actor
fastAPI server Task

Database
Advantages of Ray:
Flask server
Task Actor Task
Task Actor Task
● Built-in facilities for monitoring
Actor Actor
fastAPI server Task

Database
Advantages of Ray:
Flask server
Task Actor Task
Task Actor Task
● Built-in facilities for monitoring
● Great
fastAPIsupport
server for ML
Actor Actor Task

Reminder: The Anyscale Platform
1. Laptop experience with the power of a cluster

2. Serverless experience without serverless limitations
3. Real-time collaboration
Architecture of Anyscale
fastAPI server Service 1 Service 2 Database

Task Task
Task Task
Actor Actor

Scaling up with Ray tasks
fastAPI server Sessions Websockets Database

Task Actor Session 1
Session 2
/api/v2/session/start
Session 1
Session 2

Task Session 2
Task
/api/v2/session/1/execute
/api/v2/session/1/execute Session 1
Session 2

Task Session 2
Task
Task
/api/v2/session/1/execute Session 1
Session 2
Managing state with Ray actors
Update
fastAPI server Sessions Notifications
Actor Actor
Task
Update
/api/v2/session/start
Session
Writing an API server with fastAPI
● Makes it easy to define a REST API

● Schema validation
● Typing
@router.get(“/{command_id}/execution_logs”)
async def get_execution_logs(
command_id: int, ...) ->
Response[LogOutput]:
Ray asyncio support
Object Reference
Ray object references are awaitable!
async def get_execution_logs(session_record, session_command_id, log_params):
log = await session_tasks_service.get_execution_log.remote(
session_record["id"],
session_command_id,
logs_params
)
Ray asyncio support
Object Reference
Ray object references are awaitable!
async def get_execution_logs(session_record, session_command_id, log_params):
log = await session_tasks_service.get_execution_log.remote(

@ray.remote
session_record["id"],
class WebSocketActor:
session_command_id, def __init__(self) -> None:

self.sio = socketio.AsyncServer()
logs_params
async def emit(self, message_name: str,
) data: Dict[str, Any]) -> None:
await self.sio.emit(message_name, data)
Ray actors can also be
Typing
fastAPI server Service 1 Service 2
Task Task
Task Task
Actor Actor

Frontend
executeCommand({ async def execute_command( @ray.remote
sessionId, session_id: int, options: Options): def execute_command(
options: { command_record = db.create_command( command: CommandRecord):
command: input session_id, options) runner = AnyscaleSessionRunner()
} execute_command.remote( runner.execute_command(command)
}) command_record)
TypeScript Python Python

Testing
Unit testing: Use the Ray local mode for unit testing:
ray.init(local_mode=True)
Everything runs in a single process -> can mock out interfaces
Integration testing: Use a Ray instance running on the laptop/CI server,
testing web logic, business logic and database
End-to-end testing: Test full functionality in staging environment
Stress testing: Test scalability limits of the system
Metrics and Monitoring
Use Ray’s built in metrics API:

from ray.experimental import metrics
self.create_cluster_stats = metrics.Histogram(
"Anyscale_create_cluster", "Num of seconds took to create cluster",
"second",
[float(i) for i in range(10, 300, 10)],
["step"],
Metrics and Monitoring
Use Ray’s built in metrics API:

from ray.experimental import metrics
self.create_cluster_stats = metrics.Histogram(
"Anyscale_create_cluster", "Num of seconds took to create cluster",
"second",
[float(i) for i in range(10, 300, 10)],
["step"],
Tracing
We use OpenTelemetry for tracing
Can generate detailed traces for a number of Python libraries, including database
clients, web frameworks. requirements.txt:
opentelemetry-api
opentelemetry-sdk
opentelemetry-ext-asgi
Automatic tracing for Ray tasks and actors
opentelemetry-ext-asyncpg
opentelemetry-ext-botocore
opentelemetry-instrumentatio
Can also add custom traces: n-starlette
Tracing
We use OpenTelemetry for tracing
It can generate detailed traces automatically for a number of Python libraries,

including database clients, web frameworks.
requirements.txt
opentelemetry-api
opentelemetry-sdk
Full automatic tracing for Ray tasks and actors opentelemetry-ext-asgi

opentelemetry-ext-asyncpg
opentelemetry-ext-botocore
opentelemetry-instrumentation-starlette
Can also add custom traces:

Deployment
The cloud environment for our web service is set up with
Terraform, to make the setup easily reproducible for
● Development
● Staging,
● Production
The web service is deployed on Docker and Kubernetes, which

integrate well with Ray
Summary
We showed how the Python web serving ecosystem

integrates with Ray
We showed how Ray makes it easy to scale up your web

services and manage their state
We showed how to type, test, monitor and deploy your web

service with Ray
Thanks to the Team @ Anyscale

Building A Python Web Service With Ray

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Building A Python Web Service With Ray

Uploaded by

Copyright:

Available Formats

Building a Python

web service with Ray

Show design patterns for building Python web services with

Show design patterns for building Python web services with

Show design patterns for building Python web services with

Needs to be available 24/7

Needs to be available 24/7

Needs to be scalable according to user demand

Needs to be available 24/7

Needs to be scalable according to user demand

Needs to integrate external Python libraries and frameworks,

Service 1 Service 2 Service 3

Web logic Business logic

Service 1 Service 2 Service 3

Web logic Business logic Data

Service 1 Service 2 Service 3

Service 1 Service 2 Service 3

Web logic Business logic Data

Task Actor Task

Web logic Business logic Data

Web logic Business logic Data

Web logic Business logic Data

Web logic Business logic Data

Web logic Business logic Data

Web logic Business logic Data

1. Laptop experience with the power of a cluster

fastAPI server Service 1 Service 2 Database

Web logic Business logic Data

fastAPI server Sessions Websockets Database

Web logic Business logic Data

fastAPI server Sessions Websockets Database

Web logic Business logic Data

fastAPI server Sessions Websockets Database

● Makes it easy to define a REST API

async def get_execution_logs(

command_id: int, ...) ->

log = await session_tasks_service.get_execution_log.remote(

log = await session_tasks_service.get_execution_log.remote(

session_command_id, def __init__(self) -> None:

Web logic Business logic

TypeScript Python Python

Use Ray’s built in metrics API:

"Anyscale_create_cluster", "Num of seconds took to create cluster",

[float(i) for i in range(10, 300, 10)],

Use Ray’s built in metrics API:

"Anyscale_create_cluster", "Num of seconds took to create cluster",

[float(i) for i in range(10, 300, 10)],

It can generate detailed traces automatically for a number of Python libraries,

Full automatic tracing for Ray tasks and actors opentelemetry-ext-asgi

Can also add custom traces:

The web service is deployed on Docker and Kubernetes, which

We showed how the Python web serving ecosystem

We showed how Ray makes it easy to scale up your web

We showed how to type, test, monitor and deploy your web

You might also like

session_command_id, def init(self) -> None: