MySQL in eBay’s Personalization Platform

Presented by, MySQL & O’Reilly Media, Inc.

Chris Kasten eBay Kernel Framework Group April 16, 2008

      Background General Vision General Requirements Why MySQL Memory Engine? System Overview Results

Fun Facts About eBay
          110 Million items for sale on the site $59 Billion in gross merchandize value (GMV) per year Approx $2,039 worth of goods traded on the site every second 276 Million registered users 2 Billion URL requests per day 6,000 application servers with 12,000 Java processes 40 Billion database requests per day 300 different databases (over 700 instances) 9 PB of data storage 13 million lines of source code (In 2008 will surpass Windows NT 4.0 O/S 16 million lines)

 Further distinguish the eBay shopping experience  Provide a more relevant and even better user experience  Provide users with a more rich experience with greater continuity  Provide users with the best selection tailored to their interests/profile  Provide better user experience through real time personalization data feedback loop that is immediately available  Provide users with tailored alternatives  Further distinguish the eBay business value proposition  Advertising shown to more relevant buyers  More effective merchandizing and marketing of items  Increase conversion rates through better buyer experience and greater relevancy of items presented to the buyer

 eBay needed to expand its real time personalization capabilities  eBay needed to be able to associate more data with sessions  Both personalization and session data were constrained by technology
 Cookies limitation • Client side cookie limit of 4KB data • Long term scalability issue of sending all cookie data, whether needed or not  High cost of traditional server side solutions using an OLTP database • eBay’s very large scale quickly multiplies costs in to a very large number • Throughput of OLTP’s decrease with high write ratio of approximately 50% • Large number of licenses/servers needed for throughput was cost prohibitive  High cost of other commercial alternatives at eBay’s very large scale

 These constraints were limiting business decisions and had to be solved

General Vision
Every Application Server Can Access Data For Every URL Request (All 2 Billion of them!) Session Data Personalization Data

General Requirements
           Handle 4 Billion reads/writes per day Support connections and requests from 12,000 Java processes High throughput on low cost hardware Scale both horizontally and vertically for 10x future growth Scale without operational interruption High availability and operational failure robustness Low latency response times Low licensing, support, and total cost of ownership costs Enterprise class support agreement Enterprise class management and monitoring tools Driver for Java

Why MySQL Memory Engine?
 MySQL Memory Engine had the best performance  Very impressive POC results for MySQL Memory Engine  Approx 2X more throughput than nearest competitor (Java driver)  eBay test case of 50/50 read/writes showed approx 13,000 TPS @ 50% CPU for a Sun 4100 running Solaris 10 x86 (2 CPU, Dual Core Opteron, 16GB RAM) for a network client  Handled 20,000 concurrent connections with less than 1% degradation in throughput than baseline case (eBay developed patch)  Production performance has been consistent with POC results

Why MySQL Memory Engine?
 MySQL Enterprise had a very attractive cost structure  MySQL’s ability to offer enterprise class support  MySQL’s combined throughput and cost structure provided a low cost system for the scale of eBay  Power and flexibility of using SQL for different needs  A company with a significant track record

Why MySQL Memory Engine?
 The power of open source  eBay has developed and contributed two enhancements to MySQL • Support for an event port based threading and connection handling model for scalable connection handling • Support for true variable size columns in MySQL Memory Engine  Option to be able to apply our talent and create the enhancements we need quickly  Receive the benefits of innovations of others via open source

Why MySQL Memory Engine?
 The power of an open source company behind the product  Ability to collaborate with MySQL on enhancements to the product  Option to request enhancements from a company behind the product  Out of the box monitoring and administration tools  Eliminate tying up high end eBay talent in owning it ourselves  An enterprise class open source product  Enterprise class support offerings for use in critical systems

eBay Personalization System Overview
Browser Application Servers MySQL Memory Engine Cache Tier Persistent Database

eBay Personalization System Overview
Application Servers MySQL Memory Engine Cache Tier

Replication Read/Write Cache Miss Read

5 min Batched Write Back

Persistent Database

eBay Personalization System Overview
 Replication optional based on criticality of data loss for past 5 min  Trade-off between data criticality versus double the memory cost  Some personalization data may not be critical enough for the additional hardware cost  Single threaded MySQL replication is generally problematic  Once replication falls behind it stays behind with continued traffic  Replication can be achieved via dual writes from the application server performed transparently by the framework  Second write to replica can be asynchronous  Automatic redistribution of data when node failure or draining a node

eBay Personalization System Overview
 Write back to persistent database performed by batch process  Evictions performed by batch process based on target free memory  Buffering space is set aside in case persistent database is unavailable  Special techniques used to minimize table lock duration during write back and eviction operations

 A business critical system running on MySQL Enterprise for one of the largest scale websites in the world  Highly scalable and low cost system that handles all of eBay’s personalization and session data needs  Ability to handle 4 billion requests per day of 50/50 read/write operations for approximately 40KB of data per user / session  Approx 25 Sun 4100’s running 100% of eBay’s personalization and session data service (2 CPU, Dual core Opteron, 16 GB RAM, Solaris 10 x86)

 Highly manageable system for entire operational life cycle  Leveraging MySQL Dashboard as a critical tool in providing insight into system performance, trending, and identifying issues  Adding new applications to domain that previously would have been in a different domain because of cookie constraints  Creating several new business opportunities that would not have been possible without this new low cost personalization platform  Leveraging MySQL Memory Engine for other types of caching tiers that are enabling new business opportunities

Q&A  Thank you for coming!  Questions?