Running A Megasite On Microsoft Technologies
Casey Jacobs Director of Engineering Chris St.Amand Sr. System Engineer Aber Whitcomb CTO Jim Benedetto VP of Technology

Introduction – Quick Facts – Growing Up Upcoming Technology Enablers Open Panel Discussion


Brief History Of
Microsoft launches Information & support publishing; hosting Microsoft combines Web platform, ops, and content teams Standardization effort begins, consolidation hosted systems Focus on MSCOM Network Programming and campaignto-Web integration Single MSCOM group formed Brand, content, site std’s,  Privacy, brand compliance Enable an innovative customer experience online & in-product Product Info, Support, Dev / ITPro Experience, Customer Intelligence, Profile Mgmt & Enterprise Downloads

30k users / day

4M UUsers / day

6.5M UUsers / day

17.1M UUsers / day




Quick Facts
Infrastructure and Application Footprint 5 Internet Data Centers & 3 CDN Partnerships 110 Web Sites, 1000’s App's and 2138 Databases  80+ Gigabit/sec Bandwidth Solutions at High Scale
13M UUsers/Day & 70M Page Views/Day 10K Req/Sec, 300K CC Conn’s on 80 Servers 350 Vroots, 190 IIS Web App’s & 12 App Pools

Microsoft Update
250M UScans/Day, 12K ASP.NET Req/Sec, 1.1M ConCurrent 28.2 Billion Downloads for CY 2005 Egress – MS, Akamai & Savvis (30-80+ Gbit/Sec)

MySpace Company Overview
Launched Sept, 2003 Latest as of February 2006
64+ MM Registered Users 38 MM UUsers & 2.3M Concurrent 260K New Registered Users/Day 23 Billion Page* Views/Month
Yahoo MySpace MSN Ebay Google Hotmail Internet Rank #1 #2 #3 #4 #5 #6 Pageviews in ‘000s 29,508 23,566 14,695 9,632 7,329 6,812

Media Metrix February 2006 Audience Rankings

50.2% Female / 49.8% Male Primary Age Demo: 14-34

Site Trends
260K New Users/Day 430M Total Images Millions of Songs Streamed/Day 1000’s of New MP3’s/Day 20 Million Comments Posted
Source comScore Media Metrix February - 2006
Quick Facts
Infrastructure and Application Footprint 3 Internet Data Centers Server Breakdown
2682 Web and 650 Database Servers 90 Cache Servers 16gb RAM 650 Dart servers 60 DB Servers 150 Media servers

3000 disks in SAN architecture Egress Management
17,000 mb/s bandwidth 15,000 mb/s on CDN
Growing up in the Internet World

0 users
The beginning
Two tiered architecture
Single Database Load balanced web servers

Great for rapid development Less complexity means faster time to market and less operational costs Works for small to medium sized websites, not big ones
0 Users

500k Users
A Single database is not enough
Max out a single database Split reads and writes across separate databases Use transactional replication so multiple databases can service reads

500k Users

1 Million
Vertical partitioning
Transactional replication doesn’t work  for all workloads and data types Use a combination of Vertical Partitioning and replication

1M Users

2 Million
Start to reconsider SCSI arrays for the long-term SCSI arrays have good performance but reliability issues SANS provide better performance, uptime, and redundancy Move to a clarion and enjoy better these benefits
2M Users

3 Million
Horizontal partitioning
Vertical Partitions see performance problems Decide we need to re-architect the database Horizontal partitioning is the answer but is difficult to do while in production
3M Users

Horizontal Partitioning
All features reside on a single database server Data is partitioned by user ID Some data cannot be partitioned especially on a social networking site

3M Users

5 Million
Network bottlenecks
Various areas of the network become saturated Gig uplinks are maxed out
Switch to Autonomous network and BGP Get multiple gig links and 10G links

Load balancer is maxed out
“Must load balance the load balancers” Use DNS 5M Users

7 Million
Site dependencies
Separating features on the front end isolates potential bottlenecks Using subdomains is easiest way

7M Users

10 Million
Scalable storage
Trying to partition storage on the backend is time consuming and inefficient Maxing out SANs is very costly We realize scalable storage is key
10M Users

15 Million

DB’s versus Caching
Databases still having perf issues
Databases are expensive Have a lot of transactional overhead

Caching tier
High speed cache is perfect for reads LRU algorithm is self managing Drastically reduces database load

Where we are today

Upcoming Technology Enablers
What’s Next for and

Scaling SQL Server
V1: Single Instance – < 1 Million Users
Single SQL Server Instance Supports All Users and Features

V2: Single Instance Replicating to Read Only Full Copies < 2 Million Users
Single server handles all write transactions, read transactions spread across multiple transactional replication copies

V3: Vertical Partitioning - < 4 Million Users
Each Feature/Page of the site on its own SQL Server

Scaling SQL Server
V4: Horizontal Partitioning - < 8 Million Users
All features/pages brought back to single database schema Standard schema across all databases User ranges partitioned across databases

V5: Horizontally Partitioned Core with Replicated Content, Vertically Partitioned Features Databases, “Shared Content” Databases - > 8 Million Users
Primary Myspace schema exists across large farm of servers Small amounts of content replicated to all horizontally partitioned servers to allow for features spanning all user ranges

V6: Migration to SQL Server 2005 - >26 Million Users

SQL Server 2005
64 bit
Memory Pressure under 4GB 32 Limit
Servers loaded with 32Gigs of RAM <4 Gig Addressable to the memory pools we were stressing

Connection Timeouts Servers going “dark”, requiring restart Rejected Connections

Problem Eliminated on 64bit Arch
Connection/Sort memory pools now able to address all 32Gigs of RAM

Virtualizing Storage
What is it?
Software layer between your disks & hosts

Provisioning is very simple, makes capacity planning more predictable Much better performance Can easily add more capacity to a LUN

What do we use?
3par 14 week bake off

IIS 7.0 Failed Request Tracing

Geo-Targeting Solutions
Demographic management
US Users (NYC, LA, DC) Taiwanese Users Polish Users All other users

Broadband Users

Narrowband Users

Akamai Edgesuite
Policy: Suppress WGA Release

Policy: Release WGA at 8% per day

Policy: Release WGA at 2% per day

Policy: Release WGA at 5% per day

Easy to reach – regulate as needed

Hard to reach – NEVER regulate

Objective – Enable Targeted Release of App’s and Content  Avoid demographic support spikes and further align to marketing campaigns

Sensitivity to Time/Frequency of customer online experiences Improve ability to reach last 30% of client population

Microsoft Confidential. © 2006 Microsoft Corporation. All rights reserved. This presentation is for internal Microsoft use only.

Open Panel Discussion

© 2006 Microsoft Corporation. All rights reserved.
This presentation is for informational purposes only. Microsoft makes no warranties, express or implied, in this summary.