Migrating from PostgreSQL to MySQL at Cocolog

Naoto Yokoyama, NIFTY Corporation Garth Webb, Six Apart Lisa Phillips, Six Apart Credits: Kenji Hirohama, Sumisho Computer Systems Corp.

Agenda

   

1. What is Cocolog 2. History of Cocolog 3. DBP: Database Partitioning 4. Migration From PostgreSQL to MySQL

1. What is Cocolog

What is Cocolog
 NIFTY Corporation
    Established in 1986 A Fujitsu Group Company NIFTY-Serve (licensed and interconnected with CompuServe) One of the largest ISPs in Japan First blog community at a Japanese ISP Based on TypePad technology by SixApart Several hundred million PV/month Dec/02/2003: Cocolog for ISP users launch Nov/24/2005: Cocolog Free for free launch April/05/2007: Cocolog for Mobile Phone launch

Cocolog
  

History
  

Cocolog (Screenshot of home page)

2008/04 700 Thousand Users

Cocolog (Screenshot of home page)

Cocolog

TypePad

Cocolog template sets

Cocolog Growth (User)  ■ Cocolog  ■ Cocolog Free

phas e1

phas e2

phas e3

phas e4

Cocolog Growth (Entry)  ■ Cocolog  ■ Cocolog Free

phas e1

phas e2

phas e3

phas e4

Technology at Cocolog

 Core System
      Linux 2.4/2.6 Apache 1.3/2.0/2.2  & mod_perl Perl 5.8+CPAN PostgreSQL 8.1 MySQL 5.0 memcached/TheSchwartz/cfengine

 Eco System
 LAMP,LAPP,Ruby+ActiveRecord, Capistrano  Etc...

Monitoring
  Management Tool
 Proprietary in-house development with PostgreSQL, PHP, and Perl response time of each post number of spam comments/trackbacks number of comments/trackbacks source IP address of spam number of entries number of comments via mobile devices page views via mobile devices time of batch completion amount of API usage bandwidth usage Disk I/O Memory and CPU usage time of VACUUM analyze number of active processes CPU usage Memory usage

Monitoring points (order of priority)
         

Service APL

DB Hard

DB
     

APP

Tips for migration

 Troubles with PostreSQL 7.4-8.1&Linux 2.4/2.6
      VACUUM Data size Character set Cleaning data convert_tz function sort order

 Troubles with MySQL

2. History of Cocolog

Phase1 2003/12 ~ (Entry: 0.04Million)

Before DBP 10servers

Postgre SQL Register

TypePad
Static contents Published

NAS WEB

Phase2 2004/12 ~

(Entry: 7Million)

Publish Book

Rich template

Before DBP 50servers

Tel Operator Support

Postgre SQL

2005/5 ~
Podcast Portal Profile Etc..

Register

2004/12 ~TypePad
Static contents Published

NAS WEB

Phase2 - Problems  The system is tightly coupled.
 Database server is receiving from multiple points.  It is difficult to change the system design and database schema.

Phase3 2006/3 ~

(Entry: 12Million)

Publish Book

Rich template

Before DBP 200servers
Web-API memcached

Tel Operator Support

Postgre SQL TypePad Register

Podcast Portal Profile Etc..

Static contents Published

NAS WEB

Phase4 2007/4 ~

(Entry: 16Million)

Publish Book

Rich template

Before DBP 300servers
Web-API memcached

Typepad Tel Operator Support
Static contents Published

Postgre SQL Register Atom Mobile WEB

N S A EB W

Now 2008/4 ~
Publish Book Rich template

After DBP 150servers
Web-API memcached

Tel Operator Support Typepad
Static contents Published

Multi MySQL Register Atom Mobile WEB

N S A EB W

3. TypePad Database Partitioning

Steps for Transitioning
• • • • • • • • Server Preparation      Hardware and software setup Global Write      Write user information to the global DB Global Read      Read/write user information on the global DB Move Sequence      Table sequences served by global DB User Data Move      Move user data to user partitions New User Partition      All new users saved directly to user partition 1 New User Strategy      Decide on a strategy for the new user partition Non User Data Move      Move all non-user owned data

TypePad Overview (PreDBP)
Internet
Blog Owners Blog Readers https(443) http(80) smtp(25) / pop(110) Mobile Blog Readers

Web Server

Application Server

TypeCast Server

Mail Server

smtp(25) / pop(110) memcached(11211) nfs(2049) postgres(5432) http(80) : atom api

Storage

Da tab ase (Post gres)
Data Caching servers to reduce DB load

MEMCACHED

ATOM Server

ADMIN(CRON) Server
Cron Server for periodic asynchronous tasks

Static Content (HTML, Images, etc)

Dedicated Server for TypeCast (via ATOM)

Why Partition?
TypePad TypePad TypePad TypePad TypePad TypePad TypePad TypePad

User Rol e (U ser 1) User Rol e (U ser 0) NonUser Rol e User Rol e (U ser 2) User Rol e (U ser 3)

Global Rol e

Non- User Rol e

Current setup All inquires (access) go to one DB(Postgres)

After DBP Inquiries (access) are divided among several DB(MySQL)

Server Preparation
TypePad
Global Rol e User Rol e (User 1) User Rol e (User 2) User Rol e (User 3)
DB(MySQL) for partitioned data
Information that does not need to be partitioned (such as session information)

Non-U ser Ro le
Maintains user mapping and primary key generation

Schw art z DB
Stores job details

DB(PostgreSQL) User Rol e (U ser 0) NonUser Role

User information is partitioned Server for executing Jobs

Job Server + TypePad + Schwartz

Current Setup

Asynchronous Job Server

New expanded setup

※Grey areas are not used in current steps

Global Write
Creating the user map TypePad

Global Rol e User Rol e (User 1) User Rol e (User 2) User Rol e (User 3)
DB(MySQL) for partitioned data

Non-U ser Ro le
Maintains user mapping and primary key generation

Schw art z DB

DB(PostgreSQL) User Rol e (U ser 0) NonUser Role

Job Server + TypePad + Schwartz

Explanation   ①: For new registrations only, uniquely identifying user data is written to the global DB   ②: This same data continues to be written to the existing DB

Asynchronous Job Server

※Grey areas are not used in current steps

Global Read
Use the user map to find the user partition

TypePad
② ③
Global Rol e Non-U ser Ro le
Maintains user mapping and primary key generation


Migrate existing user data

Schw art z DB

User Rol e (User 1) User Rol e (User 2) User Rol e (User 3)

DB(PostgreSQL) User Rol e (U ser 0) NonUser Role

DB(MySQL) for partitioned data

Job Server + TypePad + Schwartz

Asynchronous Job Server

Explanation   ①: Migrate existing user data to the global DB   ②: At start of the request, the application queries global DB for the location of user data   ③: The application then talks to this DB for all queries about this user. At this stage the global DB points ※Grey areas are not used in current steps to the user0 partition in all cases.

Move Sequence
Migrating primary key generation TypePad

Global Rol e Non-U ser Ro le
Maintains user mapping and primary key generation


Migrate sequence management

Schw art z DB

User Rol e (User 1) User Rol e (User 2) User Rol e (User 3)

DB(PostgreSQL) User Rol e (U ser 0) NonUser Role

DB(MySQL) for partitioned data

Job Server + TypePad + Schwartz

Asynchronous Job Server

Explanation   ①: Postgres sequences (for generating unique primary keys) are migrated to tables on the global DB that act as “pseudo-sequences”.   ② Application requests new primary keys from global DB rather than the user partition.

※Grey areas are not used in current steps

User Data Move
Moving user data to the new user-role partitions

TypePad

Global Rol e

Non-U ser Ro le
Maintains user mapping and primary key generation

Schw art z DB
Stores job details


User Rol e (User 1)


DB(PostgreSQL) User Rol e (U ser 0) NonUser Role User Rol e (User 2) User Rol e (User 3)
DB(MySQL) for partitioned data User information is partitioned Server for executing Jobs

Migrating each user data

Job Server + TypePad + Schwartz

DB(MySQL) for partitioned data Explanation   ①: Existing users that should be migrated by Job Server are submitted as new Schwartz jobs. User data is then migrated asynchronously   ②: If a comment arrives while the user is being migrated, it is saved in the Schwartz DB to be published later.   ③: After being migrated all user data will exist on the user-role DB partitions ※Grey areas are not used in current steps   ④: Once all user data is migrated, only non-user data is on Postgres

New User Partition
New registrations are created on one user role partition

TypePad

Global Rol e User Rol e (User 1) User Rol e (User 2) User Rol e (User 3)
DB(MySQL) for partitioned data

Non-U ser Ro le
Maintains user mapping and primary key generation

Schw art z DB


DB(PostgreSQL) User Rol e (U ser 0) NonUser Role

User information is partitioned

Job Server + TypePad + Schwartz

Explanation   ①: When new users register, user data is written to a user role partition.   ②: Non-user data continues to be served off Postgres

Asynchronous Job Server

※Grey areas are not used in current steps

New User Strategy
Pick a scheme for distributing new users

TypePad

Global Rol e User Rol e (User 1) User Rol e (User 2) User Rol e (User 3)
DB(MySQL) for partitioned data

Non-U ser Ro le
Maintains user mapping and primary key generation

Schw art z DB


DB(PostgreSQL) User Rol e (U ser 0) NonUser Role

User information is partitioned

Job Server + TypePad + Schwartz

Explanation   ①: When new users register, user data is written to one of the user role partitions, depending on a set distribution method (round robin, random, etc)   ②: Non-user data continues to be served off Postgres

Asynchronous Job Server

※Grey areas are not used in current steps

Non User Data Move
Migrate data that cannot be partitioned by user

TypePad
Global Rol e User Rol e (User 1) User Rol e (User 2) User Rol e (User 3)
DB(MySQL) for partitioned data

Information that does not need to be partitioned (such as session information)

Non-U ser Ro le
Maintains user mapping and primary key generation

Schw art z DB

DB(PostgreSQL) User Rol e (U ser 0) NonUser Role

User information is partitioned

Migrate non-User data

Job Server + TypePad + Schwartz

Asynchronous Job Server

Explanation   ①: Migrate non-user role data left on PostgreSQL to the MySQL side.
※Grey areas are not used in current steps

Data migration done
TypePad
① ①
Information that does not need to be partitioned (such as session information)

Global Rol e User Rol e (User 1) User Rol e (User 2) User Rol e (User 3)

Non-U ser Ro le
Maintains user mapping and primary key generation

Schw art z DB
Stores job details

DB(Postgres) User Rol e (U ser 0) NonUser Role

User information is partitioned Server for executing Jobs

DB(MySQL) for partitioned data

Job Server + TypePad + Schwartz

Explanation   ①: All data access is now done through MySQL   ②: Continue to use The Schwartz for asynchronous jobs

Asynchronous Job Server

※Grey areas are not used in current steps

The New TypePad configuration
Internet
Blog Readers Blog Owners (management interface) http(80) Mobile Blog Readers https(443) smtp(25) / pop(110)

Web Server

Application Server

TypeCast Server
http(80) : atom api

Mail Server

smtp(25) / pop(110) memcached(11211) nfs(2049) MySQL(3306)

St orage

Database (MyS QL )
Data Caching servers to reduce DB load

MEMCACHED

ATOM Server

ADMIN(CRON) Server
Cron Server for periodic asynchronous tasks

Job Server

Static Content (HTML, Images, etc)

Dedicated Server for TypeCast (via ATOM)

TheSchwartz server for running ad-hoc jobs asynchronously

4. Migration from PostgreSQL to MySQL

History of scale up PostgreSQL server, Before DBP

 DB Node Spec History
Time OS(RedHat) 7.4(2.4.9) ES2.1(2.4.9) ES2.1(2.4.9) AS2.1(2.4.9) AS4 (2.6.9) CPU Xeon 1.8GHz/512k×1 3.2GHz/1M×2 3.2GHz/1M×2 3.2GHz/1M×4 3.2GHz/1M×4 MP3.3GHz/1M×4 〔 2Core×4 〕 MEM 1GB 4GB 4GB DiskArray No No Yes

2003/12

12GB Yes 12GB Yes 16GB Yes

2007/11

AS4 (2.6.9)

History of scale up PostgreSQL server, Before DBP

 DB DiskArray Spec
 [FUJITSU ETERNUS8000]
http://www.computers.us.fujitsu.com/www/prod ucts_storage.shtml?products/storage/fujitsu/e 8000/e8000

   

Best I/O transaction performance in the world 146GB (15 krpm) * 32disk with RAID - 10 MultiPath FibreChannel 4Gbps QuickOPC (One Point Copy)
 OPC copy functions let you create a duplicate copy of any data from the original at any chosen time.

Scale out MySQL servers, After DBP

 A role configuration
 Each role is configured as HA cluster
 HA Software: NEC ClusterPro

 Shared Storage

Scale out MySQL servers, After DBP
Postgre SQL MySQL Role1 MySQL Role2 MySQL Role3

heart beat FibreChannel SAN DiskArray

TypePad Application

Scale out MySQL servers, After DBP  Backup
 Replication w/ Hot backup

Scale out MySQL servers, After DBP
Postgre SQL MySQL Role1
mysqld

heart beat FibreChannel SAN DiskArray

MySQL Role2
mysqld

MySQL Role3
mysqld


rep rep rep

mysqld mysqld mysqld

TypePad Application

opc

MySQL BackupRole

Troubles with PostreSQL 7.4 – 8.1
 Data size
 over 100 GB  40% is index

 Severe Data Fragmentation
 VACUUM
    “VACUUM analyze” cause the performance problem Takes too long to VACUUM large amounts of data dump/restore is the only solution for de-fragmentation We don’t use Auto VACUUM since we are worried about latent response time

 Auto VACUUM

Troubles with PostgreSQL 7.4 – 8.1  Character set
 PostgreSQL allow the out of boundary UTF-8 Japanese extended character sets and multi bytes character sets which normally should come back with an error - instead of accepting them.

“Cleaning” data
 Removing characters set that are out of the boundries UTF-8 character sets.  Steps
    PostgreSQL.dumpALL Split for Piconv UTF8 -> UCS2 -> UTF8 & Merge PostgreSQL.restore

dump

Split restore UTF8->UCS2->UTF8

Merge

Migration from PostgreSQL to MySQL using TypePad script

 Steps
 PostgreSQL -> PerlObject & tmp publish -> MySQL -> PerlObject & last publish  diff tmp & last Object ( data check )  diff tmp & last publish ( file check )
data check
Object
TypePad

Object
TypePad

PostgreSQL
Document tmp File check Document last

Troubles with MySQL

 convert_tz function
 doesn't support the input value outside the scope of Unix Time

 sort order
 different sort order without “order by” clause

Cocolog Future Plans

 Dynamic  Job queue

Consulting by

 Sumisho Computer Systems Corp.
 System Integrator  first and best partner of MySQL in Japan since 2003  provide MySQL consulting, support, training service

 HA  Maintenance
 online backup

 Japanese character support

Questions