Tune your Oracle Automatic Storage Management

Tips & Techniques at implementing and optimizing your database storage using Oracle ASM in a SAN environment by Thiru Vadivelu Implementing Oracle’s Automatic Storage Management in a Storage Area Network environment can be an interesting and involved task. The purpose of this document is to provide some generic guidelines and things to consider/configure when setting up/migrating to Oracle Automatic Storage Management (ASM), in a SAN environment, based on our past experiences. This article is primarily intended for DBAs, Database Managers, System administrators and Storage administrators.

Why ASM? • ASM is an Oracle provided Database/Cluster filesystem and Storage Volume manager, optimized for Oracle for any kind of workload (be it OLTP or DSS or Batch, sequential or random access), built-in with the database, with no additional cost. It follows SAME (Stripe and Mirror Everything) strategy, eliminating any database level IO hotspots, as the database files are striped equally across all the available disks in the allocated disk groups and are dynamically rebalanced with the addition and removal of disks online. ASM provides for better storage utilization and storage consolidation (ie allows multiple databases on the same Server or across multiple Servers to share storage through Oracle clusterware). It allows leveraging existing storage RAID technologies (Mirroring, RAID 5 etc) to provide for additional striping and redundancy. It can leverage IO multipathing technologies (like EMC Powerpath, Hitachi HDLM etc) to provide for better IO balancing and fault-tolerance. Makes use of Oracle ‘Kernel Service Direct File (ksfd)’ interface to perform KAIO (kernel asynchronous IO) against raw devices by-passing OS page cache, providing the best IO performance with no overhead from Operating system or conventional Logical Volume Managers.

• • •

What ASM is not? • • Although, it’s a ‘database filesystem’, it does not provide storage for Oracle binaries, certain configuration files, trace files, logs etc. It cannot resolve database data block contention at the segment level.

. random r/w ) Availability To achieve the desired level of the above parameters which are based on application requirements. by itself (like OS or SAN frames). It does not provide IO Multipathing capabilities of its own. the following are the primary factors. It does not provide any memory caching/buffering. when it comes to planning database storage: • • • • • IO throughput ( and I/O per sec) IO service/wait times Volume of storage I/O workload (ie nature of access like sequential r/w . the following things need to be strategized and tuned: • • • • • • • • • • • • • • • • • Storage Tier level Raid group/configuration Size/Number of Luns(Logical Unit Number) Storage Frame Caching/Sharing SAN Stripe size IO Multipathing Bandwidth of IO path Volume/Filesystem configuration OS Page/Buffer Cache ASM diskgroups configuration ASM Instance configuration ASM allocation unit size ASM fine grained stripe size ASM Max IO size Oracle Block size Oracle Multiblock read count Oracle Caching( SGA/PGA) It is not the purpose of this document to go into details of all the above requirements and technologies (purely based on application needs and technology availability). Database Storage Planning/Strategies: In general. with respect to Storage/ASM setup. but to merely share my experiences and provide some guidelines.• • • It cannot automatically improve IO performance when the SAN disks are experiencing high ‘disk service times’ possibly contributing to high IO wait times at the Oracle layer.Only the ASM metadata is cached in the ASM instance.

8 M :2 73 /2 0: 00 P 03 4/ 28 7 6 .6 M :0 12 /2 1: 00 A 00 4/ 28 7 6 .8ms in Tier2 to about 17. The improvement was very minor.60 17.00 17.80 16.5 M :2 72 /2 0: P 23 4/ 007 M 28 . when i) Most of the IO is being satisfied by SAN frame cache and did not stress the underlying San storage enough.2 28 / 2 1:2 0: 00 P 38 4/ 7 28 . consider either Tier-2 or Tier-1 storage and for development Tier.20 17.8 M :2 05 /2 0: 00 P 13 4/ 28 7 7 . Its not to be taken that Tier-1 will not significantly perform better than Tier-2 (after all it provides for better striping across more luns). For Single block reads (common in Indexed reads). 11 04 :2 5 0: P 02 . For production applications.3 M :0 24 /2 0: A 06 4/ 007 28 . 28 11 61 /2 :2 3 00 0: A 2 7 4/ 12 7.8 M :0 01 /2 0: 00 A 54 4/ 28 7 5 . avg_w ait(ms) 18.00 16.3.Storage Tier level: • • • • • Greatly dependent on budget.0 M 50 / 2 7:0 0: 00 A 12 4/ M 28 7 8 :0 .4 :0 29 /2 0: 00 25 A M 4/ 7 28 . but the benefit is dependent on other factors such as attainable IO throughput.4 M 28 61 :2 /2 0: 33 A M 4/ 007 28 . 0 22 /2 5 00 :2 0 P :5 7 M 4.60 4/ 28 /2 00 7 4/ 12 28 :0 /2 0: 00 19 4/ 28 7 1 . In some of our benchmarks. 9 69 /2 00 :20 7 4/ :4 P 28 7 1 M 0. For eg.217. For Multiblock reads(common in Full table scans and Index fast full scans). / 2 8:20 279 00 4/ P :3 28 7 M 0.40 17. the average IO wait was almost the same between Tier-1 and Tier-2 Average IO Wait(ms) over the Tier2 to Tier1 migration: ( Full table scan) the Avg IO Wait(ms) almost remained the same between the two(from about 17.9 M 2: 27 /2 20 00 :4 P 4/ M 4. We didnt realize significant response time difference between Tier-1 and Tier-2 . because of heavy amount of Logical IO . 28 7 3 96 :0 /2 2 0: 00 A 48 4/ 28 7 4 . Tier-1 storage should provide an average IO wait time in the range of 10-15ms(with Tier-2 in the range of 2030ms) . San caching.2 M 17 P M .6 M 9 /2 97 00 :20 4/ A 28 7 1 :11 .4 2: 31 /2 00 00 A :4 4/ M 2. sufficient enough to achieve an overall IO response time requirement in this range.80 17.4 ms in Tier1) . ii) The DB Server’s cpu utilization is very high. in one of our benchmarks (conducted in a Datawarehouse environment dominated by large sequential reads with 95% of IO satisfied by SAN cache).9 M 0 /2 07 00 :2 0 :1 A 4/ 7 M 9. Raid configuration etc.09 /2 2 0: 00 4/ A 21 28 7 . Tier-1 storage should provide an average IO wait time in the range of 1-5ms(with Tier-2 in the range of 5-10ms). 28 7 3 61 :2 /2 9 0: 00 P 50 4/ 28 7 4 .1 M :2 95 /2 0: 00 P 55 4/ 28 7 5 .In this case Oracle was not able to make sufficient IO requests to force high IO throughput with the Tier-1 storage. and storage team’s standards and application’s desired IO response time. availability.

where all the disks in the volume are striped first and then mirrored as a whole.02 4.00 Also the graphs plotting ‘SQL Response Time’ and ‘DBTime/sec’ did not show significant difference between Tier-1 and Tier-2.10-4.6 P M 28 07 0 : 1 / 2 3 48 9 P 4/ 00 :4 . since ASM performs file extent striping first and then the Luns are mirrored at the SAN layer. again.14 4. 8 P 28 0 00 69 M 7 4/ / 20 9: :36 7 P 28 07 40 .4 P M 00 P M .0 P M 28 07 0 : 5 4 1 / 4/ 200 2:2 0. • In one of the benchmarks we did in a Datawarehouse environment.0 A M 28 07 0 : 5 / 2 7 10 0 A 4/ 00 :2 .AvgIO_Wait(ms) over Tier2 to Tier1 migration ( Indexed Reads): the Avg IO Wait(ms) almost remained the same between the two ( about 4.08ms with Tier1). RAID 5 (Tier1) outperformed RAID 1(Mirroring /Tier1) marginally (5%) and this can be directly attributable to ‘double-striping’ (ASM+ Raid 5 striping). • RAID 1+0 is marginally better than RAID 0+1 for availability (for multiple disk failures). The performance penalty experienced in this configuration due to the need to perform 4 IOs at the Raid level for every database IO (to cater to parity read/recalculate/write operations) is greatly minimized by the presence of a huge San frame cache. behaves functionally equivalent to Raid 0+1.. • RAID 5 is a very viable option for Datawarehouse applications which do predominantly ‘large sequential reads’ and ‘bulk writes’.4 M 28 7 0 : 35 4 4/ / 20 2:4 2.8 M 28 7 0 : 80 2 4/ / 20 8:2 6.1 M / 2 1 :4 78 4/ 00 0: 9.0 M 28 7 0 : 66 1 4/ / 20 8:0 4. but also the most expensive configuration.3 M 28 7 0 : 39 5 4/ / 20 4:2 2.10 4.5 P M 28 07 0 : 7 / 2 7 20 2 P 4/ 00 :4 .12ms with Tier2 and 4.6 M /2 7 :0 97 4/ 0 9: 6 28 07 40 .8 M 28 7 0 : 73 5 4/ / 20 5:4 9.2 M 28 7 0 : 64 0 4/ / 20 6:4 6.8 P M 28 7 0 : 09 4 4/ / 20 3:0 4.2 A M 9 / 4/ 200 10: :14 7 A 28 7 2 0 .064.6 M 28 7 0 : 19 1 4/ / 20 7:0 3.3 A M 28 07 0: 47 3 / 4/ 200 1:0 3.3 A M 28 07 0 : 11 / 2 2 38 A 4/ 00 :0 . avg_w ait(ms) 4.7 A M 28 07 0 : 5 / 2 4 54 9 A 4/ 00 :4 .04 4.9 A M 28 07 0 : 6 / 2 3 46 2 A 4/ 00 :2 . The Database IO is asynchronous (ie doesn’t wait for acknowledgement from the disks) and the reads/writes are buffered at the San cache. P 28 7 2 0 41 M / 2 11 :5 0 00 :0 4.4 28 7 0: 29 2 4/ / 20 1:2 3.The improvement was very minor.12 4.5 M / 2 11 :1 88 4/ 00 :0 9.2 A M 28 7 0 : 28 3 4/ / 20 1:4 7.06 4.8 M 28 7 0 : 79 5 4/ / 20 4:0 0. since the individual disks are mirrored first and then striped as opposed to 0+1.6 P M 28 07 0 : 3 / 2 6 06 1 P 4/ 00 :2 . as opposed to OLTP applications which are dominated by random read/writes.6 M 28 7 0 : 12 5 4/ / 20 5:2 8. when used with ASM.08 4. • Raid 1(Mirroring) . A 28 7 M 0 6 / 2 11 :2 13 4/ 007 :4 0 5. P 0 M 7 11 0:58 45 :4 .5 A M 34 28 1 :2 4/ / 20 2:2 9.1 P M 28 07 0 : 3 / 2 5 55 5 P 4/ 00 :0 .1 A M 2 07 0 1 4/ 8/ 2 9: :21 3 A 28 00 00 . but keep in mind of the availability benefits with Mirroring. It’s a viable option for performance intensive applications which require high availability. Raid 1 vs Raid 5: Benchmark done on a Datawarehouse Database Raid 5 ( Avg IO Wait(ms) for Indexed Reads) 4/ 28 / 4/ 200 28 7 /2 1 4/ 007 2:0 0 28 / 1 :1 4/ 200 2:4 9.6 P M 28 07 0 : 3 4/ / 20 9: 30.3 A M 28 07 1 : 8 / 2 6 02 7 A 4/ 00 :0 .0 P M 0: 21 04 . Raid group/Configuration • RAID 1+0(or 0+1) provides the highest performance and availability because of striping and mirroring.

70 3.72 3. which we were able to resolve by allocating the disks from One Raid-group to One LUN.75 3.66 3.80 3.70 3.avg_wait(ms) 3.64 3. we determined that sharing the same raid-groups across multiple LUNS results in IO contention.90 3.74 3. The following benchmark results compares the average IO response time of busiest luns between the 2 configurations i) 16 Luns in 2 RAID groups ii) 16 Luns in 8 Raid groups Case i) 300 Case ii) 300 250 250 200 200 150 150 100 100 50 50 0 11 20 11 40 11 50 12 00 12 10 12 20 12 30 12 40 11 00 11 10 11 30 12 50 0 00 10 20 30 40 50 00 10 20 30 40 23 22 22 22 22 22 22 23 23 23 23 23 50 .62 Raid 1 (Avg IO Wait (ms) for Indexed Reads) avg_wait (ms) 3.68 3.65 • Also in these benchmarks.85 3.95 3.76 3.

the RAID stripe size at the SAN layer should match ASM stripe size (1MB by default). I/O Multipathing: • • ASM can make use of various Multipathing technologies such as EMC Powerpath. our experiences show that IO performance is more dependent on the number of underlying disks. Sun Powerpath etc) and is highly recommended to provide IO path redundancy.1 copy of online redolog. Nevertheless. based on the expected IO throughput and service times. to minimize the LUN Management overhead.1 copy of controlfile. Our benchmarks show that. its strongly recommended to size all the luns in a diskgroup.Size/Number of Luns: • Although Oracle’s recommendation is to minimize the number of LUNS per diskgroup by having fewer larger Luns. If the above is not possible. its worth testing fewer/larger luns spread across as many physical disks as the configuration allows. instead for Oracle’s SGA/PGA. IO load balancing. • For eg. not more than 2 diskgroups (ie one for Data/Index/Undo/Redo/Temp and one for FLASHBACK (flashback logs. the LUNS are spread across. the same (and have same characteristics). for the same Tier configuration. the page cache kernel parameters ‘minperm’ and ‘maxperm’ default to 20 and 80 respectively which can allocate quite an amount of OS memory. the OS page cache is by-passed and hence need to be minimized (ie resized down) when migrating from a buffered filesystem to ASM. ASM diskgroups: • To achieve maximum storage utilization and consolidation. improved performance significantly. For eg . Recommended to resize these parameters to say 2 and 20 respectively to allocate minimal page cache and use that. archivelogs)) is generally required. ASM can directly access the multi-pathed raw lun devices(/dev/rhdiskpower*) OS Buffer/Page Cache: • Since ASM uses the raw devices directly and performs a DIRECT KAIO. spreading the diskgroup across double the luns(double the number of disks). Since the ASM file extent distribution strategy is capacity based.on AIX. then a stripe size of 256K or 512k should be ok. . for all the database storage.Otherwise Lun contention will result as the larger luns will have more ASM file extents stored. Hitachi HDLM. since the San frame is commonly shared across multiple applications. which is redundant and actually detrimental to performance since this memory is not available for Oracle SGA/PGA. • • SAN Stripe size: • • Ideally. on AIX.

However. Oracle_Sid with the CSS service. This can only be done at the time of diskgroup creation and cannot be altered for an existing diskgroup. ASM Clustering: • Oracle’s Clusterware (primarily consisting of CSS and CRS daemons) gets installed by default when ASM is installed. also called as stripe depth) and defaults to 1M. This is a very viable option when it comes to development and test databases and should be carefully considered for Production databases. it’s recommended to share the same ASM diskgroups across multiple databases residing on the same server (although it could result in single point-of-failures. • To achieve maximum storage utilization and consolidation. Oracle_Home. Oracle’s Clusterware is offered at no additional license cost. This allows for the ASM instance to register its diskgroups.This is sufficient for most installations. used by the database instance to communicate with the ASM instance. configure ASM with ‘External Redundancy’ with 1 Failure group.• • When the underlying SAN storage is providing the redundancy.. . this would require RAC license for clustering ASM instances. • However for VLDBs(multi-terabyte and greater). ASM allocation Unit: • Determines the unit of ASM storage (ie size of file extent . The cost and complexity of this configuration may not do enough justice to warrant this configuration. to minimize the ASM overhead in opening the files and caching the metadata. its recommended to increase the size of the allocation unit(undocumented parameter “_asm_ausize”) from 1M to say 8 or 16M. although you will find only the CSS process running in single instance configurations. with respect to diskgroup). Normal Redundancy (default: 2 failure groups) and High Redundancy (3 failure groups) ASM configurations are options. when configuring the same diskgroup across multiple storage frames or across datacenters for redundancy. • It is also possible to share the same ASM diskgroups across different databases running on multiple servers by clustering ASM instances (through Clusterware/RAC).

Oracle Db_FileMultiblockReadcount: • Should be set such that DB_BLOCK_SIZE * DB_FILE_MULTIBLOCK_READ_COUNT= ASM Coarse Stripe Size = MAX_IO size = 1M(default) . The FINE stripe size is controlled by an undocumented parameter “_asm_stripesize “ which is adequate for most installations and increasing this to higher values(say 1M). ASM Stripe size: • • • Oracle provides COARSE striping for big IO involving Datafiles and FINE striping for small IO involving online redologfiles/controlfiles. when creating the diskgroups. These are optimal sizes for most installations As outlined earlier. should only be considered for VLDBs .• Benchmarks from HP show a 16% performance improvement when the allocation_unit size was increased from 1M to 8M. increasing the allocation_unit (coarse stripe depth) size from 1M to higher values are beneficial only for VLDBs and should be set at the instance level .following benchmarks. The coarse stripe size is 1M and Fine striping size is 128K.

In a RAC installation. inspite of increasing the db_file_multiblock_read_count to 500.instance_name=+ASMn . ASM Instance configuration: • • • • • ASM instance is used to cache ASM metadata (diskgroups/disks/file extent information etc) only and doesn’t require more than 100MB typically. For Single instance configurations.ASM Max IO: • • The maximum size of an individual IO request is determined by an undocumented parameter “_asm_maxio” which defaults to 1M.asm_diskgroups='DB_DATA'. Set it to 11. Clusterware gets automatically installed when installing ASM and the CSS (Cluster Synchronization services) daemon starts up on System startup. Strongly recommended to create only one ASM Oracle_Home/Instance on a given Server and let all the database instances (both Single Instance and RAC instance) share the same ASM instance.where n=1. did not result in increasing the maximum_io. at the installation time.asm_diskstring='/dev/rasm*' . This allows oracle to automatically update the configuration file when new diskgroups are created/mounted. when reconfiguring disk groups. My attempts to increase this to 2M . Strongly recommend to use spfile. ASM Oracle_Home owner should have read/write access on the paths identified by this parameter. for the fastest (but most IO intensive) rebalancing operation. instead of client parameter pfile. which is sufficient for most installations. on that node. Set Instance_Type=’asm’ to indicate that this is an ASM instance and not a regular ‘RDBMS’ instance. • • ASM Instance connectivity: • Since ASM instance does not mount a database of its own or have data dictionary.'DB_FLASH' Asm_diskstring parameter identifies the OS search path ASM uses to search for disks when discovering disks/configuring diskgroups. each node will have an ASM instance (clustered) which will all mount the same database diskgroups. Eg) *. Set Instance_name=’+ASM’ (the default) and for RAC. Asm_diskgroups parameter identifies the diskgroups to be automatically mounted on instance startup. since its most probably limited by the Oracle internal hard-coded parameter ‘SSTIOMAX’ . ASM Installation: • • • • ASM software (and the Clusterware) is included in the 10G installation media and provided at no additional cost.2. defaulted by Oracle to 1M on most platforms .eg) *. Asm_Powerlimit parameter identifies the number of ASM Rebalancing slaves employed to perform diskgroup rebalance operations. Can be overridden at the statement level. access to ASM instance is provided normally by OS Authentication (‘/ as sysdba’).3…n . .

eliminates the need for costly LVMs. Thiru Vadivelu(tmgn12@gmail.com/technology/products/database/oracle11g/pdf/automatic-storagemanagement-11g-new-features-overview. eliminates the need to perform manual IO rebalancing.com) manages the ‘Performance & Capacity Strategies’ group in the Corporate technology Risk&Business Services division of JP Morgan Chase. introducing new features and functionality with every new Oracle release.USA. eliminates unnecessary downtime with storage reorganization and vastly improves DBA’s productivity. Take a look at the new ASM features introduced in 11G.com/technology/ocm/tvadivelu.He is an Oracle Certified Master (http://www. for instance http://www. . It is the most cost-effective automated database storage solution for Oracle and Grid computing as it eliminates the need to perform guesswork when implementing and tuning database storage.Delaware. It is the present and future.oracle.pdf ASM is one the best things that has happened to Oracle since a long time and is quickly maturing into the de facto standard for Oracle database storage. It allows the company’s storage to be most effectively used and consolidated leveraging Oracle’s cluster technology.• Recommended to configure Password/SYSDBA authentication via ASM listener for remote logins.oracle. Oracle continues to enhance ASM.html) and specializes in Performance tuning and High availability solutions involving Oracle technologies.

Sign up to vote on this title
UsefulNot useful