Categorized under: Storage basics

When is more spindles better than more cache ?

There are many tricks to speed up data access in enterprise storage platforms. Either increase the number of spindles so data may be fetched from many instead of few spindles, or increase the size of the cache so more content can be pre-fetched into the memory for access. After all, memory is fast, disk is slow.

With outrageous price tags that rival the GDP of some small island countries,  vendors can afford to throw a lot of high speed ram at their tier 1 offerings.  Additionally, every vendor claims their  pre-caching algorithm (aka the ’secret sauce’) can out-smart other vendors’, achieving better cache-hit ratio than their competiros.
To name a few : EMC Symmetrix DMX, IBM DS8000, Hitachi USP

Here’s a diagram that shows how EMC implements its Tier1 platform – Symmetrix DMX:

The “global memory” is their term for the cache memory. In the latest DMX-4 it can range from 8GB usable to 256GB usable, depending on the configuration.

However, algorithms are only as intelligent as the programmers implementing them. Since the cache is a finite resource, no matter how smart the programmers are, there are cases when using tier 1 storage do not make sense.

Example 1. highly random access with huge amount of data
With maximum usable cache memory at 256G, it sure sounds like a lot of cache. But what if the application here is a mail server (eg. MS Exchange Srv) with 15TB of mail ?  If the read access is totally random (say 30000 active users starts hitting email in the morning, and average user has 30 emails that average 300 K, it translates to about 256GB of data that has to be delivered. How many organizations can dedicate an entire Symmetrix for just Exchange server ?)

There’s no way for the cache algorithm to figure out where the next read may be, therefore a cache-hit is guaranteed at this point. Once a cache-miss occured, the SYMM would have to read the data from the back-end disk-director, which operates at ‘disk speed’.

if the cache-miss continues, the SYMM can only deliver the data at the speed of the underlying raid-groups, instead of the fast ‘memory cached’ speed.  With this types of access pattern, the symmetrix may slow down to the speed of a lower end device, since the massive cache is just sitting there looking pretty.

This applies across the board to all Tier-1 devices that sports huge cache.
The only way to speed up access would be to increase number of spindles.
a disk can only sustain so many incoming I/O.  A LUN with 5 disks would have 5 disk doing I/O, but if it’s spread across 100 disks, you can potentially have 100 disks doing the I/O.

Example 2. Intelligent application/file system with its own striping and caching algorithms

Similarly, highly parallel filesystems such as IBM GPFS that stripes data into small chunks across many LUNs (or JBODsif you trust IBM’s marketeers enough to let GPFS take care of all the data protection tasks) may run into similar problems. For example, a continuous file may have the first block at LUN A, and then the next block at LUN B, and next block at LUN D, and the end at LUN A. All disjointed from one another. There’s no way for the pre-fetch algorithm to figure out which chunk of data to pre-fetch next, since it does not understand GPFS nor does it have access to the metadata in the filesystem that keeps track of the file block locations across the LUNs, or sometimes even across storage devices, since GPFS lets you pool a number of storage systems into different pools to form a filesystem.
With this type of application, a sequential read from the user or application’s perspective suddenly looks like random access to the storage system. And as previously discussed, random access means the speed of the underlying raid groups.

These are the type of situations that make odd-ball vendor such as 3Par’s design very appealing.

Instead of throwing in a lot of cache memory, it simply stripes LUNs across huge amount of spindles, providing a lot more raw I/Os for random read that traditional Tier-1 vendors can only dream about.

One size does not fit all, no matter what the vendor tells you.

Categorized under: Uncategorized

Hello World !

Hurrah!

The Scalable Storage Blog is now up and running

more interesting things to come

T.