There are a lot of differing opinions floating around these days about “Storage Tiering” and the different methods of achieving it. Before we get into some of the popular approaches you’ll read about, let’s talk about why tiering is important.
The “need” for storage tiering stems from the fact that different types of drives have different performance metrics associated with them. When we talk about random IOPS, the following table indicates some rough sizing guidelines that architects can use to ballpark a solution.
While there are other factors like read/write ratio, raid types, block sizes and random/sequential workloads, this post isn’t about how to size a storage solution, it’s about why storage tiering is relevant to you.
When you compare these tables, one thing is obvious – SSD drives perform the best but are cost-prohibitive. This is why we don’t use SSD for everything. In the past, the only way to balance cost and performance was to have a “fast” pool of storage made up of 10,000rpm or 15,000rpm drives, and a “slow” pool of storage made up of 7,2000rpm drives which were far less expensive per terabyte.
That meant whenever we deployed a new application or service, we’d need to manually determine if that application would reside on the “fast’ or the “slow” pool of storage.
You may notice I keep putting fast and slow in quotes – this is because even though a 7,200rpm drive may provide fewer IOPS per spindle than a 15,000rpm drive, it’s not necessarily slower. If we have a pool of storage made up of qty 20 – 7,200rpm drives vs a pool of qty 6 – 15,000rpm disks, which is faster?
The 20-drive 7,2000rpm pool would have a ballpark IOPS rating of 1600 IOPS, whereas the 15,000rpm disk drive pool would only have 1080 IOPS for the “fast” pool.
The purpose of these tiering technologies really allows us to take different drive types and blend them into a single pool of storage – and then have the array manage which types of data reside on which type of disk. This forgoes the need of worrying about “fast” and “slow” disk types, since they’re all just different buckets of IOPS that we have created a single pool out of. This approach drastically reduces our storage management since we no longer need to decide if an application should go on “fast” or “slow” disk.
Storage tiering is great from a technical perspective. It’s also less management for better performance. So how how does it save you money? If money were no object, we’d simply build arrays completely out of SSD and tiering technology wouldn’t be relevant at all. Unfortunately, we all live in the real world where building all-SSD arrays for general workloads isn’t often the most fiscally responsible thing to do.
What tiering allows us to do is look at your data center requirements and build to what you need. If we were to define our requirements as “30TB at 10,000 IOPS” – we could then use ALL of the different drive types to meet those requirements in a manner that has the least cost, but still meets those requirements. We’re no longer stuck at deciding what drive type we want to use at the onset of the architecture.
Expansion is the other great use case for storage tiering. Once upon a time if we ran out of capacity in our “fast” pool, we’d need to add “fast” disk even though the performance was adequate. Applying storage tiering, we can add expensive high performance SSD drives when we’re out of performance, or cost-effective 7,200rpm drives when we’re out of capacity. Being able to balance cost and performance as we scale is huge!
While trying to predict what your storage performance requirements will be years in the future is impossible – maintaining the flexibility to grow in the most cost-effective manner results in huge savings.
In my next post, I’ll discuss the different types of tiering available in the market.