Return of Jedi…, ehm FlashCache

For the past couple of years I haven’t designed a single Netapp solution which still included FlashCache. I was quite surprised when NetApp introduced their latest hardware refresh along with Ontap 9.1. Now all new controller models come with NVMe M.2 Flash chips, which in essense are acting as FlashCache. Does that mean that FlashCache is now better option than FlashPool? Typical Pre-Sales answer applies: It depends 🙂

FlashCache vs FlashPool with Older generation FAS

Why haven’t I recommended FlashCache and opted to go with FlashPool instead? There are few reasons:

Form Factor

In the past FlashCache was a PCI expansion card.

So no FlashCache on entry-level FAS2500 series, since there were no PCI slots available

Entry level FAS8000 series model, i.e FAS8020 had only two PCI slots and typically these were in short supply and would be better used for additional back-end (SAS / FC) or front-end (10GbE/FC) connectivity

Limited number of PCI slots even with FAS8040/8060

Costs

FlashCache PCI cards were quite costly and the same cache capacity could typically be achieved cheaper with SSD drives and FlashPool

Expandability

Since there was limited number of PCI slots available, you could get more cache capacity with Flashpool. Furthermore FlashPool has lighter metadata footprint than FlashCache, for the same amount of RAM you can support more FlashPool cache than FlashCache cache.

Functionality

FlashPool has few functional advantages over FlashCache

No cache warm-up or performance degradation in case of HA event. With FlashCache the component (FlashCache PCI card) providing performance was inside storage controller and in case the controller went down, so did the FlashCache card. HA partner would have a FlashCache card, but no cached data, so after failover there was a dip in performance while the FlashCache “warmed” up with new data. In case of failover of HA event cache capacity of the whole system would also be halved, since 50% of cache would be down or unavailable.

With FlashPool component providing performance (SSD drive in SAS-shelf) would be failed over to surviving controller with cached data, so no need to “rewarm” the cache.

FlashPool is also capable of accelerating writes. This is also the most misunderstood capability of FlashPool. Typically only small portion of writes are accelerated by FlashPool and IMHO this feature offers limited benefit.

Why is that? By design Ontap / FAS is quite write friendly and does not benefit so much from caching or accelerating writes. All writes are coalesced and acknowledged from NVRAM, much faster media than SSD or spinning disk.  Typical latency for a single write operation is around 1-2 ms, even if there is no flashpool and/or your back-end storage is just spinning rust (disks). So for a single write operation there is no benefit of using SSD media, the write is acknowledged well before it hits cache or back-end storage layer. Faster back-end media or caching only becomes beneficial in a situation where the back-end storage cannot keep up with in-coming writes (and cannot flush NVRAM fast enough).

So what kind of writes does Flashpool then accelerate? Only small portion of writes.

  • it has to be random, no sense of caching sequential writes, in most cases your cache would fill up before the data is referenced again.
  • It has to be overwrite, it doesn’t make sense to cache data that is only referenced once.

Let’s take typical example of 70% reads / 30% writes example. 70% of IO is reads, so no difference between FlashPool and FlashCache performance as both are capable of caching reads.

What about remaining 30%, i.e writes.  Not all writes are random and not all writes are overwrites, if 10% of your writes are random overwrites, then only 3% of total IO will benefit from Flashpool write acceleration. In most cases write acceleration feature of Flashpool would be quite small benefit.

FlashCache / FlashPool with new FAS platforms

So what would you recommend now with the new FAS platforms? Let’s investigate new platforms by using my previous criteria:

Form Factor & Communication Protocol

FlashCache comes now in NVMe M.2 form factor. What does that mean? NVMe is a “protocol” which was designed for non-volatile memory, such as SSD or Flash memory and as such is better suited for flash usage than SCSI protocol which was designed to serve mainly spinning media.

Comparison between NVMe and SCSI: NVMe for Absolute Beginners

M.2 is the physical connection. In case of the NetApp FAS, M.2 connector is connected directly to PCIe bridge. This is faster than having flash in a SAS attached disk shelf.

So what does this mean in practice? With NVMe & M.2 flash is “closer” to CPU and RAM and you will get lower latency and higher throughput than using SCSI/SATA/SAS attached SSD drives.

Clear advantage for NVMe based flash, i.e FlashCache.

Costs

Some NVMe based flash comes now as standard with controllers. So no additional cost for basic FlashCache functionality.

With FlashPool you will have to pay for:

  • SSD media, i.e SSD drives
  • Capacity license
  • Increased support costs

Clear advantage for FlashCache.

Functionality

The same functional differences between FlashCache and FlashPool still apply. At the moment FlashCache and FlashPool are separate entities, if you are using Flashpool, then no data is inserted into FlashCache.

Maybe in future releases of Ontap FlashPool can take advantage of onboard NVMe flash, but at least for now it can’t.

Clear Advantage for FlashPool.

Expandability

With FAS2600 & FAS8200 series flash is internal to the controller and cannot not be expanded easily.

With FAS2600 series the capacity is 512 GiB / controller or 1 TiB / HA-pair. In spare part listing for FAS2600 there is also mentioned 1TB NVMe module, so in the future there might be options for ordering controller with larger FlashCache module.

With FAS8200 series the internal flash capacity can be up to 2 TiB / controller or 4TiB / HA-pair.

With FAS9000 series there are separate M.2 slots (two per controller)for flash modules in the FAS9000 chassis, so it is easier to expand flash. 1 TiB and 4 TiB flash units available.

But in any case there are limited options for cache size with FlashCache, there is more room to expand with FlashPool.

Clear advantage FlashPool.

I think that this is the most critical criteria in most cases. If your working set is small enough and can be served with FlashCache, then it would make sense to use faster, no-cost option. In case your working set is unknown or so large that it cannot be accommodated with onboard FlashCache, then go with Flashpool (or All Flash FAS).

Example sizing exercise

Let’s take a typical small NetApp use case which could be served with FAS2600 series platform.

  • around 10 TiB usable capacity required
  • around 5% active working set

Solution with Flashpool

FAS2650 has 24 SFF (2.5″) drive slots. The smallest SSD drive supported with this platform  is 960 GB with right sized capacity of 894 GiB. Typically ready-made bundles are cheaper to buy than a la carte configurations. So in this case we would use a bundle with 4x960GB SSD + 20 x 900GB SAS.

Since Flashpool is also caching writes, drives have to be RAID protected, where as FlashCache only caches reads and there is no need to use RAID to protect data.

The most conservative approach would be using RAIDDP with spare for SSD layer, which would result in: 1 spare, 2 parity, 1 data drive, i.e 894 GiB / 2 = 447 GiB cache /  controller. This is the solution that Synergy recommends, so let’s use that. Optionally with this small amount of SSD drives, one could use RAID4 protection for Flashpool cache: 1 spare, 1 parity, 2 data drives. With ADP spare and parity drive are shared. Net result: capacity of one data drive per controller or  894 GiB of cache per controller. The same result could be achieved by using RAIDDP protection and no spare.

According to Synergy remaining 20 x 900 GB drives would give about 4,77 TiB usable / controller or about 9,5 TiB usable for the HA-pair

  • End result:
    • (2×4,77TiB) =9,5 TiB or 9,5 TiB x 1024 ~ 9728 GiB usable capacity
    • (2x447GiB=) 894 GiB cache capacity
    • 894 GiB x 100 / 9728 GiB ~ 9,18% cache capacity of usable capacity
  • Pros:
    • Flashpool functionality
      • No performance degradation in case of failover
      • Some write acceleration (but not much in most cases)
    • Cache Expansion
      • Easy to expand cache capacity by adding SSD drives
  • Cons:
    • Costs
      • Extra cost for SSD drives
      • Extra cost for OS Capacity license
      • Higher support costs
      • Higher overall cost
    • Lower usable capacity
      • Only 20 spinning disks
    • Onboard NVMe flash not used (eventhough it comes with the system for “free”, there are no free lunches)

Solution with FlashCache

Take 24 x 900 GB SAS drives. Use onboard NVMe flash as Flashcache.

  • End result:
    • (2×6.2 TiB) = 12,4 TiB or 12,4 TiB x 1024 ~ 12697 GiB usable capacity
    • (2×512 GiB) = 1024 GiB cache capacity
    • 1024 GiB x 100 / 12697 GiB ~ 8,06 % cache capacity of usable capacity
  • Pros:
    • Costs
      • No extra cost for SSD media
      • No OS capacity license for SSD media
      • Lower support costs
      • Lower overall costs
    • Higher usable capacity
  • Cons:
    • No FlashPool functionality
      • Performance degradation in case of failover or HA-event
      • No write Flashpool acceleration (limited value)
    • Lower cache limits
      • Cannot expand FlashCache with entry-level systems

Which one would you choose then? It depends on your working set size, how volatile it is and how critical performance requirements are. If you can live with performance degradation in failover situation or under temporary heavy load, I would choose the cheaper solution, i.e go with onboard FlashCache and only spinning disks. If constant performance is main concern, then FlashPool has some advantages and more room to expand performance in case your performance requirements change.

Unfortunately “working set” size is hard number to find. If you are current NetApp customer, then there are tools to find this number, for other vendor products, it might be a difficult task to dig this information.

NetApp System Performance  Modeller (SPM) is showing confusing results for the examples above. I was expecting SPM to show fairly similar numbers for workloads where the active working set size is small enough to fit the cache, maybe slightly better numbers for the FlashCache option where there is little bit more cache and few more spinning disks. However SPM showed that the maximum throughput (in 4k IOPS) of FlashCache system was few times more than FlashPool system while having significantly lower latency. Maybe I made a mistake in my test runs or there is a bug in SPM.  Have to investigate more, maybe a topic for future post.

Thanks for reading, comments are welcome 🙂

 

 

 

Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s