[LINK] Flash Memory Implementation

Wed Jun 27 02:38:32 AEST 2012

The popularity of flash memory has soared over the last year because 
flash has definite advantages over conventional media. It often isn't 
clear however, what distinguishes one flash offering from another. 

Here is a review of four common flash design implementations, each of 
which has strengths and weaknesses.

(1) Let's start with the use of PCIe flash memory cards in servers, 
coupled with software that treats flash as an extension of system memory. 
Applications that depend upon high performance database accesses where 
low latency is very important can benefit from the use of these cards.

Data is generally moved as blocks closer to the application given the 
need for very high performance. Compared to traditional disk I/O, latency 
is far lower and the cost per IOPS is low. Because NFS is not the primary 
protocol being used for data access, customers that prefer this option 
are primarily SAN minded folk that are very sensitive to latency.

The cons associated with this approach are first, it's not a shared 
storage model; servers that benefit must be furnished with the flash 
cards. Second, it consumes inordinate amounts of CPU because the wear 
leveling and grooming algorithms require a great amount of processor 
cycles. Third, for some customers, consuming PCIe slots is a concern. All 
of these factors need to be factored into how servers are provisioned, 
assuring adequate processor and PCIe slot support.

(2) The second design approach is to build storage arrays purely from 
flash memory. These constitute shared storage targets that often sit on a 
SAN. You wouldn't purchase these systems to accelerate or displace NAS, 
but you can include support for NFS caching so long as one flash memory 
array sits alongside an NFS gateway server. The added latency associated 
with including such a gateway make it less than ideal in performance-
sensitive environments. The pure SAN model has gained significant 
traction displacing conventional storage from incumbent suppliers in 
latency sensitive environments, such as the financial markets.

Despite the raw performance, the storage management tools tend to lag. 
One of the major disadvantages with these systems is the processor 
utilization in the storage array. This will likely be the bottleneck that 
limits scalability. And once the processors hit 100%, it doesn't matter 
how much more flash memory is installed; the system will be incapable of 
generating incremental I/O. A better approach might be to apply flash to 
the data that needs it and make use of less expensive media for the data 
that doesn't. Aged or less important data doesn't require the same IOPS 
as hot data.

(3) The third design approach has taken on chameleon-like qualities. It 
can function as either a write-through caching appliance that offloads 
NAS or file servers, or just as a file server. As a file server, it is 
positioned as an edge NAS that delivers performance to users. There is 
still a back-end NAS that sits behind this device where everything is 
stored. Active data isn't moved to the Edge NAS, it's copied to it, and 
this option makes use of faster media to increase performance for users.

The products come in the form of nodes that can make up a cluster. Nodes 
are configured with DRAM, NVRAM and either higher-performance SAS hard 
drives or flash memory. The nodes form a high performance cluster that 
can be managed as a Global Name Space.

Data can be pushed to edge NAS nodes in different clusters in an effort 
to reduce latency associated with the WAN. Written data can be flushed 
immediately to the back end filers, hence the write-through caching 
model, or "stored" on the cluster, and at set intervals, the written 
blocks as they exist, are flushed back. There is no form of WAN 
optimization, either de-duping or compression, that takes place in this 
model.

Some of the pros associated with this design approach are that it can 
generate incremental performance for users when the back-end filers lack 
the ability to generate the IOPS. It could also develop into what could 
be a strong full featured scale-out NAS offering over time. But the use 
of the back-end NAS in this model is temporary.

The major downsides with this design are it's not optimized to be purely 
a caching solution, and when it is used as a File Server, it's intrusive 
to the existing NAS. If it holds on to Writes in this mode, the back-end 
NAS can't receive them and execute snaps or replication. For those 
looking for a cost effective cluster mode scale-out, this might well be 
an insurance policy, assuming that these products do evolve into full-
fledged NAS appliances that don't rely upon other back-end storage.

(4) This takes us to the fourth and final design, an NFS acceleration 
appliance. In this design, flash is used as cache, not as storage. The 
appliance consists of DRAM and flash memory that act as cache for active 
data based upon LRU and data acceleration technology that consists of 
both 10GbE acceleration hardware in silicon and custom software.

This acceleration technology is optimized for processing NFS requests and 
getting data on and off the wire with minimal latency and CPU 
utilization. The appliance simply sits alongside the NAS already in place 
and acts as a performance Read and NFS Metadata cache and accelerator. 
Given typical NFS mixes, this model has the appliance absorbing 
approximately 90% of the NFS traffic so the filer doesn't have to.

In this model, the appliance can support a number of filers, not just 
one, so it's both sharable and scalable. The intent is to provide cycles 
back to the NAS so it can do its job in delivering performance to 
applications and have its life extended.

The pros are simplicity and cost-effective performance. The cons include 
not being able to cache CIFS traffic today and the dependency on the back-
end NAS to handle non-cacheable operations. If there are bottlenecks on 
the filer, such as too few drives with limited IOPS to handle a burst in 
un-cached reads, there could be a temporary spike in latency. But for 
many working sets in environments where many clients are accessing much 
of the same data, as the cache warms, this isn't an issue.

In summary, there are many flash products available in the market today. 
They are diverse enough to where they shouldn't all be put in a bucket 
labeled as flash storage. Hopefully this review of Flash solutions' 
architectural differences, benefits and shortcomings has helped 
illuminate the fact that more informed choices can be made.

(Note: This vendor-written tech primer has been edited by Network World 
to eliminate product promotion, but readers should note it will likely 
favor the submitter's approach.)

http://www.arnnet.com.au/article/428748/can_flash_live_up_hype_/

Cheers,
Stephen