An Overview of Storage Area Networks
A storage area network (SAN) is a network that provides access to consolidated, block-level data storage. SANs are primarily used to enhance storage devices, such as disk arrays, tape libraries, and optical jukeboxes accessible to servers so that the devices appear to the operating system as locally attached devices. A SAN typically has its own network of storage devices generally not accessible through the local area network (LAN) by other devices. The cost and complexity of SANs dropped in the early 2000s to levels allowing wider adoption across both enterprise and small to medium-sized business environments.
A SAN does not provide file abstraction, only block-level operations. However, file systems built on top of SANs do provide file-level access, and are known as shared-disk file systems.
Historically, data centers first created “islands” of SCSI disk arrays as direct-attached storage (DAS), each dedicated to an application, often visible as a number of “virtual hard drives” addressed as Logical Unit Numbers (LUNs). Essentially, a SAN consolidates these storage islands together using a high-speed network.
Operating systems maintain their own file systems on their own dedicated, non-shared LUNs, as though they were local to themselves. If multiple systems attempted to share a LUN, these would interfere with each other and quickly corrupt the data. Any planned sharing of data on different computers within a LUN requires software, such as SAN file systems or clustered computing.
Despite such issues, SANs help to increase storage capacity utilization, since multiple servers consolidate their private storage space onto the disk arrays. Common usage of a SAN includes the provisioning of transactionally accessed data that require high-speed block-level access to the hard drives such as email servers, databases, and high usage file servers.
SAN compared to NAS
Network-attached storage (NAS) was designed independently of SAN systems. In both a NAS and SAN, the various computers in a network, such as individual users’ desktop computers and dedicated servers running applications (“application servers”), can share a more centralized collection of storage devices via a network connection like a local area network (LAN).
Concentrating the storage on one or more NAS servers or in a SAN instead of placing storage devices on each application server allows application server configurations to be optimized for running their applications and moves the storage management task to the NAS or SAN system. Both NAS and SAN have the potential to reduce the amount of excess storage that must be purchased and provisioned as spare space.
In a DAS-only architecture, each computer must be provisioned with enough excess storage to ensure that the computer does not run out of space at an untimely moment. In a DAS architecture, the spare storage on one computer cannot be utilized by another. With a NAS or SAN architecture, where storage is shared across the needs of multiple computers, one normally provisions a pool of shared spare storage that will serve the peak needs of the connected computers–typically less than the total amount of spare storage that would be needed if individual storage devices were dedicated to each computer.
In a NAS, the storage devices are directly connected to a file server that makes the storage available at file-level to the other computers. In a SAN, the storage is made available at a lower “block-level,” leaving file system concerns to the “client” side. SAN protocols include Fibre Channel, iSCSI, ATA over Ethernet (AoE) and HyperSCSI. One way to loosely conceptualize the difference between a NAS and a SAN is that NAS appears to the client OS (operating system) as a file server (the client can map network drives to shares on that server), whereas a disk available through a SAN still appears to the client OS as a disk, visible in disk and volume management utilities (along with client’s local disks) and available to be formatted with a file system and mounted.
The Downsides to NAS and SAN Architecture
One drawback to both the NAS and SAN architecture is that the connection between the various CPUs and the storage units are no longer dedicated high-speed busses tailored to the needs of storage access. Instead, the CPUs use the LAN to communicate, potentially creating bandwidth as well as performance bottlenecks.
Additional data security considerations are also required for NAS and SAN setups, as information is being transmitted via a network that potentially includes design flaws, security exploits, and other vulnerabilities that may not exist in a DAS setup.
While it is possible to use the NAS or SAN approach to eliminate all storage at user or application computers, typically those computers still have some local Direct-Attached Storage for the operating system, various program files, and related temporary files used for a variety of purposes, including caching content locally.
Comparison of SAN, DAS, and NAS Architecture
To understand the differences, a comparison of SAN, DAS, and NAS architectures may be helpful.
Benefits of SAN Architecture
Sharing storage usually simplifies storage administration and adds flexibility since cables and storage devices do not have to be physically moved to shift storage from one server to another.
Other benefits include the ability to allow servers to boot from the SAN itself. This allows for a quick and easy replacement of faulty servers since the SAN can be reconfigured so that a replacement server can use the LUN of the faulty server.
SANs also tend to enable more effective disaster recovery processes. A SAN could span a distant location containing a secondary storage array. This enables storage replication either implemented by disk array controllers, by server software, or by specialized SAN devices. Since IP WANs are often the least costly method of long-distance transport, the Fibre Channel over IP (FCIP) and iSCSI protocols have been developed to allow SAN extension over IP networks. The traditional physical SCSI layer could support only a few meters of distance –not nearly enough to ensure business continuance in a disaster.
The economic consolidation of disk arrays has accelerated the advancement of several features including I/O caching, snapshotting, and volume cloning (Business Continuance Volumes or BCVs).
Most storage networks use the SCSI protocol for communication between servers and disk drive devices. A mapping layer to other protocols is used to form a network:
- ATA over Ethernet (AoE), mapping of ATA over Ethernet
- Fibre Channel Protocol (FCP), the most prominent one, is a mapping of SCSI over Fibre Channel
- Fibre Channel over Ethernet (FCoE)
- ESCON over Fibre Channel (FICON), used by mainframe computers
- HyperSCSI, mapping of SCSI over Ethernet
- iFCP or SANoIP mapping of FCP over IP
- iSCSI, mapping of SCSI over TCP/IP
- iSCSI Extensions for RDMA (iSER), mapping of iSCSI over InfiniBand
Storage networks may also be built using SAS and SATA technologies. SAS evolved from SCSI direct-attached storage. SATA evolved from IDE direct-attached storage. SAS and SATA devices can be networked using SAS Expanders.
Qlogic SAN-switch with optical Fibre Channel connectors installed.
SANs often use a Fibre Channel fabric topology, an infrastructure specially designed to handle storage communications. It provides faster and more reliable access than higher-level protocols used in NAS. A fabric is similar in concept to a network segment in a local area network. A typical Fibre Channel SAN fabric is made up of a number of Fibre Channel switches.
Many SAN equipment vendors also offer some form of Fibre Channel routing, and these can allow data to cross between different fabrics without merging them. These offerings use proprietary protocol elements, and the top-level architectures being promoted are radically different. For example, they might map Fibre Channel traffic over IP or over SONET/SDH.
In media and entertainment
Video editing systems require very high data transfer rates and very low latency. SANs in media and entertainment are often referred to as serverless due to the nature of the configuration that places the video workflow (ingest, editing, playout) desktop clients directly on the SAN rather than attaching to servers. Control of data flow is managed by a distributed file system such as StorNext by Quantum
Per-node bandwidth usage control, sometimes referred to as quality of service (QoS), is especially important in video editing as it ensures fair and prioritized bandwidth usage across the network.
Storage virtualization is the process of abstracting logical storage from physical storage. The physical storage resources are aggregated into storage pools, from which the logical storage is created. It presents to the user a logical space for data storage and transparently handles the process of mapping it to the physical location, a concept called location transparency. This is implemented in modern disk arrays, often using vendor proprietary technology. However, the goal of storage virtualization is to group multiple disk arrays from different vendors, scattered over a network, into a single storage device. The single storage device can then be managed uniformly.
Quality of service
QoS can be impacted in a SAN storage system by unexpected increase in data traffic (usage spike) from one network user that can cause performance to decrease for other users on the same network. This can be known as the “noisy neighbor effect.” When QoS services are enabled in a SAN storage system, the “noisy neighbor effect” can be prevented and network storage performance can be accurately predicted.
Using SAN storage QoS is in contrast to using disk over-provisioning in a SAN environment. Over-provisioning can be used to provide additional capacity to compensate for peak network traffic loads. However, where network loads are not predictable, over-provisioning can eventually cause all bandwidth to be fully consumed and latency to increase significantly, resulting in SAN performance degradation.
Beth Zange-Sellers, PEI