This weekend I got a call from a local business that produces highlight reels from video game footage. Admittedly, I am not a big gamer (my PS3 was purchased at launch and has played games for less than four hours over the years), but I did think that this would be an interesting opportunity to do some troubleshooting over a rainy weekend.
Soon after arriving I saw the setup, rather beefy PCs using KillerNICs for gaming traffic and various Realtek 8111 series NICs plus a few setups with dedicated Intel NICs running to a HP ProCurve Gigabit switch. This is a best practice when one has two very different types of traffic going over networks as the gaming network was optimized for low latency and the storage networks was optimized for sequential transfers. The first thing I did was to look at the recording setup:
The main recording setup used a program called FRAPS to capture the gaming video. The “X” drive was a mapped network share to a lower-end four drive NAS solution that uses a proprietary RAID-like feature to store data (I am deliberately not naming the manufacturer.) Inside the NAS box four 7,200rpm Seagate drives were situated so it initially seemed like there was plenty of drive to saturate a gigabit Ethernet link. Trying to remove variables, I had a small SSD-based NAS that I use to troubleshoot these types of problems.
When troubleshooting network connections, I tend to use a program called DU Meter which has a few handy features that show what and when network traffic is being generated in Windows. I had the team load a quick 30 second run (ended up taking almost 45 seconds including loading times) and used the DU Meter stopwatch feature.
Since I know the network and the SSD NAS I was using were both easily capable of running 125MB/s over the network, I was able to determine that the maximum transfer speed that the FRAPS application provided was in the area of 44MB/s. 44MB/s is a figure that most custom-built NAS systems can handle very easily. When I tried using the pre-built NAS, transfers were less than 35MB/s.
Armed with these figures, I started to investigate the pre-built NAS in question. One thing I saw was that there was about 600GB free on the device with approximately 6TB of raw capacity. The company did have extra, clean drives on-hand and after a quick reload of the OS and build of the storage pools, sequential transfers were in the 90-100MB/s range over a single gigabit Ethernet link. After re-installing the previous set of storage drives that were 90% full, the transfer speeds were back to around 35MB/s.
From the best I can tell, what basically happened was that the business was being penalized performance wise on two fronts. First, the data was being written to the inner parts of the drive platters. On modern disks, it is not uncommon to see a drive go from 130MB/s or more at the outer edge to half of that at the disks inner edge. The disks in the production system were filling the last portions of the disks so the per-drive performance was probably closer to 70MB/s per drive. Second, the custom RAID-like feature is known to have some overheads. These overheads are clearly present because even with four new disks sequential transfers were well below that of the SSD-based NAS I used. A quick search revealed that the NAS vendor’s RAID-like solution did have significant overhead. Between the fairly full disks, and the performance penalty due to the software, the NAS in question was unable to hold 45MB/s.
Conclusion
One major thing people should consider when looking at benchmarks of NAS units online is the difference between new and well used systems. As systems become more full, performance degrades on spindle disks simply due to the physical location of the writes on the platter. Next, many SMB (and consumer) friendly NAS units are optimized for ease of use rather than performance. As a result, the performance of the business’ storage server declined over time to the point where the performance became unacceptable even in a single-user environment. My advice is to do research on a given NAS product both when it is new (when most people benchmark) and when it has been deployed for some time. Doing so will help avoid the same issue that the business I helped over the weekend experienced.
Besides recording in the inner (i.e. slower) tracks of the disks, when a hard drive becomes full, I believe that there is also a significant probability that the drives have become severely fragmented due to usage and so the newer files will be written all over the disk and with a lot of head movement that usually lowers transfer speeds a lot. An example: my Win 7 OS SSD has been writen about 20 times in about a year of usage and shows 38% fragmentation in Windows Defrag. But, one of my mail files in it with 622 MB has a litle over 4000 fragments. Reading it from the SSD probably takes a couple of seconds. From a disk, it would be a minute or more. That´s why is is necessary to defrag hard drives and I have never read anything saying that a NAS is capable of defragging its hard drives automatically…
@Mauro: These boxes often run ext3. Defragmentation doesn’t degrade performance as most windows filesystems do. They do tend to slow down (as most filesystems) around the 90% watermark.