Duncan wrote a great blog summarizing why HBA’s are a better choice over RAID controllers. Looking back we’ve seen a shift with some of our OEM’s to even go so far as to have their ready nodes always configured for HBA controllers due to their simplicity, lower cost, and fast performance.
One question that has come up recently is “What is the HBA 330+?”. Dell customers may have noticed that the HBA 330 became the default option on their 13th generation ReadyNodes some time ago. On Dell 14th generation quotes show up with a “+” added to the card causing some concern that maybe this device is not the same one certified. Upon consulting with the vSAN ReadyLabs it seems this card has the exact same PCI ID, and is, in fact, the exact same HBA. Only minor cabling changes made that in no way impact it’s recommended driver or firmware or certification status. This is currently the ONLY certified option for Dell 14G ReadyNode servers and I expect it to likely stay that way until NVMe replaces SCSI for customers.
Going forward I expect NVMe to increasingly replace SAS/SATA, and in this case, we will see a mixture of direct PCI-Express connections, or connections through a PCI-E crossbar. All NVMe ready nodes I’ve seen tested are showing that replacing the HBA leads to lower latency, less CPU overhead, and consistent outcomes.
I’ve been getting some questions lately and here are a few quick thoughts on getting the most out of this feature.
If you do not see deduplication or compression at all:
- See if the object space reservation policy has been set to above zero, as this reservation will effectively disable the benefits of deduplication for the virtual machine.
- Do not forget that swap is by default set to 100% but can be changed.
- If a legacy client or provisioning command is used that specifies “thick” or “Eager Zero Thick” this will override the OSR 100%. To fix this, you can reapply the policy. William Lam has a great blog post with some scripts on how to identify and resolve this.
- Make sure data is being written to the capacity tier. If you just provisioned 3-4 VM’s they may still be in the write buffer. We do not waste CPU or latency deduplicating or compressing data that may not have a long lifespan. If you only provisioned 10 VM’s that are 8GB each it’s quite possible that they have not destaged yet. If you are doing testing clone a lot of VM’s (I tend to create 200 or more) so you can force the destage to happen.
Performance anomalies (and why!) when testing vSAN’s deduplication and compression.
I’ve always felt that it’s incredibly hard to performance test deduplication and compression features, as real-world data has a mix of compressibility, and duplicate blocks and some notes I’ve seen from testing. Note: these anomalies often happen on other storage systems with these features and highlight the difficulty in testing these features.
- Testing 100% duplicate data tends to make reads and writes better than a baseline of the feature off as you avoid any bottleneck on the destage from cache process, and the tiny amount of data will end up in a DRAM cache.
- Testing data that compresses poorly on vSAN will show the little impact to read performance as vSAN will write the data fully hydrated to avoid any CPU or latency overhead in decompression (not that LZ4 isn’t a fast algorithm, to begin with).
- Write throughput and IOPS for bursts that do not start to fill up the cache show little overhead. This is true, as the data is written non-compacted to reduce latency
These quirks stick out in synthetic testing, and why I recommend reading the space efficiencies guide for guidance on using this and other features.