Skip to content

Using SD cards for embedded ESXi and vSAN?

*Update to include corruption detection script, and better KB on endurance and size requirements for  boot devices*

I get a lot of questions about embedded installations of VMware vSAN.

Cormac has written some great advice on this already.

This KB explains how to increase the crash dump partition size for customers with over 512GB of RAM.

vSAN trace file placement is discussed by Cormac here.

Given that vSAN does not support running VMFS on the same RAID controller used for pass thru this often causes customers to look at embedded ESXi installs. Today a lot of deployments are done using embedded SD cards because they support a basic RAID 1 mirror system.

The issue

While not a vSAN issue directly this issue can impact vSAN customers. We have identified this issue on non-vsan hosts.

GSS has seen challenges with lower quality SD cards exhibiting significantly higher failure rates as bad batches in the supply chain have caused cascading failures in clusters. VMware has researched the issue and found that a amplification of reads is making the substandard parts fail quicker. Note the devices will not outright fail, but can be detected by running a hash of the first 20MB repeatedly and getting different results. This issue is commonly discovered on a reboot. As a result of this in 6.0U3 we have a method of redirecting the VMTools to a RAMDisk as this was found to be the largest source of reads to the embedded install. The process for setting this as follows.

Prevention

Log into each host using an SSH connection and set the ToolsRamdisk option to “1”:

1. esxcli system settings advanced set -o /UserVars/ToolsRamdisk -i 1
2. Reboot the ESXi host
3. Repeat for remaining hosts in the cluster.

Thanks to GSS/Engineering for hunting this issue down and getting this work around out. More information can be found on the KB here. As a proactive measure I would recommend all embedded SD card and USB device deployments use this flag, as well as any environment that seeks faster VMTools performance.

Detection

Knowing is half the battle!

This host will likely not survive a reboot!

What if you do not know if you are impacted by this issue?  William Lam has written this great script that will check the MD5 hash of the first 20MB in 3 passes, to detect if you are impacted by this issue. (Thanks to Dan Barr for testing).

Going forward I expect to see more deployments with High endurance SATADOM devices, as well as in future server designs embedded M.2 slots for boot devices becoming more common and SD cards retired as the default option. While these devices may lack redundancy I would expect a higher MTBF for one of these than a pair of low quality/cost SD cards. The lack of end to end nexus checking on embedded devices vs a full drive also contribute to this. Host profiles and configuration backups can mitigate a lot of the challenges of rebuilding one in the event of a failure.

Mitigation

Check out this KB for how to Backup your ESXi configuration (somewhere other than the local device).

Evacuate the host swap in the new device with a fresh install and restore the configuration.

Looking for a new Boot Device?

Although a 1GB USB or SD device suffices for a minimal installation, you should use a 4GB or larger device. The extra space will be used for an expanded coredump partition on the USB/SD device. Use a high quality USB flash drive of 16GB or larger so that the extra flash cells can prolong the life of the boot media, but high quality drives of 4GB or larger are sufficient to hold the extended coredump partition. See Knowledge Base article http://kb.vmware.com/kb/2004784.

Looking for guidance on what the endurance and size you need for a embedded boot device (as well as vSAN advice?). Check out KB2145210 that breaks out what different use cases need.

 

VMware vSAN, Cisco UCS and Cisco ACI information

I’ve had a few questions regarding VMware vSAN with Cisco ACI.

While mostly the guidance for ACI is the same there are a few vendor specific considerations. upon internal testing we found some recommended configuration advise and specific concerns for the multicast querier. For more information see this new storage hub section of the networking guide. 

If your looking for General vSAN networking advice, be sure to read the networking guide.

If your looking for Cisco’s documentation regarding UCS servers and VMware vSAN it can be found here.

If your looking for guidance on configuring Cisco Controllers and HBA’s Peter Keilty has some great blogs on this topic. As a reminder while I would strongly prefer the Cisco HBA over the RAID controller if you use the RAID controller you will need the cache module to have proper queue depths.

 

Virtual SAN performance Service: What is it? (And what about these other things)

One of the newest exciting features of Virtual SAN 6.2 is the new performance service.  This is an ESXi native performance monitoring system with API, as well as UI access.

One misconception I wanted to be clear on is that it does not require the use of vCenter Operations Manager, or the vCenter database. Instead, Virtual SAN performance service uses the Virtual SAN object store to store its data in a distributed and protected fashion.  For smaller environments who do not want the overhead of VSOM this is a great solution, and will complement the existing tools.

Now why would you want to deploy VSOM if this turnkey simple, low overhead performance system is native? Quite a few reasons:

  • VSOM offers longer term granular performance tracking. The native Virtual SAN performance service uses the same roll up schedule as vCenter’s normal performance graphs.
  • VSOM allows for forecasting and capacity planning as it analysis trends.
  • VSOM allows overlaying performance from multiple area’s and systems (Including things like switching, application KPI’s) to do root cause and anomaly analysis and correlation.
  •  VSOM offers powerful integration with LogInsight allowing event correlation with performance graphs.
  • VSOM allows for rolling up performance information across hundreds (or thousands of sites) into larger dashboards.
  • In heterogenous enivrements using traditional storage, VSOM allows collecting fabric, and array performance information.
Screen Shot 2016-05-08 at 3.45.16 PM

vCenter Server vDisk Advanced metrics

So if I don’t enable this service (or deploy VSOM) what do I get? You still get basic Latency, IOPS, throughput information from the normal vCenter performance graphs by looking at the vDisk layer. You miss out on back end component views (things like internal SSD queues and latency) as well as datastore/cluster wide metrics, but you can still troubleshoot basic issues with the built in performance graphs.

What about VSAN Observer? For those of you who remember previously this information was only available by using the Ruby vCenter shell interface (RVC). VSAN observer provides powerful visibility, but it had a number of limits:

  • It was designed originally for internal troubleshooting and lacks consistency with the vCenter UI.
  • It ran on its own web service separately and was not integrated into the existing vCenter graphs.
  • It was manually enabled from the RVC CLI
  • It could not be accessed by API
  • It was not recommended to run it continuously, or to deploy a separate Virtual machine/Container to run it from.

All of these limitations have been addressed with the Virtual SAN performance service.

I expect the performance service will largely replace VSAN Observer uses. VSAN observer will still be useful for customers who have not upgraded to VSAN 6.2 or where you do not have capacity available for the performance database.

Screen Shot 2016-05-08 at 3.59.02 PMThere is an extensive amount of metrics that can be reviewed. It offers “top down” visibility of cluster wide performance, and virtual machine IOPS and latency.

 

 

 

 

Click to expand!

Individual device metrics

Virtual SAN Performance service also offers “bottom up” visibility into device latency and queues on individual capacity and cache devices.  For quick troubleshooting of issues, or verification of performnace it is a great and simple tool that can be turned on with a single checkbox.

 

 

 

Requirements:

vSphere 6.0u2

vCenter 6.0u2 (For UI)

Up to 255GB of capacity on the Virtual SAN datastore (You can choose the storage policy it uses).

In order to enable it simple follow these instructions.

 

Dispelling myths about VSAN and flash.

I’ve been having the same conversation with several customers lately that is concerning.

Myth #1 “VSAN must use flash devices from a small certified list”

Reality: The reality is that that there are over 600 different flash devices that have been certified (and this list is growing).

Myth #2 “The VSAN certified flash devices are expensive!”

Reality: ” Capacity tier flash devices can be found in the 50-60 cents per GB range from multiple manufacturers. Caching tier devices can be found for under $1 per GB.  These prices have fallen from $2.5 a GB when VSAN was released in 2014. I expect this downward price trend to continue.

Myth #3 “I could save money with another vendor who will support using cheaper consumer grade flash. They said it would be safe”.

Reality: Consumer grade drives lack capacitors to protect both upper and lower pages.  In order to protect lower cost NAND these drives use volatile DRAM buffers to hold and coalesce writes. Low end consumer grade drives will ignore flush after write commands coming from the operating system, and on power loss can simply loose the data in this buffer.  Other things that can happen is meta data corruption (loss of the lookup table resulting in large portions of the drive becoming unavailable) shorn writes (where writes do not align properly with their boundary and loose data as well as improperly return it on read) and non-serialized writes that could potentially file system or application level recovery journals.  Ohio State and HP Labs put together a great paper on all the things that can (and will) go wrong here. SSD’s have improved since this paper, and others have done similar tests of drives with and without proper power loss protection. The findings point to enterprise class drives with power loss protection being valuable.

Myth #4 “Those consumer grade drives are just as fast!”

Reality: IO latency consistency is less reliable on writes and garbage collection takes significantly more time as there is less spare capacity to manage it.  Flash is great when its fast, but when its not consistent applications can miss SLA’s. If using consumer grade flash in a VSAN home lab, make sure you disable the high latency drive detection. In our labs under heavy sustained load we’ve seen some fairly terrible performance out of consumer flash devices.

In conclusion, there are times and places for cheap low end consumer grade flash (like in my notebook or home lab) but for production use where persistent data matters it should be avoided.

How to handle isolation with scale out storage

I would like to say that this post was inspired by Chad’s guide to storage architectures. When talking to customers over the years a recurring problem surfaced.  Storage historically in the smaller enterprises tended towards people going “all in” on one big array. The idea was that by consolidating the purchasing of all of the different application groups, and teams they could get the most “bang for buck”.  The upsides are obvious (Fewer silo’s and consolidation of resources and platforms means lower capex/opex costs). The performance downsides were annoying but could be mitigated. (normally noisy neighbor performance issues). That said the real downside to having one (or a few) big arrays are often found hidden on the operational side.

  1. Many customers trying to stretch their budget often ended up putting Test/Dev/QA and production on the same array (I’ve seen Fortune 100 companies do this with business critical workloads). This leads to one team demanding 2 year old firmware for stability, and the teams needing agility trying to get upgrades. The battle between stability and agility gets fought regularly in the change control committee meetings further wasting more people’s time.
  2. Audit/regime change/regulatory/customer demands require an air gap be established for a new or existing workload. Array partitioning features are nice, but the demands often extend beyond this.
  3. In some cases, organizations that had previously shared resources would part ways. (divestment, operational restructuring, budgetary firewalls).
Feed me RAM!

“Not so stealthy database”

Some storage workloads just need more performance than everyone else, and often the cost of the upgrade is increased by the other workloads on the array that will gain no material benefit. Database Administrators often point to a lack of dedicated resources when performance problems arise.  Providing isolation for these workloads historically involved buying an exotic non-x86 processor, and a “black box” appliance that required expensive specialty skills on top of significant Capex cost. I like to call these boxes “cloaking devices” as they often are often completely hidden from the normal infrastructure monitoring teams.

A benefit to using a Scale out (Type III)  approach is that the storage can be scaled down (or even divided).  VMware VSAN can evacuate data from a host, and allow you to shift its resources to another cluster. As Hybrid nodes can push up to 40K IOPS (and all flash over 100K) allowing even smaller clusters to hold their own on disk performance. It is worth noting that the reverse action is also possible. When a legacy application is retired, the cluster that served it can be upgraded and merged into other clusters. In this way the isolation is really just a resource silo (the least threatening of all IT silos).  You can still use the same software stack, and leverage the same skill set while keeping change control, auditors and developers happy. Even the Database administrators will be happy to learn that they can push millions of orders per minute with a simple 4 node cluster.

In principal I still like to avoid silos. If they must exist, I would suggest trying to find a way that the hardware that makes them up is highly portable and re-usable and VSAN and vSphere can help with that quite a bit.

 

Upcoming Live/Web events…

Spiceworks  Dec 1st @ 1PM Central- “Is blade architecture dead” a panel discussion on why HCI is replacing legacy blade designs, and talk about use cases for VMware VSAN.

Micron Dec 3rd @ 2PM Central – “Go All Flash or go home”   We will discuss what is new with all flash VSAN, what fast new things Micron’s performance lab is up to, and an amazing discussion/QA with Micron’s team. Specifically this should be a great discussion about why 10K and 15K RPM drives are no longer going to make sense going forward.

Intel Dec 16th @ 12PM Central – This is looking to be a great discussion around why Intel architecture (Network, Storage, Compute) is powerful for getting the most out of VMware Virtual SAN.

VSAN is now up to 30% cheaper!

Ok, I’ll admit this is an incredibly misleading click bait title. I wanted to demonstrate how the economics of cheaper flash make VMware Virtual SAN (and really any SDS product that is not licensed by capacity) cheaper over time. I also wanted to share a story of how older slower flash became more expensive.

Lets talk about a tale of two cities who had storage problems and faced radically different cost economics. One was a large city with lots of purchasing power and size, and the other was a small little bedroom community. Who do you think got the better deal on flash?

Just a small town data center….

A 100 user pilot VDI project was kicking off. They knew they wanted great storage performance, but they could not invest in a big storage array with a lot of flash up front. They did not want to have to pay more tomorrow for flash, and wanted great management and integration. VSAN and Horizon View were quickly chosen. They used the per concurrent user licensing for VSAN so their costs would cleanly and predictably scale. Modern fast enterprise  flash was chosen that cost ~$2.50 per GB and had great performance. This summer they went to expand the wildly successful project, and discovered that the new version of the drives they had purchased last year now cost $1.40 per GB, and that other new drives on the HCL from their same vendor were available for ~$1 per GB. Looking at other vendors they found even lower cost options available.  They upgraded to the latest version of VSAN and found improved snapshot performance, write performance and management. Procurement could be done cost effectively at small scale, and small projects could be added without much risk. They could even adopt the newest generation (NVMe) without having to forklift controllers or pay anyone but the hardware vendor.

Meanwhile in the big city…..

The second city was quite a bit larger. After a year long procurement process and dozens of meetings they chose a traditional storage array/blade system from a Tier 1 vendor. They spent millions and bought years worth of capacity to leverage the deepest purchasing discounts they could. A year after deployment, they experienced performance issues and wanted to add flash. Upon discussing with the vendor the only option was older, slower, small SLC drives. They had bought their array at the end of sale window and were stuck with 2 generations old technology. It was also discovered the array would only support a very small amount of them (the controllers and code were not designed to handle flash). The vendor politely explained that since this was not a part of the original purchase the 75% discount off list that had been on the original purchase would not apply and they would need to pay $30 per GB. Somehow older, slower flash had become 4x more expensive in the span of a year.  They were told they should have “locked in savings” and bought the flash up front. In reality though, they would  locking in a high price for a commodity that they did not yet need. The final problem they faced was an order to move out of the data center into 2-3 smaller facilities and split up the hardware accordingly.  That big storage array could not easily be cut into parts.

There are a few lessons to take away from these environments.

  1. Storage should become cheaper to purchase as time goes on. Discounts should be consistent and pricing should not feel like a game show. Software licensing should not be directly tied to capacity or physical and should “live” through a refresh.
  2. Adding new generations of flash and compute should not require disruption and “throwing away” your existing investment.
  3. Storage products that scale down and up without compromise lead to fewer meetings, lower costs, and better outcomes. Large purchases often leads to the trap of spending a lot of time and money on avoiding failure, rather than focusing on delivering excellence.

Veeam On (Part 2)

It has been a great week.  VeeamOn has been a great conference and its clear Rick and the team really wanted to have a different spin on the IT conference.  While there are some impressive grand gestures, speakers and sessions (I Seriously feel like I’m in a spaceship right now) its the little details that stand out too.

My favorite little things so far.

  • General Session does not start at 8AM.
  • Breakfast runs till 9AM not 7:45AM like certain other popular conferences. (Late night networking and no food going into a General Session is not a good idea!).
  • Day 2 keynotes are by the partners, giving me a mini-taste of VMware, HP, Cisco, Netapp’s and Microsoft’s “Vision” for the new world of IT.
  • A really interesting mix of attendees.  I’ll go from having a conversation with 10TB to someone else with 7PB’s and some of the challenges we are discussing will be the same.
  • LabWarz  is easy to wander in and tests you skills in a far more “fun” and meaningful way than trying to dive a certification test between sessions.
  • The vendor expo isn’t an endless maze of companies but are companies (and specifically only the products within them) that are relevant to Veeam users.

Veeam On! (Part 1)

IMG_5267

I’m in Vegas rounding out the conference tour (VMworld,SpiceWorld,VMworld,DellWorld) for what looks to be a strong finish. This is my first time at VeeamOn and I’m looking forward to briefings across the full Veeam portfolio. I’m looking forward to being shamed by the experts in Lab Warz and getting my hands dirty with the v9.

More importantly I”m looking forward to some great conversations. The reason why I value going to conferences goes beyond great sessions and discussions with vendors at the solutions expo. The conversations with end users (small, large and giant) help you learn where the limits are (and how to push past them) in the tools you rely on. I’ve had short conversations over breakfast that saved me six months of expensive trial and error that others had been through. A good conference will attract both small and massive scale customers and bring together great conversations that will help everyone change their perspective and get things done.

All good things…

I started my IT career as a customer.  It was great having complete ownership of the environment but eventually I wanted more.  I moved to the partner side and the past five years have been amazing. I have worked with more environments than I can count.  It exposed me to diverse technical and operational challenges. It gave me the opportunity to see first hand past the marketing what worked and what did not work. I would like to thank everyone (customers, co-workers) and all of the people who I was able to directly work with who helped me reach this point in my career. I also want to thank people who freely share to the greater community. Their blogs, their words of caution, their advice, their presentations at conferences all contributed in helping me succeed. I will miss the amazing team at Synchronet but it was time for change.

Starting today, I will be in a new role at VMware in Technical Marketing for VMware VSAN. I am excited for this change, and look forward to the challenges ahead. In this position I hope to learn and give back to the greater community that has helped me reach this point. I will still blog various musings here, but look for VSAN and storage content at Virtual Blocks.

I look forward to the road ahead!