Skip to content

Posts from the ‘Virtualization’ Category

Peanut Butter is Not Supported with vSphere/Storage Networking/vSAN/VCF

 From time to time I get oddball questions where someone asks about how to do something that is not supported or a bad idea. I’ll often fire back a simple “No” and then we get into a discussion about why VMware does not have a KB for this specific corner case or situation. There are a host of reasons why this may or may not be documented but here is my monthly list of “No/That is a bad idea (TM)!”.

How do I use VMware Cloud Foundation (VCF) with a VSA/Virtual Machine that can not be vMotion’d to another host?

This one has come up quite a lot recently with some partners, and storage vendors who use VSA’s (A virtual machine that locally consumes storage to replicate it) incorrectly claiming this is supported. The issue is that SDDC Manager automates upgrade and patch management. In order to patch a host, all running virtual machines must be removed. This process is triggered when a host is placed into maintenance mode and DRS carefully vMotions VMs off of the host. If there is a virtual machine on the host that can not be powered off or moved, this will cause lifecycle to fail.

What about if I use the VSA’s external lifecycle management to patch ESXi?

The issue comes in, running multiple host patching systems is a “very bad idea” (TM). You’ll have issues with SDDC Manager not understanding the state of the hosts, but also coordination of non-ESXi elements (NSX perhaps using a VIB) would also be problematic. The only exception to using SDDC manager with external lifecycle tooling tools are select vendor LCM solutions that done customization and interop (Examples include VxRAIL Manager, the Redfish to HPE Synergy integration, and packaged VCF appliance solutions like UCP-RS and VxRACK SDDC). Note these solutions all use vSAN and avoid the VSA problem and have done the engineering work to make things play nice.

JAM also not supported!

Should I use a Nexus 2000K (or other low performing network switch) with vSAN?

While vSAN does not currently have a switch HCL (Watch this space!) I have written some guidance specifically about FEXs on this personal blog. The reality is there are politics to getting a KB written saying “not to use something”, and it would require cooperation from the switch vendors. If anyone at Cisco wants to work with me on a joint KB saying “don’t use a FEX for vSAN/HCI in 2019” please reach out to me! Before anyone accuses me of not liking Cisco, I’ll say I’m a big fan of the C36180YC-R (ultra deep buffers RAWR!), and have seen some amazing performance out of this switch recently when paired with Intel Optane.

Beyond the FEX, I’ve written some neutral switch guidance on buffers on our official blog. I do plan to merge this into the vSAN Networking Guide this quarter. 

I’d like to use RSPAN against the vDS and mirror all vSAN traffic, I’d like to run all vSAN traffic through a ASA Firewall or Palo Alto or IDS, Cisco ISR, I’d like to route vSAN traffic through a F5 or similar requests…

There’s a trend of security people wanting to inspect “all the things!”.  There are a lot of misconceptions about vSAN routing or flowing or going places.

Good Ideas! – There is some false assumptions you can’t do the following. While they may add complexity or not be supported on VCF or VxRAIL in certain configurations, they certainly are just fine with vSAN from a feasibility standpoint.

  1. Routing storage traffic is just fine. Modern enterprise switches can route OSPF/Static routes etc at wire-speed just fine all in the ASIC offloads. vSAN is supported over layer 3 (may need to configure static routes!) and this is a “Good idea” on stretched clusters so spanning tree issues don’t crash both datacenters!
  2. vSAN over VxLAN/VTEP in hardware is supported.
  3. VSAN over VLAN backed port groups on NSX-T is supported.

Bad Ideas!

Frank Escaros-Buechsel with VMware support once told someone “While we do not document that as not supported, it’s a bit like putting peanut butter in a server. Some things we assume are such bad idea’s no one would try them, and there is only so much time to document all bad ideas.

  1. Trying to mirror high throughput flows of storage or vMotion from a VDS is likely to cause performance problems. While I”m not sure of a specific support statement, i’m going to kindly ask you not to do this. If you want to know how much traffic is flowing and where, consider turning on SFLOW/JFLOW/NetFlow on the physical switches and monitoring from that point. vRNI can help quite a bit here!
  2. Sending iSCSI/NFS/FCoE/vSAN storage traffic to an IDS/Firewall/Load balancer. These devices do not know how to inspect this traffic (trust me, they are not designed to look at SCSI or NVMe packets!) so you’ll get zero security value out of this process. If you are looking for virus binaries, your better off using NSX guest introspection and regular antivirus software. Because of the volume, you will hit the wire-speed limits of these devices. Outside of path latency, you will quickly introduce drops and re-transmits and murder storage traffic performance. Outside of some old Niche inline FC encryption blades (that I think Netapp used to make), inline storage security devices are a bad idea. While there are some carrier-grade routers that can push 40+ Gbps of encryption (MLXe’s I vaguely remember did this) the costs are going to be enormous, and you’ll likely be better off just encrypting at the vSCSI layer using the VM Encryption VAIO filter. You’ll get better security that IPSEC/MACSEC without massive costs.

Did I get something wrong?

Is there an Exception?

Feel free to reach out and lets talk about why your environment is a snowflake from these general rules of things “not to do!”

Where did my host go….

UPDATE: https://kb.vmware.com/s/article/53749
VMware and Intel have a KB for workarounds on this issue.

I was reading Bob Plankers colorful complaints about his Intel X710/XL710/X722/XXV710 family of NICs and figured I’d do some digging and ask around on people I know who have them as well as summarize some things I learned from using them as a customer.

A few observations:

  • These problems are not specific to vSphere. People running Linux and Windows on bare metal ran into these issues
  • While a lot has been focused on the LRO/TSO issue, there is another separate issue tied to LLDP and duplicate mac addresses being created.

First Issue LRO/TSO

This KB Sums up the issue quite well by pointing out that these features can cause PSODs. Checking with some friends who used to be able to reproduce this at the drop of a hat the newest driver/firmware is a lot more stable in this regard, but it can still happen. Some people are leaving these disabled to stay safe, while others are hungry for the small CPU gains these features deliver.   How do I remediate it? Beyond manually setting it on the hosts Jase Mccarty has a great script that will do this in bulk for a cluster.

Next up: The case of the disappearing host!

The common symptom is that management on a host will cease to function (Pings will drop) and the host will disappear from vCenter. Sometimes something more catastrophic happens (HA triggers, host isolation is triggered, storage or vMotion fails). If you pay attention closely to LogInsight, you will see your switches are reporting Mac Address Flapping (You are sending your switches syslog into LogInsight, RIGHT?!?)

Sow what’s going on here?

How is VMK0 special

This goes back to the special behavior for VMK0 where it steals the mac address from a physical port. This is handy for new cluster setups where people know the MAC addresses from the OEM providing them before delivery being able to put this in their DHCP reservations and get started without needing to physically touch the hosts to know which one is which etc.

Why is this card special?

This card is unique in that there is a special LLDP agent that runs on the card and intercepts LLDP packets.  Previously I associated LLDP with simply sending information on what’s plugged in where (which is why you should turn it on for send/receive with your VDS). In this case, though the LLDP agent will also update where a MAC is located.

Why together does this happen?

The challenge comes when VMK0 moves to a different physical switch port and tries to move the MAC address with it. You get a fun ARP battle between the LLDP agent of the physical port and the VMK0 that is behind a different physical port. A good old fashioned duplicate MAC entry ARP battle ensues, and this is going to manifest itself as a host going offline completely, or flipping back and forth based on the update hold-down interval on the switch. (Side note, any real networking people feel free to correct me on my terminology here I dropped out of my CCNA class in 2008).

Why did I loose more than management (or what am I doing wrong!)?

Given most people use VMK0 for management by vCenter (and for non-VSAN clusters HA heartbeats happen here) this can have a lot of interesting behaviors like loss of management, host isolation response being triggered. This is another great reminder of why you should use datastore heart beating, or VSAN which will not depend on VMK0 for heartbeats.

Also if you are running EVERYTHING on VMK0 (Storage vMotion) which is NOT a recommended practice (isolate storage and vMotion networks!) you could see all of the virtual machines crash and other fun things.

Workarounds?

So there are a few ways to possibly work around this.

  1. You could simply avoid using VMK0 with this card. Either disconnecting it and using a new VMK4 or so forth for whatever it was being used for. This is simple, it’s easy (outside of disconnecting and reconnecting hosts) and doesn’t require you touch the network beyond having one extra IP on the management network to make it easier.
  2. You could change the mac address manually to something in the random VMware MAC address space (Need to clarify if this is supported, but it’s simple enough and avoids this issue). Note that the MAC would be set back if you ever remote and recreate VMK0.
  3. If you trust your networking team, you could try asking they hardcode the MAC address to specific ports in the CAM tables of the switch. I would look at this only as a last resort if operationally you can’t physically change anything on the hosts but need an extreme workaround
  4. *EDIT* It looks like running LACP across the origional physical port and another port will work around the issue. The switch isn’t going to care where the frame comes from, and so this should reduce or ignore the chance of an arp fight. Balancing for VMK0 across physical ports will not be great, but as long as it is is management only you will likely not care too much. (Thanks to Simon for this discussion).
  5. *EDIT* Try putting VMK0 on a tagged NON-Native VLAN. It can’t get in a fight with the LLDP agent for the MAC address if it’s on a completely different broadcast domain (Thanks to Broc Yanda for this idea).

What else is going on that I don’t know about vSphere Networking?

This week I also learned about shadow vmnics.

Looking for VMware Storage Content?

Looking for Demo’s, Videos, Design and sizing guides, VVOLs, SRM, VSAN?

Go check out storagehub.vmware.com

Did you get a fake ReadyNode?

We’ve all been there…

Maybe its the streets of NYC, or a corner stall in a mall in Bangkok, or even Harwin St here in Houston. Someone tried to sell you a cut rate watch or sunglasses. Maybe the lettering was off, or the gold looked a bit flakey but you passed on that possibly non-genuine watch or sunglasses. It might have even been made in the same factory, but it is clear the QC might have issues. You would not expect the same outcome as getting the real thing. The same thing can happen in ReadyNodes.

Real ReadyNodes for VMware vSAN have a couple key points.

They are tested. All of the components have been tested together and certified. Beware anyone in software-defined storage who doesn’t have some type of certification program as this opens the doors to lower quality components, or hardware/driver/firmware compatibility issues. VMware has validated satisfactory performance with the ReadyNode configurations. A Real ReadyNode looks beyond “will these components physically connect” and if they will actually deliver.

vSAN ReadyNodes offer choice. ReadyNodes are available from over a dozen different server OEM’s. The VMware vSAN Compatibility Guide offers over a thousand verified hardware components also to supplement these ReadyNodes for further customization. ReadyNodes are not limited to a single server or compoennt vendor.

They are 100% supported by VMware. Real VMware ReadyNodes don’t require virtual machines to mount, present or consume storage, or non-VMware supported VIBs be installed.

They are Mature. They run a 7th release, battle-tested, mature hypervisor integrated storage stack.

So what do you do if you’ve ended up with a fake ReadyNode? Unlike the fake watch I had to throw away, you can check with the vSAN compatibility list and see if you can with minimal controller or storage devices changes convert your system in place over to vSAN. Remember if your running ESXI 5.5 update 1 or newer, you already have vSAN software installed. You just need to license and enable it!

How to handle isolation with scale out storage

I would like to say that this post was inspired by Chad’s guide to storage architectures. When talking to customers over the years a recurring problem surfaced.  Storage historically in the smaller enterprises tended towards people going “all in” on one big array. The idea was that by consolidating the purchasing of all of the different application groups, and teams they could get the most “bang for buck”.  The upsides are obvious (Fewer silo’s and consolidation of resources and platforms means lower capex/opex costs). The performance downsides were annoying but could be mitigated. (normally noisy neighbor performance issues). That said the real downside to having one (or a few) big arrays are often found hidden on the operational side.

  1. Many customers trying to stretch their budget often ended up putting Test/Dev/QA and production on the same array (I’ve seen Fortune 100 companies do this with business critical workloads). This leads to one team demanding 2 year old firmware for stability, and the teams needing agility trying to get upgrades. The battle between stability and agility gets fought regularly in the change control committee meetings further wasting more people’s time.
  2. Audit/regime change/regulatory/customer demands require an air gap be established for a new or existing workload. Array partitioning features are nice, but the demands often extend beyond this.
  3. In some cases, organizations that had previously shared resources would part ways. (divestment, operational restructuring, budgetary firewalls).

Feed me RAM!

“Not so stealthy database”

Some storage workloads just need more performance than everyone else, and often the cost of the upgrade is increased by the other workloads on the array that will gain no material benefit. Database Administrators often point to a lack of dedicated resources when performance problems arise.  Providing isolation for these workloads historically involved buying an exotic non-x86 processor, and a “black box” appliance that required expensive specialty skills on top of significant Capex cost. I like to call these boxes “cloaking devices” as they often are often completely hidden from the normal infrastructure monitoring teams.

A benefit to using a Scale out (Type III)  approach is that the storage can be scaled down (or even divided).  VMware VSAN can evacuate data from a host, and allow you to shift its resources to another cluster. As Hybrid nodes can push up to 40K IOPS (and all flash over 100K) allowing even smaller clusters to hold their own on disk performance. It is worth noting that the reverse action is also possible. When a legacy application is retired, the cluster that served it can be upgraded and merged into other clusters. In this way the isolation is really just a resource silo (the least threatening of all IT silos).  You can still use the same software stack, and leverage the same skill set while keeping change control, auditors and developers happy. Even the Database administrators will be happy to learn that they can push millions of orders per minute with a simple 4 node cluster.

In principal I still like to avoid silos. If they must exist, I would suggest trying to find a way that the hardware that makes them up is highly portable and re-usable and VSAN and vSphere can help with that quite a bit.

 

Upcoming Live/Web events…

Spiceworks  Dec 1st @ 1PM Central- “Is blade architecture dead” a panel discussion on why HCI is replacing legacy blade designs, and talk about use cases for VMware VSAN.

Micron Dec 3rd @ 2PM Central – “Go All Flash or go home”   We will discuss what is new with all flash VSAN, what fast new things Micron’s performance lab is up to, and an amazing discussion/QA with Micron’s team. Specifically this should be a great discussion about why 10K and 15K RPM drives are no longer going to make sense going forward.

Intel Dec 16th @ 12PM Central – This is looking to be a great discussion around why Intel architecture (Network, Storage, Compute) is powerful for getting the most out of VMware Virtual SAN.

VSAN is now up to 30% cheaper!

Ok, I’ll admit this is an incredibly misleading click bait title. I wanted to demonstrate how the economics of cheaper flash make VMware Virtual SAN (and really any SDS product that is not licensed by capacity) cheaper over time. I also wanted to share a story of how older slower flash became more expensive.

Lets talk about a tale of two cities who had storage problems and faced radically different cost economics. One was a large city with lots of purchasing power and size, and the other was a small little bedroom community. Who do you think got the better deal on flash?

Just a small town data center….

A 100 user pilot VDI project was kicking off. They knew they wanted great storage performance, but they could not invest in a big storage array with a lot of flash up front. They did not want to have to pay more tomorrow for flash, and wanted great management and integration. VSAN and Horizon View were quickly chosen. They used the per concurrent user licensing for VSAN so their costs would cleanly and predictably scale. Modern fast enterprise  flash was chosen that cost ~$2.50 per GB and had great performance. This summer they went to expand the wildly successful project, and discovered that the new version of the drives they had purchased last year now cost $1.40 per GB, and that other new drives on the HCL from their same vendor were available for ~$1 per GB. Looking at other vendors they found even lower cost options available.  They upgraded to the latest version of VSAN and found improved snapshot performance, write performance and management. Procurement could be done cost effectively at small scale, and small projects could be added without much risk. They could even adopt the newest generation (NVMe) without having to forklift controllers or pay anyone but the hardware vendor.

Meanwhile in the big city…..

The second city was quite a bit larger. After a year long procurement process and dozens of meetings they chose a traditional storage array/blade system from a Tier 1 vendor. They spent millions and bought years worth of capacity to leverage the deepest purchasing discounts they could. A year after deployment, they experienced performance issues and wanted to add flash. Upon discussing with the vendor the only option was older, slower, small SLC drives. They had bought their array at the end of sale window and were stuck with 2 generations old technology. It was also discovered the array would only support a very small amount of them (the controllers and code were not designed to handle flash). The vendor politely explained that since this was not a part of the original purchase the 75% discount off list that had been on the original purchase would not apply and they would need to pay $30 per GB. Somehow older, slower flash had become 4x more expensive in the span of a year.  They were told they should have “locked in savings” and bought the flash up front. In reality though, they would  locking in a high price for a commodity that they did not yet need. The final problem they faced was an order to move out of the data center into 2-3 smaller facilities and split up the hardware accordingly.  That big storage array could not easily be cut into parts.

There are a few lessons to take away from these environments.

  1. Storage should become cheaper to purchase as time goes on. Discounts should be consistent and pricing should not feel like a game show. Software licensing should not be directly tied to capacity or physical and should “live” through a refresh.
  2. Adding new generations of flash and compute should not require disruption and “throwing away” your existing investment.
  3. Storage products that scale down and up without compromise lead to fewer meetings, lower costs, and better outcomes. Large purchases often leads to the trap of spending a lot of time and money on avoiding failure, rather than focusing on delivering excellence.

Veeam On! (Part 1)

IMG_5267

I’m in Vegas rounding out the conference tour (VMworld,SpiceWorld,VMworld,DellWorld) for what looks to be a strong finish. This is my first time at VeeamOn and I’m looking forward to briefings across the full Veeam portfolio. I’m looking forward to being shamed by the experts in Lab Warz and getting my hands dirty with the v9.

More importantly I”m looking forward to some great conversations. The reason why I value going to conferences goes beyond great sessions and discussions with vendors at the solutions expo. The conversations with end users (small, large and giant) help you learn where the limits are (and how to push past them) in the tools you rely on. I’ve had short conversations over breakfast that saved me six months of expensive trial and error that others had been through. A good conference will attract both small and massive scale customers and bring together great conversations that will help everyone change their perspective and get things done.

All good things…

I started my IT career as a customer.  It was great having complete ownership of the environment but eventually I wanted more.  I moved to the partner side and the past five years have been amazing. I have worked with more environments than I can count.  It exposed me to diverse technical and operational challenges. It gave me the opportunity to see first hand past the marketing what worked and what did not work. I would like to thank everyone (customers, co-workers) and all of the people who I was able to directly work with who helped me reach this point in my career. I also want to thank people who freely share to the greater community. Their blogs, their words of caution, their advice, their presentations at conferences all contributed in helping me succeed. I will miss the amazing team at Synchronet but it was time for change.

Starting today, I will be in a new role at VMware in Technical Marketing for VMware VSAN. I am excited for this change, and look forward to the challenges ahead. In this position I hope to learn and give back to the greater community that has helped me reach this point. I will still blog various musings here, but look for VSAN and storage content at Virtual Blocks.

I look forward to the road ahead!

 

Why you shouldn’t run BCA off a Synology (or QNAP, or other cheap Linux NAS)…

In my life as a VMware consultant I run into the following Mad Lib when trying to solve storage problems for Business critical Applications.

A customer discovers they have run out of (IOPS/Capacity/Throughput/HCL) with their existing (EMC/Dell/HP/Netapp) array. They sized only for Capacity without understanding that (RAID 6 with NL-SAS is slow, 2GB of Cache doesn’t deliver 250K IOPS). The have spent all their (Budget/rackspace/Power/Political Power/Moxie). There is also an awkward quiet moment where its realized that (Thick provisioning on Thick provisioning is wasteful, I can’t conjure IOPS out of a hat, Dedupe is only 6%, Snapshots are wasting 1/2 of their array and are still not real backs, They can’t use COW so SRM can’t test failover). Searching for solutions they hear from a junior tech that there is this new (home-made/SOHO appliance) that can meet their (Capacity/IOPS) needs at a cheap price point. And if they buy it, it probably will work… For a while.

Here’s whats missing from the discussion.

1. The business needs more than 3-5 days for parts replacement, or tickets being responded to. (Real experiences with these devices).

2. The business needs something not based on desktop class non-ECC RAM motherboards.

3. The Business needs REAL HCL’s that are verified and not tested on customers. (QNAP was saying Green drives that lacked proper TLER, and are not designed for RAID would be fine to use for quite a while).

4. The Business needs systems that are actually secured

Now I’ve heard the other argument “but John I’ll have 2 of them and just replicate!”

This is fine (once you realize that RSYNC and VMDK’s don’t play nice) until you get bit buy a code bug that hits both platforms. While technically on the VMware HCL, these guys are using open source targets (iSCSI and NFS) and are so incredibly removed from the upstream developers that they can’t quickly get anything fixed or verified quickly. 2 Systems that have a nasty iSCSI MPIO bug, or have a NFS timeout problem are worse than 1 system that “just works”. Also as these boxes are black box’s they often miss out from the benefits of open source (you patch and update on their schedules, which is why My QNAP had a version of OpenSSL at one point that was 4 years old despite being on the newest release). If both systems have hardware problems because of a power surge, or thermal problems, or user error or a bad batch your still stuck waiting days to get a fix. If its software you may be holding your breath for quite a while. With a normal server OEM or Tier 1 storage provider you have parts in 4 hours, and reliability and freedom that these boxes can’t match.

Now at this point your probably saying “but John, I need 40K IOPS and I don’t have 70K to shovel into an array.

And thats where Software Defined Storage bridges the gap. Now with SuperMicro You can get solid off the shelf servers with 4 hour support agreements without breaking the bank (This new parts support program is global BTW). For storage software you can use VMware VSAN, a platform that reduces, costs, complexity, and delivers great performance. You massively reduce your support foot print (one company for hardware, one for software) reducing operational costs and capital costs.

Nothing against the Synology, QNAP, Drobo of the world, but lets stick to the right tool for the right job!