Posts by John Nicholson

Jun 3

We need to talk about talking…

I spend a lot of time on zoom, and I’ve noticed a trend. Some individuals sound good (and a few even great) on a consistent basis. I have weekly calls with people who consistently don’t have issues. I also have calls with other individuals where I consistently hear a few different consistent phrases.

“Zoom is having audio issues”

“My ISP is flaking out again”

“Let me try the WIFI in another room”

“I think the kids are streaming again”

“Let me try dialing in instead or using my cell phone to join the call”

“Can anyone hear George, I think he’s cutting out” (Followed by everyone in unison saying George’s audio is just fine).

One common refrain in all these statements is they make the assumption that this is normal and that nothing can be done to fix it. The reality is most of these situations can be fixed (some easily, some with difficulty). Note this blog is not meant on how to make your audio go from good to great (That will be another blog discussing audio gear and recording environment).

“Let me try the WIFI in another room”

This one can have a number of root causes, but the solutions are all fairly simple.

You are using the WIFI gateway that came from your ISP and is baked into the gateway device.

For the price of $10 a month many ISPs will lease you a modem/gateway and there’s a free WIFI access point thrown in! This is problematic for a few reasons.

Your gateway generally isn’t located in a central location of the house to provide even coverage. People often want to hide this ugly box, and so it’s shoved behind other dense items.
The WIFI radio’s in these devices are generally sub-par and low quality.
In some extreme cases they might only support 2.4Ghz witch is incredibly saturated to the point of being largely useless in dense urban areas.

How do I know it’s wifi or my internet connection?

Open a console (Win+r and type “cmd” and enter on windows, open terminal.app on OS X).
Ping Something on the internet (“ping 1.1.1.1” in windows or “ping 1.1.1.1 -A” on mac) the capital -A flag on mac will cause it to make a “beep” noise on every dropped packet.
Ping something local (Generally your edge router). For most people, the edge router will be either 192.168.0.1 or 192.168.1.1. Do this to both devices for 5-10 minutes and then use Cntrl c/command c to close the command.
You will get summery. does the WIFI one show large spikes in latency (above 10ms). Do you see a consistent amount of timeouts (dropped packets). If the WIFI is bad the internet will only be as bad or worse. If the wifi is 100% delivery of packets, and low latency and the internet is all over the place, then the problem is with your internet connection and you can skip this section.

AT&T has packets falling out of the tubes again

Below is what a good connection to your local router should look like.

In this case my Ethernet cable is expected ultra low latency (sub 1ms) and packet loss is 0.0%

What can you do about it?

The simplest solution is don’t use WIFI. Run an ethernet cable all the way to your desk, and for your laptop have a USB-C dock. For under $100 you can basically ignore WIFI as a problem entirely.

You can also repurpose your phone cables in the house so you don’t need to run this down the hall

If you can move your devices to a more central location in the home this might help. For fiber, this is generally difficult, but for cable, this may be as simple as moving the device to a more central drop.

Arris TG1682G Comcast XB3 Wireless Telephone Modem — This ugly thing is slow and expensive!

If you are a cable customer you can stop wasting $10-15 a month on equipment rental and just buy your own cable modem. The Wirecutter has a great guide on shopping for cable modems. While the cable companies will try to hard-sell this solution, rented modems are rarely upgraded and you often end up with older more questionable devices over time.

Linksys WRT — Design Strategy IoT — The WRT54G are sadly not made like they used to

Once you run your own cable modem you will need a router/firewall and an Access point. There are combination devices (WIFI Routers). The WIFI router offers simplicity (Single device) but these devices tend to have more issues with security, still limit you to a single location (commonly next to your ISP gateway), and tend to have a low tolerance for power quality issues (They tend to die a lot easier). Note AT&T Fiber customers will need to put their modem in “Gateway mode” and consider disabling the WIFI that is included (or just ignore it).

The next step up, and probably the best option for people in larger multi-walled houses, is a mesh system. This can cove a larger home effectively and benefits from not having to run cable (or repurpose cable to remote access points). The Wirecutter has some reviews here, but Eero seems to be a pretty popular high-end option that “just works”.
Lastly, if a mesh system is not a good solution (old house with chicken wire in the walls holding up the plaster, extremely noisy local RF environment, you have ethernet runs throughout the house, you need to extend wifi to an external shed, garage apartment) a modular solution where you deploy standalone access points with dedicated ethernet runs is an option. Ubiquity Unifi is what I use. Note, I didn’t have to run cable, instead, I repurposed the phone wires in my house as they were already cat5e ethernet, and I deployed a PoE switch in the central closet to power remote access points. This solution is a bit more complex (central controller, dedicated firewall, etc). Starting with the Unifi DreamMachine might make for a simple start and add APs as needed.

“I think the kids are streaming again”

This one is a bit more challenging as it’s a function of bandwidth availability and priority. There are still a few solutions to it.

Throw bandwidth at the problem. Speedtest.net to first make sure you are getting what you pay for, and then call your ISP and go up in package. Some things to note are:

Downstream bandwidth may not be your issue. Upload (what is used to send your voice or video) may be the problem. I recently left Comcast for AT&T because while Comcast could sell me 1Gbps down, they couldn’t go beyond 35Mbps up. AT&T’s gigabit product offered 1Gbps up and down.
Do the math. 3Mbps for SD quality (potato 480p) HD (720P) 5Mbps and 25Mbps for 4K. Downgrading the streaming quality can help, but this math gets ugly in reverse when you have multiple people trying to stream video from themselves.
It’s worth noting that Zoom by default maxes out at 720P outside of webinars, and requires enterprise accounts for HD. For HD settings and info https://support.zoom.us/hc/en-us/articles/207347086-Group-HD
Get usage in shape. Some of the fancier firewalls can try to shape or block traffic by traffic type this is increasingly harder to do in a world of TLS encryption hiding traffic and traffic routing through CDNs. It may be easier to have Zoom prioritize your laptop above all other clients. I’m not, a fan of traffic shaping on cheaper firewalls as it adds overhead and just slows everything down as it requires per-packet inspection and many firewalls can not run at 1Gbps line rate while doing this.
Abandon the (local) network. While expensive, having a separate data plan (Use your phone as a hotspot) that you use for your conference will take you off the shared local network if it truly is a lost cause. This is my emergency plan. Note data caps may apply, but if the alternative is sounding terrible on a 500 person conference call, you have to do what you have to do. I would recommend tethering by a cable rather than wifi or Bluetooth.
Clean up your ISP – If your local ISP is having packet loss issues (When you plug your laptop directly into the modem) call them. Do troubleshooting. Get a tech out. Check the weather and see if it always happens when it rains (a common issue for DSL or Coax is exposed cable will get wet and cause intermittent issues). Physically inspect the connections outside the house. Upgrade to a business class plan. While expensive this allows enforcement of SLAs and gets you priority on a lineman to fix your issues. The squeaky wheel gets the cheese, and just keep calling. If this is a chronic issue that is not being fixed start filing regular complaints with your state public utility commission. Fines at this level can get ugly on business class connections.
Change ISPs – Ask neighbors who they use, and see if there are other options. In rural areas WISPs often offer an alternative. Check with the wireless providers for 5G service. In my neighborhood, Verizon is already running test gear, and Sprint/Tmobile is deploying backhaul ahead of new towers.
Move – This sounds drastic, but it is the year 2020. There I can get 50Mbps down and 10Mbps up at my ranch by the devil sinkhole that is hours from real civilization. When moving look for at LEAST 2 high-quality ISPs that can offer 100Mbps upload as well as 200Mbps download. ISPs know when they have competition and take it seriously with price drops, better service, and free speed upgrades to compete. Being in a market served only by DSL and DOCSIS 2 cable means a slow death. Beware apartment buildings with contracts to a single small ISP as speeds will largely remain frozen. Another dangerous option is communities with political actors who are fighting the 5G rollout and ban “ugly telephone polls”. Fiber is 10x more expensive to deliver by digging than by the air, and you will be tying your internet hopes to coax last updated in the 1990s. Expecting a magic internet fairy to fix this is the definition of insanity. Houston’s lack of zoning and willingness to allow the telos to have the poles look like a combination of Bangkok and Manilla does well for my work from home needs. Think about this way. You wouldn’t live 300 miles from the office, and by moving somewhere with bad connectivity that is what you are effectively doing with your connectivity to the internet.

May 15

Rebuilding the vSAN Lab (Hands-on part 3).

So as i complete this series I wanted to include some screenshots, examples of the order of operations I used, and discuss some of the options or ways to make this faster.

Disclaimer: I didn’t end to end script this. I didn’t even leverage host profiles. This is an incredibly low automation rebuild (partly on purpose, as we were training someone relatively new to work with bare-metal servers as part of this project so they would understand things that we often automate). If using a standard switch (In which case SHAME use the vDS!) document what the port groups and VLANs for them are. Also, check the hosts for advanced settings.

First step Documentation

So before you rebuild hosts, make sure you document what the state was beforehand. RVtools is a pretty handy tool for capturing some of this if the previous regime in charge didn’t believe in documentation. Go swing by Robware.net and grab a copy and export a XLS of your vCenter so you can find what that vmk2 IP address was before, and what port group it went on (Note it doesn’t seem to capture opaque port groups).

Put the host into maintenance mode

Now, this sounds simple, but before we do this let’s understand what’s going on in the cluster and the UI gives us quite a few clues!

The banner at the top warns me that another host is already in maintenance mode. Checking on slack, I can see that Teodora and Myles who are several time zones ahead of me have patched a few hosts already. This warning is handy for operational awareness when multiple people manage a cluster!

Next up, I’m going to check on the Go To Pre-Check. I want to see if taking an additional host offline is going to have a significant negative impact (This is a fairly large cluster, this would not be advised on a 4 host cluster where 2 hosts would mean 50% of capacity offline, and an inability to re-protect to full FTT level).

I’m using Ensure Accessibility (The handful of critical VM’s are all running at FTT=2), and I can see I”m not going to be touching a high water mark by putting this host into maintenance mode. If I had more time and aggressively strict SLAs I could simulate a full evacuation. (Again,This is a lab). here’s an image of what this looks like when you are a bit closer to 70%.

Notice the “After” capacity bar is shorter representing the missing host

Now, after pressing enter maintenance mode I’m going to watch as DRS (which is enabled) automatically evacuates the virtual machines from this host. While a rather quick process I’m going to stop and daydream of what 100Gbps Ethernet would be like here and think back to the stone ages of 1Gbps ethernet where vMotions of large VMs was like watching paint dry…

Once the host is in Maintenance Mode I’m going to remove this host from the NSX-T Manager. If your not familiar with NSX-T this can be found under System → Fabric → Nodes and THEN use the drop down to select your vCenter (it defaults to stand alone ESXi hosts).

If you have any issues with uninstalling NSX-T, Rerun the removal with “Force Delete” only

Once the host is no longer in NSX-T, and out of maintenance mode you can go ahead and reboot the host.

Working out of band

As I don’t feel like flying to Washington State to do this rebuild, I’m going to be using my out of band tooling for this project. For Dell hosts this means iDRAC, for HPE hosts this means iLO. When buying servers always make sure these products are licensed to a level that allows full remote KVM, and for Dell and HPE hosts you have the additional licensing required for vLCM. (OMIVV and HPE iLO Amplify). Some odd quirks I’ve noticed is that while I personally hate Java Web start (JWS) as a technology, the JWS console has some nice functionality that the HTML5 does not. Being able to select what the next boot option should be, means I don’t have to pay quite as much attention, however the JWS is missing an on screen keybaord option so I did need to open this from the OS level to pass through some F11 commands.

while I’m at it, I’ll go ahead and mount the Virtual Media, and attach my ESXi ISO to the Virtual CDROM drive.

Firmware and BIOS patching

Now if you are not using vLCM it might be worth doing some firmware/bios update at this time. for Dell hosts, this can be done directly from the iDRAC by pointing them at the downloads.dell.com mirror or loading an ISO.

I’m always a bit terrified when I see the almost 2 hour task limit, then realize “ohh, yah someone put a lot of padding in”

BIOS and boot configuration

For whatever reason, my lab seems to have a high number of memory errors. To help protect against this I’m switching hosts by default to run in Advanced ECC mode. For 2 hosts that have had chronic DIMM issues that I haven’t had time to troubleshoot I’m taking a more aggressive stance and enabling fault resilient mode. This forces the Hypervisor and some core kernel processes to be mirrored between DIMMs so ESXi itself can survive the complete and total failure of a memory DIMM (vs Advanced ECC which will tolerate the loss of subunits within the DIMM). For more information on memory than you ever wanted to know, check out this blog.

Next up I noticed our servers were still set to a legacy boot. I’m not going to write an essay on why UEFI is superior for security and booting, but in my case I needed to update it, if I was going to use our newer iPXE infrastructure.

Huh, I wonder if ESX2 would deploy to my R630…

Note upon fixing this I was greeted by some slightly more modern versions.

Now, if you don’t have a fancy iPXE setup you can always mount the ISO to the virtual CDROM drive.

Note: after changing this you will need to go through a full boot cycle before you can reset the default boot devices within the BIOS boot manager.

Now I didn’t take a screenshot on my Dell of this, but here’s one of the HPE hosts, what it looks like to change the boot order. The key thing here is to make sure the new boot device is first in this list (As we will be using one-time boot selection to start the installer).

In this case I’m booting from a M.2 device as this fancy Gen10 supports them

Coffee Break

Around this time is a great time for some extra coffee. Some things to check on.

Make sure your ISO is attached (Or if using PXE/iPXE the TFTP directory has the image you want!).
make sure the next boot is set to your installer method (Virtual CD Room, or PXE).
Go back into NSX-T manager and make sure it doesn’t think this host is still provisioned or showing errors. If it’s still there unselect “uninstall” and select “Force delete”. This tends to work.
Collect your notes for the rest of the installation NTP servers, DNS servers, DNS suffix/lookup domains, IP and hostname for each host (If your management domain has DNS consider setting reservations for the MAC address of VMK0 which always steals from the first physical NIC so it will be consistent, unlike the other ones that generate from the VMware MAC range.)
Go into vCenter and click “remove from inventory” on the disconnected host. We can’t add a host back in with the same hostname (This will make vCenter angry).

May 12

What to look at and do (or not) when recovering from a cluster failure (Part 2)

In part one of this series, I highlighted a scenario where we lost quite a few hosts in a lab vSAN cluster caused by 3 failed boot devices and a power event that forced a reboot of the hosts. Before I get back into the step by step of the recovery I wanted to talk a bit about what we didn’t do.

What should you do?

If this is production please call GSS. They have unusually calm voices and can help validate decisions quickly and safely before you make them. They also have access to recovery tooling, and escalation engineers you do not have.
Try to get core services online first (DNS/NTP/vCenter). This makes restoring other services easier. In our case, we were lucky and had only partial service interruption here (1 of 2 DNS servers were impacted).

Cluster Health Checks

While, I much prefer to work in vCenter, in the event of vCenter having an outage, it is worth noting that vSAN health checks can be run without vCenter.

Run at the CLI
Run from the Native HTML5 client on each ESXi host. The cluster health is a distributed service that is independent of vCenter for core checks.

Solving the chicken egg monitoring problem since 2017!

When reviewing the impact on the vSAN cluster look at the Cluster Health Checks:

How many objects are re-syncing, and what is the progress.

Note, in this case I just captured a re-balance operation

2. How many Components are healthy vs. unhealthy

3. Drive status – How many drives and disk groups are offline. note, within the disk group monitoring you can see what virtual machine components were on the impacted disk groups.

4. Service Check. See how many hosts are reporting issues with vSAN related services. In my case this was the hint that one of my hosts had managed to partially boot, but something was wrong. Inversely if you may see a host that is showing disconnected from vCenter, but is still contributing storage. It is worth noting that vSAN can continue to run and process storage IO as long as the vSAN services start, and the vSAN network is functional. It’s partly for this reason that when you enable vSAN, the HA heartbeats move to the vSAN network, as it’s important to keep your HA fencing in line with storage.

5. Time is synchronized across the cluster. For security reasons, hosts will become isolated if clocks drift too far (Similar to active directory replication breaking, Kerberos authentication not working etc. Thankfully there is a handy health check for this.

Host 16 used a stratum 16 $10 rolex I bought for cheap while traveling.

What Not to do?

Don’t panic!

Ok, so we had to do a bit more than that…

Also, while you are at it, don’t reboot random hosts.

This advice isn’t even specifically vSAN advice, but unlike your training with Microsoft desktop operating systems, the solution to problems with ESXi is not always to “tactically reboot” a host by mashing reset from the iDRAC. You might end up rebooting a perfect health host that was in the middle of a resync, or HA operation. Rebooting more health hosts does a few things:

It causes more HA events. HA events trigger boot storms. large bursts of disk IO as an Operating system reboots, databases force log rechecks, in-memory databases rebuild their memory caches and other processes that are normally staggered.
Interrupt object rebuilds. In our case (3 hosts failures and FTT=1) we had some VM’s that we lost quorum on, but many more that only lost 1 of 3 pieces. Making sure all objects that can be repaired are repaired quickly was the first order of battle.
Rebooting hosts can dump logs or crash dumps that are not being written to persistent disk. GSS may want to scrape some data out of even a 1/2 dead host if possible.

Assemble the brain trust

Remember, always have a Irish guy named Myles ready to help fix things

A few other decisions came up as Myles, Teodora and I spoke about what we needed to do to recover the cluster. We also ruled out a few recovery methods and decided on a course of action to get the cluster stable, and then begin the process of proactively preventing this from impacting us with other hosts.

Salvage a boot device from a capacity device – We briefly discussed grabbing one of the capacity devices out of the dead hosts and using it as a boot device. Technically this would not be a supported configuration (or controller is not supported to act as both a boot device and a device hosting vSAN capacity devices). The challenge here is we wanted to get back 100% of our data and it would have been tedious to identify which disk group was safe to sacrifice in a host for this purpose. If we were completely unable to get remote hands to install boot devices or were only interested in the recovery of a single critical VM at all costs, this might have made sense to investigate.
Drive Switcharo– Another option for recovery has our remote hands pull the entire disk group out of the dead servers and shove them into free drive bays on existing healthy servers. Pete Koehler mentioned this is something GSS has had success and something I’d like to dedicate to its own blog topic at some point. Why does this work? Again, vSAN does not store metadata or file system structures on the boot devices, purposely to increase survivability in cases where the entire server must be replaced. This historically was not a common behavior in enterprise storage arrays that would often put this data on OS/vault drives (that might not be movable even, or embedded). Given we had adequate drive bays free to split the 6 impacted disk groups (2 per host) across the remaining 13 hosts in the cluster this was an option. In our case, we decided we didn’t want to deal with moving them back after this was done. My remote hand’s teams were busy enough with vSphere 7 launch tasks, and COVID related precautions were reducing the staffing levels.
Fancy boot devices – We decided to avoid trying to use SD cards going forward as our primary boot option (even mirrored). Once these impacted hosts were online and the cluster was healthy we had ops plug in all of our new boot devices so we could proactively one host at a time process a fresh install. In a perfect world we would have had M.2 boot devices, but adding a PCI-E riser for this purpose on 4-year-old lab hosts was a bit more than we wanted to spend.

What did we do?

In our case, we called our data center ops team and had them plug in some “random USB drives we have laying around” and began fresh installs to get the hosts online and restore access to all virtual machines. I ordered some high endurance Sandisk USB devices and as a backup some high endurance SD cards (Designed for 4K Dashcam video usage). Once these came in, we reinstalled ESXi to the USB devices allowing our ops teams to recover their USB devices. The fresh high-quality SD cards will be useful for staging ISOs inside the out of band, as well as serving as an emergency boot device in the event a USB device fails.

Next up in the series. A walk through of installing ESXi from bare metal, some changes we made to the hosts and I’ll answer the question of “what’s up withe snake hiding in our R&D datacenter”.

May 11

How to rebuild a VCF/vSAN cluster with multiple corrupt boot devices

Note: this is the first part of a series.

In my lab, I recently had an issue where a large number of hosts needed to be rebuilt. Why did they need to be rebuilt? If you’ve followed this blog for a while, you’ve seen the issues I’ve run into with SD cards being less than reliable boot devices.

Why didn’t I move to M.2 based boot devices? Unfortunately, these are rather old hosts and unlike modern hosts, there is not an option for something nice like a BOSS device. This is also an internal lab cluster used by the technical marketing group, so while important, it isn’t necessary “mission critical” by any means.

As a result of this, and a power hiccup I ended up with 3 hosts offline that could not restart. Given that many of my VM’s were set to only FTT=1 this means complete and total data loss right?

Wrong!

First off, the data was still safe on the disk groups of the 3 offline hosts. Once I can get the hosts back online the missing components will be detected and the objects will become healthy again (yah, data loss!). vSAN does not keep the metadata or data structures for the internal files systems and object layout on the boot devices. We do not use the boot device as a “Vault” (if your familiar with the old storage array term). If needed all of the drives in a dead host can be moved to a physically new host and recovery would be similar to the method I used of reinstalling the Hypervisor on each host.

What’s the damage look like?

Hopping into my out of band management (My datacenter is thousands of miles away) I discovered that 2 of the hosts could not detect their boot devices, and the 3rd failed to fully reboot after multiple attempts. I initially tried reinstalling ESXi on the existing devices to lifeboat them but this failed. As I noted in a previous blog, SD cards don’t always fully fail.

Live view of the SD cards that will soon be thrown into a Volcano

If vSAN was only configured to tolerate a single failure, wouldn’t all of the data at least be inaccessible with 3 hosts offline? It turns out this isn’t the case for a few reasons.

vSAN does not by default stripe data wide to every single capacity device in the cluster. Instead, it chunks data out into fresh components every 255GB (Note you are welcome to set strip width higher and force more sub-components being split out of objects if you need to).
Our cluster was large. 16 hosts and 104 physical Disks (8 disks in 2 disk groups per host).
Most VM’s are relatively small, so out of the 104 physical disks in the cluster, having 24 of them offline (8 per host in my case). still means that the odds of those 24 drives hosting 2 of the 3 components needed for a quorum is actually quite low.
A few of the more critical VM’s were moved to FTT=2 (vCenter, DNS/NTP servers) making their odds even better.

Even in the case of a few VM’s that were impacted (A domain Controller, some front end web servers), we were further lucky by the fact that these were redundant virtual machines already. Given both of the VMs providing these services didn’t fail, it became clear with the compounding ods in our favor that for a service to go offline was more in the odds of rolling boxcars twice, than a 100% guarantee.

This is actually something I blogged about quite a while ago. It’s worth noting that this was just an availability issue. In most cases of actual true device failure for a drive, there would normally be enough time between loss to allow for repair (and not 3 hosts at once) making my lab example quite extreme.

Lessons Learned and other takeaways:

Raise a few Small but important VM’s to a higher FTT level if you have enough hosts. Especially core management VMs.
vSAN clusters can become MORE resilient to loss of availability the larger they are, even keeping the same FTT level.
Use higher quality boot devices. M.2 32GB and above with “real endurance” are vastly superior to smaller SD cards and USB based boot devices.
Consider splitting HA service VM’s across clusters (IE 1 Domain Controller in one of our smaller secondary clusters).
For Mission-Critical deployments use of a management workload domain when using VMware Cloud Foundation, can help ensure the management is fully isolated from production workloads. Look at stretched clustering, and fault domains to take availability up to 11.
Patch and reboot your hosts often. Silently corrupt embedded boot devices may be lurking in your USB/SD powered hosts. You might not know it until someone trips a breaker and suddenly you need to power back on 10 hosts with dead SD devices. Regular patching will catch this one host at a time.
While vSAN is incredibly resilient always have BC/DR plans. Admins make mistakes and delete the wrong VMs. Datacenters are taken down by “Fire/Flood/Blood” all the time.

I’d like to thank Myles Grey and Teodora Todorova Hristov for helping me make sense of what happened and getting the action plan to put this back together and grinding through it.

May 1

Keeping track of VCF and vSAN cluster driver/firmware

Are you building out a new VMware Cloud Foundation cluster, and trying to make sure you stay up to date with your vSAN ReadyNodes driver/firmware updates? Good news, there are a few options for tracking new driver/firmware patches.

The first method is simple, try out the new vLCM functionality. This allows for seamless updates of firmware/drivers for drives and controllers as well as system BIOS and other devices. It also has integration to verify key driver/firmware levels for the vSAN VCG sub-components. For those of you looking into this go check, the VCG for compatible hardware check out this blog post.

What about for clusters where you can not use vLCM yet. Maybe, your servers are not yet supported?

The vSAN VCG notification service can help fill the gap. It allows you to subscribe to changes. Subscribing will set you up for email alerts that will show changes to driver and firmware versions, as well as when updates and major releases. You can sign up for individual components, as well as for an entire ReadyNode specification.

Changes are reflected in a clear color-coded view showing what has been removed and what has been added to replace the entry.

The ReadyLabs team keep continuing to make it easier to keep your VMware Cloud Foundation environment up to date. If you have any more questions about the service, be sure to check out the FAQ. If you have any questions on this or the vSAN VCG reach out by email to [email protected]

Apr 22

Understanding File System Architectures.

File System Taxonomy

I’ve noticed that Clustered File Systems, Global file systems, parallel file systems and distributed file systems are commonly confused and conflated. To explain VMware vSAN™ Virtual Distributed File System™ (VDFS) I wanted to highlight some things that it is not. I’ll be largely pulling my definitions from Wikipedia but I look forward to hearing your disagreements on twitter. It is work noting some file systems can have elements that cross the taxonomy of file system layers for various reasons. In some cases, some of these definitions are subcategories of others. In other cases, some file systems (GPFS as an example) can operate in different modes (providing RAID and data protection, or simply inherent it from a backing disk array).

Clustered File System

A clustered file system is a file system that is shared by being simultaneously mounted on multiple servers. Note, there are other methods of clustering applications and data that do not involve using a clustered file system.

Parallel file systems

Parallel file systems are a type of clustered file system that spread data across multiple storage nodes, usually for redundancy or performance. While the vSAN layer mirrors some characteristics (Distributed RAID and striping) it does not 100% match with being a parallel file system.

Examples would include OneFS and GlusterFS.

shared-disk file system

shared disk file systems are a clustered file system but are not a parallel file system. VMFS is a shared disk file system. The most common form of a clustered file system that leverages a storage area network (SAN) for shared access of the underlying LBAs. Clients are forced to handle the translation of file calls, and access control as the underlying shared disk array has no awareness of the actual file system itself. Concurrency control prevents corruption. Ever mounted NTFS to 2 different windows boxes and wondered why it corrupted the file system? NTFS is not a shared disk file system and the different operating systems instances do not independently by default know how to cleanly share the partition when they both try to mount it. In the case of VMFS, each host can mount a given volume as read and write, while cleanly making sure that access to specific subgroups of LBA’s used for different VMDKs (or even shared VMDKs) is properly handled with no data corruption. This is commonly done over a storage area network (SAN) presenting LUNs (SCSI) or namespaces (NVMe over fabrics). protocol to share this is block-based and can range from Fibre Channel, iSCSI, FCoE, FCoTR, SAS, Infiniband etc.

Example of 2 hosts mounting a group of LUNs and using VMFS to host VMs

Examples would include: GFS2, VMFS, Apple xSAN (storenext).

Distributed file systems

Distributed file systems do not share block-level access to the same storage but use a network protocol to redirect access to the backing file server exposing the share within the namespace used. In this way, the client does not need to know the specific IP address of the backing file server, as it will request it when it makes the initial request and within the protocol (NFSv4 or SMB) be redirected. This is not exactly a new thing (DFS in Windows is a common example, but similar systems were layered on top of Novell based filers, proprietary filers etc). These redirects are important as they prevent the need to proxy IO from a single namespace server and allow the data path to flow directly from the client to the protocol endpoint that has active access to the file share. This is a bit “same same but different” to how iSCSI redirects allow connection to a target that was not specified in the client pathing, or ALUA pathing handles non-optimized paths in the block storage world. For how vSAN exposes this externally using NFS, Check out this blog, or take a look at this video:

The benefits of a distributed file system?

Access Transparency. This allows back end physical data migrations/rebuilds to happen without the client needing to be aware and re-pointing at the new physical location. clients are unaware that files are distributed and can access them in the same way as local files are accessed.
Transparent Scalability. Previously you would be limited to the networking throughput and resources of a single physical file server or host that hosted a file server virtual machine. With a distributed file system each new share can be distributed out onto a different physical server and cleanly allow you to scale throughput for front end access. In the case of VDFS, this scaling is done with containers that the shares are distributed across.
Capacity and IO path efficiency – Layering a scale-out storage system on top of an existing scale-out storage system can create unwanted copies of data. VDFS uses vSAN SPBM policies on each share and integrates with vSAN to have it handle the data placement and resiliency. In addition layering, a scale-out parallel file system on top of a scale-out storage system leads to unnecessary network hops for the IO path.
Concurrency transparency: all clients have the same view of the state of the file system. This means that if one process is modifying a file, any other processes on the same system or remote systems that are accessing the files will see the modifications in a coherent manner. This is distinctly different from how some global file systems operate.

It is worth noting that VDFS is a distributed file system that exists below the protocol supporting containers. A VDFS volume is mounted and presented to the container host using a secure direct hypervisor interface that bypasses TCP/IP and the vSCSI/VMDK IO paths you would traditionally use to mount a file system to virtual machine or container. I will explore more in the future. For now, Duncan Explains it a bit on this blog.

Examples include: VDFS, Mirosoft DFS, BlueArc Global Namespace

Global File System

Global File Systems are a form of a distributed file system where a distributed namespace provides transparent access to different systems that are potentially highly distributed (IE in completely different parts of the world). This is often accomplished using a blend of caching and the use of weak affinity. There are trade-offs in this approach as if the application layer is not understood by the client accessing the data you have to deal with manually resolving conflicting save attempts of the same file, or forcing one site to be “authoritative” slowing down non-primary site access. While various products in this space have existed they tend to be an intermediate step for an application-aware distributed collaboration platform (or centralizing data access using something like VDI). While async replication can be a part of a global file system, file replication systems like DFS-R would not technically qualify. Solutions like Dropbox/OneDrive have reduced the demand for this kind of solution.

Examples include: Hitachi HDI

Where do various VMware storage technologies fall within this?

VMFS – a Clustered file system, that specifically falls within the shared-disk file system. While powerful and one of the most deployed file systems in the enterprise datacenter, it was designed for use with larger files that are (With some exceptions) only accessed by a single host at a time. While support for higher numbers of files and smaller files has improved significantly over the years, general-purpose file shares are currently not a core design requirement for it.

vVols – Not a clustered file system. An abstraction layer for SAN volumes, or NFS shares. For block volumes (SAN) it leverages SUB-LUN units and directly mounts them to the hosts that need them.

VMFS-L – A non-clustered variant Used in vSAN prior to the 6.0 release. Also used for the ESXi installed volume. File system format is optimized for DAS. Optimization include aggressive caching with for the DAS use case, a stripped lockdown lock manager, and faster formats. You commonly see this used on boot devices today.

VDFS – vSAN Virtual Distributed File System. A Distributed file system that sits inside the hypervisor directly onto of vSAN objects providing the block back end. As a result, it can easily consume SPBM policies on a per-share basis. For anyone paying attention to the back end, you will notice that objects are automatically added and concatenated onto volumes when the maximum object size is reached (256GB). components behind these objects can be striped, or as a result of various reasons be automatically spanned and created across the cluster. It is currently exposed through protocol containers that export NFSv3 or NFSv4.1 as a part of vSAN file services. While VDFS does offer a namespace for NFSv4.1 one connections and handles redirection of share access, it does not currently globally redirect between disparate clusters, so it would not be considered a global file system.

Apr 15

How to succeed as a profesional tech podcast

Pete Flecha and I co-host the Virtually Speaking Podcast. We both get emails, calls, and texts from various people wanting to start a tech podcast. I figured I’d sort some of the most common advice into a single blog with maybe some follow up blogs for more in-depth gear reviews etc. Note There are some strong opinions I hold here loosely and I look forward to the twitter banter that will follow.

Who

Because AI machine learning hasn’t figured out how to podcast yet, you are going to need to sort out who will come on the podcast.

Will you have a single host, a pair of hosts, a panel?

First, how many hosts you have is going to be partly a function of how long the podcast runs for. It’s hard to have a 6-minute podcast with a 5 person panel. Inversely the more people involved in a podcast, the more hell you face trying to align the Gantt chart that is everyone’s schedule and time zone requirements.

The vSpeaking Method for hosts? We generally stick to 2 host. We leverage a guest host filling in when one of us is traveling, or in a conference session and 2 guests. Another benefit of the occasional guest host is they can help interview on a topic that presenters are still getting up to speed on. An example of this is when we tagged in Myles to help with discussing containers and other devops hipster things with Chad Sakac.

What makes a good host?

This is going to be a bit of a philosophical discussion as a lot of things can make for a good host but some general thoughts on traits that help with sucucess:

Someone needs to push the show forward, keep it on task and time and be a good showrunner.
Someone needs to know enough about the topic to discuss it, or ask the guests about it. Someone who’s willing to do the research on the product and field more than 5 minutes beforehand.
Someone skilled and willing to do the editing.
Someone who can handle publicizing the podcast. (The vSpeaking Podcast doesn’t pay for promoted tweets or google AdWords).
People (if plural hosts) who are good at reading a conversation and knowing when to jump in and out.
Flexible schedules. I’ve recorded at all kinds of weird hours to try to support overseas guests, recording at weird hours while we were traveling to get US guests. Remember, The world does not revolve around the Pacific time zone.

Inversely there are some behaviors or environmental variables for hosts that may inhibit success.

Some traits can be problematic in hosts:

People who lack commitment or “air cover from management” to stay committed beyond a few episodes. If your manager is going to be asking “what is the value in this” it may be problematic to maintain this as a work hours activity.
A host who likes to hear the sound of their own voice. Remember it’s a podcast, not a monologue.
A host who thinks they know more about the subject than they do. this will manifest itself in awkward questions, disagreements and the host talking too damn much.
Hosts with zero understanding of the technology. If it’s just a product marketing intern reading a FAQ, no one will listen to it (if you are lucky!)

Will there be guests on every episode?

Having a constantly rotating cast of guests provides a lot of benefits as it helps open up the topics and information you bring in.

there is a class of podcasts that tends to diverge from this. The ‘news of the week’ format. A solid example of this would be something like the Six Five. I’ll caution there are a LOT of podcasts who fall into this camp and even executed well it has some challenges. This may sound easier in that you do not have to do any scheduling of guests, but done properly it is more time-consuming format as it requires you to be the subject matter expert on everything. This is best executed in niche fields (rather than the entire tech industry), and best executed by seasoned veterans who can provide more than the surface commentary and repeating what TheRegister wrote for commentary. Expect to do twice as much research when you have to be able to understand both the question AND the answer.

What

What topics will you cover?

This can be a difficult one. Creating a 100 episodes on JUST Microsoft Exchange would be difficult, exhausting, and likely result in most people (even people who are full time exchange admins) losing interest. There are ways to accomplish your goal of promoting a given product without making it front and center the focus of EVERY single episode. Find common adjacent products, products that integrate, fundamental technologies that every admin using said product needs to know.

The vSpeaking Podcast method: While I wouldn’t mix in an episode on woodworking we’ve found some back to the basics, Topical episodes, and partner episodes help keep things fresh. A major commitment we made from the beginning was to have the majority of our content be “evergreen”. The goal being that if someone discovers our podcast in 2020, they would find interest in listening to a lot of the back catalog. While there are some episodes that become dated quickly (Event, or launch specific episodes) even these might have elements or stories within them that resonate years later.

what questions do we ask?

Note to execute this well you need to do research on the guest (stalk them on twitter, blogs, recent press releases, LinkedIn) and try to get some questions that people in their field of experience would want to know. DO NOT ASK THE GUEST TO WRITE THE QUESTIONS. It comes off as disrespectful, (You don’t care enough to learn enough to ask an educated question). It is ok to ask if there are any recent announcements of themes they might want to cover, but do not make the guest do all the work here.

Critical elements of a good question are not jus what you ask, but what you do not ask.

Avoid leading questions

“Chad, it looks like Customers really like VCF because it provides enterprise reliability, can you explain why this is the most important reason to buy it? This is a terrrrrible question. If the question involves a run-on sentence, assumes you know more about the topic than the host, and narrows the response to “Yes, Mr host, you sure are right and smart” you are “doing it wrong!” . Instead of a less controlled question like “What are some common customer conversations you are having about VCF?” Sometimes you need to channel your inner Larry King and ask your questions like you know nothing. Starting with a level set question, and then asking questions off of that question and going deeper is a better method to allow the guest to explain things, and bring your audience with you rather than jump to a conclusion. A podcast series that interviews the greatest interviewers of all time is worth a listen.

How

The Gear

Gear is partly a function of the environment you record in. A $100 Sure SM58 in a professional recording studio, but in a small office may pick up sound bouncing off the hard walls. I’ve had the fortune of living near firehouses. Oddly enough “Podcasting booths or rooms” in some office buildings or conferences often have some of the worst aquistics ever created (Small rooms with hard glass surfaces are terrible for bouncing audio). Look for a blog series on “good/better best” gear for audio.

For now I’ll start with what not to use as it’s likely easier.

USB condenser microphones – The most popular example of these microphones are the Blue Yeti/snowball etc. Condensers are popular in studios (they can pull a lot of sounds in). The challenge with using these for office or home recording is you tend to end up “recording the room” (bounce off walls) and these microphones are aggressive at picking up background noise (air conditioning, fans, etc are easily picked up). You can do a lot worse than these (and we’ll get to this in a minute), but for serious audio recording in a less than functional environment be prepared to put some sound dampening on the walls, carpet in the room and turn off the air conditioning and fan. A downside of some of these USB mics is they often will not work with regular accessories (amplifiers, mixers, arms, highly custom-sized foam design) so you end up with a proprietary ecosystem around them.

Anything with a phone number – The public phone system tends to drop the quality of everything down to a common codec, G.711. This codec from 1972 is a “narrowband” codec and is (along with a host of other things) part of the reason why business travel exists. People don’t listen to podcasts by dialing into a voicemail, and you should want your podcast to sound better than their voicemail.

What I use?

I’m a fan of XLR based microphones. They make any investment in the Eco-system reusable later.

Microphone – Heil PR40. It’s dynamic and center-fire (Meaning it only records in a narrow angle) and that’s actually good for me. I don’t have a professional studio, the nursery is one room over, and fire trucks and barking dogs come past my office all the time. Note: this is an XLR microphone so I’ll need something to convert it to a digital signal. Pete also uses the same microphone, and that helps us get a similar sound after we match volume levels at the start of the podcast.

Input/Digitizer – For now I primarily use a Blue Icicle that is directly connected (no XLR cable used). I had some issues with the XLR cable built into my cheap arm mount and found that this avoided the need to get anything ti amplify the signal as it immediately goes from the microphone to being a digital signal. I’m still figuring out cable management for the USB cable. I also own a Sure digitizer that costs twice as much, but it was way too easy to hit the gain knob. The Blue requires some torque to turn and this means once you get it set you can largely ignore it.

Other things To go with the mic– I have the Heil foam for the mic to cut popping noises, and the heil branded shock mount (It prevents noise from when I hit the table while recording). I have an arm that is screwed into my desk (Was something cheap). If you are going to have a long XLR run, a quiet mic that needs a pre-amp something to amplify the signal (CloudLifter) might not be a bad idea. There’s no need to buy a mixer/board unless you are going to be blending multiple inputs (I’m not a DJ!).

In Person

For the road – When traveling to conferences we use tabletop stands and a Zoom H6 recorder. While it can act as a stand-alone recorder we normally feed it into a Mac using a USB cable into Audio Hijack and run some low pass filters and other effects. It offers support for 4 XLR inputs as well as individual gain control, can act as a handheld condenser with an attached microphone. Other software like Loopback can be handy for sending sound from a soundboard into another output.

Remote recording

Over the years we’ve tried a couple different bits of software. We started with:

Skype – which worked for a bit, but quality problems have gotten worse as Microsoft has slowly ruined it.

Skype for business – (an unmitigated disaster given our implementation uses narrowband codecs for remote employees when you have more than 2 people on a call).

Zoom.us – We settled on Zoom. It has a few interesting recording capabilities, like the ability to record every channel independently, and allow users to record locally. If you have network quality concerns this can help offset this allowing Pete in editing the podcast to delete parts when I was speaking over someone, or assemble local audio from a guest who was cutting in and out. This shouldn’t be needed often but it’s buried in the zoom web settings.

Editing –

We are doing it live! – Bill R

While it may seem easier to just record an hour and a half of audio and dump it to sound cloud, this is not what most people want to listen to. Part of the benefit to a podcast is not a live call is it allows you to leave some (or a lot) of an episode on the cutting room floor. Things that you can cut out:

Mid program housekeeping discussions (Clarifying that we can introduce or avoid a given topic with the guest) or where you discuss the next segment
Deleting things where you accidentally leaked NDA content.
Letting someone try to respond to something again (however, if you obsess on perfection and require 10 takes to get something right Podcasts may be the wrong medium).
Guests, off-color humor that you’d rather your VP not listen to while making pancakes with their kids.
Long awkward pauses. We like to stress to guests if they want to sit and think for 10 seconds before responding “that’s fine”. Allowing awkward pauses gives people time to provide awesome content. It doesn’t sound great though on the podcast so we can cut them out.
When John (Me) rambles off-topic or tries to speak over someone on accident.
When you might have a good side conversation that might be useful in another montage episode on a topic.

The vSpeaking method? We REALLY try to keep podcasts in a 25-35 minute runtime when possible. Why this length? This is about the average commute time for a lot of our listeners (or time driving between customers for partners and customers). We might split a conversation up. We tend to block 1 hour for recording, use 5-10 minutes for housekeeping and setup, and record 40-50 minutes of content that is then edited down.

At Conferences like VMworld and events, we will often grab shorter 5-15 minute interviews. We will then stitch a collection of these into a longer episode using a windowing effect (we record an intro and outro for each segment). These “vignettes” might even be recorded as a small part of a larger episode. Episode 152 is an example of this format where we combined an interview with pat along with pieces of an interview with Brian Madden that will make up a future episode. This started out as a way for Pete and me to meet our goal of an episode every two weeks, by adapting the “Clip Show” method from television. These days this method is more about building an episode around a topic and provide more opinions and voices.

When

It’s worth noting that consistency is king in podcasting and publishing in general. If a podcast happens yearly, and is a 5-minute podcast, or is daily and is 3 hours long it will likely fail to grab a consistent listener base for a number of reasons. at least twice a month seems to be the minimum level of effort required for the length we run (25-45 minutes). Shorter podcasts tend to require much more frequent cadence to maintain listeners.

Length

It’s worth noting that consistency is king in podcasting and publishing in general. If a podcast happens yearly, and is a 5-minute podcast, or is daily and is 3 hours long it will likely fail to grab a consistent listener base for several reasons. at least twice a month seems to be the minimum level of effort required for the length we run (25-45 minutes). Shorter podcasts (8-10 minutes) tend to require more frequent cadence to maintain listeners.

A longer podcast allows some introduction banter, a longer intro song/music and a little more personality of the people speaking to come out. A short podcast (5-10 minutes) might work well for fitting into someone’s morning shower/shave/tooth brushing routine, but you’ll need to cut it down to more “just the facts”.

The VSpeaking Method: While there is some variety in the show length 25-35 minutes tends to be the target run time. This aligns well against the average, one-way commute time is 26.1 minutes, a 1 hour workout being able to fit two episodes.

Cadence

If you can’t commit to bi-monthly (24 episodes in a year) it may not be worth the investment. It’s hard to stay top of mind for listeners. Also consistent cadence can help. If you do weekly, then skip 2 quarters, then come back weekly it’s hard to remain part of the listeners “usual routine” where they listen to your podcast and make time for it. Assuming quality doesn’t suffer the more frequent, the better your subscriber numbers will look.

The VSpeaking Method: We started bi-monthly then shifted to bi-weekly then recently we have shifted to a “mostly-weekly” cadence.

Where

Where do you post it? Ideally, you want your podcast in every major podcast platform. You want the apple podcast app, Android play store, and Spotify. You will want a web player that allows people to play it from a browser, and you will want a website to host show notes, speaker notes, and other information.

Internal only podcast, or password protected – Call me cynical, but I don’t have faith in internal podcasts.

There’s too much friction vs. using the existing apps that people use on their devices.
It’s predicated on the myth that anything you post internal doesn’t easily leak out to competitors. Lets be honest, your secret competitive strategy if it’s any good will be published on TheRegister years before it ships or in the hands of your competitors before the ink dries. This might work for short form content that the embargo on will quickly be released.

The vSpeaking method: We host vSpeakingpodcast.com using Zencast.fm which provides hosting, and post-episode blogs on Virtual Blocks. We briefly flirted with an internal only podcast using socialcast as a distribution method (Sadly killed) or Podbean but for the quality and time commitment we make, we couldn’t justify it.

Conclusion

You need a few things to maintain a tech podcast.

The drive to keep doing it. It has to be something you actually enjoy doing otherwise committing to blocking the time and doing the pre-work to get guests etc will result in it falling apart a month in.
The right skills/talents. Pete is a fantastic showrunner, the host who keeps things moving, and editor.
A genuine technical curiosity for the topics you will cover.

Help/SOS – Podcast Emergency you say?

if you’ve reached this section and you are trying to start a quality podcast please stop reading. If you are here because your boss came to you and said “we need you to start a podcast” keep reading. if you just discovered that your MBO/KPI for your quarterly bonus is tied to starting a podcast this section of the guide is exclusively for you. You don’t have any experience with any of this, and reading the above section has you convinced there is no way to be successful and it is too late to change this objective. It’s true, there isn’t really a market for un-edited poorly recorded 8 episode podcasts run by product marketing on “why you should buy our cloud product!” but that isn’t going to stop you from getting paid! Don’t stress, you will still be able to get your bonus if you follow the following guide.

Guests – Don’t try to get busy, highly in-demand guests. This might draw attention to the episode and highlight that this was just something slapped together to get a bonus.

KPI/MBO – Make sure the MBO/KPI didn’t include download statistics. Make sure to choose a platform that doesn’t provide this as part of it’s “free hosting” so you can blame that. In the event, you have some minimum downloads required just hire Mechanical Turk to get people to send you a screenshot of their phone saying “subscribed and downloaded”.

Quality Content? That will take too long. Just write out the ‘top 10 reasons you should buy our product’. Find the sales/marketing PowerPoint and feel free to just read the speaker notes (or slides, as we know speaker notes are for losers, and slides should have 100 words per slide). This is a great opportunity to reuse other stale content. For bonus points re-use content from someone who asked you to create a podcast so you can blame them if they don’t think the content is good!

Gear? Use the official corporate standard for interoffice phone calls, especially if it’s quality is terrible. This will reduce the desire of whoever is reviewing this to give you your bonus to listen long enough to realize there is no real content. Skype4Buisness, Webex dial-in bridges

Editing? Leave 20-30 minutes of dead air on the end of the episodes to make them look longer. Especially if you are hiding the low-quality effort after the first episode as this will prevent it auto-playing the next episode.

Platform and marketing? – Consider posting this as MP3 on an internal-only sales education portal. Avoiding hosting it in the outside world will help avoid scrutiny. Make sure the metadata tags are not tracked well, and the only link to it is a weekly newsletter where it’s placed at the bottom.

What if I want to my product marketed by a Podcast and do not want to bother with all this?

This is a bit easier than the above steps. Simply reach out to some podcasts that have existing followings in your space, and see if you can get guests that will represent your product to that community. Sometimes this will cost money, sometimes this will not. Note: the vSpeakingpodcast does not do pay for play, but we don’t judge others in the industry who do as long as it is responsibly and legally disclosed.

There are also experienced people in the industry that you can just outright hire JMT being a good one right now.

Alternatively, if you want to produce a short video series, Video (A youtube playlist) is honestly a more popular format that is likely more conducive to what you are trying to accomplish.

Apr 6

1 Comment

vSAN ReadyNodes Additional Feature: vLCM Support

When picking out some new nodes for a VMware Cloud Foundation build-out, I noticed a new feature I could search for.

Here’s a quick explanation of what this new capability is, as well as some existing features:

vLCM Capable ReadyNode: This node is supported by the server OEM as being able to be patched by VMware LifeCycle Manager. This capability allows you to patch NICs, HBAs, Drives with new firmware and driver as well as update the BIOS. Currently, this includes servers from HPE Gen10 as well as select Dell 13 and 14 generation servers. For a quick demo of how vLCM can patch a host check out this video

SAS Expander: A typical SAS physical connection has 4 SAS channels. Most internal HBA and controllers only have 2 SAS physical connections and in a directly connected configuration only support 8 drives. SAS expanders switch the connection, allowing up to 254 per connection. The SAS expander must work tightly with the raid controller (both are often made by the same manufacturer) and firmware and driver versions for both must be kept in “sync” to prevent issues. SAS expanders also support SATA tunneling protocol that allows a SATA drive to emulate a SCSI device. For additional information on SAS expanders, see the vSAN design and sizing guide.

SSD/HDD Hotplug: Hotplug is the ability to add a device to a system while it is running. Useful for replacing failed devices, as well as expanding a ReadyNode without having to power off the host.

Intel VMD: VMD incorporated NVMe drives to have several modes for the drive’s amber LED such as on, off and flash to identify the NVMe drive. This allows device location for serviceability. VMD also enables hot-swap replacement without shutting down the system. The VMD device can intercept PCIe events due to hot plug and allow for safe, clean drive removal and re-insertion. With Intel VMD, servicing drives can be done online, minimizing service interruptions.

Apr 1

CIFS for VCF and vSAN?

The year was 2019 and at VMworld Barcelona, someone asked me “When will vSAN support CIFS?.” This is a question I get from time to time, and I responded the same as always.

“vSAN will NEVER support CIFS”

VMware Cloud Foundation 7, and vSAN 7 now offer native file services, starting with NFS v3 and NFS v4.1 as the first file protocols. Why was NFS chosen first? Why not CIFS?

A historical detour into what is CIFS…

CIFS (Common Internet File System) was effectively a Microsoft extension of IBM’s SMBv1 from around the time of Appletalk rising and falling in usage. It had some issues:

Despite “internet” being in the name, it is in a tie for last with NetBIOS for things you don’t want on the public internet.
There were lots of weird proprietary extensions, unstable client implementations.
Security is baaaaad (and not getting fixed). Us CERT says to stop using it.
After Microsoft deprecated it usage has plummeted to ancient legacy devices. Devices like that 15-year-old copier/scanner want to bash with a hammer, and that Windows XP machine that controls the HVAC system everyone forgot about.

Due to the opportunity for downgrade attacks from SMB2, Microsoft pushed out a service to disable it automatically. This effectively ended its era, and new versions of windows lack the binaries to use it (only in place upgrades still had it around).

Yes, that’s a service that exists to automatically remove a service. There’s got to be a better name for this?

“But John, xxxx vendor stills calls windows file shares CIFS”

I actually asked that vendor, why they call it a CIFS gateway and was told: “we have a few large customers, who haven’t updated their RFP templates from when new coke was still a thing…”

“John, will you stop pedantically correcting everyone who says CIFS, surely they mean SMB?”

The owner of the protocol Ned Pyle at Microsoft actually gets even more annoyed than me by people calling it CIFS.

Please stop saying CIFS when you mean SMB. If you don't like it, too bad; I'm the owner. I'll take my protocol and go home.
— Ned Pyle (@NerdPyle) December 10, 2016

What about SMB3.x

While SMB 3.x share still exists and holds lots of departmental shares, and roaming profiles, and various “junk drawers” of forgotten files, this is not a super exciting high growth right now. Sync and Share products (OneDrive/Dropbox) are for many shops slurping up a lot of this use case for unstructured data that needs to accessed by windows clients. It is worth noting that even the best 3rd party implementations of SMB3.x tend to cut corners on the full Microsoft server implementation, and many features associated with a windows file server (FSRM reporting and file screens, quotas, NTFS ACLs) are not actually a part of SMB and something that has to be implemented in the backing file system or emulated etc. Don’t worry, VMware is still looking at SMB 3.x support, but first, it’s time to address why NFS…

The better question: Why start with NFS?

When picking what protocols vSAN would support first, it was critical to look at what is driving new file share use cases in the data center, and specifically what are the file needs for Kubernetes developers. The goal of vSphere 7 with Kubernetes is to make VMware Cloud Foundation the premier platform for cloud-native application development and deployment. The existing ReadWriteOnce support delivered in vSAN 6.7U3 helps automate block workloads to containers using the CSI, but for applications that need ReadWriteMany volumes, a non-block shared file system option was needed.

NFSv3 strengths

In addition to the Kubernetes use case, there are a number of various infrastructure related use cases for NFS, ranging from a vCenter backup target, to content catalog, archive target, and repository share. NFSv3 especially does well with these use cases, as it’s simple, and the protocol has seen little interop issues in the over 20 years since it was ratified as a RFC. In general, it has aged a bit like a fine wine (as opposed to CIFS which as aged like milk sitting in the sun).

*I’m honestly not a cheese guy, but this is what I assume CIFS would look like as cheese*

NFSv4.1 – Back to the Future

One of the considerations, with file servers as an extension of VMware Cloud Foundation based HCI, is making sure that:

Performance scales linearly with the nodes
Consumption is cloud-like and can be easily automated

A critical feature that NFSv4.1 includes, that v3 does not, is the ability to use a virtual namespace across multiple file servers, and seamlessly redirect connections to the right one every time, without having to make the consumer of NFS go look anything up. I go into what this looks like in this blog a bit. As well as the following video.

So what’s the future of vSAN File Services?

While vSAN file services deliver a great experience for cloud-native services and infrastructure shares, it will continue to evolve to meet the needs of more and more applications and users as time goes on. The unique auto-scaling container structure can support adding additional containers to speak different protocols. Lastly, the unique hypervisor integrated IO path opens up some interesting future possibilities to extend VMware Cloud Foundation’s lead as the leading application platform.

Mar 19

2 Comments

vSAN File Services – How to find the connection URL

vSAN File Services adds a critical service to VMware Cloud Foundation. Layering a distributed protocol access layer on top of vSANs existing shared nothing distributed object store that can serve the NFS needs of Kubernetes as well as traditional services.

New shares setup in vSAN file services are balanced across the cluster. To find the IP address of which container you should connect to for a given share the interface offers this information for NFSv3.

Note NFSv4 is different. a NFSv4 referral enables a multi-server namespace to exist and seamlessly handle redirecting the client to the server that hosts a given directory or share. While it may appear as one namespace, All IO will not have to hairpin through the container owning the primary IP. Similar to iSCSI login Redirects, this simplifies setup, avoids the need for the client to attempt to connect to every node in the cluster.

What does this look like in the interface? This short 1 minute video may help:

VMware Cloud Foundation 4 is a powerful virtual machine and container platform. vSAN file services is critical to meeting the needs of modern applications and container workloads.

If your looking for more information on NFS Redirection the following may be useful:

The RFC – https://tools.ietf.org/html/rfc7530

Musing of a Storage guy in a Virtual World..