Category: VMware

How to shrink a VMDK: Shrinking a virtual disk in VMware ESXi

First open up Disk Management in Computer Management in your guest Windows environment.

Right click the volume on the disk you want to shrink.

Windows will inform you the maximum amount it can shrink the disk by. Choose an amount that you wish to actually shrink it by and click Shrink.

shrinkF

Windows will start the shrinking process and it might take some time and appear to be hanging as Windows will actually be defragmenting the disk in order to consolidate the free space towards the end of the disk before resizing the volume.

Once it is done and you are satisfied that the volume on the disk is the size you want it then you need to shut down the VM.

SSH into the host and copy the VMDK file to make a backup of it, just the descriptor file not the flat file.

cp vmname.vmdk vmname-original.vmdk

Open up the VMDK file in a text editor and find the line that describes the size of the flat file. Similar to the following

# Extent description

RW 209715200 VMFS “vmname-flat.vmdk”

The number is the size of the virtual disk in terms of disk sectors, where each sector is 512 bytes. So a 100GB virtual disk is 209715200 sectors.

You will need to change this number to correspond to the new disk size where x = size in GB

vmdk_size = [x * (1024*1024*1024)] / 512

I have chosen to shrink my disk to 60gb, so my new Extent description now reads as follows:

# Extent description

RW 125829120 VMFS “vmname-flat.vmdk”

You now need to clone the drive to get it to the new size:

vmkfstools -i vmname.vmdk vmname-new.vmdk

The bit we are interested in is the newly created vmname-new-flat.vmdk file.

Rename the old flat file from vmname-flat.vmdk to vmname-flat-old.vmdk

and rename the vmname-new-flat.vmdk file to vmname-flat.vmdk

Start the VM up and it should show the new smaller disk. When you are satisfied that everything is working you can now delete the old unneeded files from your datastore.



VMwoes: Purple Screen of Death E1000PollRxRing

VMwoes Purple Screen of Death: VMware ESXi 5.5 host experiences a purple diagnostic screen mentioning E1000PollRxRing and E1000DevRx

VMware Purple Screen of Death

VMware Purple Screen of Death

VMware ESXi 5.5.0 (Releasebuild-1331020 x86_64] #PF Exception 14 in world 264638:vmm1:AGB-Dub IP 0x418039010c57 addr 0x0
PTEs:0x0;
cr0=0x80050031 cr2=0x0 cr3=0xa5a4f3000 cr4=0x42668
frame=0x4123a6f9cf30 ip=0x418039010c57 err=9 rflags=0x10206
rax=0x0 rbx=0x51 rcx=0x18
rdx=0x2 rbp=0x4123a6f9d3d0 rsi=0x1
rdi=0x4108a8348d40 r8=0x1 r9=0x1
r10=0x41122413a080 r11=0x4 r12=0x41001651cef4
r13=0x1 r14=0x4123a6f9d2e0 r15=0x4123a6f9d334
*PCPU24:264638/vmm1:AGB-DubalLive
PCPU 0: UVVVVUVVVVVVVVVVVVVVVVVVVVVVVVS
Code start: 0x418038e00000 VMK uptime: 60:02:35:05.115
0x4123a6f9d3d0:[0x418039010c57]E1000PollRxRing@vmkernel#nover+0xb73 stack: 0x8
0x4123a6f9d440:[0x418039013bb5]E1000DevRX@mkernel#nover+0x3a9 stack: 0x4123a6f9d658
0x4123a6f9d4e0:[0x418038f92164]I0Chain_Resume@vmkernel#nover+0x174 stack: 0x0
0x412306f9d530:[0x418038f79e22]PortOutput@vmkernel#nover+0x136 stack: 0x4108ff01f780
0x4123a6f9d590:[0x41803952ff58]EtherswitchForwardLeafPortsQuick@#+0x4c stack: 0x183c21
0x4123a6f9d7b0:[0x418039530f51]EtherswitchPortDispatche@#+0xe25 stack: 0x418000000015
0x4123a6f9d820:[0x418030f7a7d2]Port_InputResume@vmkernel#nover+0x192 stack: 0x412fc57f4a80
0x4123a6f9d870:[0x418038f7ba39]Port_Input_Committed@vmkernel#nover+0x25 stack: 0x0
0x4123a6f9d8e0:[0x41803901763a]E1000DevAsyncTx@vmkernel#nover+0x112 stack: 0x4123a6f9da60
0x4123a6f9d950:[0x418030fadd70]MatWorldletPerVMC0@vmkernel#nover+0x218 stack: 0x410800000000
0x4123a6f9dab8:[0x418038eeae77]WorldletProcessQueue@vmkernel#nover+0xcf stack: 0x0
0x4123a6f9daf0:[0x418038eeb93c]WorldletEHHIandlerft@vmkernel#nover+0x54 stack: 0x0
0x4123a6f9db80:[0x418038e2e94f]BH_DrainAndDisableInterrupts@vmkernel#nover+0xf3 stack: 0x2ff889001
0x4123a6f9dbc0:[0x418038e63e03]IDT_IntrHandler@vmkernel#nover+8x1af stack: 0x4123a6f9dce8
0x4123a6f9dbd0:[0x418038ef1064]gate_entry@vmkernel#nover+0x64 stack: 0x0
0x4123a6f9dce8:[0x4180391a32d3]Power_HaltPCPU@vmkernel#nover+0x237 stack: 0x418086e64100
0x4123a6f9dd58:[0x41803904e859]CpuSchedIdleLoopInt@vmkernel#nover+0x4bd stack: 0x4123a6f9dec8
0x4123a6f9deb8:[0x418039054938]CpuSchedDispatch@vmkernel#nover+0x1630 stack: 0x4123a6f9df20
0x4123a6f9df28:[0x418039055c65]CpuSchedHalt@vmkernel#nover+0x245 stack: 0xffffffff00000001
0x4123a6f9df98:[0x4180390561cb]CpuSched_VcpuHalt@vmkernel#nover+0x197 stack: 0x410000008000
0x4123a6f9dfe8:[0x418038ecde30]VMMVMKCall Call@vmkernel#nover+0x48c Stack: 0x0
0x418038ecd484:[0xfffffffffic223baa] vmk_symbol_MFSVolume_GetLocalPathf@com.vmmare.nfsmod#1.0.0.0+0
base fs=0x0 gs=0x418046000000 Kgs=0x0
Coredump to disk. Slot 1 of 1.
VASpace (00/12) DiskDump: Partial Dump: Out of space o=0x63ff800 l=0x1000
Finalized dump header (12/12) FileDump: Successful.
Debugger waiting(world 264638) -- no port for remote debugger. "Escape" for local debugger.

Apparently it is a known issue with the particular release of the VMware ESXi 5.5 hypervisor we use on just one of our host servers. It has since been patched, but we went with the workaround as there wasn’t a huge number of virtual machines to modify.

The workaround is to replace the E1000 network adapters with the VMXNET3 adapters.

There is further information on Running-system.com regarding this bug Purple Screen of Death caused by E1000 adapters and RSS (Receive Side Scaling).



ESXi 5: Suppressing the local/remote shell warning

Using the SSH shell is a pretty efficient way to get things done on ESXi 5.x, but annoyingly it is disabled by default. Enabling the ESXi shell is simple enough to do.

But having enabled it means vSphere will show a warning message ESXi shell for the host has been enabled and in the host view the host is shown with a yellow warning exclamation mark. If you’re like me you’ll want to enable the shell but not have the warning always showing.

Suppressing the warning is pretty straightforward. In the vSphere client select the affected host and then click the configuration tab. Open up Advanced Settings and click UserVars from the menu tree and scroll all the way down to the UserVars.SuppressShellWarning setting. Change the value from 0 to 1.

SSG_shell_warning_suppress
[via]



Find large vmware.log files

Since upgrading to ESXi 5.1 some time ago I’ve seen the logfiles for some of our virtual machines grow truly massive, like over a gigabyte in size massive.

Removing the logs isn’t too difficult simply either vMotion the VM or shut it down entirely and then power up again. Both methods result in a new log being created allowing the old log to then be deleted.

The difficulty is in finding which VMs have generated huge log files, especially when you have well over a hundred virtual machines.

The following is a simple one line piece of code to show the 10 biggest logfiles, it can be amended accordingly to show a greater number.

cd /vmfs/volumes/; ls -lhdS [A-Z]*/*/vmware.log | head -10

To prevent that datastores are shown twice, once by name and once by id, it is limited to only show datastores starting with a capital letter, all our datastores start with an upper case letter, you may have to adjust the command to fit your particular environment.

[Via]


The redo log is corrupted. If the problem persists, discard the redo log.

redo-log

Yesterday about an hour before the end of my work day one of our critical servers fell over and was displaying the following message in the vSphere client.

The redo log of VisualSVNServer_1-000001.vmdk is corrupted. If the problem persists, discard the redo log.

The error message refers to a redo log, but this is legacy VMware terminology. VMware have from ESXi 3.1 started to use the term snapshot to mean the same thing but for some reason the error messages still use the old term.

The server was named Subversion and was a VisualSVN Server.

There was a snapshot dated from 15th December 2013 in the Snapshot manager for the Subversion VM so returning to this snapshot would have meant returning to a point several weeks ago and then trying to import the backup of the repository that was made the night of 29th January.

The underlying cause of the corruption cannot be definitively determined but I think was due to the amount of disk activity on the physical disk that constitutes datastore 3_2 on the host server S003-ESXi. This caused the system to fail to write to the log and to create updated delta disks which contain all the changes to the disks since the point of the snapshot.

I believe that if there had not been a snapshot the data corruption probably wouldn’t have happened. I have since educated staff that taking snapshots in vSphere is really not the same as backing up the server and they shouldn’t be doing it on the Subversion server at all.

I resolved the issue with Subversion by carrying out the following steps.

I clicked OK to the error message in the slim hope that the VM could overcome the glitch itself upon a simple reboot.

This didn’t work. So I started the process of backing up the VM by forcing a shutdown of the machine by virtually cutting off the power and then making a copy of the virtual machine folder on the datastore.

Whilst the copy process was going I checked Virtual Machine Logs, vmware-3.log was completely corrupt and the vmware.log was showing some corruption.

The copy process took over an hour as it was 150GB in total size. Mostly due to the two virtual disks the first VisualSVNServer.vmdk which constitutes the C: drive of the server is 40GB and the second VisualSVNServer_1.vmdk which is the E: drive is 100GB.

Having made a copy of everything I attempted to fix the snapshots. I made sure that there was sufficient space on the datastore and then using Snapshot Manager in vSphere created a new snapshot of the Subversion VM.

This operation was successful, so I then tried to commit the changes and to consolidate the disks. This worked for VisualSVNServer.vmdk merging all the changes, but not entirely for VisualSVNServer_1.vmdk, however it did reduce the size of the delta disks significantly meaning that there was likely to be only minimal data lost.

Nothing more could be done through the vSphere client so I then started a process of trying to manually consolidate the following disks into a single disk.
VisualSVNServer_1.vmdk
VisualSVNServer_1-000001.vmdk
VisualSVNServer_1-000002.vmdk

Enabled SSH on the host server s003-esxi.

Using PuTTY I logged into the command line of the host and changed the directory to the relevant directory that contained the virtual machine files for Subversion /vmfs/volumes/Datastore3_2/VisualSVNServer

Then ran the command ls *.vmdk –lrt to display all virtual disk components.

Then starting with the highest number snapshot ran the following command to clone the disk in a way that would merge the delta disks into a copy of the main disk.

vmkfstools –i VisualSVNServer_1-000002.vmdk VisualSVNServer-Recovered_1.vmdk

This process took another hour or so as it was trying to create a 100GB file.

This failed with the following error message displayed:

Failed to clone disk: Bad File descriptor (589833)

Then starting with the next highest number snapshot I ran command to clone the disk without the most recent changes.

vmkfstools –i VisualSVNServer_1-000001.vmdk VisualSVNServer-Recovered_1.vmdk

This process again took about hour as again it was trying to create a 100GB file.

Again this failed with the following error message displayed:
Failed to clone disk: Bad File descriptor (589833)

Abandoned the idea of merging the disks I removed the VM from the inventory in vSphere and then moved all but the following files into a separate folder.
VisualSVNServer.nvram
VisualSVNServer.vmx
VisualSVNServer.vmdk
VisualSVNServer_1.vmdk

I could then recreate the VM from these files. I downloaded the file VisualSVNServer.vmx which is the virtual machine’s configuration file and stores the settings regarding the virtual devices that make up a virtual machine. I edited the file to change all references to VisualSVNServer_1-000002.vmdk to VisualSVNServer_1.vmdk so that the machine could be booted up ignoring the delta disks and any data they might contain.

Added the VM back into the inventory and then booted up the machine. It booted up fine, checked the E: drive and there appeared to be data written to the disk all the way up to the time that the server fell over so it appeared that there was minimal if any data lost.

Thanks to XtraVirt for the necessary steps.




Free Microsoft Virtualization Training for VMware IT Professionals

Microsoft are really pushing the idea that system administrators that have VMware experience should become bilingual in server virtualisation and get up to speed on Hyper-V too. So following on from the Microsoft Virtualization for VMware Professionals Jump Start, a year or so ago, comes Free Microsoft Virtualization Training for VMware IT Professionals. December 11th from 9am – 12.30pm PST (5pm – 8.30pm GMT)

Get the edge in your technical career! Attend the online Virtualization IT Camp for VMware IT professionals and expand your virtualization skills. Seasoned experts will demonstrate key scenarios and cover equivalent technologies from Microsoft and VMware. Here’s your chance to upgrade your Microsoft Virtualization skills for FREE.

I consider myself already fairly bilingual as I have a Windows Server 2012 Hyper-V host at work with a couple of production servers on it now to go with the 108 virtual machines on our VMware infrastructure. I passed the Windows Server 2008 R2, Server Virtualization exam a couple of years ago when Microsoft was giving away free exam vouchers for it.

Plus I attended the Server Virtualization w/ Windows Server Hyper-V & System Center Jump Start online last month. Just need now to schedule the 74-409 exam whilst my free exam voucher is still valid. The vouchers can still be obtained here (Limited availability)



The New VDI Reality

I read the excellent book The VDI Delusion by Brian Madden at about this time last year.

Brian in hindsight doesn’t think that they picked the best title as it may have scared people off.

The second edition is now available to download for free in either .pdf or .mobi formats and it has had a change of title to The New VDI Reality.

I haven’t had much of a chance to read it yet, but Brian says that there has been substantial rewrites of significant portions of the book in line with the great improvements in the underlying technology for VDI.



VMDK reconstruction

The office move meant shutting down the VMware hosts and one of the side effects of this was that a couple of the virtual machines that had been happily running didn’t come back up.

One didn’t matter at all so I could safely ignore it, but the other was a demo server used by one of the company directors to show off our software to prospective clients and therefore needed to be working asap.

I was faced with quite a mystery as it seemed to have disappeared completely, it wasn’t listed in the inventory of the host in Vsphere and the datastores connected to the host didn’t contain the expected files either.

However I wasn’t completely at a loss as there was a message in the hosts inventory indicating that something had gone wrong and a virtual machine it had been hosting was missing and I knew that some of our VMs have been renamed and so the corresponding files in the datastores do not have matching names.

So I went looking for some orphan files. I found the files in question but it was just the VMDK files and they seemed to have been missing the VMDK metadata files also.

So in order to rebuild the virtual machine I needed to reconstruct the VMDK files so that I had usable virtual hard disk to the attach a new virtual machine to and given that we are using the free vSphere ESXi 4.0 version this meant using the unsupported Tech Support Mode and instructions from here http://kb.vmware.com/selfservice/microsites/search.do?language=en_US&cmd=displayKC&externalId=1002511

To use Tech Support Mode:

Log in to your ESXi host at the console.
Press Alt+F1 to switch to the console window.
Enter unsupported to start the Tech Support Mode login process. Note that no text will appear on the console window.
Enter the password for the root user. Tech Support Mode is now active.
Complete tasks in Tech Support Mode.
Enter the command clear to clear the screen of any residual data from step 5. This may be required by your local security policies.
Enter the command exit to exit Tech Support Mode.
Press Alt+F2 to return the server to DCUI mode.

So I logged into the terminal of the ESXi host.

Then to recreate the virtual machine disks I navigated to the directory that contained the virtual machine disks with the missing descriptor file using the following command (having previously found the relevant volume from browsing Storage in the vSphere client on my PC:

cd "/vmfs/volumes/4bfd0ee1-48e6535e-7d30-0026b97ee7d2/CRJ test/"

The instructions then asked me to identify the type of SCSI controller the virtual disk is using by examining the virtual machine configuration file (.vmx). But I didn’t have the .vmx file so I took a look at similar VMs and they used the SCSI controller type lsilogic.

I identified and recorded the exact size of the -flat file using the command:

# ls -l VS030-Srv08Tmpl-flat.vmdk
-rw——- 1 root root 32212254720 May 29 12:30 VS030-Srv08Tmpl-flat.vmdk

Then used the vmkfstools command to create a new virtual disk:

# vmkfstools -c 32212254720 -a lsilogic -d thin temp.vmdk

This command uses these flags:
-c (This is the size of the virtual disk).
-a (Whether the virtual disk was configured to work with BusLogic or LSILogic).
-d thin (This creates the disk in a thin-provisioned format).

Note: To save disk space, the disk was created in a thin-provisioned format using the type thin. The resulting flat file then consumes minimal amounts of space (1MB) instead of immediately assuming the capacity specified with the -c switch. The only consequence, however, is the descriptor file contains an extra line that must be removed manually in a later step.

The files temp.vmdk and temp-flat.vmdk were created as a result.

I deleted the unneeded temp-flat.vmdk using the command:

# rm temp-flat.vmdk

And renamed temp.vmdk to the name that match the orphaned .flat file (or VS030-Srv08Tmpl-flat.vmdk, in my case):

# mv temp.vmdk VS030-Srv08Tmpl-flat.vmdk

Then the descriptor file needed to be edited to match the .flat file:

Under the Extent Description section, change the name of the .flat file to match the orphaned .flat file you have.

Find and remove the line ddb.thinProvisioned = “1″ if the original .vmdk was not a thin disk. If it was, retain this line.

This completed the reconstruction of the first virtual hard disk, I did the same for the second disk and then I was in a position to rebuild the machine.

Building the VM was straightforward and not worth writing in any great detail as it is simply a case of using the wizard in the vSphere client and then selecting “Use Existing” option instead of “Create New” when it gets to the section about virtual hard disks.