VMwoes Purple Screen of Death: VMware ESXi 5.5 host experiences a purple diagnostic screen mentioning E1000PollRxRing and E1000DevRx

VMware Purple Screen of Death
VMware Purple Screen of Death

VMware ESXi 5.5.0 (Releasebuild-1331020 x86_64]
#PF Exception 14 in world 264638:vmm1:AGB-Dub IP 0x418039010c57 addr 0x0
PTEs:0x0;
cr0=0x80050031 cr2=0x0 cr3=0xa5a4f3000 cr4=0x42668
frame=0x4123a6f9cf30 ip=0x418039010c57 err=9 rflags=0x10206
rax=0x0 rbx=0x51 rcx=0x18
rdx=0x2 rbp=0x4123a6f9d3d0 rsi=0x1
rdi=0x4108a8348d40 r8=0x1 r9=0x1
r10=0x41122413a080 r11=0x4 r12=0x41001651cef4
r13=0x1 r14=0x4123a6f9d2e0 r15=0x4123a6f9d334
*PCPU24:264638/vmm1:AGB-DubalLive
PCPU 0: UVVVVUVVVVVVVVVVVVVVVVVVVVVVVVS
Code start: 0x418038e00000 VMK uptime: 60:02:35:05.115
0x4123a6f9d3d0:[0x418039010c57]E1000PollRxRing@vmkernel#nover+0xb73 stack: 0x8
0x4123a6f9d440:[0x418039013bb5]E1000DevRX@mkernel#nover+0x3a9 stack: 0x4123a6f9d658
0x4123a6f9d4e0:[0x418038f92164]I0Chain_Resume@vmkernel#nover+0x174 stack: 0x0
0x412306f9d530:[0x418038f79e22]PortOutput@vmkernel#nover+0x136 stack: 0x4108ff01f780
0x4123a6f9d590:[0x41803952ff58]EtherswitchForwardLeafPortsQuick@#+0x4c stack: 0x183c21
0x4123a6f9d7b0:[0x418039530f51]EtherswitchPortDispatche@#+0xe25 stack: 0x418000000015
0x4123a6f9d820:[0x418030f7a7d2]Port_InputResume@vmkernel#nover+0x192 stack: 0x412fc57f4a80
0x4123a6f9d870:[0x418038f7ba39]Port_Input_Committed@vmkernel#nover+0x25 stack: 0x0
0x4123a6f9d8e0:[0x41803901763a]E1000DevAsyncTx@vmkernel#nover+0x112 stack: 0x4123a6f9da60
0x4123a6f9d950:[0x418030fadd70]MatWorldletPerVMC0@vmkernel#nover+0x218 stack: 0x410800000000
0x4123a6f9dab8:[0x418038eeae77]WorldletProcessQueue@vmkernel#nover+0xcf stack: 0x0
0x4123a6f9daf0:[0x418038eeb93c]WorldletEHHIandlerft@vmkernel#nover+0x54 stack: 0x0
0x4123a6f9db80:[0x418038e2e94f]BH_DrainAndDisableInterrupts@vmkernel#nover+0xf3 stack: 0x2ff889001
0x4123a6f9dbc0:[0x418038e63e03]IDT_IntrHandler@vmkernel#nover+8x1af stack: 0x4123a6f9dce8
0x4123a6f9dbd0:[0x418038ef1064]gate_entry@vmkernel#nover+0x64 stack: 0x0
0x4123a6f9dce8:[0x4180391a32d3]Power_HaltPCPU@vmkernel#nover+0x237 stack: 0x418086e64100
0x4123a6f9dd58:[0x41803904e859]CpuSchedIdleLoopInt@vmkernel#nover+0x4bd stack: 0x4123a6f9dec8
0x4123a6f9deb8:[0x418039054938]CpuSchedDispatch@vmkernel#nover+0x1630 stack: 0x4123a6f9df20
0x4123a6f9df28:[0x418039055c65]CpuSchedHalt@vmkernel#nover+0x245 stack: 0xffffffff00000001
0x4123a6f9df98:[0x4180390561cb]CpuSched_VcpuHalt@vmkernel#nover+0x197 stack: 0x410000008000
0x4123a6f9dfe8:[0x418038ecde30]VMMVMKCall Call@vmkernel#nover+0x48c Stack: 0x0
0x418038ecd484:[0xfffffffffic223baa] vmk_symbol_MFSVolume_GetLocalPathf@com.vmmare.nfsmod#1.0.0.0+0
base fs=0x0 gs=0x418046000000 Kgs=0x0
Coredump to disk. Slot 1 of 1.
VASpace (00/12) DiskDump: Partial Dump: Out of space o=0x63ff800 l=0x1000
Finalized dump header (12/12) FileDump: Successful.
Debugger waiting(world 264638) -- no port for remote debugger. "Escape" for local debugger.

Apparently it is a known issue with the particular release of the VMware ESXi 5.5 hypervisor we use on just one of our host servers. It has since been patched, but we went with the workaround as there wasn’t a huge number of virtual machines to modify.

The workaround is to replace the E1000 network adapters with the VMXNET3 adapters.

There is further information on Running-system.com regarding this bug Purple Screen of Death caused by E1000 adapters and RSS (Receive Side Scaling).

Moving TempDB to a new location.

We had a process running on a particular SQL server virtual machine which was causing the TempDB file to grow exponentially and as a result caused the C: drive to run out of space. In this case the best solution was to move the location of the TempDB from the default location to a new location on the very large second Virtual drive.

The process is pretty straightforward.
[via]

Firstly locate the current file path of TempDB.
SELECT name, physical_name AS CurrentLocation
FROM sys.master_files
WHERE database_id = DB_ID(N'tempdb');
GO

Secondly perform the actual move with the following code. Modify it to choose new locations appropriate to your system.
USE master;
GO
ALTER DATABASE tempdb
MODIFY FILE (NAME = tempdev, FILENAME = 'E:\TempDB\tempdb.mdf');
GO
ALTER DATABASE tempdb
MODIFY FILE (NAME = templog, FILENAME = 'E:\TempDB\templog.ldf');
GO

Cannot RDP to a Windows Server 2008 R2 virtual machine

A quite mystifying issue with one of Citrix test machines was escalated to me this morning. The member of staff whose role it is to configure new test environments on the Citrix servers Skyped me to say that he couldn’t RDP to the machine but could access it via the vSphere client and could I please take a look at it and see if I could work out what was going on.

It was in a hell of state and I suspect that he’d had a good go at fixing things himself but had made matters much worse. The Remote Desktop Services role had been uninstalled for a start! Not that that would have actually made much of a difference as RDP for Administration would still be available without that role installed.

From the command line I ran the following two commands.

netstat -a -o | findstr 3389
and
qwinsta

The first was to display all the active TCP and UDP ports on which the computer was listening and then find the string 3389 which is the default RDP port number, the second command displays information about Remote Desktop sessions on a server. Neither returned any result.

I then restarted the Remote Desktop Services service.

Checked Remote Desktop Session Host and then at that point realised that RDS was no longer there. Reinstalled RDS and configured it to point at the license server again. A redundant step in terms of resolving the issue, but an important one in restoring the server back to full functionality.

Disabled the Windows Firewall completely.

From elevated command prompt I ran the following two commands.
sfc /scannow
regsvr32 remotepg.dll

I thought about checking Group Policy to ensure that nothing silly had been configured that would have denied RDP connections.

To do so would involve opening up the Group Policy Editor locally and then expanding the following.
Computer Configuration – Administrative Templates – Windows Components – Remote Desktop Services – Remote Desktop Session Host – Connections.
Allow users to connect remotely using Remote Desktop Services (enable or disable)

But the issue was more fundamental than that as I could see that the port itself wasn’t open.

Then decided to check whether the correct port number was assigned to the Remote Desktop Services and using information from this knowledge base article http://support.microsoft.com/kb/2477176 I checked the port number associated with RDP in the registry.

  • Ran regedit and opened the following registry subkey:
  • HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Control\Remote Desktop server\WinStations
  • Located the PortNumber registry entry.
  • Saw that the port number 3390 had been assigned.
  • Changed the port back from 3390 to 3389.
  • Saved the change, and then closed Registry Editor.

Tested RDP from my laptop and it worked.

Job done.

This strikes me as being a deliberate change . There is security advice out there that suggests changing the default port to something else, but I don’t believe that it offers a great deal of security and in this case was a massive pain. Also I can’t think who would have made this change.

ESXi 5: Suppressing the local/remote shell warning

Using the SSH shell is a pretty efficient way to get things done on ESXi 5.x, but annoyingly it is disabled by default. Enabling the ESXi shell is simple enough to do.

But having enabled it means vSphere will show a warning message ESXi shell for the host has been enabled and in the host view the host is shown with a yellow warning exclamation mark. If you’re like me you’ll want to enable the shell but not have the warning always showing.

Suppressing the warning is pretty straightforward. In the vSphere client select the affected host and then click the configuration tab. Open up Advanced Settings and click UserVars from the menu tree and scroll all the way down to the UserVars.SuppressShellWarning setting. Change the value from 0 to 1.

SSG_shell_warning_suppress
[via]

Find large vmware.log files

Since upgrading to ESXi 5.1 some time ago I’ve seen the logfiles for some of our virtual machines grow truly massive, like over a gigabyte in size massive.

Removing the logs isn’t too difficult simply either vMotion the VM or shut it down entirely and then power up again. Both methods result in a new log being created allowing the old log to then be deleted.

The difficulty is in finding which VMs have generated huge log files, especially when you have well over a hundred virtual machines.

The following is a simple one line piece of code to show the 10 biggest logfiles, it can be amended accordingly to show a greater number.

cd /vmfs/volumes/; ls -lhdS [A-Z]*/*/vmware.log | head -10

To prevent that datastores are shown twice, once by name and once by id, it is limited to only show datastores starting with a capital letter, all our datastores start with an upper case letter, you may have to adjust the command to fit your particular environment.

[Via]

POODLE Attack – Disabling SSLv3 in Internet Explorer via Group Policy

The POODLE attack (which stands for “Padding Oracle On Downgraded Legacy Encryption”) is a man-in-the-middle exploit which takes advantage of Internet and security software clients’ fallback to SSL 3.0. Further details on the nature of the attack can be found here.

SSL 3.0 will be disabled in the next releases of all the major web browsers, but until then the following steps can be taken to protect clients in your company through disabling SSL 3.0 and enabling TLS 1.0, TLS 1.1, and TLS 1.2 for Internet Explorer in Group Policy.

You can disable support for the SSL 3.0 protocol in Internet Explorer via Group Policy by modifying the Turn Off Encryption Support Group Policy Object.

  • Open Group Policy Management.
  • Select the group policy object to modify, right click and select Edit.
  • In the Group Policy Management Editor, browse to the following setting:
    Computer Configuration -> Administrative Templates -> Windows Components -> Internet Explorer -> Internet Control Panel -> Advanced Page -> Turn off encryption support
  • Double-click the Turn off Encryption Support setting to edit the setting.
  • Click Enabled.
  • In the Options window, change the Secure Protocol combinations setting to “Use TLS 1.0, TLS 1.1, and TLS 1.2″.
  • Click OK.

Note Administrators should make sure this group policy is applied appropriately by linking the GPO to the appropriate OU in their environment.

To achieve the same in Mozilla Firefox is not possible centrally via Group Policy but can be done on an individual basis through installation of the SSL Version control plugin.

Debugging and fixing the BUGCODE_USB_DRIVER (fe) Blue Screen of Death

A remote laptop user has been suffering from occasional blue screens of death with the error BUGCODE_USB_DRIVER. I asked him to email me the mini-dump that had been generated by the last BSOD so that I could analyse it with WinDbg.

Output from analysis of dump file using WinDbg

*******************************************************************************
*                                                                             *
*                        Bugcheck Analysis                                    *
*                                                                             *
*******************************************************************************
BUGCODE_USB_DRIVER (fe)
USB Driver bugcheck, first parameter is USB bugcheck code.
Arguments:
Arg1: 0000000000000008, USBBUGCODE_RESERVED_USBHUB
Arg2: 0000000000000006, USBHUB_TRAP_FATAL_TIMEOUT
Arg3: 0000000000000005, TimeoutCode: Timeout_PCE_Suspend_Action3 - PortData->PortSuspendEvent
Arg4: fffffa8007dfcc80, TimeoutContext - PortData

Debugging Details:
------------------
CUSTOMER_CRASH_COUNT: 1

DEFAULT_BUCKET_ID: WIN7_DRIVER_FAULT

BUGCHECK_STR: 0xFE

PROCESS_NAME: System

CURRENT_IRQL: 0

LAST_CONTROL_TRANSFER: from fffff88006830a5c to fffff80002ed8bc0

STACK_TEXT:
fffff880`035d9ad8 fffff880`06830a5c : 00000000`000000fe 00000000`00000008 00000000`00000006 00000000`00000005 : nt!KeBugCheckEx
fffff880`035d9ae0 fffff800`031cfc93 : fffffa80`07df6050 00000000`00000001 ffffffff`dc3a58a0 fffff800`0307e2d8 : usbhub!UsbhHubProcessChangeWorker+0xec
fffff880`035d9b40 fffff800`02ee2261 : fffff800`0307e200 fffff800`031cfc01 fffffa80`036d9000 fffffa80`036d9040 : nt!IopProcessWorkItem+0x23
fffff880`035d9b70 fffff800`0317473a : 24d524c5`24c524c5 fffffa80`036d9040 00000000`00000080 fffffa80`036669e0 : nt!ExpWorkerThread+0x111
fffff880`035d9c00 fffff800`02ec98e6 : fffff880`033d7180 fffffa80`036d9040 fffff880`033e1fc0 54d3e93c`92e2655f : nt!PspSystemThreadStartup+0x5a
fffff880`035d9c40 00000000`00000000 : 00000000`00000000 00000000`00000000 00000000`00000000 00000000`00000000 : nt!KxStartSystemThread+0x16

STACK_COMMAND: kb

FOLLOWUP_IP:
usbhub!UsbhHubProcessChangeWorker+ec
fffff880`06830a5c cc int 3

SYMBOL_STACK_INDEX: 1

SYMBOL_NAME: usbhub!UsbhHubProcessChangeWorker+ec

FOLLOWUP_NAME: MachineOwner

MODULE_NAME: usbhub

IMAGE_NAME: usbhub.sys

DEBUG_FLR_IMAGE_TIMESTAMP: 52954e12

FAILURE_BUCKET_ID: X64_0xFE_usbhub!UsbhHubProcessChangeWorker+ec

BUCKET_ID: X64_0xFE_usbhub!UsbhHubProcessChangeWorker+ec

The important bit here are the arguments following the Bug Check 0xFE: BUGCODE_USB_DRIVER. These parameters give the exact underlying error associated with the general BUGCODE_USB_DRIVER error. The parameters indicate the failure was due to ‘Timed out waiting for a suspend-port request to complete.’

A highly recommended solution to this issue is to disable USB Selective Suspense in the power settings.

Open the Power Settings from the Control Panel and then click Edit Plan Settings for your current plan.

power_settings_1

Click on Change advanced power settings.

power_settings_2

In the Advanced settings window for Power Options scroll down to the USB settings and expand them to display the USB selective suspend setting. It should be enabled by default. To disable it just click on Enabled and then in the drop down menu that appears change the option to Disabled and then Apply the change and close the window.

Resolving Microsoft Office 2010 issues with Office 365

We have a mixture of licenses for Microsoft Office 2010 and 2013 here and there is the occasional need to reinstall Microsoft Office such as when rebuilding a PC or migrating a user to a new computer.

Problem
New installs of Microsoft Office 2010 will have issues when trying to work with Office 365 i.e. Exchange and SharePoint online.

The issue will be most apparent when trying to connect Outlook with the Exchange mail server as it will continue to request login details on an infinite loop which is difficult to cancel without killing off the Outlook process in Task Manager.

Solution
Microsoft Office 2010 needs to be updated to the most recent version by installing Service Pack 2 (and possibly subsequent Windows Updates that relate to Office 2010).

You can determine what exact version of Office you have by doing the following. On the File tab, click Help. You will see the version information in the About Microsoft section.

office 2010 version number

• The version number of Office 2010 SP2 is greater than or equal to 14.0.7015.1000.
• The version number of Office 2010 SP1 is greater than or equal to 14.0.6029.1000 but less than 14.0.7015.1000.
• The version number of the original RTM release of Office 2010 (that is, with no service pack) is greater than or equal to 14.0.4763.1000 but less than 14.0.6029.1000.

How to whitelist a domain in Office 365 Exchange online

We receive automated emails from a domain other than the one we use for staff and some of these emails were getting misidentified as spam and moved to people’s Junk Email folders in Outlook. So we needed to white list the domain so that any emails originating from there would bypass the spam filter.

  1. In the Exchange admin center click on Mail Flow.
    exchange_mail_flow
  2. Next create a new rule by clicking on the + icon and click Bypass spam filtering…
    new_rule
  3. Select on the *Apply this rule if… for The sender… domain is
  4. Add the domain you wish to whitelist plus any additional domains you also wish to whitelist.
  5. Select Stop Processing more rules and then click save.

CCNA, MCITP and MCSE: Server Infrastructure