NMP: nmpDeviceAttemptFailover: Retry world failover device


After restarting a vSphere 4.1 update 2 host there where a lot of warnings in the vmkernel log about NMP and failover messages:

Feb 29 16:03:19 esx vmkernel: 0:00:53:08.502 cpu1:4305)WARNING: NMP: nmpDeviceAttemptFailover: Retry world restore device "mpx.vmhba34:C0:T0:L0" - no more commands to retry
Feb 29 16:03:24 esx vmkernel: 0:00:53:13.498 cpu0:4096)VMNIX: VmkDev: 2860: abort succeeded.
Feb 29 16:03:24 esx vmkernel: 0:00:53:13.498 cpu0:4096)WARNING: NMP: nmp_IssueCommandToDevice: I/O could not be issued to device "mpx.vmhba34:C0:T0:L0" due to Not found
Feb 29 16:03:24 esx vmkernel: 0:00:53:13.498 cpu0:4096)WARNING: NMP: nmp_DeviceRetryCommand: Device "mpx.vmhba34:C0:T0:L0": awaiting fast path state update for failover with I/O blocked. No prior reservation exists on the device.
Feb 29 16:03:24 esx vmkernel: 0:00:53:13.498 cpu0:4096)WARNING: NMP: nmp_DeviceStartLoop: NMP Device "mpx.vmhba34:C0:T0:L0" is blocked. Not starting I/O from device.
Feb 29 16:03:25 esx vmkernel: 0:00:53:14.500 cpu1:4305)WARNING: NMP: nmpDeviceAttemptFailover: Retry world failover device "mpx.vmhba34:C0:T0:L0" - issuing command 0x41027fa7e340
Feb 29 16:03:25 esx vmkernel: 0:00:53:14.500 cpu1:4305)WARNING: NMP: nmpDeviceAttemptFailover: Retry world failover device "mpx.vmhba34:C0:T0:L0" - failed to issue command due to Not found (APD), try again...
Feb 29 16:03:25 esx vmkernel: 0:00:53:14.500 cpu1:4305)WARNING: NMP: nmpDeviceAttemptFailover: Logical device "mpx.vmhba34:C0:T0:L0": awaiting fast path state update...
Feb 29 16:03:34 esx vmkernel: 0:00:53:23.500 cpu0:4096)VMNIX: VmkDev: 2767: a/r=2 cmd=0x1e sn=4054 dsk=vml0:88:0 reqbuf=0000000000000000 (sg=0)
Feb 29 16:03:34 esx vmkernel: 0:00:53:23.500 cpu15:4127)ScsiDeviceIO: 1688: Command 0x1e to device "mpx.vmhba34:C0:T0:L0" failed H:0x5 D:0x0 P:0x0 Possible sense data: 0x0 0x0 0x0.
Feb 29 16:03:34 esx vmkernel: 0:00:53:23.500 cpu15:4127)WARNING: NMP: nmp_DeviceStartLoop: NMP Device "mpx.vmhba34:C0:T0:L0" is blocked. Not starting I/O from device.
Feb 29 16:03:34 esx vmkernel: 0:00:53:23.500 cpu0:4096)VMNIX: VmkDev: 2812: abort sn=4054, vmkret=0.

There’s only one thing weird about these warnings because the vmhba mentioned in the vmkernel log isn’t visible inside the vSphere client:

image

So to verify what kind of device the vmhba was I logged in on the vSphere host and ran the esxcfg-scsidevs –l command:

[root@esx ~]# esxcfg-scsidevs -l | grep vmhba34
mpx.vmhba34:C0:T0:L0
   Display Name: Local USB Direct-Access (mpx.vmhba34:C0:T0:L0)
   Devfs Path: /vmfs/devices/disks/mpx.vmhba34:C0:T0:L0

This particular vSphere host was a Dell R710 server with the Dell OMSA agent installed. After a quick search on http://kb.vmware.com I found the following KB article KB1013818 which describes the following cause:

Cause

This issue occurs if you have external USB devices and the iDRAC is set to mount these devices as virtual media.

The fix is quite simple. Just detach the iDRAC virtual CD drive or run the following commands to fix the “issue”:

# cd /opt/dell/srvadmin/sbin
# mv invcol invcol.bak
# srvadmin-services.sh restart

After restarting the services the VMkernel logs are clean again.

Source:  KB1013818

Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s