NMP: nmpDeviceAttemptFailover: Retry world failover device

After restarting a vSphere 4.1 update 2 host there where a lot of warnings in the vmkernel log about NMP and failover messages:

Feb 29 16:03:19 esx vmkernel: 0:00:53:08.502 cpu1:4305)WARNING: NMP: nmpDeviceAttemptFailover: Retry world restore device "mpx.vmhba34:C0:T0:L0" - no more commands to retry
Feb 29 16:03:24 esx vmkernel: 0:00:53:13.498 cpu0:4096)VMNIX: VmkDev: 2860: abort succeeded.
Feb 29 16:03:24 esx vmkernel: 0:00:53:13.498 cpu0:4096)WARNING: NMP: nmp_IssueCommandToDevice: I/O could not be issued to device "mpx.vmhba34:C0:T0:L0" due to Not found
Feb 29 16:03:24 esx vmkernel: 0:00:53:13.498 cpu0:4096)WARNING: NMP: nmp_DeviceRetryCommand: Device "mpx.vmhba34:C0:T0:L0": awaiting fast path state update for failover with I/O blocked. No prior reservation exists on the device.
Feb 29 16:03:24 esx vmkernel: 0:00:53:13.498 cpu0:4096)WARNING: NMP: nmp_DeviceStartLoop: NMP Device "mpx.vmhba34:C0:T0:L0" is blocked. Not starting I/O from device.
Feb 29 16:03:25 esx vmkernel: 0:00:53:14.500 cpu1:4305)WARNING: NMP: nmpDeviceAttemptFailover: Retry world failover device "mpx.vmhba34:C0:T0:L0" - issuing command 0x41027fa7e340
Feb 29 16:03:25 esx vmkernel: 0:00:53:14.500 cpu1:4305)WARNING: NMP: nmpDeviceAttemptFailover: Retry world failover device "mpx.vmhba34:C0:T0:L0" - failed to issue command due to Not found (APD), try again...
Feb 29 16:03:25 esx vmkernel: 0:00:53:14.500 cpu1:4305)WARNING: NMP: nmpDeviceAttemptFailover: Logical device "mpx.vmhba34:C0:T0:L0": awaiting fast path state update...
Feb 29 16:03:34 esx vmkernel: 0:00:53:23.500 cpu0:4096)VMNIX: VmkDev: 2767: a/r=2 cmd=0x1e sn=4054 dsk=vml0:88:0 reqbuf=0000000000000000 (sg=0)
Feb 29 16:03:34 esx vmkernel: 0:00:53:23.500 cpu15:4127)ScsiDeviceIO: 1688: Command 0x1e to device "mpx.vmhba34:C0:T0:L0" failed H:0x5 D:0x0 P:0x0 Possible sense data: 0x0 0x0 0x0.
Feb 29 16:03:34 esx vmkernel: 0:00:53:23.500 cpu15:4127)WARNING: NMP: nmp_DeviceStartLoop: NMP Device "mpx.vmhba34:C0:T0:L0" is blocked. Not starting I/O from device.
Feb 29 16:03:34 esx vmkernel: 0:00:53:23.500 cpu0:4096)VMNIX: VmkDev: 2812: abort sn=4054, vmkret=0.

There’s only one thing weird about these warnings because the vmhba mentioned in the vmkernel log isn’t visible inside the vSphere client:

So to verify what kind of device the vmhba was I logged in on the vSphere host and ran the esxcfg-scsidevs –l command:

[root@esx ~]# esxcfg-scsidevs -l | grep vmhba34
mpx.vmhba34:C0:T0:L0
Display Name: Local USB Direct-Access (mpx.vmhba34:C0:T0:L0)
Devfs Path: /vmfs/devices/disks/mpx.vmhba34:C0:T0:L0

This particular vSphere host was a Dell R710 server with the Dell OMSA agent installed. After a quick search on http://kb.vmware.com I found the following KB article KB1013818 which describes the following cause:

Cause

This issue occurs if you have external USB devices and the iDRAC is set to mount these devices as virtual media.

The fix is quite simple. Just detach the iDRAC virtual CD drive or run the following commands to fix the “issue”:

# cd /opt/dell/srvadmin/sbin
# mv invcol invcol.bak
# srvadmin-services.sh restart

After restarting the services the VMkernel logs are clean again.

Source: KB1013818

NMP: nmpDeviceAttemptFailover: Retry world failover device

Cause

Published by afokkema

Leave a comment Cancel reply

NMP: nmpDeviceAttemptFailover: Retry world failover device

Cause

Share this:

Related

Published by afokkema

Leave a comment Cancel reply