Well, I hit an interesting issue with VMotion in VMware vCenter. Funnily enough it seems eerily similar to a problem reported on Twitter last week by @joefarri. Maybe this will provide some insight into some of the odd problems that can occur.
Basically, I had a swath of VM’s that wouldn’t VMotion across my cluster. They seemed to be set up OK, and even seemed to have no removable devices attached, serial ports or any of the other things that usually cause problems with VMotion. I was a little stumped until I happened to start looking through the configuration in detail.
What I found was that under the NIC configuration, when I dropped down the list of attached networks I found that there were multiple networks with identical names. There were two networks listed under there called “Production Network”. On a whim, I selected the “other” defined network and suddenly my machines would vMotion again. Bizarre part is; once I’d done this, the “ghost” network had vanished.
What do I think caused this? Well, these were all older machines… some 2003 and early 2008 servers that we had stood up. I think some of them were P2V’s. Anyway, I think that somewhere during one of our upgrades these “ghost networks” appeared as a corruption in the config or in the vCenter database itself. Once I told it to use a “different” network, that corruption was cleared and suddenly stuff worked again.
I suspect another way to resolve this problem would be to unregister and re-register the VM… but I didn’t test that since these are all production systems I didn’t want to interrupt (and yes, changing the network caused no loss of pings).
Hope that helps anyone. Posting it here mostly so I can link it from Twitter for the benefit of my fellow VMware geeks