Hi Community,
that's my first discussion here. So I hope, that's the right space for my question.
We have strange issues with the auto deploy feature, using DVS and I hope anyone have an idea...
The setup:
- We're using ESXi 5.5u2b
- vCenter Edt: Windows
- AutoDeploy-Server: dedicated Windows machine
- The ESX-Nodes are located in a routed DMZ
- The ESX-Nodes have no local storage (stateless rollout)
- The ESX hardware are UCS B220M3 blades, contained by a UCS mini chassis).
The Pre-Staged autodeploy has completed successfully. We've moved the new ESX into our cluster and set all additional configuration (distributed vswitch, Storage, etc...).
The created host profile is 100% compliant and we're able to apply this host profile to the host. Everything works well until now, so we added the Cluster and Host Profile to our Deployment Rule.
Now the reboot of the host:
The Host come up successfully, with the correct configuration (everything is fine!).
The vCenter performs a "disconnect" of the host and is starting a "reconnect". These two actions can be completed successfully.
Unfortunately, the vcenter is starting to apply the Host Profile again (I don't know why). This task stops with 22% and the ESX host is gone.
What's happened on the ESX-host?
The network configuration has been modified There is no existing vmkernel configuration to manage the host, so the communication to the vcenter is interrupted.
We've found out, that these changes are happened:
- The network configuration has been replaced.
- All dvswitches are configured.
- Only one vmkernel interface has been defined (vmk2). vmk2 is dedicated to serve the storage connection.
- By loosing the vmk0 interface, the vmk2 has been tagged as "management" interface.
- The Host Profile deployment stops here. After that point, only lldp-snmp log messages occured.
The most funny thing:
If you boot the esx host again with the pre-staging rule and the host is connected to the vcenter, you can apply the Host Profile manually and it works well.
#### Update ####
Currently, the workaround is to bind the vmk0 to a standard vswitch. We hope to find a solution to migrate all switches from standard to distributed.
I hope you can give us some input to solve the problem.
Thank you
Falk