I am running the following:
Build versions:
VMware ESXi 5.0 Express Patch 4 (version build 5.0.0
804277) vCenter server 5.0 U2 (version build 5.0.0 913577) vCenter Server 5.0
Update 1 b (this is actually vSphere Client version 5.0.0 build 804277) SRM
5.0.1.2645 Left Hand SRA 9.5.0.621 Left Hand SAN iQ 9.5.00.1215.0
So, we recently moved our Windows CA server from 2003 to 2008. We exported all certs and re-imported them into a server that had the same name, best practice according to MSFT. A couple of days later our hosts in VSphere dropped to a disconnected state with the Alert, SSL not verified. This is a known issue in 5.0 however we are on 5.0 U2 which should be fine. We generated new certs from the cert authority and imported them. Here is the list of steps more or less we have performed on more than 1 occasion:
1) Shut down all VMs by connecting to the hosts directly
2) Enable SSH and put hosts in maintenance mode
3) Create CRS's (cert requests)
4) Generate the certificates from AD cert services or any enterprise certificate authority
5) Import the new cert to the hosts
6) Power cycle the hosts (they should come back in maintenance mode, not disconnected state)
7) Remove hosts from the cluster in VSphere
8) Then re add them to the cluster
The hosts would be good for an hour or so then drop to a disconnected state. We have cycled through at least 3-4 Tier 3 support engineers. They all seem to be stumped. Here is the host log that we think best shows what is happening:
2013-10-25T12:40:29.113-07:00 [09292 info 'Default'opID=HB-host-547@340-90f44497] [VpxLRO] -- BEGIN task-internal-11448 --host-547 -- VpxdInvtHostSyncHostLRO.Synchronize --
2013-10-25T12:40:29.114-07:00 [09292 info 'Default'opID=HB-host-547@340-90f44497] [VpxdHostSync] Synchronizing host: host-547 (redacted)
2013-10-25T12:40:29.118-07:00 [09292 info 'Default'opID=HB-host-547@340-90f44497] [ClientAdapterBase] InvokeOnSoap leaving
2013-10-25T12:40:29.120-07:00 [09292 info 'Default'opID=HB-host-547@340-90f44497] [ClientAdapterBase] InvokeOnSoap leaving
2013-10-25T12:40:29.216-07:00 [09600 error 'Default']SSLStreamImpl::DoClientHandshake (000000000db55850) SSL_connect failed. DumpingSSL error queue:
2013-10-25T12:40:29.216-07:00 [09600 error 'Default'] [0]error:14090086:SSL routines:SSL3_GET_SERVER_CERTIFICATE:certificate verifyfailed
2013-10-25T12:40:29.216-07:00 [09600 error'HttpConnectionPool'] [ConnectComplete] Connect error SSL Exception: The remotehost certificate has these problems:
-->
--> * unable to get local issuer certificate
-->
--> * Host name does not match the subject name(s) incertificate.
2013-10-25T12:40:29.217-07:00 [10080 info 'Default'opID=task-internal-11449-6cc3b3c7] [VpxLRO] -- BEGIN task-internal-11449 --host-547 -- HostDisconnectLRO.Disconnect --
2013-10-25T12:40:29.218-07:00 [10080 info'vmomi.soapStub[398]' opID=task-internal-11449-6cc3b3c7] Resetting stub adapterfor server TCP:redacted:443: Closed
2013-10-25T12:40:29.220-07:00 [09292 error 'Default'opID=HB-host-547@340-90f44497] [VpxdInvtHostSyncHostLRO] Got method fault:vim.fault.SSLVerifyFault
2013-10-25T12:40:29.220-07:00 [09292 error 'Default' opID=HB-host-547@340-90f44497]Backtrace: backtrace[00] rip 000000018013da0a (no symbol)
--> backtrace[01] rip 00000001801006b8 (no symbol)
--> backtrace[02] rip 0000000180100bbe (no symbol)
--> backtrace[03] rip 0000000180087c2b (no symbol)
--> backtrace[04] rip 00000000009f9a21 (no symbol)
--> backtrace[05] rip 000000013fed05da (no symbol)
--> backtrace[06] rip 00000001401e8cfd (no symbol)
--> backtrace[07] rip 00000001401e9d84 (no symbol)
--> backtrace[08] rip 00000001401ea70a (no symbol)
--> backtrace[09] rip 000000013fec424b (no symbol)
--> backtrace[10] rip 000000013feccf6a (no symbol
--> backtrace[11] rip 000000018015471d (no symbol)
--> backtrace[12] rip 0000000180155c44 (no symbol)
--> backtrace[13] rip 000000018014dfd5 (no symbol)
--> backtrace[14] rip 0000000074ce2fdf (no symbol
--> backtrace[15] rip 0000000074ce3080 (no symbol)
--> backtrace[16] rip 000000007739652d (no symbol)
--> backtrace[17] rip 000000007782c521 (no symbol)
-->
2013-10-25T12:40:29.317-07:00 [10080 info 'Default'opID=task-internal-11449-6cc3b3c7] [VpxdMoHost] host connection state changedto [DISCONNECTED] for host-547
2013-10-25T12:40:29.333-07:00 [10080 info 'Default'opID=task-internal-11449-6cc3b3c7] [VpxdInvtHost::SaveFieldsToDb] IPMI info of redacted is not set
2013-10-25T12:40:29.390-07:00 [10080 info 'Default'opID=task-internal-11449-6cc3b3c7] [VpxdMoHost::SetComputeCompatibilityDirty]Marked host-547 as dirty.
2013-10-25T12:40:29.390-07:00 [10080 info 'Default'opID=task-internal-11449-6cc3b3c7] [VpxdMoCluster::SetDasCompatDirty] Markeddomain-c26 as dirty.
2013-10-25T12:40:29.464-07:00 [09292 info 'Default' opID=HB-host-547@340-90f44497] [VpxLRO] -- FINISH task-internal-11448 -- host-547 -- VpxdInvtHostSyncHostLRO.Synchronize --
2013-10-25T12:40:29.464-07:00 [10080 info 'Default'opID=task-internal-11449-6cc3b3c7] [VpxdMoHost::SetComputeCompatibilityDirty]Marked host-547 as dirty.
2013-10-25T12:40:29.464-07:00 [10080 info 'Default'opID=task-internal-11449-6cc3b3c7] [VpxdMoCluster::SetDasCompatDirty] Markeddomain-c26 as dirty.
2013-10-25T12:40:29.464-07:00 [10080 info 'Default'opID=task-internal-11449-6cc3b3c7] [VpxLRO] -- FINISH task-internal-11449 --host-547 -- HostDisconnectLRO.Disconnect --
2013-10-25T12:40:29.775-07:00 [09292 error'HttpConnectionPool'] [ConnectComplete] Connect error No connection could bemade because the target machine actively refused it.
2013-10-25T12:40:29.775-07:00 [04556 error 'Default'opID=b02f0c1d] [HttpUtil::ExecuteRequest] Error in sending request - Noconnection could be made because the target machine actively refused it.