Repointing vCenter Server to external PSC on load balanced FQDN fails

I have been ¬†planning a migration project for a customer for a while which involves moving from an embedded SSO instance on vCenter 5.5 to an external Platform Services Controller instance on 6.5. Suffice to say, plenty of ‘how to’ guides exist, alongside the documentation from VMware – however, there is a generally scant outline of what steps to take when ‘repointing your vCenter to the new load balanced PSC virtual IP. The topic of this post is what happens when you follow the available load balancing documentation and your VMware Update Manager service fails to start afterwards.

I’ll include the reference articles up front, in case these are the ones which you might also have referred to:

Reference articles:

Configuring HA PSC load balancing on Citrix NetScaler РVMware KB article

Repoint vCenter Server to Another External Platform Services Controller in the Same Domain РVMware KB article

The repoint command:

At the step where you are reminded to repoint your vCenter instances at the new load balanced VIP address you’ll need to use the command:

cmsso-util repoint --repoint-psc psc-ha-vip.sbcpureconsult.internal

However, if you’ve followed the steps precisely, you’re likely to run into the following output when the repoint script attempts to restart the Update Manager service:

What happens:

Validating Provided Configuration …
Validation Completed Successfully.
Executing repointing steps. This will take few minutes to complete.
Please wait …
Stopping all the services …
All services stopped.
Starting all the services …

[… truncated …]

Stderr = Service-control failed. Error Failed to start vmon services.vmon-cli RC=2, stderr=Failed to start updatemgr services. Error: Service crashed while starting

Failed to start all the services. Error {
“resolution”: null,
“detail”: [
{
“args”: [
“Stderr: Service-control failed. Error Failed to start vmon services.vmon-cli RC=2, stderr=Failed to start updatemgr services. Error: Service crashed while starting\n\n”
],
“id”: “install.ciscommon.command.errinvoke”,
“localized”: “An error occurred while invoking external command : ‘Stderr: Service-control failed. Error Failed to start vmon services.vmon-cli RC=2, stderr=Failed to start updatemgr services. Error: Service crashed while starting\n\n'”,
“translatable”: “An error occurred while invoking external command : ‘%(0)s'”
}
],
“componentKey”: null,
“problemId”: null
}

Following this issue you might reboot or attempt to start all services directly on the vCenter appliance afterwards and receive:

service-control --start --all

Service-control failed. Error Failed to start vmon services.vmon-cli RC=2, stderr=Failed to start updatemgr services. Error: Service crashed while starting

This again is fairly unhelpful output and doesn’t provide any assistance as to the cause of the issue. After much investigation, it turns out that the list of TCP port numbers which the load balancing configuration details are not complete, causing the service startup to fail. Because we’re not running any other applications on the PSC hosts it’s possible to simplify the configuration on NetScaler by using wildcard port services for each server.

NetScaler configuration commands (specific to PSC load balancing):

The following alternative configuration ensures that any PSC service requested by your vCenter Server (or other solutions) will remain persistently connected on a ‘per host’ basis for up to 1440 minutes which is the default lifetime of a vCenter Web Client session. This is different to VMware’s documented approach which load balances each service individually, but obviously misses out some crucial port.

add server hosso01.sbcpureconsult.internal 192.168.0.117
add server hosso02.sbcpureconsult.internal 192.168.0.116

add service hosso01.sbcpureconsult.internal_TCP_ANY hosso01.sbcpureconsult.internal TCP * -gslb NONE -maxClient 0 -maxReq 0 -cip DISABLED -usip NO -useproxyport YES -sp OFF -cltTimeout 9000 -svrTimeout 9000 -CKA NO -TCPB NO -CMP NO

add service hosso02.sbcpureconsult.internal_TCP_ANY hosso02.sbcpureconsult.internal TCP * -gslb NONE -maxClient 0 -maxReq 0 -cip DISABLED -usip NO -useproxyport YES -sp OFF -cltTimeout 9000 -svrTimeout 9000 -CKA NO -TCPB NO -CMP NO

add lb vserver lb_hosso01_02_TCP_ANY TCP 192.168.0.122 * -persistenceType SOURCEIP -timeout 1440 -cltTimeout 9000

bind lb vserver lb_hosso01_02_TCP_ANY hosso01.sbcpureconsult.internal_TCP_ANY

bind lb vserver lb_hosso01_02_TCP_ANY hosso02.sbcpureconsult.internal_TCP_ANY

Once this configuration is put in place you’ll find that the vCenter Update Manager service will start correctly and your repoint will be successful.

Thanks for reading!

If you find this useful drop me a message on Twitter.

Leave a Reply