IaaS service not registering


IaaS refers to the collection components installed on the Windows servers in a vRealize Automation environment. This article will focus specifically on the troubleshooting methodology applied when this IaaS service is not registering on the vRA appliances, I intend to create a separate post at a later date to focus solely on the purpose of each individual IaaS component.

See vRA Management agents explained article for further explanation specifically for IaaS Management Agent components.



The Problem




The IaaS service registry status can be viewed in the Services tab of the VAMI interface or at the vRA appliance component service registry health check URL.



https://vRA:5480

https://vRA/component-registry/services/status/current



 

Typically, IaaS service registry issues will be noticed here first. Other common symptoms of IaaS service registry issues include problems accessing the Infrastructure tab or general provisioning/data collection failures.



      
Understanding the IaaS service's various dependencies

 

 


The dependencies of the IaaS architecture can be simplified as follows, with each component having a dependency on the underlying piece:



1) IaaS SQL database.  |
                                       |---->2) IaaS web. |
                                                                    |  ---->3) IaaS Manager.
                                                                                       (At least one DEM Orchestrator & worker running)



First and foremost the vRA IaaS SQL database news to be accessible but I rarely use this as a starting point. The IaaS application logs do a good job of flagging if database access is to blame so hold off on engaging DB admin team for now.

Next up the Repository needs to be initialized, this is hosted by Microsoft IIS.

Finally, the IaaS manager service needs to be running, this is installed as a service in windows. Generally speaking from a service registry standpoint we do not care about the Dem and Agent services with one caveat and that is that at least one dem orchestrator & worker instance needs to be running in order for IaaS to be able to register into vRA.

In short the dependencies can be expressed as follows:

  • The IaaS Web(repository) needs only SQL DB to be accessible in order to initialize successfully
  • The IaaS Manager service needs both the SQL DB and Repository to be available
  • In order for IaaS to register in vRA at least one dem worker & orchestrator need to be up


The Approach



1. First I like to check the component registry url, under certain circumstances this URL will actually provide useful feedback as to why IaaS is not registering:
 

https://vRA/component-registry/services/status/current











 

-------------------------------



2. Then my advice is to help isolate the issue by determine the scope and ruling out any loadbalancer issues by checking to see if the IaaS services are accessible locally on the machine they are running.

Check the IaaS web and IaaS manager health check URL's both locally on the relevant machines and at the Loadbalancer URL. RDP to both IaaS web and manager servers & check the urls outlined below:



IaaS web health check urls:



https://IaaSwebnode/WAPI/api/status
https://IaaSWebLBUrl/WAPI/api/status 






IaaS manager health check urls: 


https://IaaSmanagernode/VMPS2
https://IaaSmanagerLBUrl/VMPS2







 
















If the local url's return a good response and loadbalanced url's dont then troubleshoot the Loadbalancer. Engage the networking team and verify configuration is as per VMware best practise. A temporary workaround in this scenario is to place hostfile entries pointing the faulty VIP address to the IP address of a working node. If the local Url's are not responding as expected proceed to step 3 below.


------------------------------


3. Next I would check the IaaS web logfiles. The Repository.log is located under:

InstallDrive\Program Files (x86)\VMware\vCAC\Server\Model Manager Web\Logs\

 An example of normal healthy logging is included below.



Typical problems here include a corrupt web.config file, trust issues with vRA appliances, IaaS SQL DB not available, Disk Space & a whole host of other weird and wonderful windows based issues.

The Microsoft IIS logs are a common cause of full partitions on IaaS web server nodes, deleting the oldest logfiles will have no impact on vRA. The logfiles are located under: InstallDirectory:\inetpub\logs\LogFiles\W3SVC1 There is a Microsoft article which discusses the various options available for managing these logfiles:

managing-iis-log-file-storage

 

If the logs dont look right issue an IIS reset and check the logfiles again. To reset IIS open command prompt as administrator and run: iisreset 






If the repository error is too generic then you can also check the windows event logs for any potential further feedback on what may be causing the issue. Its also worth searching VMware's own knowledge base articles, for best results try and pick any unique looking strings and strip anything generic or environment specific, for example hostnames timestamps things of that nature.


If there are ssl or certificate related errors then it is worth running a Reinitiate trust operation from the vRA appliance vami interface, this can be accessed from the vRA Settings -> Certificate tab of the vami interface. This will ensure two-way trust between the vRA & IaaS components by pushing down the relevant certificates again via the IaaS Management Agents. The actual certificate in use can be compared to that shown in the VAMI interface. To check the certificate currently in use by the IaaS web component you can check the certificate bound on port 443 in IIS.  Open IISmanager and navigate to Default Web Site select bindings and edit on https port 443




---------------------------



4. If the logs look healthy for the IaaS web I would move onto the IaaS Manager.

The IaaS Manager All.log is located under:

InstallDrive\Program Files (x86)\VMware\vCAC\Server\Logs\

An example of normal healthy manager service logging is included below:








Typical problems here include a corrupt managerservice.exe config file, trust issues with vRA appliances & IaaS SQL DB not available. If the logging doesnt look healthy you can restart the IaaS manager service: VMware vCloud Automation Center Service




If you found this article helpful then you may want to take a look at proactive guide for admins wishing to improve the stability, reliability or knowledge of their vRA setup: admins-guide-to-keeping-vra-healthy

Comments

Post a Comment

Popular posts from this blog

vRealize Automation appliance services not registering

vRA 7.x Certificate Replacement Process