Problem:

I hit a problem with a vCenter server (linux appliance) I look after where I simply could not log into the vCenter client. Upon investigating the vpxd service was not running, and looking at the disk usage using a df -h showed that the /storage/db partition was 100% utilized.

This needs fixing in order to restart the vpxd service. However the problem may not be all that it seems...

Solution:

Clean the database! You would think that the database itself has become bloated with VMware logs. This may indeed be the case, however I found something which proved to be a far easier fix.

After looking on VMware's excellent knowledge base I'd managed to find a couple of articles which talked about purging old entries from the database by running a few commands. Naturally I couldn't run these because the database was so full it wouldn't even start. This required creating a new virtual disk, moving the files to a new partition, starting the database from a new location, re-sizing it moving it back to the old partition, restart the database, restart vpxd and delete the temporary disk - a lot of work!

Anyway, I did this, and it removed about 5gb of junk. Which I was rather disappointed with considering my database was 60GB and had eaten the entire partition. I'd managed to get back less than 10% of space, not brilliant.

After some more poking around I found that the postgres SQL database vCenter uses keeps a log much like you'd expect from any linux service. Problem was, the log folder was 35GB. Right, think we've found the issue. I deleted the lot, and voila, all fixed! But why was there 35GB of logs in the first place?

This vCenter server has been running for about 18 months, not that long in the grand scheme of things and it's only a small deployment of less than 15 ESXi hosts and a few hundred VMs. Turns out that the database logging level is set to warnings by default, and for whatever reason vCenter generates a lot of warnings.

To resolve this you can change the logging level to errors. The configuration file for the database sits in /storage/db/vpostgres/postgresql.conf and the change that must be made is log_min_messages which should be changed to log_min_messages = error. A restart of the server will apply these changes (you can restart the db and then vpxd but may as well just reboot the whole server - can't hurt!)

Additional Information

These articles from VMware's knowledge base contain information on how to shrink the database