It is needless to say that logging is one of the must important items to be configured within any infrastructure and should be part of your monitoring implementation plan.  So let’s see what options do we have available within vSphere.

On this blog, I will be using the terms Logging/Logs in a rather more generic fashion – storing *information* we could use for troubleshooting and proactive monitoring. 

vSphere 5.x makes use of three different facilities (for the lack of other term) to record information which we could later use for diagnosing a particular problem affecting a specific component, a specific VM, or even a system crash – these are the Scratch Partitionthe  Core-Dump Partition and the System Logs

So here it is – what (I think) you should really know about each of these.


I. SCRATCH PARTITION

The scratch partition is *not* used for the typical logs which most of us are familiar with. In vSphere the scratch partition is used for storing the vm-support command output (support-bundle); it is also where the logs are stored initially, by default, in those cases when the partition gets automatically created – see below. You would typically provide this information when logging a support call with VMware.

  • Configuration: you can use either the command line (ESXCLI/PowerCLI) or the GUI – see the official documentation (search for Scratch)
  • Size: 5 GB VFAT Partition *per host*; exceptionally, it is 512 MB only when created in RAM (see below)
  • Not mandatory – that’s right; you can configure it, but you don’t have to. You will get a warning but it won’t stop the system from functioning.
  • Automatically installed? It depends …
    • PXE installation, there is no local scratch partition configured
    • Installable installation, the scratch partition is automatically created during installation
    • Embedded installation, a scratch partition is automatically created on a local-disk, if one exists; it will otherwise be installed in RAM (ramdisk) using 512 MB of the actual physical RAM installed (not recommendable)
  • Reboot Is Required when configuring/changing the location of the scratch partition
  • Best Practices
    • Do configure a scratch partition; make sure it’s on persistent storage – either local-disk or shared storage; USB/SD is not considered local disk! Before rebooting, make sure you put the host in maintenance mode
    • 20 GB of storage should be enough space for a large cluster (recommended by VMware)
    • Make sure you use a folder for each host

II. CORE-DUMP PARTITION

The Core-dump partition is used for saving the core/system dump in case of a system crash. It can be set as a local partition (also called the diagnostic partition); alternatively, the ESXi hosts could be configured to send the core-dump to a remote core-dump collector server.

  • Configuration: Configuring the system to store the core dump locally, is fairly straight forward; the process is however slightly more elaborate should you need to store the core-dumps on a remote server as you need to install the Coredump Collector first – see the official documentation (search for Managing Core Dumps)
  • Size: 110 MB per host / shared storage
  • Automatically Installed, typically during the installation process
  • Coredump Collector is bundled with VCSA; in this case, it is therefore automatically installed; it can also be manually installed on Windows OS – the package can be found on the Windows vCenter Server installation CD
  • Reboot Required – No
  • Best Practices
    • Cannot be located on an iSCSI LUN accessible via software or dependent iSCSI adapter
    • Setup the coredump location pointing to a local storage

III. SYSTEM LOGS

The system logs is what we are used to see – we would normally use a remote server running Syslog and forward all logs to that server, in real-time. Having the logs stored on a Syslog server allow for quicker access since most implementations provide advanced search capabilities. By default, this connection would occur over TCP or UDP, on port 514.

Though by default, these logs are *also* stored locally in the folder /var/logs/ (linked to the Scratch partition, when one exists). This location could also be changed so that logs are all stored on a remote storage location – this could be an NFS share or another shared storage available to the host.

  • Configuration: To start off, should you choose to forward the logs to a syslog server, make sure you allow Syslog on the ESXi host firewall. Also make sure no firewalls are blocking the traffic on the communication path between the ESXi host and the Syslog server. Once this has been done, you will need to edit the following advanced properties: Syslog.global.logdir, Syslog.global.logDirUnique & Syslog.global.logHost.
  • As a Syslog server, you could use either vSphere Syslog Collector, available on the vCenter installation CD (automatically deployed along with VCSA) or, you could use any other Syslog server implementation available.
  • Size: variable
  • Automatically Installed  when deployed with vCenter Server Appliance (VCSA). It will also be automatically installed when installing vCenter for Windows depending on which installation type you’ve chosen.
  • Reboot Required: No
  • Best Practices: 
    • Make sure you do forward your system logs to a Syslog server; really, do not miss that!
    • Make sure whichever process/system you are using, you have enough space to store the logs
    • Ensure you are using an NTP server on your network – it is very frustrating to look at log messages and not to know when they actually occurred
    • Make sure you logs get stored/organized by the source

There are quite a few files where logs are stored. For a full list, please see this VMware article.


My Setup …

In my setup, I have created a folder on the NAS; I have further created two folders: Logs Scratch and relevant sub-folders matching the host name. As an example, I will find the logs for pESXi-01 host within the <NFS-Share>/Logs/pESXi-01/ subfolder.

I have setup the Core-Dump location to be on one of the local hard-drives on the host.

At last, the system logs are being forwarded to a Windows 2k8 server running Splunk.

Go back to Index


Thank you,
Signature
View Rafael A Couto Cabral's profile on LinkedIn



Comments are closed.