Operating on a single node, 99% of the time.

My lab setup includes a two node Proxmox VE 5 cluster. Only one machine is powered on 24x7 and the only real use I get out of the cluster is combined management under the web interface and the ability to observe corosync traffic between the nodes for educational purposes. My second machine is used for intermittent testing of Windows Server and GPU passthrough. This setup works fine for me in a homelab setting but the correct configuration of a Promox cluster would include at least three nodes, each with one vote counted in the voting process for any node activity.

If you run only one Proxmox node, you can start, stop, create, and destroy VM’s and containers as you please as long as you have the correct user permissions. Once another machine is added and the two (or more) machines are clustered, most operations will require a minimum number of votes from cluster members before they can be performed. The first issue I encountered after clustering my two nodes was starting a stopped VM while one node was down. To reach quorum I would need two votes but with only one node and the default vote settings, I could not start any of my stopped machines.

Since most of my important services are run on one node and I do not need the High Availability (HA) features provided by a cluster, I configured my main node to contribute two votes in the quorum voting process so that I could carry on normal operations while the second node remained powered off until I needed it.

Proxmox provides the pvecm (Proxmox VE Cluster Manager) command which, among many other cluster related settings, allows you to set the expected votes for a node directly from the terminal. When I began the process of revising these cluster settings, I only had one node online and tried to edit the configuration file /etc/pve/corosync.conf for my main machine but was met with write permission errors. Proxmox is very protective of the files listed under /etc/pve and my attempts to save changes were all denied.

After a few attempts I realized that because I did not have quorum in the cluster, I wasn’t allowed to make edits to the corosync configuration file. To remedy this I ran the following on my main node:

	foo@bar:~$ pvecm expected 1

This change gave me quorum in the cluster and I was granted write access to /etc/pve/corosync.conf. For safety, I created a copy of the corosync configuration, made my changes in that copy (making 100% sure that I increased the config file version number by one), saved the file, then copied the new file over the existing configuration file. Rather than lowering the expected count of votes and modifying the corosync configuration file, I could have run the following command to grant my main node two votes in the voting process, which would have met the minimum of two votes for quorum in the cluster.

	foo@bar:~$ pvecm --votes 2  

Following this change, I set the expected votes back to two and restarted corosync:

	foo@bar:~$ pvecm expected 2  
	foo@bar:~$ systemctl restart corosync

After restart, confirm that the changes you’ve made are noted in the configuration file of the node you were working on. The next time you power on the second node, corosync will propogate the changes over to bring the two machines into sync. I made these changes to my system back in November 2018 and I have not observed any issues to date.

For reference, the corosync.conf file for a two node system as described above would include a nodelist section that looks like the following:

	nodelist {  
	  node {  
		name: node1  
		nodeid: 1  
		*quorum_votes: 2*  
		ring0_addr: 192.168.0.20  
	  }  
	  node {  
		name: node2   
		nodeid: 2  
		*quorum_votes: 1*  
		ring0_addr: 192.168.0.21  
	  }  
	}

Node1 in this example is the main node that is operational 24x7.

A few notes on this process:

  • When run in a cluster configuration, to provide high availability your Proxmox setup should “have at least three nodes for reliable quorum”. I have seen elsewhere that people are running corosync on a Raspberry Pi to contribute the third vote but for my homelab usage, this method was all that I needed.
  • Proxmox is very protective of files in /etc/pve/ and will recover files you have deleted if you are not quick to replace them. Making a copy of your config files, editting the copy, and copying the new file over the existing file is a good way to avoid conflicts with the recovery mechanism.
  • When editting the configuration files by hand, it is extremely important to increase the version number every time you make any changes, no matter how small. Without the change in version number, corosync will find a conflict and may possibly overwrite your changes with an older version of the configuration files.

References:

Proxmox Documentation
Proxmox HTML Manual - pvecm Command
Proxmox Wiki - Write config when not quorate