Fault injection testing
You can test the fault tolerance of your cluster by deleting a VM in order to inject a fault. Once a VM is deleted, you can monitor the availability and recovery of the cluster.
Requirements
Ensure you meet the following requirements before using fault injection testing:
- You have connected your BigAnimal cloud account with your Azure subscription. See Setting up your Azure Marketplace account for more information.
- You should have permissions in your Azure subscription to view and delete VMs.
- You have PGD CLI installed. See Installing PGD CLI for more information.
- You have created a
pgd-cli-config.yml
file in your home directory. See Configuring PGD CLI for more information.
Fault injection testing steps
Fault injection testing consists of the following steps:
- Verifying cluster health
- Determining the write leader node for your cluster
- Deleting a write leader node from your cluster
- Monitoring cluster health
Verifying Cluster Health
Use the following commands to monitor your cluster health, node info, raft, replication lag, and write leads.
You can use pgd help
for more information on these commands.
To list the supported commands, enter:
For help with a specific command and its parameters, enter pgd help <command_name>
. For example:
Determining the write leader node for your cluster
In this example, the write leader node is p-x67kjp3fsq-a-1.
Deleting a write leader node from your cluster
To delete a write lead node from the cluster:
Log into BigAnimal.
In a separate browser window, log into your Microsoft Azure subscription.
In the left navigation of BigAnimal portal, choose Clusters.
Choose the cluster to test fault injection with and copy the string value from the URL. The string value is located after the underscore.
In your Azure subscription, paste the string into the search and prefix it with dp- to search for the data plane.
- From the results, choose the Kubernetes service from the Azure Region that your cluster is deployed in.
Identify the Kubernetes service for your cluster.
Note
Don't delete the Azure Kubernetes VMSS here or sub resources directly.
- Browse to the Data Plane, choose Workloads, and locate the Kubernetes resources for your cluster to delete a chosen node.
Monitoring cluster health
After deleting a cluster node, you can monitor the health of the cluster using the same PGD CLI commands that you used to verify cluster health.