All Astrolabe tests are composed of two separate YAML documents:
test-scenario- defines initial conditions and a set of operations that modify the runtime conditions of a MongoDB cluster.workload- defines the operations that the workload executor (i.e. the driver under test) should perform while Astrolabe is modifying the runtime conditions of the MongoDB cluster.
The combination of one Test Scenario and one Workload defines a unique Astrolabe test. The following sections describe the format for each type of document.
Test Scenario
There are two distinct Test Scenario file formats:
atlas-test-scenario- scenarios for testing drivers while Atlas "planned maintenance" operations are running.kubernetes-test-scenario- scenarios for testing drivers while Kubernetes container orchestration operations are running.
Atlas Planned Maintenance Test Scenario
An Atlas Planned Maintenance Test Scenario defines an initial state and a sequence of "planned maintenance" operations that modify the running conditions of a MongoDB cluster running in Atlas. An Atlas Planned Maintenance Test Scenario YAML file has the following keys:
-
initialConfiguration (document): Description of Cluster Configuration Options to be used for initializing the test cluster. This document MUST contain the following keys:
-
clusterConfiguration (document): Document containing initial Basic Configuration Options values. This document MUST, at a minimum, have all fields required by the Create One Cluster endpoint.
-
processArgs (document): Document containing initial Advanced Configuration Option values. This MAY be an empty document if the maintenance plan does not require modifying the Advanced Configuration Options.
-
operations (array): List of operations to be performed, representing the maintenance event. Each operation is a document containing one key which is the name of the operation. The possible operations are:
-
setClusterConfiguration: set the cluster configuration to the specified Cluster Configuration Options as defined in initialConfiguration. The value must be the Cluster Configuration Options which MUST contain the following keys (note that at least one of these fields MUST be a non-empty document):
- clusterConfiguration (document): Document containing final Basic Configuration Option values. This MAY be an empty document if no changes to the Basic Configuration Options are needed by the maintenance plan. If non-empty, this document MUST, at a minimum, have all fields required by the Modify One Cluster endpoint.
- processArgs (document): Document containing final Advanced Configuration Option values. This MAY be an empty document if the maintenance plan does not require modifying the Advanced Configuration Options.
Example:
setClusterConfiguration: clusterConfiguration: providerSettings: providerName: AWS regionName: US_WEST_1 instanceSizeName: M10 processArgs: {} -
testFailover: trigger an election in the cluster using the "test failover" API endpoint. The value MUST be
true.The workload executor MUST ignore the value of this key, so that the value can be changed to a hash in the future to provide options to the operation.
testFailover SHOULD be followed by sleep and waitForIdle operations because it does not update maintenance state synchronously (see PRODTRIAGE-1232).
Example:
testFailover: true -
restartVms: perform a rolling restart of all nodes in the cluster. This operation requires Atlas Global Operator API key to be set when invoking
astrolabe. The value MUST betrue.The workload executor MUST ignore the value of this key, so that the value can be changed to a hash in the future to provide options to the operation.
testFailover SHOULD be followed by sleep and waitForIdle operations because it does not update maintenance state synchronously.
Example:
restartVms: true -
assertPrimaryRegion: assert that the primary in the deployment is in the specified region. The value MUST be a hash with the following keys:
- region (string, required): the region name as defined in Atlas
API, e.g.
US_WEST_1. - timeout (floating-point number, optional): the maximum time, in seconds, to wait for the region to become the expected one. Default is 90 seconds.
This operation is undefined and MUST NOT be used when the deployment is a sharded cluster.
Example:
assertPrimaryRegion: region: US_WEST_1 timeout: 15 - region (string, required): the region name as defined in Atlas
API, e.g.
-
sleep: do nothing for the specified duration. The value MUST be the duration to sleep for, in seconds.
Example:
sleep: 10 -
waitForIdle: wait for cluster maintenance state to become "idle". The value MUST be
true.The workload executor MUST ignore the value of this key, so that the value can be changed to a hash in the future to provide options to the operation.
Example:
waitForIdle: true
For all maintenance operations other than sleep, after the
maintenance operation is performed, astrolabe will wait for cluster
state to become idle. When performing a VM restart in a sharded
cluster, due to the state not being updated for a potentially long
time, the test SHOULD add an explicit sleep operation for at least
30 seconds.
Kubernetes Test Scenario
A Kubernetes Test Scenario defines a sequence of operations that modify the running conditions or configuration of a MongoDB cluster running in Kubernetes. A Kubernetes Test Scenario YAML file has the following keys:
- operations (array): List of operations to be performed. The possible operations are:
-
kubectl: Run a command using the kubectl command line tool. The value MUST be a valid array of arguments for thekubectlcommand line tool. Note that thekubectlexecutable must be in the system PATH.Example:
kubectl: [--namespace, default, delete, pod, mongodb-0] -
sleep: Do nothing for the specified duration. The value MUST be the duration to sleep for, in seconds.Example:
sleep: 10
Workload
A Workload defines a set of operations that the workload executor (i.e. the MongoDB driver under test) will run while connected to the MongoDB cluster. The YAML document must be a single complete test as defined by the Unified Test Format specification.
The workload MUST use the loop unified test format operation to define
the MongoDB operations to execute during maintenance. There MUST be
exactly one loop operation per scenario, and it SHOULD be the last
operation in the scenario.
The scenario MUST use storeErrorsAsEntity, storeSuccessesAsEntity,
and storeIterationsAsEntity operation arguments to allow the workload
executor to retrieve errors, failures, and operation and iteration
counts for the executed workload. The entity names for these options
MUST be as follows:
storeErrorsAsEntity:errorsstoreSuccessesAsEntity:successesstoreIterationsAsEntity:iterations
The scenario MUST NOT use storeFailuresAsEntity to ensure that all
errors and failures are reported under a single errors entity
irrespective of how a test runner might distinguish errors and failures
(if at all). Note that some ValidateWorkloadExecutor tests may still use
storeFailuresAsEntity with the entity name failures to assert
workload executor correctness.
The scenario MUST use storeEventsAsEntities operation argument when
defining MongoClients to record CMAP and command events published during
maintenance. All events MUST be stored in an entity named events. When
this option is used, astrolabe will retrieve the collected events and
store them as an Evergreen build artifact, and will also calculate
statistics for command execution time and connection counts.