Experiment Executor Service

The Experiment Executor is the core component of the MiSArch Experiment Tool. It is responsible for (1) the creation and storage of experiments, (2) the execution of stored experiments by calling the executor components at the correctly scheduled time, (3) the collection and transformation of the Gatling metrics, and (4) the creation of the final Grafana dashboard and the report.

API

The Experiment Executor exposes the following REST API endpoints to manage and execute experiments.

Experiment Management

POST /experiment/generate - Generate a new experiment with a new UUID
GET /experiment/list - List all experiments with their versions
GET /experiment/{testUUID}/versions - List all versions of a specific experiment
POST /experiment/{testUUID}/{testVersion}/newVersion - Create a new version of an existing experiment
DELETE /experiment/{testUUID} - Delete an experiment with all its versions
DELETE /experiment/{testUUID}/{testVersion} - Delete a specific version of an experiment

Configuration

GET /experiment/{testUUID}/{testVersion}/chaosToolkitConfig - Get the Chaos Toolkit configuration of a specific experiment version
PUT /experiment/{testUUID}/{testVersion}/chaosToolkitConfig - Update the Chaos Toolkit configuration of a specific experiment version
GET /experiment/{testUUID}/{testVersion}/misarchExperimentConfig - Get the MiSArch Experiment Config of a specific experiment version
PUT /experiment/{testUUID}/{testVersion}/misarchExperimentConfig - Update the MiSArch Experiment Config of a specific experiment version
GET /experiment/{testUUID}/{testVersion}/gatlingConfig - Get the Gatling configuration of a specific experiment version
PUT /experiment/{testUUID}/{testVersion}/gatlingConfig - Update the Gatling configuration of a specific experiment version
GET /experiment/{testUUID}/{testVersion}/config - Get the global experiment configuration of a specific experiment version
PUT /experiment/{testUUID}/{testVersion}/config - Update the global experiment configuration of a specific experiment version

Execution

POST /experiment/start
POST /experiment/{testUUID}/{testVersion}/start - Start the execution of a specific experiment version
POST /experiment/{testUUID}/{testVersion}/stop - Stop the execution of a specific experiment version
GET /experiment/{testUUID}/{testVersion}/events - Register for server-sent events to get experiment execution updates

Synchronization & Metrics

POST /trigger/{testUUID}/{testVersion} - Register a component (Gatling Executor, Chaos Toolkit Executor, MiSArch Experiment Config) as ready
GET /trigger/{testUUID}/{testVersion} - Poll if the experiment can start
POST /experiment/{testUUID}/{testVersion}/gatling/metrics/steadyState - Forward steady-state metrics from Gatling Executor

Technology Stack

Language: Kotlin
Framework: Spring Boot
Asynchronous Processing: Spring WebFlux + Kotlin Coroutines

Repository Structure

The repository is structured as follows:

/src/: Source code of the service
- config: Package that includes several configuration classes
- controller/: Package that includes all REST controllers
  - experiment/: Different controllers for the experiment lifecycle
- service/: Package for all service classes containing the actual business logic
- model/: Package that includes the main data model
  - ExperimentCofig: The global experiment configuration schema
- plugin: Package that includes all plugin classes for the different technologies
  - export: Plugins for Grafana export, LLM export and report generation
  - failure: Plugins for failure execution with Chaos Toolkit Executor and MiSArch Experiment Config
  - workload: Plugin for Gatling Executor workload execution
  - metrics: Plugins for metrics transformation and storage from Prometheus and Gatling

Experiment Execution Process

The following steps describe the workflow that is executed when an experiment is started.

Starting an Experiment

Initiation:
- Start via API call or UI button.
- Loads experiment configuration from persistent storage.
- If no execution for the same version is running, a temporary state is created in memory.
Component Preparation:
- Sends HTTP requests:
  - Failure configuration → Chaos Toolkit Executor
  - Workload configuration → Gatling Executor
  - Reset failures → MiSArch Experiment Config
- Waits for all components to be ready.

Synchronization & Registration

Endpoints Provided:
1. Register component as ready.
2. Poll if experiment can start.
Readiness:
- All three components (Gatling Executor, Chaos Toolkit Executor, MiSArch Experiment Configuration) must register.
- Polling every 100 ms; experiment starts when all are ready (max 300 ms diff).
Scheduling:
- Failure scheduling is handled by Experiment Executor.

Special Handling

Warm-up / Steady-State Hypothesis:
- Gatling Executor runs these before registering as ready.
- Metrics are forwarded to Experiment Executor for threshold calculation and goal storage.

Execution Phase

Start:
- Timestamp is marked.
- Failure injection is managed by Experiment Executor.
Component Responsibilities:
Chaos Toolkit Executor and Gatling Executor handle their respective tasks.

Completion & Reporting

Metrics Collection:
- Gatling Executor sends HTML and JS metrics files.
- Experiment Executor marks completion, clears state, and transforms metrics for InfluxDB.
Report Generation:
- Stores execution timestamps, goals, and threshold violations.
- Optionally stores raw metrics and queries Prometheus for additional data.
Dashboard Creation:
- Grafana dashboard is parameterized and forwarded.
- Frontend is notified via server-side events with dashboard URL.

Stopping an Experiment

Stop Endpoint:
- Terminates MiSArch Experiment Configuration thread.
- Calls stop endpoints for Gatling Executor and Chaos Toolkit Executor.
- Clears experiment state.

Experiment Executor Service

API​

Experiment Management​

Configuration​

Execution​

Synchronization & Metrics​

Technology Stack​

Repository Structure​

Experiment Execution Process​

Starting an Experiment​

Synchronization & Registration​

Special Handling​

Execution Phase​

Completion & Reporting​

Stopping an Experiment​

API