Experiment Executor Service
The Experiment Executor is the core component of the MiSArch Experiment Tool. It is responsible for (1) the creation and storage of experiments, (2) the execution of stored experiments by calling the executor components at the correctly scheduled time, (3) the collection and transformation of the Gatling metrics, and (4) the creation of the final Grafana dashboard and the report.
API
The Experiment Executor exposes the following REST API endpoints to manage and execute experiments.
Experiment Management
POST /experiment/generate
- Generate a new experiment with a new UUIDGET /experiment/list
- List all experiments with their versionsGET /experiment/{testUUID}/versions
- List all versions of a specific experimentPOST /experiment/{testUUID}/{testVersion}/newVersion
- Create a new version of an existing experimentDELETE /experiment/{testUUID}
- Delete an experiment with all its versionsDELETE /experiment/{testUUID}/{testVersion}
- Delete a specific version of an experiment
Configuration
GET /experiment/{testUUID}/{testVersion}/chaosToolkitConfig
- Get the Chaos Toolkit configuration of a specific experiment versionPUT /experiment/{testUUID}/{testVersion}/chaosToolkitConfig
- Update the Chaos Toolkit configuration of a specific experiment versionGET /experiment/{testUUID}/{testVersion}/misarchExperimentConfig
- Get the MiSArch Experiment Config of a specific experiment versionPUT /experiment/{testUUID}/{testVersion}/misarchExperimentConfig
- Update the MiSArch Experiment Config of a specific experiment versionGET /experiment/{testUUID}/{testVersion}/gatlingConfig
- Get the Gatling configuration of a specific experiment versionPUT /experiment/{testUUID}/{testVersion}/gatlingConfig
- Update the Gatling configuration of a specific experiment versionGET /experiment/{testUUID}/{testVersion}/config
- Get the global experiment configuration of a specific experiment versionPUT /experiment/{testUUID}/{testVersion}/config
- Update the global experiment configuration of a specific experiment version
Execution
POST /experiment/start
POST /experiment/{testUUID}/{testVersion}/start
- Start the execution of a specific experiment versionPOST /experiment/{testUUID}/{testVersion}/stop
- Stop the execution of a specific experiment versionGET /experiment/{testUUID}/{testVersion}/events
- Register for server-sent events to get experiment execution updates
Synchronization & Metrics
POST /trigger/{testUUID}/{testVersion}
- Register a component (Gatling Executor, Chaos Toolkit Executor, MiSArch Experiment Config) as readyGET /trigger/{testUUID}/{testVersion}
- Poll if the experiment can startPOST /experiment/{testUUID}/{testVersion}/gatling/metrics/steadyState
- Forward steady-state metrics from Gatling Executor
Technology Stack
- Language: Kotlin
- Framework: Spring Boot
- Asynchronous Processing: Spring WebFlux + Kotlin Coroutines
Repository Structure
The repository is structured as follows:
/src/
: Source code of the serviceconfig
: Package that includes several configuration classescontroller/
: Package that includes all REST controllersexperiment/
: Different controllers for the experiment lifecycle
service/
: Package for all service classes containing the actual business logicmodel/
: Package that includes the main data modelExperimentCofig
: The global experiment configuration schema
plugin
: Package that includes all plugin classes for the different technologiesexport
: Plugins for Grafana export, LLM export and report generationfailure
: Plugins for failure execution with Chaos Toolkit Executor and MiSArch Experiment Configworkload
: Plugin for Gatling Executor workload executionmetrics
: Plugins for metrics transformation and storage from Prometheus and Gatling
Experiment Execution Process
The following steps describe the workflow that is executed when an experiment is started.
Starting an Experiment
- Initiation:
- Start via API call or UI button.
- Loads experiment configuration from persistent storage.
- If no execution for the same version is running, a temporary state is created in memory.
- Component Preparation:
- Sends HTTP requests:
- Failure configuration → Chaos Toolkit Executor
- Workload configuration → Gatling Executor
- Reset failures → MiSArch Experiment Config
- Waits for all components to be ready.
- Sends HTTP requests:
Synchronization & Registration
- Endpoints Provided:
- Register component as ready.
- Poll if experiment can start.
- Readiness:
- All three components (Gatling Executor, Chaos Toolkit Executor, MiSArch Experiment Configuration) must register.
- Polling every 100 ms; experiment starts when all are ready (max 300 ms diff).
- Scheduling:
- Failure scheduling is handled by Experiment Executor.
Special Handling
- Warm-up / Steady-State Hypothesis:
- Gatling Executor runs these before registering as ready.
- Metrics are forwarded to Experiment Executor for threshold calculation and goal storage.
Execution Phase
- Start:
- Timestamp is marked.
- Failure injection is managed by Experiment Executor.
- Component Responsibilities:
- Chaos Toolkit Executor and Gatling Executor handle their respective tasks.
Completion & Reporting
- Metrics Collection:
- Gatling Executor sends HTML and JS metrics files.
- Experiment Executor marks completion, clears state, and transforms metrics for InfluxDB.
- Report Generation:
- Stores execution timestamps, goals, and threshold violations.
- Optionally stores raw metrics and queries Prometheus for additional data.
- Dashboard Creation:
- Grafana dashboard is parameterized and forwarded.
- Frontend is notified via server-side events with dashboard URL.
Stopping an Experiment
- Stop Endpoint:
- Terminates MiSArch Experiment Configuration thread.
- Calls stop endpoints for Gatling Executor and Chaos Toolkit Executor.
- Clears experiment state.