Introducing Gradual: A Concurrency-First Stress Testing Framework

February 1, 2024
4 min read
Stress TestingPerformancePythonConcurrencyOpen Source

A deep dive into Gradual, a stress testing framework I'm building that focuses on concurrent user simulation rather than traditional RPS-based testing.

Introducing Gradual: A Concurrency-First Stress Testing Framework

Traditional stress testing tools focus on Requests Per Second (RPS) as the primary metric. While RPS is important, it doesn't tell the full story of how your system behaves under concurrent load. This is especially true in high-latency environments like fintech, where concurrent users can create bottlenecks that simple RPS testing might miss.

That's why I'm building Gradual—a concurrency-first stress testing framework designed to simulate real-world user load patterns.

The Problem with RPS-Based Testing

Most stress testing tools work like this:

# Traditional RPS-based approach
for i in range(1000):  # 1000 requests
    send_request()
    time.sleep(1/1000)  # 1 RPS

This approach has several limitations:

  1. Unrealistic Load Patterns: Real users don't make requests at perfectly spaced intervals
  2. Misses Concurrency Issues: Doesn't simulate multiple users hitting the system simultaneously
  3. Poor Latency Simulation: Doesn't account for network latency and user think time
  4. Limited Bottleneck Discovery: May miss issues that only appear under concurrent access

Enter Gradual: Concurrency-First Design

Gradual takes a different approach by focusing on concurrent users rather than requests per second:

# gradual.yaml
config:
  name: "User Login Flow"
  duration: "5m"
  ramp_up: "additive"  # or "multiplicative"
  
users:
  - name: "authenticated_user"
    concurrency: 100
    behavior:
      - action: "login"
        think_time: "2-5s"  # Random think time
      - action: "browse_dashboard"
        think_time: "5-15s"
      - action: "view_reports"
        think_time: "3-8s"
      - action: "logout"
        think_time: "1-3s"
    
  - name: "guest_user"
    concurrency: 50
    behavior:
      - action: "view_public_data"
        think_time: "1-3s"
      - action: "register"
        think_time: "10-30s"

Key Features

1. YAML-Based Configuration

No more writing complex tick functions. Define your test scenarios in simple YAML:

actions:
  login:
    method: "POST"
    url: "/api/auth/login"
    headers:
      Content-Type: "application/json"
    body:
      username: "{{user.username}}"
      password: "{{user.password}}"
    assertions:
      - status_code: 200
      - response_time: "< 500ms"

2. Flexible Ramp-Up Strategies

Gradual supports both additive and multiplicative concurrency ramping:

# Additive: Add 10 users every 30 seconds
ramp_config = {
    "strategy": "additive",
    "initial": 10,
    "increment": 10,
    "interval": "30s",
    "max_concurrency": 100
}

# Multiplicative: Double users every minute
ramp_config = {
    "strategy": "multiplicative",
    "initial": 5,
    "multiplier": 2,
    "interval": "1m",
    "max_concurrency": 80
}

3. Realistic User Behavior

Simulate real user patterns with think times and conditional logic:

behavior:
  - action: "login"
    think_time: "2-5s"
    success_rate: 0.95  # 95% success rate
    
  - action: "conditional_action"
    condition: "{{response.status == 'success'}}"
    think_time: "5-10s"
    
  - action: "error_handling"
    condition: "{{response.status == 'error'}}"
    think_time: "1-2s"

4. Pluggable Reporting

Gradual uses an adapter-based reporting system:

from gradual.reporting import BaseReporter

class CustomReporter(BaseReporter):
    def on_request_complete(self, request):
        # Custom logic for each completed request
        self.metrics.record_latency(request.duration)
        self.metrics.record_status(request.status_code)
    
    def on_test_complete(self, summary):
        # Generate custom report
        self.generate_html_report(summary)
        self.send_slack_notification(summary)

Architecture Overview

Gradual is built using clean architecture principles:

gradual/
├── core/           # Domain logic
│   ├── user.py     # User simulation
│   ├── action.py   # Action execution
│   └── metrics.py  # Performance metrics
├── adapters/       # External integrations
│   ├── http.py     # HTTP client
│   ├── reporting/  # Report generators
│   └── storage/    # Data persistence
├── config/         # Configuration parsing
└── cli/           # Command-line interface

Example Usage

# Run a simple test
gradual run --config test.yaml

# Run with custom reporting
gradual run --config test.yaml --reporter custom

# Run with real-time monitoring
gradual run --config test.yaml --monitor

Why This Matters for Fintech

In fintech applications, concurrent access patterns are crucial:

  1. Trading Systems: Multiple traders accessing the same data simultaneously
  2. Risk Management: Concurrent risk calculations across portfolios
  3. Reporting: Multiple users generating reports at the same time
  4. Data Ingestion: Concurrent data feeds being processed

Gradual helps identify bottlenecks in these scenarios that traditional RPS testing might miss.

Roadmap

Gradual is currently in active development. Planned features include:

  • Distributed Testing: Run tests across multiple machines
  • Real-time Monitoring: Live dashboard during test execution
  • Integration Testing: Support for complex multi-service scenarios
  • Performance Baselines: Automatic regression detection
  • Cloud Integration: Native support for AWS, GCP, Azure

Getting Involved

Gradual is open source and I'm actively looking for contributors. If you're interested in:

  • Stress testing and performance engineering
  • Python and async programming
  • Building developer tools
  • Fintech and high-performance systems

Check out the project on GitHub and let's build something amazing together!

Conclusion

Gradual represents a shift in how we think about stress testing—from simple request counting to realistic user simulation. By focusing on concurrency and user behavior, we can discover performance issues that traditional tools might miss.

The framework is still in early development, but the core concepts are solid. I'm excited to see how it evolves and helps teams build more robust, scalable systems.

Stay tuned for more updates on Gradual's development, and feel free to reach out if you'd like to contribute or learn more about stress testing strategies!