Introducing Gradual: A Concurrency-First Stress Testing Framework
Traditional stress testing tools focus on Requests Per Second (RPS) as the primary metric. While RPS is important, it doesn't tell the full story of how your system behaves under concurrent load. This is especially true in high-latency environments like fintech, where concurrent users can create bottlenecks that simple RPS testing might miss.
That's why I'm building Gradual—a concurrency-first stress testing framework designed to simulate real-world user load patterns.
The Problem with RPS-Based Testing
Most stress testing tools work like this:
# Traditional RPS-based approach
for i in range(1000): # 1000 requests
send_request()
time.sleep(1/1000) # 1 RPS
This approach has several limitations:
- Unrealistic Load Patterns: Real users don't make requests at perfectly spaced intervals
- Misses Concurrency Issues: Doesn't simulate multiple users hitting the system simultaneously
- Poor Latency Simulation: Doesn't account for network latency and user think time
- Limited Bottleneck Discovery: May miss issues that only appear under concurrent access
Enter Gradual: Concurrency-First Design
Gradual takes a different approach by focusing on concurrent users rather than requests per second:
# gradual.yaml
config:
name: "User Login Flow"
duration: "5m"
ramp_up: "additive" # or "multiplicative"
users:
- name: "authenticated_user"
concurrency: 100
behavior:
- action: "login"
think_time: "2-5s" # Random think time
- action: "browse_dashboard"
think_time: "5-15s"
- action: "view_reports"
think_time: "3-8s"
- action: "logout"
think_time: "1-3s"
- name: "guest_user"
concurrency: 50
behavior:
- action: "view_public_data"
think_time: "1-3s"
- action: "register"
think_time: "10-30s"
Key Features
1. YAML-Based Configuration
No more writing complex tick functions. Define your test scenarios in simple YAML:
actions:
login:
method: "POST"
url: "/api/auth/login"
headers:
Content-Type: "application/json"
body:
username: "{{user.username}}"
password: "{{user.password}}"
assertions:
- status_code: 200
- response_time: "< 500ms"
2. Flexible Ramp-Up Strategies
Gradual supports both additive and multiplicative concurrency ramping:
# Additive: Add 10 users every 30 seconds
ramp_config = {
"strategy": "additive",
"initial": 10,
"increment": 10,
"interval": "30s",
"max_concurrency": 100
}
# Multiplicative: Double users every minute
ramp_config = {
"strategy": "multiplicative",
"initial": 5,
"multiplier": 2,
"interval": "1m",
"max_concurrency": 80
}
3. Realistic User Behavior
Simulate real user patterns with think times and conditional logic:
behavior:
- action: "login"
think_time: "2-5s"
success_rate: 0.95 # 95% success rate
- action: "conditional_action"
condition: "{{response.status == 'success'}}"
think_time: "5-10s"
- action: "error_handling"
condition: "{{response.status == 'error'}}"
think_time: "1-2s"
4. Pluggable Reporting
Gradual uses an adapter-based reporting system:
from gradual.reporting import BaseReporter
class CustomReporter(BaseReporter):
def on_request_complete(self, request):
# Custom logic for each completed request
self.metrics.record_latency(request.duration)
self.metrics.record_status(request.status_code)
def on_test_complete(self, summary):
# Generate custom report
self.generate_html_report(summary)
self.send_slack_notification(summary)
Architecture Overview
Gradual is built using clean architecture principles:
gradual/
├── core/ # Domain logic
│ ├── user.py # User simulation
│ ├── action.py # Action execution
│ └── metrics.py # Performance metrics
├── adapters/ # External integrations
│ ├── http.py # HTTP client
│ ├── reporting/ # Report generators
│ └── storage/ # Data persistence
├── config/ # Configuration parsing
└── cli/ # Command-line interface
Example Usage
# Run a simple test
gradual run --config test.yaml
# Run with custom reporting
gradual run --config test.yaml --reporter custom
# Run with real-time monitoring
gradual run --config test.yaml --monitor
Why This Matters for Fintech
In fintech applications, concurrent access patterns are crucial:
- Trading Systems: Multiple traders accessing the same data simultaneously
- Risk Management: Concurrent risk calculations across portfolios
- Reporting: Multiple users generating reports at the same time
- Data Ingestion: Concurrent data feeds being processed
Gradual helps identify bottlenecks in these scenarios that traditional RPS testing might miss.
Roadmap
Gradual is currently in active development. Planned features include:
- Distributed Testing: Run tests across multiple machines
- Real-time Monitoring: Live dashboard during test execution
- Integration Testing: Support for complex multi-service scenarios
- Performance Baselines: Automatic regression detection
- Cloud Integration: Native support for AWS, GCP, Azure
Getting Involved
Gradual is open source and I'm actively looking for contributors. If you're interested in:
- Stress testing and performance engineering
- Python and async programming
- Building developer tools
- Fintech and high-performance systems
Check out the project on GitHub and let's build something amazing together!
Conclusion
Gradual represents a shift in how we think about stress testing—from simple request counting to realistic user simulation. By focusing on concurrency and user behavior, we can discover performance issues that traditional tools might miss.
The framework is still in early development, but the core concepts are solid. I'm excited to see how it evolves and helps teams build more robust, scalable systems.
Stay tuned for more updates on Gradual's development, and feel free to reach out if you'd like to contribute or learn more about stress testing strategies!