Benchmarks

Flow-Like’s runtime is built in Rust for predictable, high-throughput workflow execution. Here are the results from our benchmark suite using mimalloc allocator.

Summary (4 Threads - Fair Comparison)

To provide a fair comparison with n8n (which uses 4 vCPUs), we benchmark with 4 worker threads:

Metric	Value	Description
Single Execution	~1.2ms	Time to execute a simple 2-node workflow
Peak Throughput (4 threads)	~124,000 workflows/sec	At 8K concurrent workflows
Peak Throughput (16 threads)	~244,000 workflows/sec	At 65K concurrent workflows
Step Latency	~20-40µs	Per-node execution overhead

Throughput by Concurrency Level (4 Threads)

These results use 4 worker threads to match typical cloud VM configurations (e.g., n8n’s c5a.large):

Concurrency	Throughput	Latency
128	~65,000 exec/s	2.0ms
512	~100,000 exec/s	5.1ms
1,024	~112,000 exec/s	9.1ms
2,048	~121,000 exec/s	17ms
4,096	~123,000 exec/s	33ms
8,192	~124,000 exec/s	66ms

Throughput by Concurrency Level (16 Threads)

With full 16-core utilization:

Concurrency	Throughput	Latency
128	~60,000 exec/s	2.1ms
512	~140,000 exec/s	3.6ms
1,024	~177,000 exec/s	5.7ms
4,096	~228,000 exec/s	18ms
8,192	~238,000 exec/s	35ms
32,768	~241,000 exec/s	135ms
65,536	~244,000 exec/s	269ms

Allocator Comparison

Using mimalloc provides significant performance improvements over the system allocator:

Allocator	Throughput (1K conc.)	Improvement
mimalloc	~222,000 exec/s	+24%
system	~179,000 exec/s	baseline

Running Benchmarks

Run these benchmarks on your own hardware:

# Test peak throughput with various concurrency levels
FL_CONCURRENCY_LIST="128,512,1024,4096,8192" \
RUST_LOG=off cargo bench --bench throughput_bench --features mimalloc -- peak

# Compare system allocator vs mimalloc
RUST_LOG=off bash packages/catalog/benches/compare_allocators.sh

# Single workflow execution latency
RUST_LOG=off cargo bench --bench allocator_bench --features mimalloc -- single_exec

# Run all benchmarks
RUST_LOG=off cargo bench --features mimalloc

Environment Variables

Customize benchmark behavior:

Variable	Default	Description
`FL_WORKER_THREADS`	CPU count	Tokio worker threads
`FL_CONCURRENCY_LIST`	Auto	Comma-separated concurrency levels to test
`FL_MAX_CONCURRENCY`	CPU × 8	Max concurrency for auto-sweep
`FL_MEASURE_SECS`	10	Measurement duration per level
`RUST_LOG`	-	Set to `off` for accurate benchmarks

Requirements

Rust toolchain (stable)
Test data in tests/ directory
Recommended: 8+ cores for meaningful throughput tests

Benchmark Environment

Results shown were measured on:

CPU: 16 cores (Apple M-series)
Memory: 32GB
OS: macOS
Rust: Stable toolchain
Build: Release mode with LTO
Allocator: mimalloc

What Affects Performance?

Concurrency Level — Higher concurrency enables better CPU utilization up to ~65K concurrent
Allocator Choice — mimalloc provides ~24% improvement over system allocator
Node Complexity — Simple data routing is fast; heavy compute nodes dominate execution time
Graph Depth — More sequential nodes = more steps = longer execution
Data Size — Large payloads increase serialization/deserialization overhead
Tracing Level — Use LogLevel::Fatal for benchmarks; full tracing adds overhead

Comparison with n8n

Both benchmarks execute a comparable task: a simple 2-node workflow. For a fair comparison, we use 4 threads to match n8n’s c5a.large (4 vCPU) setup (n8n benchmarks):

Platform	Setup	Throughput	vs n8n
Flow-Like	4 threads, mimalloc	~124,000 exec/sec	564× faster
Flow-Like	16 threads, mimalloc	~244,000 exec/sec	1,109× faster
n8n (single)	c5a.large (4 vCPU)	~220 exec/sec	baseline
n8n (scaled)	7× c5a.4xlarge	~2,000 exec/sec	9× baseline

General Comparison

Platform	Execution Model	Typical Latency
Flow-Like	Native Rust, typed	~1-2ms per workflow
Node-based tools	JavaScript/Python	~10-50ms per workflow
Cloud workflows	HTTP-based	~100-500ms per workflow

Contributing Benchmarks

Found a performance issue or want to add a benchmark?

Check existing benchmarks in packages/catalog/benches/
Use Criterion for consistent measurement
Document what you’re measuring and why
Submit a PR with before/after results