System Design Lab

Online Judge scaling is mostly worker economics, not API traffic.

Move the sliders from a toy judge to a LeetCode-like workload. The design changes when compilation and sandboxed execution dominate: submit asynchronously, queue work, prewarm containers, cache immutable results, and split heavy submit from light run-code traffic.

Normal evolution scenarios

Click left to right for the intended demo path. Each card changes the workload inputs.

Workload

These are inputs, not preset architecture stages.

Recommended shape

Current architecture path
Online Judge architecture diagram Whiteboard-style architecture diagram for asynchronous code submission, queueing, sandbox workers, result cache, and persistent submission metadata. Clients API Queueing Execution Storage + results User submit + poll API server 202 token + status Submission DB append metadata Queue backpressure Scheduler priority + fairness Worker pool compile + run Warm runners per language Result cache immutable verdicts Object store code + tests
Clients
User submits code and polls for immutable verdicts
API
API server validates request, stores metadata, and returns a token
Queueing
Queue absorbs spikes and gives workers a pull-based backlog
Execution
Scheduler applies fairness, priority, and queue selection
Worker pool CPU and memory heavy compilation and execution
Warm runners language-specific sandbox containers
Storage + results
Submission DB durable append-only verdict history
Result cache cheap polling reads with TTL
Object store large source, problem, and testcase blobs

Bottlenecks

Worker capacity

Queue pressure

Result lookup

Sandbox pool

Submission storage

Why this changes

    Decision tradeoffs

    Async API

    Message queue

    Pre-warmed runners

    Sandbox isolation

    Result cache

    Run / submit split

    Source-backed rules

    These are the durable system-design claims behind the model. The exact slider thresholds are deliberately labeled as teaching assumptions.

    Verified rule

    Container resource limits are the unit of TLE and MLE enforcement

    CPU and memory limits give worker infrastructure a concrete way to stop submissions that exceed problem constraints.

    Docker Docs
    Verified rule

    Seccomp reduces the syscall surface for untrusted code

    A sandbox should block dangerous system calls instead of merely trusting language runtimes or process permissions.

    Docker Docs
    Verified rule

    Queues decouple submit traffic from worker execution

    A queue absorbs bursts and lets workers consume at their own pace, which matches asynchronous judging.

    AWS SQS Docs
    Verified rule

    TTL-backed key-value results make polling cheap

    A final verdict is immutable, so short polling can read a small cached object until the key expires.

    Redis Docs

    Teaching assumptions

    • Worker slot thresholds are teaching thresholds; real capacity depends on language mix, testcase size, CPU model, and isolation overhead.
    • The result cache stores final or in-progress verdict objects, not the source code blob.
    • Kafka is deliberately not required here unless the design needs replay, analytics fanout, or multiple consumers.