High-performance Serverless GPU Task Orchestration System
🌐 wavespeed.ai • 📐 Architecture • 📖 User Guide • 🔧 Developer Guide
- 🚀 Pull-based Architecture - Workers actively pull tasks for better load balancing
- 🔌 RunPod Compatible - Zero-code migration from runpod-python SDK
- ☸️ Multi-Provider - Kubernetes, Novita Serverless, Docker backends
- 📊 Smart Autoscaling - Queue-depth, priority, and resource-aware scaling
- 🛡️ Graceful Shutdown - Zero task loss during rolling updates
flowchart TB
subgraph Clients
direction LR
Client[Client V1 API]
WebUI[Web UI]
end
subgraph Core["Waverless API Server"]
direction TB
Queue[Task Queue]
WM[Worker Mgmt]
Autoscaler[Autoscaler]
Store[(Redis + MySQL)]
end
subgraph Provider
direction LR
K8s[K8s]
Novita[Novita]
Docker[Docker]
end
subgraph Workers
direction LR
W1[Worker A]
W2[Worker B]
W3[Worker ...]
end
Clients -->|submit| Core
Core --> Provider
Provider -->|manage| Workers
Workers -->|pull tasks| Core
style Clients fill:#4a90a4,color:#fff
style Core fill:#2d5a7b,color:#fff
style Provider fill:#5d8aa8,color:#fff
style Workers fill:#7fb3d3,color:#000
# Local development
docker-compose up -d mysql redis
cp config/config.example.yaml config/config.yaml
go run cmd/main.go
# Kubernetes deployment
./deploy.sh install# Submit task
curl -X POST http://localhost:8090/v1/my-endpoint/run \
-H "Content-Type: application/json" \
-d '{"input": {"prompt": "hello world"}}'
# Check status
curl http://localhost:8090/v1/status/{task_id}| Document | Description |
|---|---|
| Architecture | System design, components, data flow, lifecycle |
| User Guide | Deployment, API reference, autoscaling, troubleshooting |
| Developer Guide | Code structure, core design, provider integration |
MIT License