Turn-key text inference worker for AI Power Grid. Run a local model, connect to the Grid, and start earning.
Grab the latest binary for your platform from Releases:
| Platform | File |
|---|---|
| Windows x64 | grid-inference-worker-windows-x64.exe |
| macOS ARM64 | grid-inference-worker-macos-arm64.zip |
| Linux x64 | grid-inference-worker-linux-x64 |
| Linux ARM64 | grid-inference-worker-linux-arm64 |
Windows — Double-click the exe. A setup wizard opens in your browser at http://localhost:7861.
macOS — Unzip, then open Grid Inference Worker.app.
Linux — chmod +x grid-inference-worker-linux-x64 && ./grid-inference-worker-linux-x64
No Python or dependencies needed. Just install a backend (Ollama is easiest), run the worker, and follow the wizard.
You'll need a Grid API key — register here.
Once your worker is running, chat with your model at aipg.chat — select your model in the upper selector.
Override config from the command line. The web dashboard is always available at http://localhost:7861 regardless of how you start the worker.
grid-inference-worker \
--model llama3.2:3b \
--backend-url http://127.0.0.1:11434 \
--api-key YOUR_API_KEY \
--worker-name my-worker--model NAME Model name (e.g. llama3.2:3b)
--backend-url URL Backend URL (e.g. http://127.0.0.1:11434)
--api-key KEY Grid API key
--worker-name NAME Worker name on the grid
--port PORT Web dashboard port (default: 7861)
--gui Show the desktop control window (default for binaries)
--no-gui Skip the desktop control window
--install-service Install as a system service (auto-start on boot)
--uninstall-service Remove the system service
--service-status Check if the service is installed
Copy .env.example to .env and fill in your values, or configure through the web setup wizard.
| Variable | Default | Description |
|---|---|---|
GRID_API_KEY |
(required) | Your Grid API key (register) |
MODEL_NAME |
Model to serve (e.g. llama3.2:3b) |
|
BACKEND_TYPE |
ollama |
ollama or openai |
OLLAMA_URL |
http://127.0.0.1:11434 |
Ollama endpoint |
OPENAI_URL |
http://127.0.0.1:8000/v1 |
OpenAI-compatible endpoint (vLLM, SGLang, etc.) |
OPENAI_API_KEY |
API key for OpenAI-compatible backend | |
GRID_WORKER_NAME |
Text-Inference-Worker |
Worker name on the grid |
GRID_MAX_LENGTH |
4096 |
Max generation length |
GRID_MAX_CONTEXT_LENGTH |
4096 |
Max context window (auto-detected from backend) |
GRID_NSFW |
true |
Accept NSFW jobs |
WALLET_ADDRESS |
Base chain wallet for rewards |
Requires Python 3.9+.
pip install -e .
grid-inference-workerOn Windows you can also use:
.\scripts\run.ps1cp .env.example .env
# Edit .env with your values
docker compose up -dThe dashboard is available at http://localhost:7861.
Run the worker on boot without needing to stay logged in. Works on Windows (startup registry), Linux (systemd), and macOS (launchd).
# Configure the worker first (run it once to set up .env), then:
grid-inference-worker --install-service
# Check status
grid-inference-worker --service-status
# Remove
grid-inference-worker --uninstall-service| Backend | Type | Setup |
|---|---|---|
| Ollama | ollama |
Install Ollama, ollama pull llama3.2:3b, done |
| LM Studio | ollama |
Load a model, enable server in Developer tab |
| vLLM | openai |
--served-model-name + set OPENAI_URL |
| SGLang | openai |
Point OPENAI_URL at SGLang's OpenAI endpoint |
| LMDeploy | openai |
lmdeploy serve api_server + set OPENAI_URL |
| KoboldCpp | openai |
Enable OpenAI-compatible endpoint |
Ollama is the easiest way to get started. The setup wizard auto-detects it and lets you pick a model.
For any backend that exposes an OpenAI-compatible API (/v1/chat/completions), set BACKEND_TYPE=openai and point OPENAI_URL at it.
For high-performance inference with vLLM, see our detailed guides:
- vLLM Setup Guide - Installation, configuration, and integration
- vLLM Optimization Guide - Performance tuning, benchmarking, and production best practices
