Support for 20+ AI generation categories including:
- Text to Image - Generate images from text prompts
- Image to Image - Transform and edit existing images
- Text to Video - Create videos from text descriptions
- Video to Video - Transform and enhance videos
- Text to Audio - Generate audio from text
- Image to 3D - Convert images to 3D models
- And more...
- Single Predictor Node - Access 600+ AI models through one node
- Dynamic Parameters - Model-specific parameters auto-configure
- Fuzzy Search - Quickly find models by name or category
- Category Tabs - Browse models by type
- Smart Caching - Fast model loading after first use (first load: 5-10 seconds)
- Progress Indicators - Real-time generation progress
- Workflow Support - Save and restore complete workflows
- Connection Flexibility - Connect any compatible input/output
- ComfyUI installed
- WaveSpeed AI API Key (free tier available)
π‘ Built on: ComfyUI v0.8.2 (Frontend v1.35.9)
Step 1: Clone the plugin
cd ComfyUI/custom_nodes
git clone https://github.com/WaveSpeedAI/wavespeed-comfyui.git
cd wavespeed-comfyuiStep 2: Find ComfyUI's Python path
Start ComfyUI and check the console output for:
** Python executable: D:\Projects\ComfyUI\.venv\Scripts\python.exe
Copy that path.
Step 3: Install dependencies
Use the Python path from Step 2:
# Windows example:
D:\Projects\ComfyUI\.venv\Scripts\python.exe -m pip install -r requirements.txt
# Linux/Mac example:
/home/user/ComfyUI/.venv/bin/python -m pip install -r requirements.txtStep 4: Restart ComfyUI
Close and restart ComfyUI. You should see WaveSpeed nodes available.
β οΈ Why this matters: If you just runpip install -r requirements.txt, it might install to the wrong Python environment, causingModuleNotFoundError.
Restart ComfyUI after installation.
- Get your API key from WaveSpeed AI
- In ComfyUI:
SettingsβWaveSpeedβ Enter API Key - Or create
config.jsonin plugin directory:
{
"api_key": "your_api_key_here"
}β±οΈ First Load Notice: The first loading will fetch 600+ models from 25 categories. This takes approximately 2-4 minutes depending on your network. Subsequent loads use cached data and are instant.
πΈ High-Quality Image Generation
Use Case: Generate professional-quality images from text descriptions
Models: Flux Dev, SDXL, Ghibli Style, etc.
Key Features:
- Multiple aspect ratios (1:1, 16:9, 9:16, etc.)
- Resolution control (512px - 2048px)
- Seed control for reproducibility
- Negative prompts support
Result:
π¬ Create Videos from Text
Use Case: Generate short videos from text prompts
Models: Kling v1.6, Minimax Video, Wan2.1, etc.
Key Features:
- Duration control (2-10 seconds)
- Resolution options (480p, 720p, 1080p)
- Camera movement control
- Audio generation option
Note: Video generation may take several minutes. The plugin supports up to 30-minute timeout for long-running tasks.
Result:
case2.mp4
π¨ Transform and Edit Images
Use Case: Transform existing images with AI
Models: Flux Redux, Instant Character, Step1X Edit, etc.
Key Features:
- Style transfer
- Image editing with prompts
- Reference image support
- Strength control
Result:
π¬ Animate Static Images
Use Case: Bring static images to life with motion
Models: Stable Video Diffusion, I2VGen-XL, etc.
Key Features:
- Motion generation from single image
- Camera movement control
- Duration control
- Smooth animation
Result:
πΉ Download Video | π₯ Download Workflow JSON
ποΈ Enhance and Transform Videos
Use Case: Upscale, stylize, or transform videos
Models: Seedance v1.5, Real-ESRGAN, etc.
Key Features:
- Video upscaling (2x, 4x)
- Style transformation
- Motion preservation
- Frame interpolation
Result:
πΉ Download Video | π₯ Download Workflow JSON
π΅ Generate Audio from Text
Use Case: Generate music, sound effects, or voice from text
Models: Dia TTS, MMAudio V2, etc.
Key Features:
- Voice synthesis
- Music generation
- Sound effect creation
- Multiple voice options
Result:
π² Convert Images to 3D
Use Case: Generate 3D models from 2D images
Models: Hunyuan 3D V2, etc.
Key Features:
- Multi-view generation
- GLB format export
- Texture mapping
- Mesh optimization
Result:
π¦ Download 3D Model (.glb) | π₯ Download Workflow JSON
π Advanced ComfyUI Integration
Use Case: Demonstrate seamless integration with ComfyUI native nodes and complex multi-stage pipelines
Workflow Highlights:
- Multiple WaveSpeed Nodes - Chain multiple AI generation steps (T2I β I2I β I2V)
- Native Node Integration - Mix with ComfyUI's Load Image, Save Image, Preview Image nodes
- Flexible Data Flow - Pass IMAGE/VIDEO tensors between nodes seamlessly
- Real-world Pipeline - TextβImage β Image Enhancement β ImageβVideo
Pipeline Stages:
-
Stage 1: Text-to-Image (WaveSpeed Predictor #1)
- Generate base image from text prompt
- Model: Text-to-Image model (e.g., FLUX)
- Output: ComfyUI IMAGE tensor
- Native Integration: Output connects directly to ComfyUI Preview Image node
-
Stage 2: Image-to-Image Enhancement (WaveSpeed Predictor #2)
- Enhance and refine the generated image
- Model: Image-to-Image model (e.g., Flux Redux)
- Input: IMAGE tensor from Stage 1 (via native ComfyUI connection)
- Output: Enhanced IMAGE tensor
- Native Integration: Seamlessly receives IMAGE from previous WaveSpeed node
-
Stage 3: Image-to-Video Animation (WaveSpeed Predictor #3)
- Animate the enhanced image into video
- Model: Image-to-Video model (e.g., Stable Video Diffusion)
- Input: Enhanced IMAGE tensor from Stage 2
- Output: VIDEO URL
- Native Integration: Works with ComfyUI's video preview nodes
-
Stage 4: Preview & Save (Native ComfyUI Nodes)
- Use ComfyUI's Preview Image nodes to view intermediate results
- Use Save Image nodes to export final outputs
- All connections work exactly like native ComfyUI nodes
Key Integration Features:
β Tensor Compatibility
- WaveSpeed nodes accept ComfyUI IMAGE/VIDEO/AUDIO tensors directly
- No manual conversion needed - just connect and run
- Works with any image/video processing node in ComfyUI ecosystem
β Output Flexibility
- Outputs can connect to any compatible node
- Support for Preview Image, Save Image, Video Preview, etc.
- Chain multiple WaveSpeed nodes together seamlessly
β Workflow Persistence
- All connections saved in workflow JSON
- Model selections preserved across sessions
- Parameter values restored on load
- Full workflow portability
β Native ComfyUI Features
- Works with node groups and reroute nodes
- Compatible with workflow templates
- Supports ComfyUI's execution queue
- Full undo/redo support
- Drag-and-drop connections
Results:
Stage 1 & 2 Outputs (Text-to-Image β Image-to-Image):
![]() Stage 1: Generated Image |
![]() Stage 2: Enhanced Image |
Stage 3 Output (Image-to-Video):
case8.mp4
Input:
- Dynamic parameters - Auto-generated based on selected model
- Common parameters:
prompt,negative_prompt,seed,resolution, etc. - Media inputs:
image,video,audio(accepts ComfyUI IMAGE/VIDEO/AUDIO tensors or URLs) - All parameters can be set via UI widgets or connected from other nodes
- Common parameters:
Output:
output(ANY) - URL string or list of URLs:- Single output β URL string (e.g.,
"https://cdn.wavespeed.ai/image.png") - Multiple outputs β List of URLs (e.g.,
["url1", "url2"]) - 3D model tasks β List containing preview images + 3D model URL
- Single output β URL string (e.g.,
Note: Output is URL format, not tensor. Use WaveSpeedAI Preview node to convert to IMAGE/VIDEO tensors for further processing.
Input:
input_url(ANY) - Accepts URL string, list of URLs, or text content from WaveSpeed Predictor
Output:
image(IMAGE) - ComfyUI IMAGE tensor (if input is image URL)video(VIDEO) - ComfyUI VIDEO tensor (if input is video URL)
Note: Automatically detects input type and converts URLs to tensors. For 3D model tasks, only displays preview without tensor conversion.
The plugin automatically generates input fields based on each model's schema. Parameters can be:
- Set via UI widgets (text fields, dropdowns, sliders)
- Connected from other nodes (images, videos, audio, numbers, text)
- Mixed (UI defaults + node connections)
For models that accept multiple inputs (e.g., multiple reference images):
Example: Multiple Reference Images
image_0 β First reference
image_1 β Second reference
image_2 β Third reference
The plugin automatically:
- Expands array parameters to individual inputs
- Limits to API-defined maxItems (typically 5)
- Merges back to array format for API submission
The plugin intelligently handles size parameters:
Enum Size (Dropdown):
- Models like Seedance provide fixed size options
- Example: "auto", "10241024", "10241536"
- UI shows dropdown selector
Range Size (Width/Height):
- Models like Flux allow custom dimensions
- UI shows separate width/height inputs
- Ratio buttons for quick selection (1:1, 16:9, 9:16, etc.)
- Each component can be connected independently
When you connect image/video/audio nodes, the plugin automatically:
- Detects the data type (image/video/audio)
- Converts to appropriate format (PNG/MP4/WAV)
- Uploads to WaveSpeed CDN
- Passes the URL to the API
Supported input types:
- ComfyUI IMAGE tensors
- ComfyUI VIDEO tensors
- ComfyUI AUDIO dicts
- VHS_AUDIO callables
- VideoFromFile objects
The plugin intelligently detects output types and returns the appropriate format:
- Images: URL or tensor (for further processing)
- Videos: URL (for preview or download)
- Audio: URL (for playback)
- 3D Models: URL with .glb/.obj format (for 3D viewer)
- Text: Plain text output
Automatic Workflow Persistence:
- All node states saved in workflow JSON
- Model selections preserved
- Parameter values restored
- Input connections maintained
Q: How do I get an API key?
Visit WaveSpeed AI and sign up. Free tier includes:
- 100 credits per month
- Access to all models
- No credit card required
Q: Why is the first load slow?
The first load fetches the complete model list from WaveSpeed API (5-10 seconds). Subsequent loads use cached data and are much faster.
Q: Why can't I find a specific model?
- Check if you're in the correct category tab
- Use the fuzzy search feature
- Model might be temporarily unavailable
- Check WaveSpeed AI dashboard for model status
Q: How do I handle "Task timed out" errors?
The default timeout is 30 minutes. For longer tasks:
- Check your network connection
- Try a different model
- Reduce output resolution/duration
- Contact support if issue persists
Q: Can I use local LoRA files?
No, but you can:
- Upload LoRA to Hugging Face
- Use the public URL in the plugin
- Or use WaveSpeed's built-in LoRA library
Issue: "No API key configured"
- Solution: Configure your API key using Settings β WaveSpeed
- Verify key at WaveSpeed Dashboard
Issue: "Model list not loading"
- Solution: Check your internet connection and API key validity
- First load may take 5-10 seconds
- Check ComfyUI console for error messages
Issue: "Task timeout"
- Solution: Video generation can take up to 30 minutes
- Check network connection
- Try reducing output resolution/duration
Issue: "Upload failed"
- Solution: Check file size limits and format compatibility
- Ensure your API key has sufficient credits
Issue: "VideoFromFile is not JSON serializable"
- Solution: Update to latest version (v2.0+)
- This issue has been fixed in the new architecture
Issue: "Cannot find WaveSpeed nodes"
- Restart ComfyUI completely
- Check
custom_nodes/wavespeed-comfyuiexists - Check console for error messages
We welcome contributions! Here's how you can help:
- Use GitHub Issues
- Include workflow JSON
- Provide error messages
- Describe expected vs actual behavior
- Fork the repository
- Create a feature branch
- Make your changes
- Add tests if applicable
- Submit PR with clear description
- Fix typos
- Add examples
- Translate to other languages
- Create video tutorials
This project is licensed under the MIT License - see the LICENSE file for details.
Need help? We're here for you:
- π Official Website: wavespeed.ai - Live chat support available
- π¬ Discord: Join our community
- π Documentation: WaveSpeed Docs
- π Bug Reports: GitHub Issues
Our support team is ready to assist you with any questions or issues.
- Complete architecture redesign with unified Predictor node
- Support for 20+ model categories and 600+ models
- Dynamic parameter generation from model schemas
- Fuzzy search and category filtering
- Smart output detection and format conversion
- Automatic tensor upload and conversion
- Real-time progress tracking
- Support for long-running tasks (30-minute timeout)
- VideoFromFile support for video-to-video models
- Size component widget for resolution parameters
- Workflow save/restore functionality
- Array parameter expansion (images, loras, etc.)
- Object array support (bbox_condition, etc.)
Made with β€οΈ by the WaveSpeed Team











