Skip to content

WaveSpeedAI/wavespeed-comfyui

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

62 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

πŸš€ ComfyUI-WaveSpeed: Universal AI Generation Plugin

License ComfyUI WaveSpeed AI


✨ Features

🎭 Multi-Modal Generation

Support for 20+ AI generation categories including:

  • Text to Image - Generate images from text prompts
  • Image to Image - Transform and edit existing images
  • Text to Video - Create videos from text descriptions
  • Video to Video - Transform and enhance videos
  • Text to Audio - Generate audio from text
  • Image to 3D - Convert images to 3D models
  • And more...

🎯 Unified Interface

  • Single Predictor Node - Access 600+ AI models through one node
  • Dynamic Parameters - Model-specific parameters auto-configure
  • Fuzzy Search - Quickly find models by name or category
  • Category Tabs - Browse models by type

⚑ Performance & UX

  • Smart Caching - Fast model loading after first use (first load: 5-10 seconds)
  • Progress Indicators - Real-time generation progress
  • Workflow Support - Save and restore complete workflows
  • Connection Flexibility - Connect any compatible input/output

πŸ“¦ Installation

Prerequisites

πŸ’‘ Built on: ComfyUI v0.8.2 (Frontend v1.35.9)

Installation Steps

Step 1: Clone the plugin

cd ComfyUI/custom_nodes
git clone https://github.com/WaveSpeedAI/wavespeed-comfyui.git
cd wavespeed-comfyui

Step 2: Find ComfyUI's Python path

Start ComfyUI and check the console output for:

** Python executable: D:\Projects\ComfyUI\.venv\Scripts\python.exe

Copy that path.

Step 3: Install dependencies

Use the Python path from Step 2:

# Windows example:
D:\Projects\ComfyUI\.venv\Scripts\python.exe -m pip install -r requirements.txt

# Linux/Mac example:
/home/user/ComfyUI/.venv/bin/python -m pip install -r requirements.txt

Step 4: Restart ComfyUI

Close and restart ComfyUI. You should see WaveSpeed nodes available.

⚠️ Why this matters: If you just run pip install -r requirements.txt, it might install to the wrong Python environment, causing ModuleNotFoundError.

Restart ComfyUI after installation.

Configuration

  1. Get your API key from WaveSpeed AI
  2. In ComfyUI: Settings β†’ WaveSpeed β†’ Enter API Key
  3. Or create config.json in plugin directory:
{
  "api_key": "your_api_key_here"
}

⏱️ First Load Notice: The first loading will fetch 600+ models from 25 categories. This takes approximately 2-4 minutes depending on your network. Subsequent loads use cached data and are instant.


🎨 Usage Examples

Example 1: Text-to-Image Generation

πŸ“Έ High-Quality Image Generation

Use Case: Generate professional-quality images from text descriptions

Models: Flux Dev, SDXL, Ghibli Style, etc.

Key Features:

  • Multiple aspect ratios (1:1, 16:9, 9:16, etc.)
  • Resolution control (512px - 2048px)
  • Seed control for reproducibility
  • Negative prompts support

Result:

πŸ“₯ Download Workflow JSON


Example 2: Text-to-Video Generation

🎬 Create Videos from Text

Use Case: Generate short videos from text prompts

Models: Kling v1.6, Minimax Video, Wan2.1, etc.

Key Features:

  • Duration control (2-10 seconds)
  • Resolution options (480p, 720p, 1080p)
  • Camera movement control
  • Audio generation option

Note: Video generation may take several minutes. The plugin supports up to 30-minute timeout for long-running tasks.

Result:

case2.mp4

πŸ“₯ Download Workflow JSON


Example 3: Image-to-Image Transformation

🎨 Transform and Edit Images

Use Case: Transform existing images with AI

Models: Flux Redux, Instant Character, Step1X Edit, etc.

Key Features:

  • Style transfer
  • Image editing with prompts
  • Reference image support
  • Strength control

Result:

πŸ“₯ Download Workflow JSON


Example 4: Image-to-Video Animation

🎬 Animate Static Images

Use Case: Bring static images to life with motion

Models: Stable Video Diffusion, I2VGen-XL, etc.

Key Features:

  • Motion generation from single image
  • Camera movement control
  • Duration control
  • Smooth animation

Result:

πŸ“Ή Download Video | πŸ“₯ Download Workflow JSON


Example 5: Video-to-Video Enhancement

🎞️ Enhance and Transform Videos

Use Case: Upscale, stylize, or transform videos

Models: Seedance v1.5, Real-ESRGAN, etc.

Key Features:

  • Video upscaling (2x, 4x)
  • Style transformation
  • Motion preservation
  • Frame interpolation

Result:

πŸ“Ή Download Video | πŸ“₯ Download Workflow JSON


Example 6: Text-to-Audio Generation

🎡 Generate Audio from Text

Use Case: Generate music, sound effects, or voice from text

Models: Dia TTS, MMAudio V2, etc.

Key Features:

  • Voice synthesis
  • Music generation
  • Sound effect creation
  • Multiple voice options

Result:

case6.mp3

πŸ“₯ Download Workflow JSON


Example 7: Image-to-3D Model

🎲 Convert Images to 3D

Use Case: Generate 3D models from 2D images

Models: Hunyuan 3D V2, etc.

Key Features:

  • Multi-view generation
  • GLB format export
  • Texture mapping
  • Mesh optimization

Result:

πŸ“¦ Download 3D Model (.glb) | πŸ“₯ Download Workflow JSON


Example 8: Complex Multi-Stage Pipeline

πŸ”— Advanced ComfyUI Integration

Use Case: Demonstrate seamless integration with ComfyUI native nodes and complex multi-stage pipelines

Workflow Highlights:

  • Multiple WaveSpeed Nodes - Chain multiple AI generation steps (T2I β†’ I2I β†’ I2V)
  • Native Node Integration - Mix with ComfyUI's Load Image, Save Image, Preview Image nodes
  • Flexible Data Flow - Pass IMAGE/VIDEO tensors between nodes seamlessly
  • Real-world Pipeline - Textβ†’Image β†’ Image Enhancement β†’ Imageβ†’Video

Pipeline Stages:

  1. Stage 1: Text-to-Image (WaveSpeed Predictor #1)

    • Generate base image from text prompt
    • Model: Text-to-Image model (e.g., FLUX)
    • Output: ComfyUI IMAGE tensor
    • Native Integration: Output connects directly to ComfyUI Preview Image node
  2. Stage 2: Image-to-Image Enhancement (WaveSpeed Predictor #2)

    • Enhance and refine the generated image
    • Model: Image-to-Image model (e.g., Flux Redux)
    • Input: IMAGE tensor from Stage 1 (via native ComfyUI connection)
    • Output: Enhanced IMAGE tensor
    • Native Integration: Seamlessly receives IMAGE from previous WaveSpeed node
  3. Stage 3: Image-to-Video Animation (WaveSpeed Predictor #3)

    • Animate the enhanced image into video
    • Model: Image-to-Video model (e.g., Stable Video Diffusion)
    • Input: Enhanced IMAGE tensor from Stage 2
    • Output: VIDEO URL
    • Native Integration: Works with ComfyUI's video preview nodes
  4. Stage 4: Preview & Save (Native ComfyUI Nodes)

    • Use ComfyUI's Preview Image nodes to view intermediate results
    • Use Save Image nodes to export final outputs
    • All connections work exactly like native ComfyUI nodes

Key Integration Features:

βœ… Tensor Compatibility

  • WaveSpeed nodes accept ComfyUI IMAGE/VIDEO/AUDIO tensors directly
  • No manual conversion needed - just connect and run
  • Works with any image/video processing node in ComfyUI ecosystem

βœ… Output Flexibility

  • Outputs can connect to any compatible node
  • Support for Preview Image, Save Image, Video Preview, etc.
  • Chain multiple WaveSpeed nodes together seamlessly

βœ… Workflow Persistence

  • All connections saved in workflow JSON
  • Model selections preserved across sessions
  • Parameter values restored on load
  • Full workflow portability

βœ… Native ComfyUI Features

  • Works with node groups and reroute nodes
  • Compatible with workflow templates
  • Supports ComfyUI's execution queue
  • Full undo/redo support
  • Drag-and-drop connections

Results:

Stage 1 & 2 Outputs (Text-to-Image β†’ Image-to-Image):


Stage 1: Generated Image

Stage 2: Enhanced Image

Stage 3 Output (Image-to-Video):

case8.mp4

πŸ“₯ Download Workflow JSON


πŸ”§ Core Nodes

WaveSpeedAI Predictor ⚑

Input:

  • Dynamic parameters - Auto-generated based on selected model
    • Common parameters: prompt, negative_prompt, seed, resolution, etc.
    • Media inputs: image, video, audio (accepts ComfyUI IMAGE/VIDEO/AUDIO tensors or URLs)
    • All parameters can be set via UI widgets or connected from other nodes

Output:

  • output (ANY) - URL string or list of URLs:
    • Single output β†’ URL string (e.g., "https://cdn.wavespeed.ai/image.png")
    • Multiple outputs β†’ List of URLs (e.g., ["url1", "url2"])
    • 3D model tasks β†’ List containing preview images + 3D model URL

Note: Output is URL format, not tensor. Use WaveSpeedAI Preview node to convert to IMAGE/VIDEO tensors for further processing.


WaveSpeedAI Preview πŸ‘οΈ

Input:

  • input_url (ANY) - Accepts URL string, list of URLs, or text content from WaveSpeed Predictor

Output:

  • image (IMAGE) - ComfyUI IMAGE tensor (if input is image URL)
  • video (VIDEO) - ComfyUI VIDEO tensor (if input is video URL)

Note: Automatically detects input type and converts URLs to tensors. For 3D model tasks, only displays preview without tensor conversion.


πŸ“š Advanced Features

Dynamic Parameter System

The plugin automatically generates input fields based on each model's schema. Parameters can be:

  • Set via UI widgets (text fields, dropdowns, sliders)
  • Connected from other nodes (images, videos, audio, numbers, text)
  • Mixed (UI defaults + node connections)

Array Parameters

For models that accept multiple inputs (e.g., multiple reference images):

Example: Multiple Reference Images

image_0 β†’ First reference
image_1 β†’ Second reference
image_2 β†’ Third reference

The plugin automatically:

  • Expands array parameters to individual inputs
  • Limits to API-defined maxItems (typically 5)
  • Merges back to array format for API submission

Size Parameter Handling

The plugin intelligently handles size parameters:

Enum Size (Dropdown):

  • Models like Seedance provide fixed size options
  • Example: "auto", "10241024", "10241536"
  • UI shows dropdown selector

Range Size (Width/Height):

  • Models like Flux allow custom dimensions
  • UI shows separate width/height inputs
  • Ratio buttons for quick selection (1:1, 16:9, 9:16, etc.)
  • Each component can be connected independently

Tensor Upload

When you connect image/video/audio nodes, the plugin automatically:

  1. Detects the data type (image/video/audio)
  2. Converts to appropriate format (PNG/MP4/WAV)
  3. Uploads to WaveSpeed CDN
  4. Passes the URL to the API

Supported input types:

  • ComfyUI IMAGE tensors
  • ComfyUI VIDEO tensors
  • ComfyUI AUDIO dicts
  • VHS_AUDIO callables
  • VideoFromFile objects

Smart Output Detection

The plugin intelligently detects output types and returns the appropriate format:

  • Images: URL or tensor (for further processing)
  • Videos: URL (for preview or download)
  • Audio: URL (for playback)
  • 3D Models: URL with .glb/.obj format (for 3D viewer)
  • Text: Plain text output

Workflow Save/Restore

Automatic Workflow Persistence:

  • All node states saved in workflow JSON
  • Model selections preserved
  • Parameter values restored
  • Input connections maintained

❓ FAQ

Q: How do I get an API key?

Visit WaveSpeed AI and sign up. Free tier includes:

  • 100 credits per month
  • Access to all models
  • No credit card required
Q: Why is the first load slow?

The first load fetches the complete model list from WaveSpeed API (5-10 seconds). Subsequent loads use cached data and are much faster.

Q: Why can't I find a specific model?
  1. Check if you're in the correct category tab
  2. Use the fuzzy search feature
  3. Model might be temporarily unavailable
  4. Check WaveSpeed AI dashboard for model status
Q: How do I handle "Task timed out" errors?

The default timeout is 30 minutes. For longer tasks:

  1. Check your network connection
  2. Try a different model
  3. Reduce output resolution/duration
  4. Contact support if issue persists
Q: Can I use local LoRA files?

No, but you can:

  1. Upload LoRA to Hugging Face
  2. Use the public URL in the plugin
  3. Or use WaveSpeed's built-in LoRA library

πŸ› οΈ Troubleshooting

Common Issues

Issue: "No API key configured"

  • Solution: Configure your API key using Settings β†’ WaveSpeed
  • Verify key at WaveSpeed Dashboard

Issue: "Model list not loading"

  • Solution: Check your internet connection and API key validity
  • First load may take 5-10 seconds
  • Check ComfyUI console for error messages

Issue: "Task timeout"

  • Solution: Video generation can take up to 30 minutes
  • Check network connection
  • Try reducing output resolution/duration

Issue: "Upload failed"

  • Solution: Check file size limits and format compatibility
  • Ensure your API key has sufficient credits

Issue: "VideoFromFile is not JSON serializable"

  • Solution: Update to latest version (v2.0+)
  • This issue has been fixed in the new architecture

Issue: "Cannot find WaveSpeed nodes"

  • Restart ComfyUI completely
  • Check custom_nodes/wavespeed-comfyui exists
  • Check console for error messages

🀝 Contributing

We welcome contributions! Here's how you can help:

Report Issues

  • Use GitHub Issues
  • Include workflow JSON
  • Provide error messages
  • Describe expected vs actual behavior

Submit Pull Requests

  1. Fork the repository
  2. Create a feature branch
  3. Make your changes
  4. Add tests if applicable
  5. Submit PR with clear description

Improve Documentation

  • Fix typos
  • Add examples
  • Translate to other languages
  • Create video tutorials

πŸ“„ License

This project is licensed under the MIT License - see the LICENSE file for details.


πŸ“ž Support

Need help? We're here for you:

Our support team is ready to assist you with any questions or issues.


πŸ“Š Changelog

v2.0.0

  • Complete architecture redesign with unified Predictor node
  • Support for 20+ model categories and 600+ models
  • Dynamic parameter generation from model schemas
  • Fuzzy search and category filtering
  • Smart output detection and format conversion
  • Automatic tensor upload and conversion
  • Real-time progress tracking
  • Support for long-running tasks (30-minute timeout)
  • VideoFromFile support for video-to-video models
  • Size component widget for resolution parameters
  • Workflow save/restore functionality
  • Array parameter expansion (images, loras, etc.)
  • Object array support (bbox_condition, etc.)

Made with ❀️ by the WaveSpeed Team

⭐ Star us on GitHub | 🌐 Visit WaveSpeed.ai

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 5