Anakin: How to Use WAN 2.1 with Comfy UI on Mac, Windows, and Linux: A Comprehensive Guide

On February 25, 2025, Alibaba Cloud stirred the industry by open-sourcing Wan 2.1, an advanced AI video generation model from the acclaimed Tongyi series. This innovative model transforms text prompts into visually impressive videos, handling intricate movements and spatial details with ease. With a standout VBench score of 84.7%, multilingual support, and free access, Wan 2.1 is already a strong contender in a field that includes OpenAI’s Sora, Minimax, Kling from Kuaishou, and Google’s Veo 2.

If you’d rather bypass the setup hassle and start generating videos right away, check out Anakin AI—an all-in-one AI platform that makes using Wan 2.1 a breeze. Otherwise, this guide will walk you through how to use WAN 2.1 with Comfy UI on Mac, Windows, and Linux, covering installation, configuration, and advanced video generation techniques. Enjoy exploring the future of AI video creation!

Wan 2.1 Text to Video AI Video Generator | Free AI tool | Anakin

Wan 2.1 Text to Video AI Video Generator is an innovative app that transforms written text into dynamic, high-quality videos using advanced AI, enabling users to create professional visual content in minutes with customizable templates, styles, and voiceovers.

Anakin.ai

Introduction and System Preparations

When you're ready to dive into how to use WAN 2.1 with Comfy UI, the first step is to ensure your system meets the necessary hardware and software requirements. Trust me—starting with a strong foundation makes the whole process a lot smoother.

Hardware Specifications

Minimum:
GPU: NVIDIA GTX 1080 (8GB VRAM) or Apple M1
RAM: 16GB DDR4
Storage: 15GB SSD space for models and dependencies
Recommended:
GPU: NVIDIA RTX 4090 (24GB VRAM) or Apple M3 Max
RAM: 32GB DDR5
Storage: NVMe SSD with at least 50GB capacity

Software Dependencies

Python: Versions 3.10 to 3.11 (3.11.6 works best for Apple Silicon)
PyTorch: Version 2.2+ with CUDA 12.1 (for Windows/Linux) or Metal support (for macOS)
FFmpeg: Version 6.1 for video encoding/decoding
Drivers: NVIDIA Studio Drivers 550+ for Windows/Linux

Installing ComfyUI on Different Platforms

Follow these detailed steps to set up ComfyUI, a crucial part of how to use WAN 2.1 with Comfy UI.

Windows Installation

Method A: ComfyUI Desktop (Official Beta)

Download: Get the ComfyUI_Desktop_Windows_0.9.3b.exe from comfyui.org/downloads.
Run Installer: Execute the installer and ensure NVIDIA GPU acceleration is enabled.
Verification: Open a command prompt and run:

.\run_nvidia_gpu.bat

This quick check confirms that everything’s set up properly.

Method B: Manual Build

Clone the Repository:bashCopy

git clone https://github.com/comfyanonymous/ComfyUI
cd ComfyUI

2. Setup Virtual Environment:

python -m venv venv
venv\Scripts\activate

3. Install PyTorch:

pip install torch==2.2.0+cu121 -f https://download.pytorch.org/whl/torch_stable.html

4. Install Requirements:

pip install -r requirements.txt

macOS Installation (M1/M2/M3)

Install Homebrew (if needed):

/bin/bash -c "$(curl -fsSL https://raw.githubusercontent.com/Homebrew/install/HEAD/install.sh)"

2. Install Python & FFmpeg:

brew install python@3.11 ffmpeg

3. Clone and Setup ComfyUI:

git clone https://github.com/comfyanonymous/ComfyUI
cd ComfyUI
python3.11 -m pip install --pre torch torchvision --extra-index-url https://download.pytorch.org/whl/torch_stable.html
pip3 install -r requirements.txt

Linux Installation (Native/WSL2)

For WSL2:

Install WSL2 with Ubuntu 22.04:

wsl --install -d Ubuntu-22.04

2. Update and Upgrade:

sudo apt update && sudo apt full-upgrade -y

Deploying ComfyUI:

Clone the Repository:

git clone https://github.com/comfyanonymous/ComfyUI

2. Setup Conda Environment (Recommended):

conda create -n comfy python=3.10
conda activate comfy

3. Install PyTorch with CUDA:

pip install torch==2.2.0+cu121 -f https://download.pytorch.org/whl/torch_stable.html

4. Install Requirements:

pip install -r requirements.txt

Integrating the WAN 2.1 Model

With ComfyUI up and running, the next step in how to use WAN 2.1 with Comfy UI is integrating the WAN 2.1 model.

Model Acquisition and Setup

Download Weights:
wan_2.1_base.safetensors (approx. 8.4GB)
wan_2.1_vae.pth (approx. 1.2GB)
Download these files using your favorite method (for instance, wget).
File Placement:
Place wan_2.1_base.safetensors in ComfyUI/models/checkpoints/
Place wan_2.1_vae.pth in ComfyUI/models/vae/

Custom Nodes Installation

Enhance your workflow by installing custom nodes:

Navigate to the Custom Nodes Directory

cd ComfyUI/custom_nodes

Clone Essential Extensions:

git clone https://github.com/WASasquatch/was-node-suite-comfyui git clone https://github.com/Kosinkadink/ComfyUI-VideoHelperSuite

These nodes provide handy features like video frame interpolation and batch processing.

Configuring Your Workflow for WAN 2.1

Building the right pipeline is key when learning how to use WAN 2.1 with Comfy UI.

Setting Up the Text-to-Video Pipeline

Here’s a simplified pipeline structure:

Load Checkpoint Node: Loads your WAN 2.1 model weights.
CLIPTextEncode Node: Converts text prompts (e.g., “A cybernetic dragon soaring through nebula clouds”) into conditioning data.
WANSampler Node: Samples the latent space with parameters such as:

Resolution: 1024×576 frames

Frames: 48 (modifiable based on needs)

Motion Scale: Typically between 1.2 and 2.5 for smooth transitions.

VAEDecode Node: Decodes the latent data into a final video output.

Parameter Tweaks & Optimization

Motion Scale: Many users prefer around 1.8 to balance smooth transitions with consistency.
Temporal Attention: Aim for settings between 0.85 and 0.97 to maintain long-range motion stability.
Noise Schedule & Frame Interpolation: Options like Karras and FilmNet help reduce unwanted artifacts.
Hybrid Inputs: Combine reference images and depth maps to enhance style transfer and introduce a 3D effect.

Advanced Video Generation Techniques

Take your projects further with these advanced tips:

Multi-Image Referencing

Style Transfer: Use multiple reference images to alter the art style.
Depth Map Conditioning: Incorporate depth maps to create a pseudo-3D feel.
ControlNet & Pose Estimation: Direct the model using human poses or object positioning for more refined outputs.

Camera Motion Simulation

Simulate dynamic camera movements with the CameraController node:

Orbit Speed: e.g., 0.12
Dolly Zoom: e.g., -0.05
Roll Variance: e.g., 2.7
These adjustments give your videos that cinematic flair.

Performance Optimization & Troubleshooting

VRAM Management Techniques

Keep your system running efficiently:

Frame Caching: Enable by setting enable_offload_technique = True and opting for aggressive VRAM optimization.
Mixed Precision: Boost performance using:

torch.set_float32_matmul_precision('medium')

Troubleshooting Common Issues

Black Frame Output: Verify that your VAE file (wan_2.1_vae.pth) matches your model version and check your temporal attention settings.
VRAM Overflow: Launch ComfyUI with --medvram and --xformers flags.
Log Analysis: Inspect comfy.log for any ERROR or CRITICAL messages to quickly pinpoint problems.

Platform-Specific Installation Differences

Here’s a quick rundown on the main differences between installing ComfyUI on Windows, macOS, and Linux—important to understand when figuring out how to use WAN 2.1 with Comfy UI:

Windows

Traditional Method:
Involves a portable ZIP extraction, manual Python environment setup, and batch file execution (like running run_nvidia_gpu.bat).
Requires a separate 7‑Zip installation and manual configuration of the CUDA toolkit.
V1 Desktop App:
A one-click installer (about 200MB bundled package) that automates dependency resolution and setup.

macOS

Traditional Method:
Uses Homebrew for installing core packages and requires manual Python/MPS configuration.
Launches via Terminal, and Python 3.11+ is mandatory for optimizing on Apple Silicon.
V1 Desktop App:
Comes as a universal .dmg package with an integrated Python environment, significantly simplifying installation.

Linux

Traditional Method:
Relies on terminal-based cloning, conda or pip management, and manual installation of NVIDIA/AMD drivers.
May need additional tweaks for AppArmor/SELinux policies.
V1 Desktop App:
Offers code-signed binaries (via AppImage/DEB packages) that streamline dependency management and updates.

The V1 Desktop App dramatically cuts down on installation headaches by providing automatic dependency resolution and unified model libraries across all platforms.

Final Thoughts

In summary, this guide has walked you through how to use WAN 2.1 with Comfy UI—from getting your system ready to diving into advanced video generation techniques. No matter if you're on Windows, macOS, or Linux, you're now equipped to set up, customize, and optimize your AI video workflow like a pro.

So, grab your system, give it a spin, and enjoy the creative ride. Happy video making, and here’s to pushing your projects to new heights!

from Anakin Blog http://anakin.ai/blog/using-wan-2-1-with-comfyui-a-comprehensive-guide-for-windows-macos-and-linux/
via IFTTT

Anakin

Thursday, February 27, 2025

How to Use WAN 2.1 with Comfy UI on Mac, Windows, and Linux: A Comprehensive Guide