ComfyUI Models Documentation

Comprehensive guide to installed AI models for image and video generation

23GB
Total Models
8.9GB
Checkpoints
8.7GB
ControlNet
655GB
Available Space

📊 Storage Overview

Location: /data/comfyui/models/ Total Storage: 23GB currently used, 655GB available Distribution: ├── checkpoints/ 8.9GB - Base generation models ├── controlnet/ 8.7GB - Control & guidance models ├── clip_vision/ 2.4GB - Vision encoders ├── animatediff_models/ 1.7GB - Video generation ├── facerestore_models/ 692MB - Face enhancement ├── vae/ 639MB - VAE models └── upscale_models/ 82MB - Image upscaling

🎨 Checkpoint Models (Base Generation)

SDXL Base 1.0 6.5GB

High-quality text-to-image generation, successor to SD 1.5

Purpose:

  • General purpose image generation
  • High resolution (1024x1024 native)
  • Better text rendering and composition

Best For:

Character Design Game Assets Concept Art Textures
Use Case: Creating game character concept art, environmental textures, UI elements

Tips:

  • Works best at 1024x1024, can generate up to 2048x2048
  • Requires ~10GB VRAM for optimal performance
  • Use with ControlNet for precise control
Stable Video Diffusion XT 2.5GB

Image-to-video generation for creating animations

Purpose:

  • Convert static images to short videos (14-25 frames)
  • Camera motion and object animation
  • Video-to-video transformation

Best For:

Character Animation Product Demos Cinematics
Use Case: Animate game cutscenes, create character intro videos, product showcase animations

Tips:

  • Start with high-quality still images (SDXL generated works great)
  • Keep motion subtle for best results
  • Can generate 576x1024 resolution videos
  • Requires ~16GB VRAM

🎬 Video Generation Models

AnimateDiff v2 1.7GB

Motion module for creating animated sequences from SD 1.5 models

Purpose:

  • Add motion to Stable Diffusion 1.5 outputs
  • Create character animations
  • Generate looping animations

Best For:

Character Walkcycles Action Sequences Sprite Animations
Use Case: Create 2D game character animations, sprite sheets, animated backgrounds

Tips:

  • Works with SD 1.5 checkpoints only (not SDXL)
  • Combine with LoRAs for style consistency
  • Can generate 16-frame sequences
  • Motion LoRAs can enhance specific movements

🎮 ControlNet Models (Precision Control)

ControlNet Depth (SD 1.5) ~1.5GB

⭐ CRITICAL for RealSense D435i integration - converts depth maps to images

Purpose:

  • Uses depth information to control generation
  • Primary use: RealSense D435i depth camera → 3D character creation
  • Maintain spatial relationships and 3D structure

Best For:

RealSense Scanning 3D Character Design Environment Layouts

🎯 RealSense Character Creation Workflow:

  1. Capture depth map with RealSense D435i camera
  2. Feed depth map into ComfyUI
  3. Use Depth ControlNet to generate character maintaining 3D structure
  4. Refine with inpainting or img2img
  5. Animate with AnimateDiff or SVD

Tips:

  • RealSense outputs work directly with this model
  • Adjust ControlNet strength (0.5-1.0) for balance between depth accuracy and creative freedom
  • Combine with OpenPose for character posing
ControlNet Depth SDXL ~2.5GB

Higher quality depth control for SDXL models

Purpose:

  • Same as SD 1.5 depth, but for SDXL quality
  • Better detail preservation
  • Higher resolution outputs

Best For:

High-Quality Characters Game Marketing Assets

Tips:

  • Use this for final production-quality renders
  • Use SD 1.5 version for quick iteration
ControlNet OpenPose ~1.5GB

Control character poses using skeleton/keypoint detection

Purpose:

  • Pose consistency across generations
  • Character animation pose control
  • Combine with depth for precise character modeling

Best For:

Character Poses Animation Frames Reference Matching

📸 Pose-Controlled Character Creation:

  1. Take reference photo or use pose estimation
  2. Extract pose skeleton with OpenPose
  3. Generate character in exact pose
  4. Maintain pose while changing character details

Tips:

  • Can combine with Depth ControlNet for best results
  • Works great for character turnarounds
  • Use lower strength (0.3-0.7) for natural variations
ControlNet Canny ~1.5GB

Edge detection for preserving line art and structure

Purpose:

  • Preserve edge structure from input images
  • Convert line art to colored images
  • Maintain architectural/structural details

Best For:

Line Art Coloring Concept to Render Environment Design
Use Case: Convert hand-drawn sketches to game-ready assets, colorize line art, maintain building structures

Tips:

  • Great for converting sketches to polished art
  • Works well with architectural/environment concepts
  • Lower Canny threshold detects finer edges

🔍 Upscale Models

Real-ESRGAN x4plus ~65MB

General purpose 4x upscaling for textures and game assets

Purpose:

  • 4x resolution increase (512x512 → 2048x2048)
  • Enhance texture details
  • Improve low-resolution game assets

Best For:

Game Textures UI Elements Asset Enhancement
Use Case: Upscale generated textures for Unity/Unreal, enhance sprite sheets, improve UI assets
Real-ESRGAN x4plus Anime ~17MB

Specialized upscaler for anime/stylized artwork

Best For:

Anime Art 2D Game Assets Stylized Characters

Tips:

  • Better for cell-shaded or cartoon-style art
  • Preserves clean lines better than general model

🔄 Complete Workflows

🎮 Game Character Pipeline (RealSense → Animated Character)

  1. Scan: Capture person/object with RealSense D435i depth camera
  2. Depth Processing: Feed depth map into ComfyUI
  3. Character Generation: Use Depth ControlNet + SDXL to generate styled character
  4. Pose Variations: Use OpenPose ControlNet to generate multiple poses
  5. Animation: Apply AnimateDiff or SVD for movement
  6. Enhancement: Upscale with Real-ESRGAN for final quality

Models Used: Depth ControlNet SDXL, SDXL Base, OpenPose ControlNet, AnimateDiff/SVD, Real-ESRGAN

🎨 Concept Art to Game Asset

  1. Sketch: Draw rough concept in Krita/Photoshop
  2. Line Detection: Use Canny ControlNet to preserve edges
  3. Generation: SDXL generates detailed, styled version
  4. Variations: Adjust prompts while maintaining structure
  5. Upscale: Real-ESRGAN 4x for production resolution
  6. Export: Ready for Unity/Unreal

Models Used: Canny ControlNet, SDXL Base, Real-ESRGAN

📹 Character Animation for Cutscene

  1. Base Image: Generate high-quality character with SDXL
  2. Pose Series: Create multiple poses using OpenPose ControlNet
  3. Video Generation: Use SVD to animate between keyframes
  4. Refinement: Post-process in Kdenlive
  5. Export: Game-ready cutscene video

Models Used: SDXL Base, OpenPose ControlNet, SVD

📚 Additional Resources

Model Compatibility:

  • SD 1.5 models: AnimateDiff, SD 1.5 ControlNets
  • SDXL models: SDXL Base, SDXL ControlNets, SVD
  • Universal: Upscalers, VAEs

Hardware Requirements:

  • GPU: NVIDIA P5000 (16GB VRAM) ✓ Sufficient for all models
  • Storage: 655GB available on /data/comfyui
  • Recommended batch size: 1-2 for SDXL, 2-4 for SD 1.5

Storage Paths:

Models: /data/comfyui/models/ Output: /data/comfyui/output/ Input: /data/comfyui/input/ Custom Nodes: /data/comfyui/custom_nodes/ Symlink: ~/Book1-Production/ComfyUI → /data/comfyui