ComfyUI Models Documentation

📊 Storage Overview

Location: /data/comfyui/models/
Total Storage: 23GB currently used, 655GB available

Distribution:
├── checkpoints/          8.9GB  - Base generation models
├── controlnet/           8.7GB  - Control & guidance models
├── clip_vision/          2.4GB  - Vision encoders
├── animatediff_models/   1.7GB  - Video generation
├── facerestore_models/   692MB  - Face enhancement
├── vae/                  639MB  - VAE models
└── upscale_models/       82MB   - Image upscaling
                

🎨 Checkpoint Models (Base Generation)

SDXL Base 1.0 6.5GB

High-quality text-to-image generation, successor to SD 1.5

Purpose:

General purpose image generation
High resolution (1024x1024 native)
Better text rendering and composition

Best For:

Character Design Game Assets Concept Art Textures

Use Case: Creating game character concept art, environmental textures, UI elements

Tips:

Works best at 1024x1024, can generate up to 2048x2048
Requires ~10GB VRAM for optimal performance
Use with ControlNet for precise control

Stable Video Diffusion XT 2.5GB

Image-to-video generation for creating animations

Purpose:

Convert static images to short videos (14-25 frames)
Camera motion and object animation
Video-to-video transformation

Best For:

Character Animation Product Demos Cinematics

Use Case: Animate game cutscenes, create character intro videos, product showcase animations

Tips:

Start with high-quality still images (SDXL generated works great)
Keep motion subtle for best results
Can generate 576x1024 resolution videos
Requires ~16GB VRAM

🎬 Video Generation Models

AnimateDiff v2 1.7GB

Motion module for creating animated sequences from SD 1.5 models

Purpose:

Add motion to Stable Diffusion 1.5 outputs
Create character animations
Generate looping animations

Best For:

Character Walkcycles Action Sequences Sprite Animations

Use Case: Create 2D game character animations, sprite sheets, animated backgrounds

Tips:

Works with SD 1.5 checkpoints only (not SDXL)
Combine with LoRAs for style consistency
Can generate 16-frame sequences
Motion LoRAs can enhance specific movements

🎮 ControlNet Models (Precision Control)

ControlNet Depth (SD 1.5) ~1.5GB

⭐ CRITICAL for RealSense D435i integration - converts depth maps to images

Purpose:

Uses depth information to control generation
Primary use: RealSense D435i depth camera → 3D character creation
Maintain spatial relationships and 3D structure

Best For:

RealSense Scanning 3D Character Design Environment Layouts

🎯 RealSense Character Creation Workflow:

Capture depth map with RealSense D435i camera
Feed depth map into ComfyUI
Use Depth ControlNet to generate character maintaining 3D structure
Refine with inpainting or img2img
Animate with AnimateDiff or SVD

Tips:

RealSense outputs work directly with this model
Adjust ControlNet strength (0.5-1.0) for balance between depth accuracy and creative freedom
Combine with OpenPose for character posing

ControlNet Depth SDXL ~2.5GB

Higher quality depth control for SDXL models

Purpose:

Same as SD 1.5 depth, but for SDXL quality
Better detail preservation
Higher resolution outputs

Best For:

High-Quality Characters Game Marketing Assets

Tips:

Use this for final production-quality renders
Use SD 1.5 version for quick iteration

ControlNet OpenPose ~1.5GB

Control character poses using skeleton/keypoint detection

Purpose:

Pose consistency across generations
Character animation pose control
Combine with depth for precise character modeling

Best For:

Character Poses Animation Frames Reference Matching

📸 Pose-Controlled Character Creation:

Take reference photo or use pose estimation
Extract pose skeleton with OpenPose
Generate character in exact pose
Maintain pose while changing character details

Tips:

Can combine with Depth ControlNet for best results
Works great for character turnarounds
Use lower strength (0.3-0.7) for natural variations

ControlNet Canny ~1.5GB

Edge detection for preserving line art and structure

Purpose:

Preserve edge structure from input images
Convert line art to colored images
Maintain architectural/structural details

Best For:

Line Art Coloring Concept to Render Environment Design

Use Case: Convert hand-drawn sketches to game-ready assets, colorize line art, maintain building structures

Tips:

Great for converting sketches to polished art
Works well with architectural/environment concepts
Lower Canny threshold detects finer edges

🔍 Upscale Models

Real-ESRGAN x4plus ~65MB

General purpose 4x upscaling for textures and game assets

Purpose:

4x resolution increase (512x512 → 2048x2048)
Enhance texture details
Improve low-resolution game assets

Best For:

Game Textures UI Elements Asset Enhancement

Use Case: Upscale generated textures for Unity/Unreal, enhance sprite sheets, improve UI assets

Real-ESRGAN x4plus Anime ~17MB

Specialized upscaler for anime/stylized artwork

Best For:

Anime Art 2D Game Assets Stylized Characters

Tips:

Better for cell-shaded or cartoon-style art
Preserves clean lines better than general model

🔄 Complete Workflows

🎮 Game Character Pipeline (RealSense → Animated Character)

Scan: Capture person/object with RealSense D435i depth camera
Depth Processing: Feed depth map into ComfyUI
Character Generation: Use Depth ControlNet + SDXL to generate styled character
Pose Variations: Use OpenPose ControlNet to generate multiple poses
Animation: Apply AnimateDiff or SVD for movement
Enhancement: Upscale with Real-ESRGAN for final quality

Models Used: Depth ControlNet SDXL, SDXL Base, OpenPose ControlNet, AnimateDiff/SVD, Real-ESRGAN

🎨 Concept Art to Game Asset

Sketch: Draw rough concept in Krita/Photoshop
Line Detection: Use Canny ControlNet to preserve edges
Generation: SDXL generates detailed, styled version
Variations: Adjust prompts while maintaining structure
Upscale: Real-ESRGAN 4x for production resolution
Export: Ready for Unity/Unreal

Models Used: Canny ControlNet, SDXL Base, Real-ESRGAN

📹 Character Animation for Cutscene

Base Image: Generate high-quality character with SDXL
Pose Series: Create multiple poses using OpenPose ControlNet
Video Generation: Use SVD to animate between keyframes
Refinement: Post-process in Kdenlive
Export: Game-ready cutscene video

Models Used: SDXL Base, OpenPose ControlNet, SVD

📚 Additional Resources

Model Compatibility:

SD 1.5 models: AnimateDiff, SD 1.5 ControlNets
SDXL models: SDXL Base, SDXL ControlNets, SVD
Universal: Upscalers, VAEs

Hardware Requirements:

GPU: NVIDIA P5000 (16GB VRAM) ✓ Sufficient for all models
Storage: 655GB available on /data/comfyui
Recommended batch size: 1-2 for SDXL, 2-4 for SD 1.5

Storage Paths:

Models: /data/comfyui/models/
Output: /data/comfyui/output/
Input: /data/comfyui/input/
Custom Nodes: /data/comfyui/custom_nodes/
Symlink: ~/Book1-Production/ComfyUI → /data/comfyui