AI & Machine Learning for CGEdit
Generative AI and ML tools across image, video, 3D, texture, audio. AI mocap lives in §3 (dual-listed).
Machine Learning for CG
ML fundamentals and courses for CG artists.
- DINOv2. Meta self-supervised vision features. Strong general-purpose image embeddings. Useful as backbone for CG ML tasks.
- IEEE Transactions on Neural Networks and Learning Systems. Publishes technical articles on the theory, design, and applications of neural networks and related learning systems.
- Meta 3D Gen. Meta research pipeline for text-to-3D. Combines AssetGen geometry with TextureGen materials. Paper and results.
- PyTorch3D. Meta FAIR library for 3D deep learning. Differentiable rendering, mesh ops, point cloud utils on PyTorch.
Image Generation
text-to-image models and platforms.
| Software | Description | License | Tags | Best For |
|---|---|---|---|---|
| Adobe Firefly 3 | Commercial-safe, deep Creative Cloud integration. See also: Software Reference → AI Image & Texture Generation Software |
Paid | Commercial-Safe · Adobe CC | Commercial-safe Adobe workflows |
| Aga Miko/pixel Character Generator | Generating retro pixel game characters with Generative Adversarial Networks. Dataset "TinyHero" included. See also: Software Reference → AI Image & Texture Generation Software |
Open Source | ||
| Ashawkey/stable Dreamfusion | A pytorch implementation of text-to-3D dreamfusion, powered by stable diffusion. See also: Software Reference → AI Image & Texture Generation Software |
Open Source | ||
| DALL-E 3 (OpenAI) | Text-to-image with strong semantic accuracy. See also: Software Reference → AI Image & Texture Generation Software |
Paid | OpenAI · Semantic | Semantic accuracy |
| FLUX 1.1 Pro (Black Forest Labs) | Production-grade open model plus Fill/Canny/Depth tools. See also: Software Reference → AI Image & Texture Generation Software |
Freemium | Production-Grade · Open Model | Production-grade image gen |
| Google Gemini Image Gen | High realism, fast, large-scale production. See also: Software Reference → AI Image & Texture Generation Software |
Freemium | Google · Realism | Fast large-scale gen |
| Grok Aurora (xAI) | Photorealistic image gen integrated into Grok. See also: Software Reference → AI Image & Texture Generation Software |
Paid | xAI · Photoreal | Photorealism in Grok |
| HunyuanDiT (Tencent) | Open-source DiT-based, strong Chinese text. See also: Software Reference → AI Image & Texture Generation Software |
Open Source | Tencent · DiT | Chinese text rendering |
| Ideogram 2.0 | Best-in-class typography and text rendering, canvas mode. See also: Software Reference → AI Image & Texture Generation Software |
Freemium | Typography · Text Rendering | Typography in images |
| Jimeng / Dreamina (ByteDance) | High quality, integrated with video pipeline. See also: Software Reference → AI Image & Texture Generation Software |
Freemium | ByteDance · Video Pipeline | Integrated image + video pipeline |
| Kolors (Kuaishou) | Open-source, bilingual Chinese/English. See also: Software Reference → AI Image & Texture Generation Software |
Open Source | Open Source · Bilingual | Bilingual open model |
| Krea AI | Real-time generation, upscaling, and style transfer platform. See also: Software Reference → AI Image & Texture Generation Software |
Freemium | Real-time · Upscaling | Real-time + upscaling |
| Leonardo AI | Real-time generation with artistic control. See also: Software Reference → AI Image & Texture Generation Software |
Freemium | Real-time · Controlled | Controlled artistic generation |
| Midjourney v7 | Industry-leading artistic coherence and stylized output. See also: Software Reference → AI Image & Texture Generation Software |
Paid | Artistic · Subscription | Artistic and stylized imagery |
| MusicGen | Meta's tool for generating music See also: Software Reference → AI Image & Texture Generation Software |
Open Source | ||
| Playground v3 | Mixed-mode design canvas for graphic design. See also: Software Reference → AI Image & Texture Generation Software |
Freemium | Design Canvas · Mixed-mode | Graphic design canvas |
| Recraft V3 | Design-oriented gen. Vector, icons, brand assets. See also: Software Reference → AI Image & Texture Generation Software |
Freemium | Design Focus · Vector | Design assets, vectors |
| Snowpixel | Generate Images/Videos/Animations/Audio/Music/3D Objects with Text and/or Image. Upload your own data to create custom models. See also: Software Reference → AI Image & Texture Generation Software |
Freemium | ||
| Stable Animation SDK | Text-to-animation tool for developers by Stability AI [dev platform] See also: Software Reference → AI Image & Texture Generation Software |
Paid | ||
| Stable Diffusion 3.5 | Open-source diffusion (Large, Medium, Turbo sizes). See also: Software Reference → AI Image & Texture Generation Software |
Open Source | Open Source · Self-Host | Self-hosted image gen |
| UneeQ Digital Humans | 3D character lib synced See also: Software Reference → AI Image & Texture Generation Software |
Paid |
Related:
- (5) Deforum notebook v0.5 for Stable Diffusion animations is out! Now with math automation, perspective flips, prompt weights, video masking and waifus! : StableDiffusion. (5) Deforum notebook v0.5 for Stable Diffusion animations is out! Now with math automation, perspective flips, prompt weights, video masking and waifus! : StableDiffusion.
- 3DALL-E: Integrating Text-to-Image AI in 3D Design Workflows. ArXiv 2022 paper on integrating text-to-image AI in 3D design workflows.
- AI Render. Stable Diffusion in Blender. This add-on renders an AI generated image based on a text prompt and your scene.
- Canva Adds a Free and Unlimited AI Text-to-Image Generator | PetaPixel. Canva Adds a Free and Unlimited AI Text-to-Image Generator | PetaPixel.
- COVAL. Architecture of voice AI, from speech recognition to emotional intelligence, and learn how to build, scale, and evaluate them
- Deforum Stable Diffusion Animation - v5 Math Functions - Demo and Test - YouTube. Deforum Stable Diffusion Animation - v5 Math Functions - Demo and Test - YouTube.
- Generative AI Genius 2024 (created by Aishwarya Naresh Reganti) course material. Generative AI Genius 2024 (created by Aishwarya Naresh Reganti) course material.
- Introduction to Generative AI Community Course. By ineuron
- Openai/point E. OpenAI's point cloud diffusion for 3D model synthesis
- sd-concepts-library (Stable Diffusion concepts library). Stable Diffusion Textual Inversion Concepts Library - browse through objects and styles taught by the community to Stable Diffusion and use them in your prompts!
- Stable Diffusion Frivolous · Because lawsuits based on ignorance deserve a response. Community response for the "Stable Diffusion litigation"
- Stable Diffusion KLMC2 Animation.ipynb. Notebook by @RiversHaveWings to generate animation based on scripted prompts using a technique called KLMC2 discretization of underdamped Langevin dynamics
- Stable Diffusion KLMC2 Animation.ipynb forked. Fork by @DigThatData
- Transforming 2D Images into 3D with the AdaMPI AI Model. Guide on how to use the AdaMPI AI model for creating 3D photos from 2D images
- Word-As-Image for Semantic Typography. Semantically transforming fonts into illustrations
Video Generation
AI video generation platforms and models.
| Software | Description | License | Tags | Best For |
|---|---|---|---|---|
| Cog Video X | Tsinghua/Zhipu open-source, multiple sizes. See also: Software Reference → AI Video Generation Software |
Open Source | Open Source · Tsinghua | Research open-source video gen |
| Genmo Mochi 1 | Open-source video gen model. See also: Software Reference → AI Video Generation Software |
Open Source | Open Source · Genmo | Open-source video gen |
| Hailuo AI / MiniMax | "Director" model. Strong motion and character consistency. See also: Software Reference → AI Video Generation Software |
Freemium | Director Model · Character Consist | Directed motion |
| Haiper | Ex-DeepMind team, animation and video-to-video modes. See also: Software Reference → AI Video Generation Software |
Freemium | Ex-DeepMind · V2V | Video-to-video |
| HunyuanVideo 1.5 | Open-source 8.3B params, runs on 14GB VRAM. See also: Software Reference → AI Video Generation Software |
Open Source | Open Source · 14GB VRAM | Self-hosted video gen |
| Kling 3.0 | Up to 5min clips, strong human motion. Motion Brush control. See also: Software Reference → AI Video Generation Software |
Freemium | 5min Clips · Motion Brush | Long clips, human motion |
| LTX Video | Open-source, real-time generation speeds. See also: Software Reference → AI Video Generation Software |
Open Source | Open Source · Real-time | Real-time open-source video gen |
| Luma Dream Machine | Fast generations with good camera motion. See also: Software Reference → AI Video Generation Software |
Freemium | Fast · Camera Motion | Fast video gen |
| Pika 2.0 | Extended video gen with improved consistency. See also: Software Reference → AI Video Generation Software |
Freemium | Extended Gen · Stylized | Stylized video |
| PixVerse | Style-specific modes (anime, 3D, realistic), character consistency. See also: Software Reference → AI Video Generation Software |
Freemium | Style Modes · Consistency | Style-specific video gen |
| Runway Gen-4 | Character consistency across shots, controllable camera. See also: Software Reference → AI Video Generation Software |
Freemium | Character Consistency · Camera Control | Character consistency |
| Sora (OpenAI) | Up to ~20s clips with strong temporal coherence. See also: Software Reference → AI Video Generation Software |
Paid | OpenAI · 20s Clips | Coherent video clips |
| Veo 3.1 (Google DeepMind) | Top leaderboard, native audio, 60s+ clips. See also: Software Reference → AI Video Generation Software |
Paid | Top Leaderboard · Native Audio | Top-quality video gen |
| Vidu | 16s clips, strong human motion (Shengshu). See also: Software Reference → AI Video Generation Software |
Freemium | 16s · Human Motion | Long human-motion clips |
| Wan 2.2 (Alibaba) | Cinematic MoE diffusion, 8GB+ VRAM. Open source. See also: Software Reference → AI Video Generation Software |
Open Source | Open Source · MoE Diffusion | Cinematic open-source video gen |
Related:
- Emu Video (Meta). Meta's text-to-video research model. Factorized generation, image-then-video. Demos and paper.
3D Generation
AI tools for generating 3D models from text or images.
| Software | Description | License | Tags | Best For |
|---|---|---|---|---|
| 3DTopia | Open-source text-to-3D pipeline (coarse → refined). See also: Software Reference → AI 3D Generation Software |
Open Source | Coarse→Refined · Open Source | Text-to-3D open-source |
| InstantMesh | Fast single-image-to-3D reconstruction. See also: Software Reference → AI 3D Generation Software |
Open Source | Fast · Image→3D | Fast image→3D |
| Kaedim | Image-to-3D with hybrid AI plus artist cleanup for production quality. See also: Software Reference → AI 3D Generation Software |
Paid | Hybrid AI+Artist · Production | Production-quality via hybrid AI+artist |
| Luma Genie | Text/image to 3D, integrated with Dream Machine. See also: Software Reference → AI 3D Generation Software |
Freemium | Luma · Integrated | Luma-integrated 3D gen |
| Meshy v4 | Production-reliable, improved topology and PBR textures. See also: Software Reference → AI 3D Generation Software |
Freemium | Production-Reliable · PBR | Production-ready AI 3D |
| Rodin Gen-2 (Hyper3D) | 10B params, photorealistic, free generation tier. See also: Software Reference → AI 3D Generation Software |
Freemium | 10B Params · Free Tier | Photorealistic AI 3D |
| Sloyd | Procedural 3D generation with parametric control. See also: Software Reference → AI 3D Generation Software |
Freemium | Procedural · Parametric | Parametric procedural 3D |
| Spline AI | Generate 3D objects and textures from text prompts in-editor. See also: Software Reference → AI 3D Generation Software |
Freemium | In-Editor · Text→3D | In-editor AI 3D |
| Stability SPAR3D | Open-source single-image 3D reconstruction. See also: Software Reference → AI 3D Generation Software |
Open Source | Stability AI · Open Source | Single-image open-source 3D |
| TRELLIS.2 (Microsoft) | Full PBR materials, complex topologies. See also: Software Reference → AI 3D Generation Software |
Open Source | Microsoft · PBR + Open Source | PBR AI 3D (open source) |
| Tripo v3.0 | Sculpture-level precision, clean quad topology. See also: Software Reference → AI 3D Generation Software |
Freemium | Clean Quads · Sculpt-level | Clean topology AI 3D |
| TripoSR | Tripo/Stability collab. Fast open-source image-to-3D. See also: Software Reference → AI 3D Generation Software |
Open Source | Open Source · Fast | Fast open-source image→3D |
| Unique3D | High-quality mesh from single image (NeurIPS 2024). See also: Software Reference → AI 3D Generation Software |
Open Source | Single-Image · NeurIPS | Single-image open-source 3D |
| Wonder3D++ | Cross-domain diffusion, textured meshes in 2-3min. See also: Software Reference → AI 3D Generation Software |
Open Source | Fast · Textured Mesh | Fast textured mesh gen |
Related:
- Pixal3D. Single-image to 3D asset generation via pixel-feature back-projection. Outputs geometry and PBR textures.
- Threestudio. A unified framework for 3D content generation.
Texture & Material Generation
AI-generated PBR textures and materials.
| Software | Description | License | Tags | Best For |
|---|---|---|---|---|
| Adobe Firefly Textures | Prompt-to-edit texture workflows. See also: Software Reference → AI Image & Texture Generation Software |
Paid | Adobe · Prompt-to-Edit | Adobe-integrated AI textures |
| InstaMAT | Material authoring with AI workflows (Substance alternative). See also: Software Reference → AI Image & Texture Generation Software |
Paid | Substance-alt · AI Workflows | AI-assisted material authoring |
| Meshy Textures | Integrated with 3D pipeline. See also: Software Reference → AI Image & Texture Generation Software |
Freemium | Meshy Pipeline · Integrated | Textures tied to Meshy 3D |
| Poly | AI-generated PBR textures and materials, tileable with full map sets. See also: Software Reference → AI Image & Texture Generation Software |
Freemium | AI PBR · Tileable | AI PBR textures |
| Ponzu | AI texture gen from text prompts for uploaded meshes. See also: Software Reference → AI Image & Texture Generation Software |
Freemium | Mesh-aware · Text→Tex | Mesh-aware AI textures |
| Scenario | Game-ready PBR materials, full map sets. See also: Software Reference → AI Image & Texture Generation Software |
Paid | Game-Ready · Full Maps | Game-ready AI materials |
ComfyUI Ecosystem
Node-based AI generation. ComfyUI and its ecosystem.
| Software | Description | License | Tags | Best For |
|---|---|---|---|---|
| ComfyUI | Node-based AI generator for image/video/3D/audio. See also: Software Reference → AI Image & Texture Generation Software |
Open Source | Node-based · Multi-modal | Node-based AI workflows |
Related:
- Awesome ComfyUI. Curated custom nodes collection.
- Comflowy. ComfyUI tutorials for ControlNet, SDXL, FLUX workflows.
- Comfy UI Manager. Install and manage custom nodes and dependencies.
- ComfyUI Registry. Community workflow and subgraph sharing.
- ControlNet SD3.5. Blur/Canny/Depth models, free commercial use.
AI-Assisted CG Tools
AI tools that augment traditional CG workflows.
| Software | Description | License | Tags | Best For |
|---|---|---|---|---|
| Autodesk Flow Studio (Wonder Dynamics) | AI VFX, auto CG characters in live footage, USD export. See also: Software Reference → AI-Assisted CG Software |
Paid | Auto CG Chars · USD Export | AI CG characters in live plates |
| Blockade Labs Skybox AI | AI-generated 360° skyboxes/HDRIs from text. See also: Software Reference → AI-Assisted CG Software |
Freemium | AI Skybox · 360° | AI skyboxes/HDRIs |
| Gigapixel AI | [$] : is the only photo enlargement product available that can actually add detail back to your upscaled photo. See also: Software Reference → AI-Assisted CG Software |
Paid | ||
| Magnific AI | AI upscaler with "creativity" slider. Reimagines/adds detail (2-16x). See also: Software Reference → AI-Assisted CG Software |
Paid | Creative Upscale · 2-16x | Creative AI upscaling |
| Rosebud AI | Vibe coding platform for creating 3D games and interactive web apps with AI. See also: Software Reference → AI-Assisted CG Software |
Freemium | AI Game Builder · Vibe Coding | |
| Topaz Photo AI / Video AI | Best-in-class upscaling, denoising, sharpening. See also: Software Reference → AI-Assisted CG Software |
Paid | Upscale · Denoise | Upscale/denoise |
| Unity Muse | Unity's suite of AI products See also: Software Reference → AI-Assisted CG Software |
Paid | Unity AI Suite |
Related:
- AI Game Developer.
Unity EditorandUnity RuntimeAI integration. Unit Test, Coding, C# Roslyn, Reflection, Assets. Helps to create games with AI. And helps to run AI logic during gameplay. - Comfy UI Blender AI. Run ComfyUI workflows inside Blender.
- ControlNet. Depth/pose/edge control for precision CG workflows.
- Coplay Freemium. AI Copilot for Unity
- CorridorKey. Corridor Crew's open keying tool. AI-driven chroma key for VFX work.
- CoTracker. Meta FAIR point tracker for video. Tracks arbitrary points jointly across frames. Useful for matchmove and roto prep.
- DEVA (Tracking Anything). Decoupled video segmentation. Long-form, multi-object roto with text or click prompts.
- Dream Textures (Blender). Generate textures in Blender via Stable Diffusion, locally.
- GPT 4. Prompt engineering techniques for GPT-4, including tips, applications, limitations, and additional reading materials.
- RIFE for Nuke. Real-Time Intermediate Flow Estimation for Video Frame Interpolation (ie machine learning for framerate upscaling) for Nuke.
- Robust Video Matting (RVM). Real-time human video matting. Strong alpha mattes with no greenscreen.
- Rotobot Paid. Paid AI roto plugin for Nuke and After Effects. Auto-mattes people and objects.
- SAM 2 (Segment Anything Model 2). Meta's promptable segmentation for images and video. Masks objects across frames from a single click. Useful for rotoscope and matte work.
- Sammie Roto 2. SAM-based rotoscope app. Click-to-mask, propagates across frames.
- SolidUI. AI-generated visualization prototyping and editing platform, support 2D, 3D models, combined with LLM(Large Language Model) for quick editing.
- Track Anything. SAM plus XMem for interactive video object tracking and segmentation. Useful for roto and matte propagation.
- Wonder Studio (Autodesk Flow Studio) Freemium. AI character replacement with auto roto, body tracking, and re-lighting from a single video. Now part of Autodesk Flow Studio.
AI Audio & Music
AI music generation, voice, TTS, and SFX.
| Software | Description | License | Tags | Best For |
|---|---|---|---|---|
| ACE Studio | AI singing voice synthesizer with expression control. See also: Software Reference → AI Audio & Music Generation Software |
Freemium | Singing · Expression | AI singing synthesis |
| AIVA | AI composition, good for orchestral/cinematic scoring. See also: Software Reference → AI Audio & Music Generation Software |
Freemium | Orchestral · Scoring | Orchestral/cinematic scoring |
| DiffRhythm | Open-source full-song gen with vocals from lyrics. See also: Software Reference → AI Audio & Music Generation Software |
Open Source | Open Source · Lyrics→Song | Open-source song gen with vocals |
| ElevenLabs | Voice cloning, narration, music generation. See also: Software Reference → AI Audio & Music Generation Software |
Freemium | Voice Cloning · TTS | Voice cloning, TTS |
| F5 TTS | Open-source zero-shot voice cloning TTS. See also: Software Reference → AI Audio & Music Generation Software |
Open Source | Zero-Shot · Voice Clone | Zero-shot voice cloning |
| Fish Audio | Open-source TTS with voice cloning, fast and multilingual. See also: Software Reference → AI Audio & Music Generation Software |
Open Source | Open Source · Multilingual | Open-source TTS |
| Suno v5 | Full song generation, 100M+ users. See also: Software Reference → AI Audio & Music Generation Software |
Freemium | Song Gen · 100M users | Full song generation |
| Udio | Strong electronic/pop, licensed for commercial use. See also: Software Reference → AI Audio & Music Generation Software |
Freemium | Electronic/Pop · Commercial | Commercial-licensed AI music |
Related:
- AudioCraft (MusicGen / AudioGen / EnCodec). Meta's open audio generation stack. MusicGen for music, AudioGen for SFX, EnCodec for neural audio compression.
Open-Source Models (HuggingFace)
Notable open-source generative models.
- Hugging Face Transformers. NLP models and pipelines.
- Hunyuan Video I2V. Image-to-video, multilingual.
- SV3D (Stable Video 3D). Orbital video from single image.
Research Papers
Research papers on ML for CG, generative 3D, neural rendering, and related topics. Each entry: title (the plain-English summary), year, and venue/links in the description.
- Align Your Gaussians: Text-to-4D with Dynamic 3D Gaussians and Composed Diffusion Models. Project Page
- Animatable Neural Radiance Fields from Monocular RGB Videos. Chen et al., Arxiv 2021 | github | bibtex
- BeyondPixels: A Review of the Evolution of Neural Radiance Fields. AKM Shahariar Azad Rabby and Chengcui Zhang, Arxiv 2023 | bibtex
- CADOps-Net: Jointly Learning CAD Operation Types and Steps from Boundary-Representations. 3DV 2022 | [project] |
- CLA-NeRF: Category-Level Articulated Neural Radiance Field. Tseng et al., ICRA 2022 | bibtex
- Consolidating Attention Features for Multi-view Image Editing. Project Page
- CROSSFIRE: Camera Relocalization On Self-Supervised Features from an Implicit Representation. Moreau et al., ICCV 2023 | bibtex
- Deep Generative Models on 3D Representations: A Survey. Project Page
- DeepCAD: A Deep Generative Network for Computer-Aided Design Models. ICCV 2021 | [project] [code] |
- Depth-supervised NeRF: Fewer Views and Faster Training for Free. Deng et al., Arxiv 2021 | github | bibtex
- DeRF: Decomposed Radiance Fields. Rebain et al. Arxiv 2020 | bibtex
- Direct Voxel Grid Optimization: Super-fast Convergence for Radiance Fields Reconstruction. Sun et al., CVPR 2022 | github | bibtex
- DM-NeRF: 3D Scene Geometry Decomposition and Manipulation from 2D Images. Code
- DMV3D: Denoising Multi-View Diffusion using 3D Large Reconstruction Model. Project Page
- Dream3D: Zero-Shot Text-to-3D Synthesis Using 3D Shape Prior and Text-to-Image Diffusion Models. Project Page | [ Code][Coming soon.]
- DreamCraft3D: Hierarchical 3D Generation with Bootstrapped Diffusion Prior. Project Page | Code
- DreamFusion: Text-to-3D using 2D Diffusion. Project Page | Code
- DreamGaussian: Generative Gaussian Splatting for Efficient 3D Content Creation. Project Page | Code
- DreamGaussian4D: Generative 4D Gaussian Splatting. Project Page | Code
- DreamWaltz: Make a Scene with Complex 3D Animatable Avatars. Project Page | Code
- En3D: An Enhanced Generative Model for Sculpting 3D Humans from 2D Synthetic Data. Project Page | Code
- EndoGaussian: Real-time Gaussian Splatting for Dynamic Endoscopic Scene Reconstruction. Project Page | Code
- FastNeRF: High-Fidelity Neural Rendering at 200FPS. Garbin et al., Arxiv 2021 | bibtex
- From 2D CAD Drawings to 3D Parametric Models: A Vision-Language Approach. AAAI 2025 | [project] |
- GALA3D: Towards Text-to-3D Complex Scene Generation via Layout-guided Generative Gaussian Splatting. Project Page | Code
- GARF: Gaussian Activated Radiance Fields for High Fidelity Reconstruction and Pose Estimation. Chng et al., ECCV 2022 | bibtex
- GaussianAvatars: Photorealistic Head Avatars with Rigged 3D Gaussians. Project Page
- GaussianObject: High-Quality 3D Object Reconstruction from Four Views with Gaussian Splatting. Project Page | Code
- Gemini Robotics 1.5. Embodied reasoning, thinking, and motion transfer.
- GIRAFFE: Representing Scenes as Compositional Generative Neural Feature Fields. Niemeyer et al., CVPR 2021, bibtex
- GNeRF: GAN-based Neural Radiance Field without Posed Camera. Meng et al., Arxiv 2021 | bibtex
- GO-NeRF: Generating Objects in Neural Radiance Fields for Virtual Reality Content Creation. Project Page
- GRAF: Generative Radiance Fields for 3D-Aware Image Synthesis. Schwarz et al., NeurIPS 2020 | github | bibtex
- GRF: Learning a General Radiance Field for 3D Scene Representation and Rendering. Trevithick and Yang, Arxiv 2020 | github | bibtex
- IDE-3D: Interactive Disentangled Editing for High-Resolution 3D-aware Portrait Synthesis. Project Page | Code
- Image Sculpting: Precise Object Editing with 3D Geometry Control. Project Page | Code
- Instant3D: Instant Text-to-3D Generation. Project Page | Code
- KiloNeRF: Speeding up Neural Radiance Fields with Thousands of Tiny MLPs. Reiser et al., ICCV 2021 | github | bibtex
- KiloNeuS: Implicit Neural Representations with Real-Time Global Illumination. Esposito et al., Arxiv 2022 | bibtex
- Learned Initializations for Optimizing Coordinate-Based Neural Representations. Tancik et al., CVPR 2021 | github | bibtex
- Learning the 3D Fauna of the Web. Project Page | Code
- Lightning NeRF: Efficient Hybrid Scene Representation for Autonomous Driving. Cao et al. ICRA 2024 | github | bibtex
- Loc-NeRF: Monte Carlo Localization using Neural Radiance Fields. Maggio et al., ICRA 2023 | github | bibtex
- Local 3D Editing via 3D Distillation of CLIP Knowledge. Hyung et al., CVPR 2023| github | bibtex
- M3DBench: Let's Instruct Large Models with Multi-modal 3D Prompts. Project Page | Code
- Magic3D: High-Resolution Text-to-3D Content Creation. Project Page | [ Code][Coming soon.]
- Make-A-Character: High Quality Text-to-3D Character Generation within Minutes. Project Page | Code
- Make-It-3D: High-Fidelity 3D Creation from A Single Image with Diffusion Prior. Project Page | Code
- MinD-3D: Reconstruct High-quality 3D objects in Human Brain. Project Page | Code
- Mixture of Volumetric Primitives for Efficient Neural Rendering. Lombardi et al., SIGGRAPH 2021 | bibtex
- NeO 360: Neural Fields for Sparse View Synthesis of Outdoor Scenes. Irshad et al., ICCV 2023 | github | bibtex
- NeRF-MAE: Masked AutoEncoders for Self-Supervised 3D Representation Learning for Neural Radiance Fields. Irshad et al., ECCV 2024 | bibtex
- NeRF-VAE: A Geometry Aware 3D Scene Generative Model. Kosiorek et al., Arxiv 2021 | bibtex
- NeRF: Neural Radiance Field in 3D Vision, Introduction and Review. Kyle Gao, Yina Gao, Hongjie He, Dening Lu, Linlin Xu, Jonathan Li
- NeRF++: Analyzing and Improving Neural Radiance Fields. Zhang et al., Arxiv 2020 | github | bibtex
- Neural Volume Rendering: NeRF. Dellaert and Yen-Chen, Arxiv 2020 | blog | github |
- NeuS: Learning Neural Implicit Surfaces by Volume Rendering for Multi-view Reconstruction. Wang et al., NeurIPS 2021 | github | bibtex
- One-2-3-45++: Fast Single Image to 3D Objects with Consistent Multi-View Generation and 3D Diffusion. Project Page | Code
- ParSeNet: A Parametric Surface Fitting Network for 3D Point Clouds. ECCV 2020 | [project] [code] |
- PC2WF: 3D Wireframe Reconstruction from Raw Point Clouds. ICLR 2021 | [code] |
- pixelNeRF: Neural Radiance Fields from One or Few Images. Yu et al., CVPR 2021 | github | bibtex
- PlankAssembly: 3D Reconstruction from Three Orthographic Views with Learnt Shape Programs. ICCV 2023 | [project] [code] |
- Point2Cyl: Reverse Engineering 3D Objects from Point Clouds to Extrusion Cylinders. CVPR 2022 | [project] [code] |
- PolyGen: An Autoregressive Generative Model of 3D Meshes. ICML 2020 | [code] |
- PonderV2: Pave the Way for 3D Foundation Model with A Universal Pre-training Paradigm. Zhu et al., Arxiv 2023 | github | bibtex
- Progressive3D: Progressively Local Editing for Text-to-3D Content Creation with Complex Semantic Prompts. Project Page | Code
- PSAvatar: A Point-based Shape Model for Real-Time Head Avatar Animation with 3D Gaussian Splatting. Code
- PVDeconv: Point-voxel deconvolution for autoencoding cad construction in 3D. ICIP 2020 | [project] |
- Reconstructing Editable Prismatic CAD from Rounded Voxel Models. SIGGRAPH Asia 2022 | |
- RichDreamer: A Generalizable Normal-Depth Diffusion Model for Detail Richness in Text-to-3D. Project Page | Code
- Rig3DGS: Creating Controllable Portraits from Casual Monocular Videos. Project Page
- RT-NeRF: Real-Time On-Device Neural Radiance Fields Towards Immersive AR/VR Rendering. Li et al., ICCAD 2022 | bibtex
- SEEAvatar: Photorealistic Text-to-3D Avatar Generation with Constrained Geometry and Appearance. Project Page
- SIGNeRF: Scene Integrated Generation for Neural Radiance Fields. Project Page | Code
- Streaming Radiance Fields for 3D Video Synthesis. Li et al. NeurIPS 2022 | github | bibtex
- Street Gaussians: Modeling Dynamic Urban Scenes with Gaussian Splatting. Project Page | Code
- Supervised Fitting of Geometric Primitives to 3D Point Clouds. CVPR 2019 | [code] |
- SurfelNeRF: Neural Surfel Radiance Fields for Online Photorealistic Reconstruction of Indoor Scenes. Gao et al., CVPR 2023 | github | bibtex
- Taming Mode Collapse in Score Distillation for Text-to-3D Generation. Project Page | Code
- Text-To-4D Dynamic Scene Generation. Project Page
- Text2CAD: Text to 3D CAD Generation via Technical Drawings. NeurIPS 2024 | [project] |
- TextField3D: Towards Enhancing Open-Vocabulary 3D Generation with Noisy Text Fields. Project Page
- TextureDreamer: Image-guided Texture Synthesis through Geometry-aware Diffusion. Project Page
- TOSS:High-quality Text-guided Novel View Synthesis from a Single Image. Project Page
- Triplane Meets Gaussian Splatting: Fast and Generalizable Single-View 3D Reconstruction with Transformers. Project Page
- ViCA-NeRF: View-Consistency-Aware 3D Editing of Neural Radiance Fields. Code
- Volume Rendering of Neural Implicit Surfaces. Yariv et al., NeurIPS 2021 | github | bibtex
- Wonder3D: Single Image to 3D using Cross-Domain Diffusion. Project Page | Code
- X-NeRF: Explicit Neural Radiance Field for Multi-Scene 360° Insufficient RGB-D Views. Zhu et al., WACV 2023 | github | bibtex
- Zero123++: a Single Image to Consistent Multi-view Diffusion Base Model. Project Page | Code | Hugging Face
AI Image & Texture Generation Software
Also in Software Reference → AI Image & Texture Generation Software
| Software | Description | License | Tags | Best For |
|---|---|---|---|---|
| Adobe Firefly 3 | Commercial-safe, deep Creative Cloud integration. | Paid | Commercial-Safe · Adobe CC | Commercial-safe Adobe workflows |
| Adobe Firefly Textures | Prompt-to-edit texture workflows. | Paid | Adobe · Prompt-to-Edit | Adobe-integrated AI textures |
| Aga Miko/pixel Character Generator | Generating retro pixel game characters with Generative Adversarial Networks. Dataset "TinyHero" included. | Open Source | ||
| Ashawkey/stable Dreamfusion | A pytorch implementation of text-to-3D dreamfusion, powered by stable diffusion. | Open Source | ||
| ComfyUI | Node-based AI generator for image/video/3D/audio. | Open Source | Node-based · Multi-modal | Node-based AI workflows |
| DALL-E 3 (OpenAI) | Text-to-image with strong semantic accuracy. | Paid | OpenAI · Semantic | Semantic accuracy |
| FLUX 1.1 Pro (Black Forest Labs) | Production-grade open model plus Fill/Canny/Depth tools. | Freemium | Production-Grade · Open Model | Production-grade image gen |
| Google Gemini Image Gen | High realism, fast, large-scale production. | Freemium | Google · Realism | Fast large-scale gen |
| Grok Aurora (xAI) | Photorealistic image gen integrated into Grok. | Paid | xAI · Photoreal | Photorealism in Grok |
| HunyuanDiT (Tencent) | Open-source DiT-based, strong Chinese text. | Open Source | Tencent · DiT | Chinese text rendering |
| Ideogram 2.0 | Best-in-class typography and text rendering, canvas mode. | Freemium | Typography · Text Rendering | Typography in images |
| InstaMAT | Material authoring with AI workflows (Substance alternative). | Paid | Substance-alt · AI Workflows | AI-assisted material authoring |
| Jimeng / Dreamina (ByteDance) | High quality, integrated with video pipeline. | Freemium | ByteDance · Video Pipeline | Integrated image + video pipeline |
| Kolors (Kuaishou) | Open-source, bilingual Chinese/English. | Open Source | Open Source · Bilingual | Bilingual open model |
| Krea AI | Real-time generation, upscaling, and style transfer platform. | Freemium | Real-time · Upscaling | Real-time + upscaling |
| Leonardo AI | Real-time generation with artistic control. | Freemium | Real-time · Controlled | Controlled artistic generation |
| Meshy Textures | Integrated with 3D pipeline. | Freemium | Meshy Pipeline · Integrated | Textures tied to Meshy 3D |
| Midjourney v7 | Industry-leading artistic coherence and stylized output. | Paid | Artistic · Subscription | Artistic and stylized imagery |
| MusicGen | Meta's tool for generating music | Open Source | ||
| Playground v3 | Mixed-mode design canvas for graphic design. | Freemium | Design Canvas · Mixed-mode | Graphic design canvas |
| Poly | AI-generated PBR textures and materials, tileable with full map sets. | Freemium | AI PBR · Tileable | AI PBR textures |
| Ponzu | AI texture gen from text prompts for uploaded meshes. | Freemium | Mesh-aware · Text→Tex | Mesh-aware AI textures |
| Recraft V3 | Design-oriented gen. Vector, icons, brand assets. | Freemium | Design Focus · Vector | Design assets, vectors |
| Scenario | Game-ready PBR materials, full map sets. | Paid | Game-Ready · Full Maps | Game-ready AI materials |
| Snowpixel | Generate Images/Videos/Animations/Audio/Music/3D Objects with Text and/or Image. Upload your own data to create custom models. | Freemium | ||
| Stable Animation SDK | Text-to-animation tool for developers by Stability AI [dev platform] | Paid | ||
| Stable Diffusion 3.5 | Open-source diffusion (Large, Medium, Turbo sizes). | Open Source | Open Source · Self-Host | Self-hosted image gen |
| UneeQ Digital Humans | 3D character lib synced | Paid |
AI Video Generation Software
Also in Software Reference → AI Video Generation Software
| Software | Description | License | Tags | Best For |
|---|---|---|---|---|
| Cog Video X | Tsinghua/Zhipu open-source, multiple sizes. | Open Source | Open Source · Tsinghua | Research open-source video gen |
| Genmo Mochi 1 | Open-source video gen model. | Open Source | Open Source · Genmo | Open-source video gen |
| Hailuo AI / MiniMax | "Director" model. Strong motion and character consistency. | Freemium | Director Model · Character Consist | Directed motion |
| Haiper | Ex-DeepMind team, animation and video-to-video modes. | Freemium | Ex-DeepMind · V2V | Video-to-video |
| HunyuanVideo 1.5 | Open-source 8.3B params, runs on 14GB VRAM. | Open Source | Open Source · 14GB VRAM | Self-hosted video gen |
| Kling 3.0 | Up to 5min clips, strong human motion. Motion Brush control. | Freemium | 5min Clips · Motion Brush | Long clips, human motion |
| LTX Video | Open-source, real-time generation speeds. | Open Source | Open Source · Real-time | Real-time open-source video gen |
| Luma Dream Machine | Fast generations with good camera motion. | Freemium | Fast · Camera Motion | Fast video gen |
| Pika 2.0 | Extended video gen with improved consistency. | Freemium | Extended Gen · Stylized | Stylized video |
| PixVerse | Style-specific modes (anime, 3D, realistic), character consistency. | Freemium | Style Modes · Consistency | Style-specific video gen |
| Runway Gen-4 | Character consistency across shots, controllable camera. | Freemium | Character Consistency · Camera Control | Character consistency |
| Sora (OpenAI) | Up to ~20s clips with strong temporal coherence. | Paid | OpenAI · 20s Clips | Coherent video clips |
| Veo 3.1 (Google DeepMind) | Top leaderboard, native audio, 60s+ clips. | Paid | Top Leaderboard · Native Audio | Top-quality video gen |
| Vidu | 16s clips, strong human motion (Shengshu). | Freemium | 16s · Human Motion | Long human-motion clips |
| Wan 2.2 (Alibaba) | Cinematic MoE diffusion, 8GB+ VRAM. Open source. | Open Source | Open Source · MoE Diffusion | Cinematic open-source video gen |
AI 3D Generation Software
Also in Software Reference → AI 3D Generation Software
| Software | Description | License | Tags | Best For |
|---|---|---|---|---|
| 3DTopia | Open-source text-to-3D pipeline (coarse → refined). | Open Source | Coarse→Refined · Open Source | Text-to-3D open-source |
| InstantMesh | Fast single-image-to-3D reconstruction. | Open Source | Fast · Image→3D | Fast image→3D |
| Kaedim | Image-to-3D with hybrid AI plus artist cleanup for production quality. | Paid | Hybrid AI+Artist · Production | Production-quality via hybrid AI+artist |
| Luma Genie | Text/image to 3D, integrated with Dream Machine. | Freemium | Luma · Integrated | Luma-integrated 3D gen |
| Meshy v4 | Production-reliable, improved topology and PBR textures. | Freemium | Production-Reliable · PBR | Production-ready AI 3D |
| Rodin Gen-2 (Hyper3D) | 10B params, photorealistic, free generation tier. | Freemium | 10B Params · Free Tier | Photorealistic AI 3D |
| Sloyd | Procedural 3D generation with parametric control. | Freemium | Procedural · Parametric | Parametric procedural 3D |
| Spline AI | Generate 3D objects and textures from text prompts in-editor. | Freemium | In-Editor · Text→3D | In-editor AI 3D |
| Stability SPAR3D | Open-source single-image 3D reconstruction. | Open Source | Stability AI · Open Source | Single-image open-source 3D |
| TRELLIS.2 (Microsoft) | Full PBR materials, complex topologies. | Open Source | Microsoft · PBR + Open Source | PBR AI 3D (open source) |
| Tripo v3.0 | Sculpture-level precision, clean quad topology. | Freemium | Clean Quads · Sculpt-level | Clean topology AI 3D |
| TripoSR | Tripo/Stability collab. Fast open-source image-to-3D. | Open Source | Open Source · Fast | Fast open-source image→3D |
| Unique3D | High-quality mesh from single image (NeurIPS 2024). | Open Source | Single-Image · NeurIPS | Single-image open-source 3D |
| Wonder3D++ | Cross-domain diffusion, textured meshes in 2-3min. | Open Source | Fast · Textured Mesh | Fast textured mesh gen |
AI Audio & Music Generation Software
Also in Software Reference → AI Audio & Music Generation Software
| Software | Description | License | Tags | Best For |
|---|---|---|---|---|
| ACE Studio | AI singing voice synthesizer with expression control. | Freemium | Singing · Expression | AI singing synthesis |
| AIVA | AI composition, good for orchestral/cinematic scoring. | Freemium | Orchestral · Scoring | Orchestral/cinematic scoring |
| DiffRhythm | Open-source full-song gen with vocals from lyrics. | Open Source | Open Source · Lyrics→Song | Open-source song gen with vocals |
| ElevenLabs | Voice cloning, narration, music generation. | Freemium | Voice Cloning · TTS | Voice cloning, TTS |
| F5 TTS | Open-source zero-shot voice cloning TTS. | Open Source | Zero-Shot · Voice Clone | Zero-shot voice cloning |
| Fish Audio | Open-source TTS with voice cloning, fast and multilingual. | Open Source | Open Source · Multilingual | Open-source TTS |
| Suno v5 | Full song generation, 100M+ users. | Freemium | Song Gen · 100M users | Full song generation |
| Udio | Strong electronic/pop, licensed for commercial use. | Freemium | Electronic/Pop · Commercial | Commercial-licensed AI music |
AI-Assisted CG Software
Also in Software Reference → AI-Assisted CG Software
| Software | Description | License | Tags | Best For |
|---|---|---|---|---|
| Autodesk Flow Studio (Wonder Dynamics) | AI VFX, auto CG characters in live footage, USD export. | Paid | Auto CG Chars · USD Export | AI CG characters in live plates |
| Blockade Labs Skybox AI | AI-generated 360° skyboxes/HDRIs from text. | Freemium | AI Skybox · 360° | AI skyboxes/HDRIs |
| Gigapixel AI | [$] : is the only photo enlargement product available that can actually add detail back to your upscaled photo. | Paid | ||
| Magnific AI | AI upscaler with "creativity" slider. Reimagines/adds detail (2-16x). | Paid | Creative Upscale · 2-16x | Creative AI upscaling |
| Rosebud AI | Vibe coding platform for creating 3D games and interactive web apps with AI. | Freemium | AI Game Builder · Vibe Coding | |
| Topaz Photo AI / Video AI | Best-in-class upscaling, denoising, sharpening. | Paid | Upscale · Denoise | Upscale/denoise |
| Unity Muse | Unity's suite of AI products | Paid | Unity AI Suite |