Ludwig - ai/generative

World Models

Added on June 26, 2025

Can agents learn inside of their own dreams?

ai/reinforcement_learning

Optimizing Transformer-Based Diffusion Models for Video Generation with NVIDIA TensorRT

Added on May 16, 2025

State-of-the-art image diffusion models take tens of seconds to process a single image. This makes video diffusion even more challenging, requiring significant computational resources and high costs.

cs/software_development/performance_optimization

Position: Model Collapse Does Not Mean What You Think

Added on April 10, 2025

The proliferation of AI-generated content online has fueled concerns over \emph{model collapse}, a degradation in future generative models' performance when trained on synthetic data generated by earlier models. Industry leaders, premier research journals and popular science publications alike have prophesied catastrophic societal consequences stemming from model collapse. In this position piece, we contend this widespread narrative fundamentally misunderstands the scientific evidence. We highlight that research on model collapse actually encompasses eight distinct and at times conflicting definitions of model collapse, and argue that inconsistent terminology within and between papers has hindered building a comprehensive understanding of model collapse. To assess how significantly different interpretations of model collapse threaten future generative models, we posit what we believe are realistic conditions for studying model collapse and then conduct a rigorous assessment of the literature's methodologies through this lens. While we leave room for reasonable disagreement, our analysis of research studies, weighted by how faithfully each study matches real-world conditions, leads us to conclude that certain predicted claims of model collapse rely on assumptions and conditions that poorly match real-world conditions, and in fact several prominent collapse scenarios are readily avoidable. Altogether, this position paper argues that model collapse has been warped from a nuanced multifaceted consideration into an oversimplified threat, and that the evidence suggests specific harms more likely under society's current trajectory have received disproportionately less attention.

diffusion transofrmers

Added on April 5, 2025

Metaphorically, you can think of Vision Transformers as the eyes of the system, able to understand and contextualize what it sees, while Stable Diffusion is the hand of the system, able to generate and manipulate images based on this understanding.

ai/computer_vision

diffusion transformers

Added on April 5, 2025

Metaphorically, you can think of Vision Transformers as the eyes of the system, able to understand and contextualize what it sees, while Stable Diffusion is the hand of the system, able to generate and manipulate images based on this understanding.

ai/deep_learning/transformers ai/generative_models/diffusion_models

Flow Matching Guide and Code

Added on December 17, 2024

Flow Matching (FM) is a recent framework for generative modeling that has achieved state-of-the-art performance across various domains, including image, video, audio, speech, and biological structures. This guide offers a comprehensive and self-contained review of FM, covering its mathematical foundations, design choices, and extensions. By also providing a PyTorch package featuring relevant examples (e.g., image and text generation), this work aims to serve as a resource for both novice and experienced researchers interested in understanding, applying and further developing FM.

ai/generative_models/diffusion_models

Genie 2: A large-scale foundation world model

Added on December 10, 2024

Generating unlimited diverse training environments for future general agents

ai/reinforcement_learning

WilliamYi96/Awesome-Energy-Based-Models: A curated list of resources on energy-based models.

Added on December 9, 2024

A curated list of resources on energy-based models. - WilliamYi96/Awesome-Energy-Based-Models

"CBLL, Research Projects, Computational and Biological Learning Lab, Courant Institute, NYU"

Added on December 9, 2024

Yann LeCun's Web pages at NYU

ai/deep_learning

yataobian/awesome-ebm: Collecting research materials on EBM/EBL (Energy Based Models, Energy Based Learning)

Added on December 9, 2024

Collecting research materials on EBM/EBL (Energy Based Models, Energy Based Learning) - yataobian/awesome-ebm

ai/deep_learning

Oasis: A Universe in a Transformer

Added on October 31, 2024

Generating Worlds in Realtime

ai/deep_learning/transformers

Tutorial on Diffusion Models for Imaging and Vision

Added on September 10, 2024

The astonishing growth of generative tools in recent years has empowered many exciting applications in text-to-image generation and text-to-video generation. The underlying principle behind these generative tools is the concept of diffusion, a particular sampling mechanism that has overcome some shortcomings that were deemed difficult in the previous approaches. The goal of this tutorial is to discuss the essential ideas underlying the diffusion models. The target audience of this tutorial includes undergraduate and graduate students who are interested in doing research on diffusion models or applying these models to solve other problems.

ai/generative_models/diffusion_models ai/computer_vision

Stretching Each Dollar: Diffusion Training from Scratch on a Micro-Budget

Added on July 29, 2024

The authors present a method for training large text-to-image diffusion models on a very low budget. They use a technique called deferred masking to minimize performance loss while reducing computational costs. Their approach achieves high-quality results at a fraction of the cost compared to existing models, demonstrating the potential for democratizing AI training.

ai/generative_models/diffusion_models

Picsart-AI-Research/LIVE-Layerwise-Image-Vectorization: [CVPR 2022 Oral] Towards Layer-wise Image Vectorization

Added on July 1, 2024

The text discusses a new method called LIVE for generating SVG images layer by layer to fit raster images. LIVE uses closed bezier paths to learn visual concepts in a recursive manner. Installation instructions and references for the method are provided in the text.

ai/computer_vision

Step-by-Step Diffusion: An Elementary Tutorial

Added on June 15, 2024

The text is a tutorial about diffusion. The authors are Preetum Nakkiran, Arwen Bradley, Hattie Zhou, and Madhu Advani. The tutorial is available on the domain readwise.io.

ai/generative_models/diffusion_models

What are Diffusion Models?

Added on June 6, 2024

Diffusion models slowly add noise to data and then learn to reverse the process to create desired samples. Unlike other models, diffusion models have a fixed procedure and high-dimensional latent variables. Training a diffusion model involves approximating conditioned probability distributions and simplifying the objective function.

ai/generative_models/diffusion_models

Iterative α-(de)Blending: a Minimalist Deterministic Diffusion Model

Added on June 5, 2024

The paper presents a simple and effective denoising-diffusion model called Iterative α-(de)Blending. It offers a user-friendly alternative to complex theories, making it accessible with basic calculus and probability knowledge. By iteratively blending and deblending samples, the model converges to a deterministic mapping, showing promising results in computer graphics applications.

ai/generative_models/diffusion_models

How diffusion models work: the math from scratch

Added on June 1, 2024

Diffusion models generate diverse high-resolution images and are different from previous generative methods. Cascade diffusion models and latent diffusion models are used to scale up models to higher resolutions efficiently. Score-based generative models are similar to diffusion models and involve noise perturbations to generate new samples.

ai/generative_models/diffusion_models ai/deep_learning

The Annotated Diffusion Model

Added on May 19, 2024

A neural network learns to denoise data by gradually removing noise. The process involves adding noise to an image and then training the network to reverse the denoising. The network predicts noise levels based on corrupted images at different time steps.

ai/generative_models/diffusion_models

Defusing Diffusion Models

Added on May 19, 2024

This post explains the concepts of forward and reverse diffusion processes in diffusion models. By understanding these processes, readers can train diffusion models to generate samples from target distributions effectively. Guided diffusion models are also discussed, showing how conditioning information can be used to guide the diffusion process for specific outcomes.

ai/generative_models/diffusion_models

The Illustrated Stable Diffusion

Added on May 19, 2024

AI image generation with Stable Diffusion involves an image information creator and an image decoder. Diffusion models use noise and powerful computer vision models to generate aesthetically pleasing images. Text can be incorporated to control the type of image the model generates in the diffusion process.

ai/generative_models/diffusion_models ai/computer_vision

Memory in Plain Sight: A Survey of the Uncanny Resemblances between Diffusion Models and Associative Memories

Added on February 9, 2024

Diffusion Models and Associative Memories show surprising similarities in their mathematical underpinnings and goals, bridging traditional and modern AI research. This connection highlights the convergence of AI models towards memory-focused paradigms, emphasizing the importance of understanding Associative Memories in the field of computation. By exploring these parallels, researchers aim to enhance our comprehension of how models like Diffusion Models and Transformers operate in Deep Learning applications.

ai/generative_models/diffusion_models ai/interpretability

Memory in Plain Sight: A Survey of the Uncanny Resemblances between Diffusion Models and Associative Memories

Added on February 8, 2024

Diffusion Models (DMs) have become increasingly popular in generating benchmarks, but their mathematical descriptions can be complex. In this survey, the authors provide an overview of DMs from the perspective of dynamical systems and Ordinary Differential Equations (ODEs), revealing a mathematical connection to Associative Memories (AMs). AMs are energy-based models that share similarities with denoising DMs, but they allow for the computation of a Lyapunov energy function and gradient descent to denoise data. The authors also summarize the 40-year history of energy-based AMs, starting with the Hopfield Network, and discuss future research directions for both AMs and DMs.

ai/generative_models/diffusion_models

Pen and Paper Exercises in Machine Learning

Added on January 7, 2024

This is a collection of (mostly) pen-and-paper exercises in machine learning. The exercises are on the following topics: linear algebra, optimisation, directed graphical models, undirected graphical models, expressive power of graphical models, factor graphs and message passing, inference for hidden Markov models, model-based learning (including ICA and unnormalised models), sampling and Monte-Carlo integration, and variational inference.

cs/software_development/educational_resources

MotionGPT: Human Motion as a Foreign Language

Added on January 3, 2024

MotionGPT is a unified model for language and motion tasks, achieving top performance in text-driven motion generation. It combines natural language models with human motion tasks, benefiting fields like gaming and robotics. The model treats human motion like a foreign language, offering a versatile solution for diverse motion synthesis problems.

ai/deep_learning/transformers

Bookmarks

World Models

Optimizing Transformer-Based Diffusion Models for Video Generation with NVIDIA TensorRT

Position: Model Collapse Does Not Mean What You Think

diffusion transofrmers

diffusion transformers

Flow Matching Guide and Code

Genie 2: A large-scale foundation world model

WilliamYi96/Awesome-Energy-Based-Models: A curated list of resources on energy-based models.

"CBLL, Research Projects, Computational and Biological Learning Lab, Courant Institute, NYU"

yataobian/awesome-ebm: Collecting research materials on EBM/EBL (Energy Based Models, Energy Based Learning)

Oasis: A Universe in a Transformer

Tutorial on Diffusion Models for Imaging and Vision

Stretching Each Dollar: Diffusion Training from Scratch on a Micro-Budget

Picsart-AI-Research/LIVE-Layerwise-Image-Vectorization: [CVPR 2022 Oral] Towards Layer-wise Image Vectorization

Step-by-Step Diffusion: An Elementary Tutorial

What are Diffusion Models?

Iterative α-(de)Blending: a Minimalist Deterministic Diffusion Model

How diffusion models work: the math from scratch

The Annotated Diffusion Model

Defusing Diffusion Models

The Illustrated Stable Diffusion

Memory in Plain Sight: A Survey of the Uncanny Resemblances between Diffusion Models and Associative Memories

Memory in Plain Sight: A Survey of the Uncanny Resemblances between Diffusion Models and Associative Memories

Pen and Paper Exercises in Machine Learning

MotionGPT: Human Motion as a Foreign Language

Subcategories