The diffusion model does not imagine. It interpolates. When Midjourney or Stable Diffusion generates an image from a text prompt, it is not creating something new; it is navigating a latent space constructed from billions of existing images. The output is a weighted average, a statistical recombination of visual patterns extracted from a training corpus that was itself scraped, without consent, from the networked archive of human visual culture. The model is a mirror, but a distorting one—it reflects back what it has been fed, smoothed and compressed into a probabilistic surface.
This distinction matters. The rhetoric surrounding generative AI insists on novelty: these systems "create," "imagine," "dream." But diffusion models are fundamentally interpolative. They learn to denoise—to reverse the process of adding random noise to an image until a coherent picture emerges. What guides this denoising is not creativity but correlation: the model has learned which pixel patterns co-occur in its training data, and it reconstructs images by predicting the most statistically likely configuration given a prompt. The output is not invented; it is retrieved, recombined, and rendered plausible.
The training set is therefore not an input but a parameter. It determines not just what the model can generate but what it understands as possible. A model trained on Western art history will produce Western art history. A model trained on stock photography will produce stock photography. The biases are not bugs; they are features—encoded at the level of the data, inherited by the weights, reproduced in every output. When critics point out that AI image generators default to white faces, thin bodies, and Eurocentric aesthetics, they are not describing a failure of the algorithm. They are describing the dataset, which is to say, they are describing the internet, which is to say, they are describing us.
This is where the diffusion model becomes diagnostic. It does not show us what AI can imagine; it shows us what the internet has accumulated. The model is a compression of visual culture, and like all compressions, it loses information at the margins. The images that were rare in the training data become rarer in the output. The images that were common become dominant. The model does not diversify; it averages. And the average, in a dataset scraped from a web shaped by advertising, engagement metrics, and platform incentives, is not neutral. It is a portrait of attention under capitalism: optimized, flattened, designed to be consumed.
Artists working with diffusion models confront this condition in different ways. Some treat the model as a tool, using prompts and fine-tuning to direct outputs toward specific aesthetic goals. Others treat it as a collaborator, engaging in iterative dialogue with the system to discover unexpected results. A smaller number treat it as a material, foregrounding the model's tendencies and limitations as the subject of the work itself. Holly Herndon and Mat Dryhurst's practice falls into this last category: their projects explicitly address the provenance of training data, the ethics of extraction, and the question of who benefits when a model learns from collective cultural production without compensation or consent.
This question—who benefits?—is the political core of the diffusion model debate. The training sets for major image generators were assembled by scraping publicly accessible images, including work by artists who never consented to have their images used for machine learning. The output of these models can then be sold, licensed, or used commercially by the companies that trained them. The value created by the collective labor of millions of image-makers is captured by a handful of corporations. This is not a new dynamic—it is the logic of platform capitalism applied to cultural production—but it is newly visible in the case of AI, because the extraction is so literal. The model contains the images. It cannot function without them.
Some artists have responded by poisoning the well: tools like Glaze and Nightshade alter images in ways that are invisible to humans but confuse machine learning systems, disrupting the model's ability to learn from protected work. Others have opted out entirely, withdrawing their images from platforms where they might be scraped. But these are defensive measures. They do not address the structural asymmetry between individual creators and corporate infrastructure. The model has already been trained. The images have already been ingested. The question is not whether extraction will occur but how the products of extraction will be distributed.
This is where the art market enters the frame. Diffusion models have produced a new category of cultural object: the AI-generated image, which can be minted as an NFT, printed and framed, or licensed for commercial use. These objects circulate in the same markets as traditional art, often commanding significant prices. But their relationship to authorship is fundamentally different. The artist who prompts a diffusion model is not the same as the artist who paints a canvas. The labor is distributed differently—across the model's developers, the creators of the training data, the designers of the prompt interface, and the user who types the text. The question of who made the work, and who should be compensated for it, does not have a clear answer.
The legal system is beginning to grapple with this ambiguity. Copyright law, designed for a world of identifiable authors and discrete objects, struggles to accommodate works that are statistically derived from thousands of sources. Courts have ruled that AI-generated images cannot be copyrighted if no human author can be identified, but this leaves open the question of what happens when a human provides the prompt, curates the output, and claims the result as their own. The model becomes a kind of ghost author—present in the work, responsible for its form, but legally invisible.
What diffusion models reveal, finally, is that originality was always a legal and market category before it was an aesthetic one. The Romantic notion of the artist as singular genius, creating ex nihilo, was itself a product of copyright regimes and market structures that rewarded claims of individual authorship. The diffusion model does not disprove this ideology; it exposes its contingency. When images can be generated from a statistical model trained on collective production, the fiction of the lone creator becomes harder to sustain. The work is always already collaborative, always already networked, always already indebted to what came before.
This does not mean authorship disappears. It means authorship is redistributed, made visible as a social relation rather than an individual property. The artist who works with diffusion models is not creating from nothing; they are curating from everything, navigating a space of possibilities defined by the accumulated visual culture of the networked age. The model is a mirror, and what it reflects is not the artist's imagination but the collective image-world from which both artist and model draw. To work with AI is to work with this inheritance—its biases, its exclusions, its compressed and averaged memory of what images have been. The question is not whether to use the mirror but how to look into it critically, knowing that what looks back is not the future but the past, endlessly recombined.