I know algorithms don’t have a sense of humour…..but, the YouTube algo cued up a talk from the leader of the team that developed the A.I. text-to-image model Stable Diffusion. Of course I watched it. Björn Ommer is a Professor at the University of Munich and leads the Computer Vision and Learning Group. He certainly didn’t endear me with his slide entitled, “Making Images Great Again.” His project, as he describes it—is focused on making sense of the world, democratizing image generation, and creating a tool that can be used on a cell phone without the need for a server farm. His demonstration of the tool as he rifled through a bevy of morphing images could only be described as pixel porn.
I then defied the algo and selected a dialogue between cognitive scientist John Vervaeke and philosopher Evan Thompson, as they discussed his book, The Blind Spot: Why Science Cannot Ignore Human Experience. An excellent book review by philosophy professor Robert Crease offers the central theme… “philosopher Edmund Husserl's remark… the West has both flourished because of and been afflicted by the "surreptitious substitution" of theories and ideas for the world itself. We bifurcate the world into what is "out there," and thus objective, and what is "in here," and thus subjective, regarding the first as what's real and our experience as a mere glimpsing of it.”
The book’s authors, astrophysicist Adam Frank, theoretical physicist Marcelo Gleiser, and philosopher Evan Thompson call for a new scientific approach, where science includes—rather than ignores or excludes—our lived experience as an essential part of the search for objective truth.
This seemed like a great boxing match between Ommers and his shiny machine learning picture maker and Vervaeke and Thompson as they raise the fists of a philosophy of embodied experience and perception. In Round 1, V & T dismantle the mathematical intelligence that Ommers offers up as gen A.I.’s core value for helping us understand our world. Thompson quotes Stuart Kauffman, medical doctor, theoretical biologist and complex systems researcher. “Life is based on physics, but life is beyond physic.” We need to…“participate in life to form the relevant concepts to individuate a phenomena of life.”
Another chapter in the book deals with ‘lived time’ and ‘clock time’ as a further example of surreptitious substitution. Crease’s review sums this up neatly in—“Clocks don't tell time; people do. Nobody pulls out a clock to know what time is, only what time it is.” Thompson goes further to point to the pernicious research approach of building a model from rich human sources, and then the model is imbued with a sense of value, such that the model’s product is seen as more fundamentally meaningful than the human experience that gave the model life.
Ommers demonstrates this hubris as his model attempts to capture all the phenomena of seeing and experiencing through an autoencoder, forward and backward diffusion, noise prediction, and text pairing. He claims he is offering creativity. But is the training of predictive noise addition and subtraction that results in novel arrangements of pixels a unique creative entity? We miss the fundamental human experience that gave life to each individual image.
Ommer makes another claim for his model, that Stable Diffusion is making sense of the world. There is no magic process that adds relevance as human creativity is wrung out of the billions of images this model consumed. The model does rely on the semantic and syntactical approach of living artists (the infringement argument from law professor Tim Dornis and A.I. scientist Sebastian Stober) but this still doesn’t create a new language of seeing or understanding. When used as a text to image tool, it has less value to humanity than the Slap Chop
I understand why it was done, machine learning needs massive datasets and data scientists had their giant corral—all those words waiting for the slaughterhouse of the LLM. Why leave all those images just waiting to be harvested, the low hanging fruit of naive artists having shared their life’s work to platforms in the vain hope of recognition and work. Ripe for the picking.


What should these algos have been built to do? The best answer I have is the example provided by a lonely introvert, brow beaten by his mother, who never left his home on Utopia Parkway in Flushing Queens for 43 years—except for trips to New York City to collect the artifacts, objects, and film scraps that he built into his marvelous exploration of the world, nature, and the cosmos. Joseph Cornell never travelled this earth but in the maps, materials, and spaces he constructed he gave our world new meaning.



The frontiers of a visual science should offer us awe and wonder and allow us to visit places we have never seen. To venture beyond our imagination and present the vast diversity of this world. These were research labs that created these products, not Hasbro and what we got is painting in the style of Van Gogh.
Joseph’s work is beautiful and i am sorry he endured so much in his life. I hope making his work gave him access to the privacy and joy he otherwise probably lacked.