"...generating sequential data is the closest computers get to dreaming" — Alex Graves, machine learning researcher
"And the Search is not only an instrument, but a machine." — Gilles Deleuze, philosopher
Machine Proust is a project I built while a student at NYU’s Interactive Telecommunications Program (ITP) in the fall of 2018.
It was motivated by an interest in new modes of creative collaboration between people and machines.
Central to my curiosity about these new modes of collaboration is the notion of intelligence augmentation. Intelligence augmentation is often posed a kind of alternative to artificial intelligence understood as tool that we outsource labor or cognition to. Machine learning models are very often employed in more instrumental tasks like image recognition (discerning a picture of a cat from that of a dog), sentiment prediction (telling whether a user review is “negative” or “positive”), and other similarly rote tasks. Besides this paradigm of simply training machine learning models to do things we’d rather not, there exists another: one where machine learning models can be used toward more generative ends, such as surprise, illumination, revelation, shedding light on unintuitive knowledge or alternative, alien perspectives manufactured by the trained models. (You can read more about this point of view here and here.)
Machine learning, conceived as such, could also be thought of as a worthy partner for collaboration, whether as an ancillary device to spur on the user’s imagination, or one to mold or tune their contribution to fit a certain template and then iterate and modulate their contribution to the point of effectively performing a duet with the machine.
The point — in general, pertaining to what I’m talking about above, as well as to the Machine Proust project in particular — is not arriving at a level of fidelity or uncanniness, where one is incapable of telling the machinations of the model from the thing it is trained on (Proust or Bach or whatever else). It’s rather the more holistic dynamic of the user’s imagination being fructified by the synthetic output of the machine learning model. In this sense, this point of view could be considered of a piece with a kind of human-centered design inflected approach to machine learning.
This project in particular makes use of the RNN neural network architecture, which is older and less powerful than some of the latest and greatest in neural network architecture today, such as transformer. I have run many training sessions with a variety of training inputs to some of these newer models, and, expectedly, the output is a lot better, a lot harder to tell apart from the input. But that is not necessarily always as interesting (though it certainly can be), as this kind of output can read like a duplicate or continuation of the input, not allowing as much room to dream, interject, cannibalize, and transform as the slightly more broken output some older architecture produces.
And this room to dream, to interject, and transform, is integral to my project. The output on the screen invites user edits, allowing anyone to generate a sequence, cut it up, add, and take away, and then generate more, resulting in a true collaboration.