Join top executives in San Francisco on July 11-12, to hear how leaders are integrating and optimizing AI investments for success. Learn More
Google AI researcher and piano player Pablo Castro knows that musicians can settle into a comfort zone. It’s something they can rely on in front of a paying audience, but it can also become boring and limit growth. To push those creative boundaries, Castro is developing a deep generative AI model that encourages musicians to tap into uniquely human music through improvisation.
“Because we’ve been trained for so long, we can use our musical training to sort of navigate these uncomfortable areas in a creative way. And that sometimes lends itself to new types of musical expression,” Castro said in a phone conversation with VentureBeat.
Castro plays piano in PSC Trio, a jazz band that performs in Ottawa, Montreal, and other parts of Canada.
ML-Jam, which seeks to bring out the human characteristics of musical improvisation, comes out of Google Brain’s Magenta project for driving music with machine learning. ML-Jam purposely attempts to limit itself to premade models that work right out of the box, and it utilizes Magenta’s DrumsRNN and MelodyRNN.
Event
Transform 2023
Join us in San Francisco on July 11-12, where top executives will share how they have integrated and optimized AI investments for success and avoided common pitfalls.
“Essentially, what I wanted to do is keep my rhythm, since that’s so reflective of the way I play, but replace my notes with the notes that the model is producing. So it’s this hybrid improvisation,” he said. “What I found in my experience is this is often rhythmically not something I would have come up with on my own, because it’s not a rhythm that would have come organically to me. But it often ends up being something kind of interesting to me.”
Castro presented ML-Jam and its open source Python code at the International Conference on Computational Creativity (ICCC) held last week in Charlotte, North Carolina.
The model begins with what Castro calls a deterministic drum groove. Someone plays a bassline and adds other instruments and then sends the groove to DrumsRNN to generate a unique model. Then a musician who is in control of the rhythm model that shapes a musical phrase improvises with a melody created by MelodyRNN.
Multithreading with Python makes ML-Jam’s inference carry out in a separate thread, allowing the model to be generated and then played live during a performance. Since generating a model can take an unpredictable amount of time, musicians have to work onstage with a sound they haven’t yet heard live.
Finding the groove
Castro tried playing with ML-Jam and his jazz trio but said he found chemistry between them to be lacking. Instead, he plans to incorporate AI into his own music.
His next step is to use ML-Jam or derivative systems to fuel unique content for a live show.
“One thing I’ve started working on is essentially a solo show, where it’s just me and … improvisations are built around this technology. So then it’s a lot more organic, and what’s interesting about that is that it forces me to approach composition in a very different way than what I would do normally,” Castro said.
“I have to think of whether it will work with the type of system I’m using. Like if it’s this drumming thing, it’s using a loop, so I have to have something that kind of works well with the loop, isn’t going to be too repetitive, isn’t going to be boring, but that still fits well within this idea … And so whenever I’m done with it, whatever comes out of it will 100% be very different than anything I would have come up with had I not imposed those constraints on myself.”
Other standout AI models made recently for music include Magenta’s Piano Genie. The Flaming Lips used a version of Piano Genie — named Fruit Genie — in a performance onstage at I/O last month.
Castro’s tandem performance with AI may incorporate other novel music models — such as Magenta’s Music Transformer, which generates piano melodies, and OpenAI’s MuseNet — to spark more improvised composition. In March, Google created a Music Transformer-powered tool that begins with keys chosen by a person and then generates music that sounds like the work of Johann Sebastian Bach.
“The whole point of it is to explore the space of human-machine collaboration, so the compositions would be written for this collaboration rather than trying to take a system I built externally and putting it into a song that I composed for a human-only trio,” Castro said.
“What I want to do is essentially have each song explore a different type of machine learning model, and they don’t necessarily all have to be music-generating models. The idea is to see how can you can integrate different machine learning technologies into composition or improvisation in a way that produces music that likely wouldn’t have come out that way had you not been trying to incorporate these machine learning technologies.”
Castro distinguishes his model from some of the others used by creatives because his must receive human input in order to operate.
For Castro, human purpose — shaped by a person’s history and humanity — is what defines art.
“For me, the question of ‘Is it art or not?’ really boils down to ‘Where is the purpose coming from?’ And I don’t see any model having any purpose right now,” he said. “It’s the humans that put that in there.”
VentureBeat's mission is to be a digital town square for technical decision-makers to gain knowledge about transformative enterprise technology and transact. Discover our Briefings.