> Q5 In the case of Markov, etc., I understand that the generation may be not very conformant to the style learnt, unless using a high order Markov but with the risk of recopying entire sequences from the corpus and thus plagiat. But, in the case of a RNN-based architecture [9], what is the rationale?-------------------- A5: As mentioned before, RNN does the similar work as LSTM in our work. But without including a discriminator, it only learns transmission probability between adjacent notes, but does not promise that generated sequences look like real ones.
Come on, guys, that's just not true. You do not need an adversarial loss to get good quality melodies. Look at Sturm's char-RNN on ABC notation, or OpenAI's MuseNet, or a bunch of Project Magenta work, or my own GPT-2 ABC music (MIDI in progress): https://www.gwern.net/GPT-2-music Or for that matter, any generative model trained with a non-adversarial loss (anything using GPT-2 for example).
In fact, generally, everyone avoids GANs for sequence generation because they work so badly compared to regular likelihood training... (Just at a skim, their 'baseline' is pretty suspicious. I'd expect an ablation for the GAN, not comparing their 400-unit LSTM to... a 100-unit LSTM https://www.aclweb.org/anthology/N19-4015.pdf ? Really?)
feynmanliang's BachBot and hexahedria's biaxial-rnn also have very good melody generation, and these deal with polyphonic music which is quite a bit harder than simple monody/monophonic music.
Oh no, they're ruining popular music as we know it. Anybody can just push some buttons and generate the next hit song. All you need is some with artificial lyrics, artificial melody, artificial vocal (Yamaha vocaloid), on top of a beat bought on the net.
I know you're being sarcastic but I always thought the concern over the provenance of art is strange. If a computer can generate good music, why does it matter that a computer created it if it's good music? And if computers are just spitting out terrible music, who cares?
Same goes for computer generated visual art, painting created by elephants, or whatever else you can think of that challenges the position of humans as the sole creators of art.
Depends if you consider it pure raw material (I think it is) that you can use to convey something of your own or if you can relate to some point to it (and do the same, but to some further extent).
At a rather utilitarian level, one wouldn't consider a spoon exactly the same way whether it was:
1) carved out of wood by someone you can relate to (by history, reasons, knowledge); as a unique piece, or part of a series;
2) or built in a factory, by workers; and available in the same fashion in X instances;
3) or built in a factory by robots; and available as an exact copy in X instances;
4) or a pure unique artefact that got the practical shape of a spoon.
And that would be for a spoon only.
I am of the school of thought that one doesn't make art without the intent of communicating/transmiting (it/through it) with other people, however remote/intimate it can be. If only because one cannot live alone.
In this line, whatever pleasing something can be, if there's no transmission/communication/need/desire behind the creation of it, if there is no social reason behind it, it has no resonance, no affect to me that makes me actually relate to it (and to the person that thought of it, fell on it, carved it in their words, sounds, silences, colors, shapes, steps, etc.).
To me, whatever the artist intent was, when anything is done, it is done by someone - this someone cared, if only enough to make it, and that way attached something to their work.
That is something I can try to relate to (to the product itself, and to the intent perhaps).
When something is produced by a system, there still may be the intent of the person(s) that designed this system. But it may also be totally independant of the designers scope. That is interesting also, as pure rough material.
But it isn't remotely as profound, interesting, human as something that is done by someone real.
Now consider this:
Something generated by a computer, in this case music, has no intent.
But what happens if I curate it, judge it, only publish the best ones. Through that I add my intent.
Where is the difference between curating a number of computer generated songs into an album, and curating a number of computer generated sounds into a piece of music?
Both require some sort of talent, both add intent to something that a computer generated.
Of course. You add your intent of curation/collection. The part that's original to you is the curation itself. Not the songs being collected themselves.
It's the same with instruments that generate a sound (whether mechanical or electronic ones). The sound source might be more or less generated. But what you do with it, the music you compose and play with it, that is original to your self. But the source sound material is just that: raw material.
How much of that profoundness is within our own heads because we know someone made it. We might imagine their intent, working back from the final representation to imagine what they might have felt creating it.
So in that regard, as long as we're honest with the viewer, I imagine your point holds; it won't ever be the same.
Yet I have to imagine that how most art is consumed these days already does away with any notion of "intent of the artist." Did the artist intend for their work to be displayed in large galleries, alongside countless other works, hundreds of years later? Who knows? We're constantly re-imagining these works by placing them in new contexts and priming viewers in different ways. However it is different when the artist is still with us.
Would cubist paintings have the same depth if the viewer was lead to believe that a computer had generated them?
I don't think art is experienced in one way only, and context is important (as almost always).
In the case of cubism—it was a response to the mode of the time. Form was being questioned. It was a philosophical effort expressed and communicated in its own way. That's one experience. Another is purely aesthetic.
Much the same with music.A computer wouldn't likely [at this stage, anyway] developed the blues. It's not purely algorithmic, though the resulting form can be used to develop one. A lot of that kind of music only developed as a result of empathy and the ability to relate to the poetry and the mood and feelings imbued by it. Likewise, that's one experience. Another is purely aesthetic. Another yet begins with the aesthetic and might explore the history to develop a more profound experience through the attempt at understanding.
I'm quite brazenly biased here. I don't resent the efforts, nor would I disparage the results of computer generated music. But I don't think it compares. By that I don't mean one is objectively better or worse than the other; I mean quite strictly that I don't think they compare.
It wasn't experience that resulted in the emergence—just mild variations on a "consciously" used algorithm. (I know I'm oversimplifying—it's not meant to degrade the author's or other commenters' work in the field)
Fair enough, that's a good point. Art is constantly given new meaning, whether initially by the artist, or those who choose to display it, and finally by those who view it (and in some cases those who then try to ban or suppress it!). I do see the argument that a human element is required. Is computer generated noise "music" until someone selects and features it?
I'm not necessarily arguing for one view over the other, I'm just not yet entirely convinced that a computer can't generate art, without devolving into debating what is and isn't art altogether.
I think in that case, and also without getting too into the weeds, I'd put forward that a precursor to art would be consciousness. Until then I figure it's pattern matching—or instruction-following. Origin I guess I also consider important... but we'll get into the weeds if I do go one (as I was about to start questioning if a child learning drawing techniques from a 'how-to' book is creating art in that moment)
Yes - the real medium of art is culture itself. "Content" is just a means to that end.
AI is culturally blind, tone deaf, and dumb. So although it's going to be capable of generating adequate content at some point, it's not going to be changing the foundations of the cultural currents around it in the way a groundbreaking work of art does. (And no - producing "content" mechanically is culturally trivial. It's a change of medium, not a change of message.)
That won't happen until it understands and can influence cultural processes as a personality with agency and something original to say, and isn't just spinning its wheels on MIDI notes and virtual paint splotches.
Supposedly because it devalues the value of manual created artwork, or because people are afraid for change.
> And if computers are just spitting out terrible music, who cares?
Well, I loved the 90s goa trance, but it got snowed under by the 00s psytrance. If that 00s psytrance was never made available, perhaps there would've been more of the style I preferred.
My point being that:
1) What is popular matters
A) in ways which might not be apparent at face glance.
Hello, the 80's called, your moaning about the boring synth musicians interests is deleted.
We synth nerds have been having precisely this argument for decades now, and the answers always, unequivocally is, no matter how you made the music, its the musician, not the tool... as soon as we all agree "computer music is boring", someone proves us wrong.
This. People seem to not realize that there is much more complexity to music besides just the melody and the chord progression used. Which sort of makes sense, because those are probably the only things that people not familiar with making music can easily grasp.
Did the music of the past 10 years do much in those aspects? No, because chord progressions and melodies have been explored into such depths over centuries, it is kinda difficult to do anything new there without sounding ultra experimental and, most likely, not palpable to an average ear. But the music has made giant leaps in terms of sound synthesis, timbre complexity, etc.
Midi generation from lyrics is imo a cool trick, but it doesn't even begin to scratch the surface of "automatic the music-making". It's like saying that using a TypeScript transpiler with syntactic sugar on top is getting us to the point of Javascript code writing itself without need for any software engineers.
When something hard becomes easy with the aid of technology, our expectations and demands of quality tend to rise. Maybe this is a good thing?
What great innovations or paradigms in music has happened recently? For example, what hiphop did in the way it combines beats and poetry revolutionized music as we know it.
Why the beat from the net? Just analyze the lyric's measure and produce a beat based on that. I know, it's lame because that doesn't even need ML :P Or, as a drummer, I recommend just going with (1)bumm (2)tzz (3)bumm (4)tz :(goto 1), maybe occasionally inserting a (2+)bumm if you're feeling fancy.
Benchmark proposal: It's good if a significant number out of N=1000 prefer its output for The Raven (by Poe) over Alan Parsons hand-crafted version.
Judging by the four provided melodies: 1) the notes have very little rhythmic variation. 2) the melodies don't seem to have any concept of metre or metric accent.
Music, like weaving, is a predecessor in using algorithms to do work. For hundreds of years, music theorists have been codifying the algorithms, or creating new algorithms, that create good music (for particular definitions of good). Like other code instantiated in a network, this code is more tailored to specific prior states and is less available to analysis of its details than prior efforts.
I always wanted a "demo engine" whereby one feeds in the melody and chord name, and then a style(s). The AI would then use pattern matching to make a fuller score in the chosen style(s). The output could be midi and/or an audio file (such as .WAV). Bonus points for vocals if given lyrics. I could make Elvis diet parodies: "Ain't nothing but a round dog..." Band-in-a-Box software sort of does this, but lacks realism in my opinion.
Come on, guys, that's just not true. You do not need an adversarial loss to get good quality melodies. Look at Sturm's char-RNN on ABC notation, or OpenAI's MuseNet, or a bunch of Project Magenta work, or my own GPT-2 ABC music (MIDI in progress): https://www.gwern.net/GPT-2-music Or for that matter, any generative model trained with a non-adversarial loss (anything using GPT-2 for example).
In fact, generally, everyone avoids GANs for sequence generation because they work so badly compared to regular likelihood training... (Just at a skim, their 'baseline' is pretty suspicious. I'd expect an ablation for the GAN, not comparing their 400-unit LSTM to... a 100-unit LSTM https://www.aclweb.org/anthology/N19-4015.pdf ? Really?)