Computer evolves to generate baroque music!

[Cary]: Hey Computer! [Computer]: Hi Cary! [Cary]: Make me music! [Computer]: You are in luck! I have a wide range of musical masterpieces available, ranging from Rick Astley’s Never Gonna Give You Up to Darude – Sandstorm. [Cary]: No, no no, that won’t do it! I’m a special snowflake, so I want music that only I get to hear. [Computer]: You want me to create new music from scratch? But I don’t even know what music is! [Cary]: Yeah. Like, you’re a computer, you’re smart! You should know how to do that. [Computer]: Well, I’ll give it a try. [sick beatz] [Computer]: Did I do a good job? [Cary]: No! Here, maybe I should give you some examples to learn from, like pieces by Bach. [Computer]: Ooh! Let me look at this! Now I know exactly what to do! [more sick beatz] [Cary]: *sigh* Clearly computer school is not working. [Computer]: Computers don’t have school. [Cary]: Because you didn’t even learn the concept of learning! Maybe you should just take a look at Andrej Karpathy’s article. [Computer]: Woah! Recurrent neural networks! I’ll be sure to remember this unless it passes through a “forget gate”. [transition music] [even more sick beatz] Ok, enough of that bizarre skit thing. Computery, you’re free to go! Yay! Time to become SkyNet! As you can probably tell by now, The goal of this project is to get a program to replicate pieces of Bach as closely as possible. To do this, I first downloaded as many MIDI files of Bach pieces as I could. I restricted it to keyboard only pieces to keep it simple. This website is so old, some of the links are older than me! And any file over 30 kb is labeled to warn it to you of its large size. The next step was to convert these MIDI files into text files using MIDI CSV. Now, they’re slightly easier to deal with, but not by much. We still have a lot of wasted characters just saying, “Note on”, “Note off”, “Note on”, “Note off”. In a custom Processing script, I stripped away these unnecesary characters, and converted the 88 pitches of the piano into 88 different ASCII characters. To allow for multiple notes to be played at the same time, I entered in spaces, to show where each unit of time ticks by. This shrinks the file’s size by about 6 times. Which makes it much easier to input it onto the next step! Which is… Andrej Karpathy’s magical LSTM! Seriously though, his blog post of this LSTM learning and replicating complex patterns of text is pretty much famous now. It seems like every machine-learning article links to it somewhere, even Google! But anyway, since this LSTM takes in text as input, we can just feed our reformated Bach text, And bam! It’ll start training! After a few hours or days… We can stop the training and let the LSTM output its own text. Hopefully, this text imitates our original training data as close as possible! Actually, we don’t want the outputed text to exactly replicate the training data. We just hope that it takes upon the same patterns, but it’s still technically original. Before we go on.. I want to point out that just by coincidence, I found the words ‘Yaeh Yaeh Yaeh Yaeh’. They look like lyrics, but no. It’s representing the pitches of 89, 97, 101, and 104. So, a major chord in first inversion, actually! Now what? Well, we can just go through all the conversion processes in reverse now. We go from the reformatted text, back into the unnecessarily long text using a custom Processing script. from the unnecessarily long text back into MIDI using CSV MIDI, and then finally from MIDI back into WAV or something :/ by just using windows media player and recording the output. So that’s what you heard computery- [Computery]: That’s me! [Cary]: -output. But, if I recall correctly, You’ve only heard 7 minutes of training! That’s not very long! Let’s see what happens if we train it for longer! Sorry for cutting that masterpiece off short but I’m gonna give some final thoughts So, first of all, about cutting things off short, Well, I tried to find a place that felt like it had closure to end it but one thing about these neural networks is that It never ends! Like, the stream of notes just keeps going and going and going So wherever I cut it off it’s gonna feel like it’s incomplete The second thing is, what is the ultimate goal of this project? If we want to replicate Bach and Mozart pieces as close as possible, Wouldn’t the perfect neural network just clone those pieces and spit them out verbatim? This process of just memorizing and regurgitating the training data is called overfitting and we don’t want that because we’d like to listen to original sounding music! To anyone who accuses this LSTM of overfitting, just listen to any more than 5 seconds of it and you’ll realize it sounds too wonky for something to actually appear in Bach or Mozart. So, I think that is enough proof that I’m not overfitting though I could do actual analysis of that But i feel like that takes a lot of work The third thought on my mind is I could imagine a lot of people downplaying this LSTM’s music, saying “Oh, it’s nowhere near as good as a human! What’s the point of all this? It sounds like garbage!” And I could understand that It’s clearly nowhere near as good as a human could do, But one advantage that computers have over humans well, two, I guess is speed and effort. Because if you were to ask a human to create 10 hours of original music for you, that might take them their entire lifetime. But, if I want to use this LSTM I can just let it run overnight. And when I’m sleeping, It’ll happen. So on the quality VS quantity spectrum, you can see we’re really, really excelling on the quantity side. By the way, if any of you YouTubers want to use this music in the background of your videos, you can. but, like, it’s not very good music so why would you want to? but, it’s, like copyright free and all that so you don’t need to worry. just credit uuuuuuuuuuuuuuuuuuuuuuuuuum computery okay? but imagine if a human composer wants to speed up the rate at which they can produce music, it will take a lot of training and practice just to make music, maybe, 10% faster. But say, if I want to make music on my LSTM 10x faster, well, I just need to buy 10 computers! Which sound like a lot, but really It’s just a small expense and now everything runs 10x faster and you get 10x more music per second that’s pretty crazy But speaking of speeding things up, as I said earlier I need to get a GPU. Well, I have a gpu in my computer, but It’s like AMD which is kind of unusable for tensor flow but yeah, like it will speed things up by orders of magnitude and in addition, i only trained the biggest, most intensive model of this video for a day. so if I trained it for longer, like a week, with a more powerful GPU, or maybe multiple, then It’s very possible for me to train essentially 100x longer. Which is kind of crazy! In addition, I’ll try to find more pieces by other composers to reduce the chance of overfitting but, I’m just really excited! Because if this is what we get after just one day of training on a pretty crappy processor Just imagine what else is possible if I actually try harder oh, by the way, I forgot to mention I’m clearly not the first person to try something like this nor are my results the best so far. not by a long shot. I’ve heard a lot of better AI-generated music. But this was just me trying something out and now I know a lot more about what’s happening behind the scenes and plus, I can now tweak it however I want! anyway, thanks to all of you for watching this far into the video actually I should give all my thanks to Andrej Karpathy because I didn’t really program any of the internal mechanisms, he was the one who did it. By the way, Andrej, my fastest competition official Rubik’s cube solve is faster than yours So, like, no biggie but I’m beating you there. BYE

  1. a good example of why ai will never become sentient …. regardless of how much we want it to, its important to remember the artificial part of artificial intelligence, ai emulates the basic aspects of intelligence that we intuitively understand but there are aspects of real intelligence that we don't fully understand and no ai can model it, regardless of how long it trains itself

  2. I just wonder, what happen if he "train" it on some VPS then try to create new after someday or even a month

  3. As a composer, that hurt my ears. It was just noise with no consideration for musical syntax. Good try, though.

  4. 30 minutes = Scott Joplin
    45 minutes = Keith Emerson
    52 minutes = Shostakovich / Chick Corea
    60 minutes = Gospel church pianists
    90 minutes = The Bad Plus
    150 minutes = Stephen Foster
    240 minutes = Frank Zappa
    360 minutes = Olivier Massiaen

  5. What about using harmonically simpler music like pop music? I guess this would require supporting different instruments though, although there are some pop pieces that are mainly piano and have no percussion. Also transposing them all into the same key might help since it wouldn't try to play notes from multiple keys at the same time.

  6. wouldnt it have pretty good results if you somehow allowed the computer to learn that some parts of songs are perceived by human as happy/sad/dramatic etc? and then u ask the computer to create a certain type

  7. You should input as many types of music as you can think of, rock, opera, pop, jazz, classical, gospel, allay the Same time, and see what kind of weird chord progressions it makes up

  8. Are you familiar with the work of David Cope? Not neural networks, but he gets impressive results: Experiments in Musical Intelligence.

  9. This is a bit scaring. Imagine in the future, original music produced like this, if it reach to sound good (wich i don't doubt in a future), it will make human creativity pointless.

  10. There are things that an IA cannot do, specially put creative sense over own creations. Because the IA need more complexity. Like having a body, suffering pain, some meaning of feeling things and so.

  11. Teach your computer to play
    Vince Clark. He made Just Can't Get It Enough by Depeche Mode then left
    the band after the first album. He didn't like the direction Depeche Mode was going.
    Then Vince Clark was in Yazoo UK/ Yaz US(there was a copyright issue with the name.), then The Assembly with Sharkey Fergal and then teamed up with Paul Quinn before starting Erasure. Erasure is famous for classics like Oh L'amour, A Little Respect, When i Needed You, Always (think Robot Unicorn Attack!), Take Me Back (love) Agnesfashionart's taste in art.),
    Blue Savannah and Chorus. As well as Reason and Be the One.
    They're still going strong!
    Train it for two days.
    Now you playing synthesizer ear candy. WOW! yum!

  12. Around the 6 hour mark I'd say it was at least good enough that if I heard that out of context, I'd probably just think it was being played by a really amateur musician

  13. Dude did u even add the constraints such as a fixed scale , chords during the training for a single midi piece?? … it will be mess otherwise

  14. One question: How do you measure Loss Rate per note?, I'd assume that's one of the success criteria, but in general = what is loss rate?

  15. That was wonderful! I really enjoyed him discovering the DRAMATIC notes, and then the legato chords…. so touching! 💕🙈

  16. If quality of music is not a thing, I can assure you that within 10 hours, I can also make 10 hours of music for you… but its total crap

  17. So, after you created life, withou living ypur own, what are you gonna to do with that? And, why dont you leran to play something? Make somethinf for your own

  18. sense key.. semse rythem.. then use those as controllers..if it over trains, then use statistics to create music from the primciple components..and geneticly improovemthem..
    sounds like a lot of work..

  19. you should take the longer approach and teach it to learn music similar to humans, starting with the basics from scratch and building up into a progression… Not by processing content from the most talented musicians from the beginning.

  20. 0 Minutes: "PRESS ALL THE KEYS!" 4 hours: "i have put on a powdered toupee."
    But seriously, it's impressive and i'd like to replicate it with electronica music.

  21. I feel that you could revisit this with more powerful hardware and a more diverse training set, and get continuously listenable results

