Mozart Ex Machina

In this project, I prototype some new ideas for machine learning algorithms. Currently, artificial neural networks (ANNs) have serious problems of reliability, reproducability and stability, which collectively erode trust in their results. The hope with my new prototypes is to add transparency and robustness to ANNs, thereby restoring confidence in their widespread implementation.

We start small, making sure these ideas can work on the simplest of ANNs. I've chosen to focus on the classification and automatic generation of the music of Mozart. I'll be saving my notes and observations in this blog, hopefully so others can make use of them.

Tools and Resources


The first thing I need for this project is a dataset. Since I've chosen Mozart's music as the focus, this means I'll need a selection of his pieces. All of these are in the public domain, and so I am free to re-use them however I like. I've taken the pieces from, where they are stored as MIDI files (more on this file format later). The copyright on this website is cc-by-sa Germany License, which requires that I acknowledge the original copyright owner (Bernd Krueger) and make any content built on this data available under the same license. This does not apply to the pieces themselves, as they are in the public domain.

The dataset contains 21 movements, three each for seven pieces. These pieces are numbered 311, 330, 331, 332, 333, 545, and 570. Their files names are mz_311_1.mid, indicating composer, piece, and movement.

Machine learning package

Next, I'll need software to program my ANNs. I've gone with scikit-learn, a Python package designed for simple machine learning. It's open-source, which is always good in research because it makes the programs accessible. It also means the code won't become useless if a company goes bankrupt. The package uses NumPy, SciPy, and matplotlib, so I need to have those installed as well, along with a Python release.

Music analysis toolkit

While scikit-learn will give me the machine learning tools I need, I'm also going to need software to analyze the music itself. For that, I'm using music21, an open-source Python toolkit from MIT. This manipulates MIDI files, letting me dissect the pieces into their features.

To get the essential components of a MIDI file, one can run the following code snippet.

      import music21 as m21

This extracts information from the MIDI file and puts it into the format for music21. From there, we can pick out specific measures: midi_data.measure(144).show('text'). This returns all information found in measure 344 of this piece, including instruments played, clef, key, tempo, meter, and list of notes.

The format for music21 stores information as streams. These streams contain all other objects relevant to this project. The data extracted from the MIDI files in the dataset is stored as scores. Each of these scores is divided into the separate parts, usually Piano Left and Piano Right. These parts are then comprised of the measures that make up the piece as well as information that is true of the piece throughout, such as the clef, key, meter, and instrument. Note that some of this information may change for some measures. Finally, within each measure is a list of notes, rests, and tempos.

While individual measures can be called using midi_data.measure(#), individual parts have to be called as lists,[0]. The following code snippet returns relevant information on the piece and follows from the previous code snippet.[0].clef                                                        # returns the clef[0].getInstrument()                                             # returns the instrument[0].measure(1)'text') # list of chords, notes and rests, including offset from start of measure[0].measure(1).getElementsByClass('KeySignature').show('text')  # key[0].measure(1).getElementsByClass('TimeSignature').show('text') # meter[0].measure(1).getElementsByClass('MetronomeMark').show('text') # tempo(s)

Feature space

To feed data into an ANN, we need to divide each datum into features. For this project, each datum is a selection of music. Since the music is stored as a sequence of chords, each with a set of notes, an offset, a length of time played, etc., the feature space should be precisely these descriptors. That is, for each chord in the selection, there is a feature for the set of notes, the offset, and the length of the chord, as well as the instrument played and other important details.

The size of each datum can be as small as a single chord or as long as the entire song. However, it must be constant. Let the number of chords found in each datum be cpd (Chords Per Datum). Each song must then be divided into data by taking cpd chords at a time.

The songs as found in the dataset are divided into measures. This suggests we should use measures as the length of each sample. Note, however, that the feature space will become complicated, as the contents of each measure vary significantly. Can each measure be fed into an ANN as is? How can we annotate such a high dimensional sample? What to do with strictly qualitative features?

Classifier: Name That Tune

Let's begin with an easy task for scikit-learn to handle: classifying a measure as belonging to a specific song. This ANN will be called Name That Tune, or NTT for short.

NTT takes a single datum and returns the song title. I have 21 songs composed by Mozart as a dataset. The available classes should then be these 21 songs. Note that each 'song' is in fact a movement in a larger piece. Ideally, this information can also be encoded into NTT.