A Tutorial Series for Software Developers, Data Scientists, and Data Center Managers
This is the 20th article in the AI Developer Journey Tutorial Series and it continues the description of data collection and preparation in articles (Image Data Collection) and (Image Data Exploration) with a discussion on data collection and exploration for the music data. Be sure to check out previous articles in this series for help on team formation, project planning, dataset search, and other related topics.
The goal of this project is to:
- Create an application that takes in a set of images.
- Extract the emotional valence of the images.
- Output a piece of music that fits the emotion.
This project’s approach to creating emotion-modulated music is to use an algorithm (Emotion-Based Music Transformation) to alter a base melody according to a specific emotion, and then harmonize and complete the melody using a deep learning model. To do this, the music datasets that are required are:
- A training dataset for the melody completion algorithm (Bach chorales).
- A set of popular melodies that serve as a template for emotion modulation.
Music Data Collection and Exploration
Bach Chorales—Music21* Project1
The choice to use Bach chorales as the training dataset for the music generation was explained in detail in the (Music Dataset Search) article. In that article, the music21* project was briefly introduced. Here, the music21 corpus access will be discussed in more detail.
Music21 is a Python* based toolkit for computer-aided musicology, and includes a complete collection of Bach chorales as a part of its core corpus. Thus, data collection was as simple as installing the music21 toolkit (instructions available for macOS*, Windows*, and Linux*).
Once installed, the set of Bach chorales may be accessed using the following code:
from music21 import corpus for score in corpus.chorales.Iterator(numberingSystem='bwv', returnType='stream'): pass # do stuff with scores here
Figure 1: Iterating through all Bach chorales.
Alternatively, the following code returns a list of the filenames of all Bach chorales, which can then be processed with the parse function:
from music21 import corpus chorales = corpus.getBachChorales() score = corpus.parse(chorales[0]) # do stuff with score
Figure 2: Getting a list of all Bach chorales.
Exploring the Data
Once a dataset has been collected (or accessed in this case), the next step is to examine and explore the features of this data.
The following code will display a text representation of the music file:
>>> from music21 import corpus>>> chorales = corpus.getBachChorales()>>> score = corpus.parse(chorales[0])>>> score.show('text') {0.0} <music21.text.TextBox "BWV 1.6 W..."> {0.0} <music21.text.TextBox "Harmonized..."> {0.0} <music21.text.TextBox "PDF ©2004 ..."> {0.0} <music21.metadata.Metadata object at 0x117b78f60> {0.0} <music21.stream.Part Horn 2> {0.0} <music21.instrument.Instrument P1: Horn 2: Instrument 7> {0.0} <music21.stream.Measure 0 offset=0.0> {0.0} <music21.layout.PageLayout> {0.0} <music21.clef.TrebleClef> {0.0} <music21.key.Key of F major> {0.0} <music21.meter.TimeSignature 4/4> {0.0} <music21.note.Note F> {1.0} <music21.stream.Measure 1 offset=1.0> {0.0} <music21.note.Note G> {0.5} <music21.note.Note C> {1.0} <music21.note.Note F> {1.5} <music21.note.Note F> {2.0} <music21.note.Note A> {2.5} <music21.note.Note F> {3.0} <music21.note.Note A> {3.5} <music21.note.Note C> {5.0} <music21.stream.Measure 2 offset=5.0> {0.0} <music21.note.Note F> {0.25} <music21.note.Note B-> {0.5} <music21.note.Note A> {0.75} <music21.note.Note G> {1.0} <music21.note.Note F> {1.5} <music21.note.Note G> {2.0} <music21.note.Note A> {3.0} <music21.note.Note A> {9.0} <music21.stream.Measure 3 offset=9.0> {0.0} <music21.note.Note F> {0.5} <music21.note.Note G> . . .>>> print(score)<music21.stream.Score 0x10bf4d828>
Figure 3: Text representation of a chorale.
Figure 3 shows a text representation of the chorale as a music21.stream.Score object. While it is interesting to see how music21 represents music in code, it is not very helpful for the purpose of examining the important features of the data. Therefore, a software that can visualize the scores is required.
As mentioned in Emotion Recognition from Images Model Tuning and Hyperparameters, scores in the music21 corpus are stored as MusicXML* files (.xml or .mxl). A free application that can view these files in staff notation is Finale NotePad* 2 (an introductory version of the professional music notation suite Finale*). Finale NotePad is available for Mac and Windows. When installing NotePad on macOS, in some cases it may prevent software from installing if it has not been digitally signed for the OS. Take a look at (Mac) Security settings prevent running MakeMusic installers to avoid this problem. Once Finale Notepad is downloaded, run the following code to configure music21 with Finale Notepad:
>>> import music21>>> music21.configure.run()
We can now run the same code as in Figure 3 but with score.show()
instead of score.show(‘text’)
. This will open up the MusicXML file in Finale, which looks like this:
Figure 4: First page of a Bach chorale in staff notation.
This format gives a clearer visual representation of the chorales. Looking at a couple of the chorales confirms that the data is what we expected it to be: Short pieces of music with (at least) four parts (soprano, alto, tenor, and bass), separated into phrases by fermatas.
A common thing to do as a part of data exploration is to calculate some descriptive statistics. In this case we could find out how many times each key or pitch is used in the corpus. An example of how to calculate and visualize the number of times each key is used is shown below.
from music21 import* import matplotlib.pyplot as plt chorales = corpus.getBachChorales() dict = {} for chorale in chorales: score = corpus.parse(chorale) key = score.analyze('key').tonicPitchNameWithCase dict[key] = dict[key] + 1 if key in dict.keys() else 1 ind = [i for i in range(len(dict))] fig, ax = plt.subplots() ax.bar(ind, dict.values()) ax.set_title('Frequency of Each Key') ax.set_ylabel('Frequency') plt.xticks(ind, dict.keys(), rotation='vertical') plt.show()
Figure 5: Frequency of each key in the corpus. Minor keys are labelled as lowercase and major keys are labelled as uppercase letters. Flats are notated with a ‘-’.
Below are some other statistics about the corpus.
Figure 6: Distribution of pitches used over the corpus3.
Figure 7: Note occurrence positions calculated as offset from the start of measure in crotchets3.
The descriptive statistics that are interesting to calculate will differ for each project. However, they can generally help to get a grasp on what kind of data you have, and even guide certain steps in the preprocessing. These statistics can also serve as a baseline to see the effects of preprocessing on the data.
Base Melodies
Musical instrument digital interface (MIDI) files for the base melodies were simply downloaded from the Internet (for a discussion on the selection process and links see the article (Music Dataset Search).
Conclusion
Data collection for the music data was a relatively straightforward process and only involved installing the music21 toolkit. Exploration of the dataset involved looking at the different representations of the score as well as calculating descriptive statistics on the data.
Now, as all the relevant data has been collected and explored, the project can move on to the exciting part of implementing the deep learning models!
References and Links
1. Cuthbert, M., & Ariza, C. (2008). Music21 Documentation. Retrieved May 24, 2017, from http://web.mit.edu/music21/doc/index.html
2. Finale NotePad [Computer software]. (2012). Boulder, CO: Makemusic. https://www.finalemusic.com/products/finale-notepad/
3. Liang, F. (2016). BachBot: Automatic composition in the style of Bach chorales Developing, analyzing, and evaluating a deep LSTM model for musical style (Unpublished master's thesis, 2016). University of Cambridge.
Find more helpful resources at the Intel® Nervana™ AI Academy.