Hands-On AI Part 20: Music Data Collection and Exploration

A Tutorial Series for Software Developers, Data Scientists, and Data Center Managers

This is the 20th article in the AI Developer Journey Tutorial Series and it continues the description of data collection and preparation in articles (Image Data Collection) and (Image Data Exploration) with a discussion on data collection and exploration for the music data. Be sure to check out previous articles in this series for help on team formation, project planning, dataset search, and other related topics.

The goal of this project is to:

Create an application that takes in a set of images.
Extract the emotional valence of the images.
Output a piece of music that fits the emotion.

This project’s approach to creating emotion-modulated music is to use an algorithm (Emotion-Based Music Transformation) to alter a base melody according to a specific emotion, and then harmonize and complete the melody using a deep learning model. To do this, the music datasets that are required are:

A training dataset for the melody completion algorithm (Bach chorales).
A set of popular melodies that serve as a template for emotion modulation.

Music Data Collection and Exploration

Bach Chorales—Music21* Project¹

The choice to use Bach chorales as the training dataset for the music generation was explained in detail in the (Music Dataset Search) article. In that article, the music21* project was briefly introduced. Here, the music21 corpus access will be discussed in more detail.

Music21 is a Python* based toolkit for computer-aided musicology, and includes a complete collection of Bach chorales as a part of its core corpus. Thus, data collection was as simple as installing the music21 toolkit (instructions available for macOS*, Windows*, and Linux*).

Once installed, the set of Bach chorales may be accessed using the following code:

from music21 import corpus
for score in corpus.chorales.Iterator(numberingSystem='bwv', returnType='stream'):
    pass
    # do stuff with scores here

Figure 1: Iterating through all Bach chorales.

Alternatively, the following code returns a list of the filenames of all Bach chorales, which can then be processed with the parse function:

from music21 import corpus
chorales = corpus.getBachChorales()
score  = corpus.parse(chorales[0])
# do stuff with score

Figure 2: Getting a list of all Bach chorales.

Exploring the Data

Once a dataset has been collected (or accessed in this case), the next step is to examine and explore the features of this data.

The following code will display a text representation of the music file:

>>> from music21 import corpus>>> chorales = corpus.getBachChorales()>>> score = corpus.parse(chorales[0])>>> score.show('text')

{0.0} <music21.text.TextBox "BWV 1.6  W...">
{0.0} <music21.text.TextBox "Harmonized...">
{0.0} <music21.text.TextBox "PDF ©2004 ...">
{0.0} <music21.metadata.Metadata object at 0x117b78f60>
{0.0} <music21.stream.Part Horn 2>
    {0.0} <music21.instrument.Instrument P1: Horn 2: Instrument 7>
    {0.0} <music21.stream.Measure 0 offset=0.0>
        {0.0} <music21.layout.PageLayout>
        {0.0} <music21.clef.TrebleClef>
        {0.0} <music21.key.Key of F major>
        {0.0} <music21.meter.TimeSignature 4/4>
        {0.0} <music21.note.Note F>
    {1.0} <music21.stream.Measure 1 offset=1.0>
        {0.0} <music21.note.Note G>
        {0.5} <music21.note.Note C>
        {1.0} <music21.note.Note F>
        {1.5} <music21.note.Note F>
        {2.0} <music21.note.Note A>
        {2.5} <music21.note.Note F>
        {3.0} <music21.note.Note A>
        {3.5} <music21.note.Note C>
    {5.0} <music21.stream.Measure 2 offset=5.0>
        {0.0} <music21.note.Note F>
        {0.25} <music21.note.Note B->
        {0.5} <music21.note.Note A>
        {0.75} <music21.note.Note G>
        {1.0} <music21.note.Note F>
        {1.5} <music21.note.Note G>
        {2.0} <music21.note.Note A>
        {3.0} <music21.note.Note A>
    {9.0} <music21.stream.Measure 3 offset=9.0>
        {0.0} <music21.note.Note F>
        {0.5} <music21.note.Note G>
.
.
.>>> print(score)<music21.stream.Score 0x10bf4d828>

Figure 3: Text representation of a chorale.

Figure 3 shows a text representation of the chorale as a music21.stream.Score object. While it is interesting to see how music21 represents music in code, it is not very helpful for the purpose of examining the important features of the data. Therefore, a software that can visualize the scores is required.

As mentioned in Emotion Recognition from Images Model Tuning and Hyperparameters, scores in the music21 corpus are stored as MusicXML* files (.xml or .mxl). A free application that can view these files in staff notation is Finale NotePad* ² (an introductory version of the professional music notation suite Finale*). Finale NotePad is available for Mac and Windows. When installing NotePad on macOS, in some cases it may prevent software from installing if it has not been digitally signed for the OS. Take a look at (Mac) Security settings prevent running MakeMusic installers to avoid this problem. Once Finale Notepad is downloaded, run the following code to configure music21 with Finale Notepad:

>>> import music21>>> music21.configure.run()

We can now run the same code as in Figure 3 but with score.show() instead of score.show(‘text’). This will open up the MusicXML file in Finale, which looks like this:

Figure 4: First page of a Bach chorale in staff notation.

This format gives a clearer visual representation of the chorales. Looking at a couple of the chorales confirms that the data is what we expected it to be: Short pieces of music with (at least) four parts (soprano, alto, tenor, and bass), separated into phrases by fermatas.

A common thing to do as a part of data exploration is to calculate some descriptive statistics. In this case we could find out how many times each key or pitch is used in the corpus. An example of how to calculate and visualize the number of times each key is used is shown below.

from music21 import*
import matplotlib.pyplot as plt

chorales = corpus.getBachChorales()
dict = {}

for chorale in chorales:
   score = corpus.parse(chorale)
   key = score.analyze('key').tonicPitchNameWithCase
   dict[key] = dict[key] + 1 if key in dict.keys() else 1

ind = [i for i in range(len(dict))]
fig, ax = plt.subplots()
ax.bar(ind, dict.values())
ax.set_title('Frequency of Each Key')
ax.set_ylabel('Frequency')
plt.xticks(ind, dict.keys(), rotation='vertical')
plt.show()

Figure 5: Frequency of each key in the corpus. Minor keys are labelled as lowercase and major keys are labelled as uppercase letters. Flats are notated with a ‘-’.

Below are some other statistics about the corpus.

Figure 6: Distribution of pitches used over the corpus³.

Figure 7: Note occurrence positions calculated as offset from the start of measure in crotchets³.

The descriptive statistics that are interesting to calculate will differ for each project. However, they can generally help to get a grasp on what kind of data you have, and even guide certain steps in the preprocessing. These statistics can also serve as a baseline to see the effects of preprocessing on the data.

Base Melodies

Musical instrument digital interface (MIDI) files for the base melodies were simply downloaded from the Internet (for a discussion on the selection process and links see the article (Music Dataset Search).

Conclusion

Data collection for the music data was a relatively straightforward process and only involved installing the music21 toolkit. Exploration of the dataset involved looking at the different representations of the score as well as calculating descriptive statistics on the data.

Now, as all the relevant data has been collected and explored, the project can move on to the exciting part of implementing the deep learning models!

References and Links

1. Cuthbert, M., & Ariza, C. (2008). Music21 Documentation. Retrieved May 24, 2017, from http://web.mit.edu/music21/doc/index.html

2. Finale NotePad [Computer software]. (2012). Boulder, CO: Makemusic. https://www.finalemusic.com/products/finale-notepad/

3. Liang, F. (2016). BachBot: Automatic composition in the style of Bach chorales Developing, analyzing, and evaluating a deep LSTM model for musical style (Unpublished master's thesis, 2016). University of Cambridge.

Find more helpful resources at the Intel® Nervana™ AI Academy.

Hands-On AI Part 20: Music Data Collection and Exploration

Music Data Collection and Exploration

Bach Chorales—Music21* Project¹

Exploring the Data

Base Melodies

Conclusion

References and Links

Trending Articles

Practice Sheet of Right form of verbs for HSC Students

Download: FK ft Shenky – Nakuyewa ”Prod by: Shenky”

How to win at Markstrat (Markstrat Tips and Tricks) – Vodites

Ominde Commission Report and Recommendations – Ominde Report of 1964

Bureau of Internal Revenue: Regional Offices (Directory)

GO 53 on Enhancement of Ex-gratia upto 5 Lakhs Toddy Tappers in Telangana

Cakewalk CA-2A Leveling Amplifier v2.0.1.97 WiN, v2.0.1.96 OSX Incl Keygen

Mp3 Download: Mdu - Kunjenjenjena

How the kill the job , when DTP request running for long hours.

Microsoft Intune から展開しているアプリのアップデートについて

18-year-old girl was beaten for half an hour by two Northampton men in 'an...

Car crash in Dunton Bassett leaves driver in critical condition

Macky 2, Two Others In Road Accident

Application log 00000000000000089514: Could not convert queue DLVST90CLNT

Detroit mafia: D’Anna Brothers agree to plea deal

Delivery block field greyed out using VA02

Muloraki Au

【個人撮影】スマホのプライベート映像♪「中に出さないで///」カラオケ屋での生ハメ撮りが流出ｗ【リベンジポルノ】＠PornHub

BREAKING NEWS: Diamond Platnumz Is Reported Dead After Ghastly Car Accident

FIAT 500 B0111 B0112

Music Data Collection and Exploration

Bach Chorales—Music21* Project1

Exploring the Data

Base Melodies

Conclusion

References and Links

Trending Articles

Bach Chorales—Music21* Project¹