My OS X Programming Blog

I’ve read two more papers on expressive performance today. I was attracted to Widmer’s “Machine Discoveries: a Few Simple, Robust Local Expression Principles” because it promises simple rules that work at the level of individual notes. Machine learning techniques are used to discover rules that govern timing, dynamics, and articulation from performances of Mozart sonatas. It produces rules such as

“Given two notes of equal duration followed by a longer note, lengthen the second of the two.”

Unfortunately, because such rules only form a partial model (they are true only for a fraction of positive examples), it’s not clear how one might apply them to generate expressive performances.

I then also read a summary of Friberg’s thesis. This work takes the “reverse” approach by proposing rules for generating expressive performances and basically testing whether they produce nice sounding results. Perhaps its value is its collection of rules from much of the expressive performance literature up to that point.

Perhaps I’ll have to devise my own algorithm for generating bass lines (mine is a much simpler problem). I don’t think I should vary the timing since in jazz accompaniment, the bass line helps to keep time. It would be interesting to apply some of the rules from these papers to determine note dynamics though. Since bass lines are generated in a previous composition step, the role of each note in them (e.g., chord tone, tonal passing note, chromatic passing note, etc.) is already known to the program. It should be a simple matter to apply a set of rules to them.

Expressive Performance

Thursday February 26, 2004

I’ve been doing some reading on approaches to generate expressive performances, machine imitation of artistic nuances when human performers play a score. I’m looking ahead at how a realistic bass line can be generated once we’ve figured out what notes to play.

My search led me to a paper by Arcos, de Mantaras, and Serra “SaxEX: a case-based reasoning system for generating expressive musical performance”. Their method uses Narmour’s implication/realization model and Lerdahl and Jackendoff’s GTTM to analyze the score and identify the structure of the piece and the role of each note within its structure. The rest of it is a machine learning system that identifies relevant cases in learned performances and applies the appropriate parameters to the note under consideration.

A paper that provides some more details on SaxEx is “AI and Music: From Composition to Expressive Performance” by de Mantaras and Arcos. For a broader view of expressive performance, see “Modelling the Rational Basis of Musical Expression” by Widmer. Many related papers can be found at OFAI by typing in the key word “expressive performance” and at the “publications” page at the Music Performance Group at KTH.

A Design for Temporal Musical Objects

Tuesday February 24, 2004

Here’s a design for a set of temporal musical objects which can be used in an accompaniment generation program. We already have the Note and Chord classes, which represent notes and chords without register/octave information. We’ll need to add classes like OctaveDependentNote and OctaveDependentChord in MusES. To save typing, let’s just call these MIDINote and MIDIChord. These names are appropriate because a MIDINote object will probably be completely specified by a MIDI note number (e.g. 60 = C4). A MIDIChord object is just a vector of MIDINote objects. Notice also that velocity information is not present in these objects.

The temporal musical object system is designed to have two levels. Level one captures temporal information that may appear in a score, for example. Time intervals in this level are measured in quarter notes, eighth notes, and so on. Musical objects in this level are used in analysis, algorithmic composition, etc. Level two supports temporal information for performances. It measures time intervals in MIDI file divisions. “Humanizing” a level-one representation of a bass line produces a corresponding level-two representation, which can then be written to a track in a MIDI file. Note that it is also during this step that note velocity information is added. We will sketch the design of level-one objects below.

Our design centers around a class template TimeSeq. A class Duration is introduced to represent the length of level-one musical objects. Then TimeSeq is defined as:

template <class T>
class TimeSeq : public list<pair<Duration, T> >
{
  ...
};

We can then use TimeSeq<Chord> to represent a set of chord changes, TimeSeq<MIDINote> to represent a bass line, TimeSeq<MIDIChord> to represent a generated piano accompaniment, vector<TimeSeq<MIDINote> > to represent a drum track, and TimeSeq<Scale> to represent the result of a tonality analysis of a set of chord changes.

It doesn’t take a lot of reflection on the problem of duplicating this in any other language for one to realize how powerful C++/STL really is. In fact I dare anyone to do this and come up with a cleaner implementation :-).

MIDI File Writer

Monday February 23, 2004

To experiment with algorithms for generating bass lines, I’ll need to implement classes for notes and chords with pitch and duration information, much like MusES’s PlayableNote and PlayableChord classes. But to do that I’ll first need to sketch the implementation of the bass-line algorithms. I’ll work on this in the next few days.

I’ll also need a way to play the notes being generated. So I studied the MIDI file format. It turns out to be a simple enough format to output, especially when I only need to write files containing a few tracks and simple timing information, much like what BiaB or MiBAC Jazz will export. I then wrote a test program to output a MIDI file with a quarter-note scale in C.

I’ve also come across an interesting program called MMA. It’s written in Python and defines quite a elaborate language for describing patterns and styles. In theory this is a nicer approach than editing tables using BiaB’s StyleMaker utility, but I wonder why the author didn’t just extend Python by providing classes! Then he’ll have the full power of the Python language! Anyway I’m more interested in systems that are AI/expert-system based rather than pattern and probability based, and ones that can generate more realistic accompaniments.