Under the Hood: BEATS 1.2.0 Performance Improvements

August 10, 2010

High fives all around! “Hey man, loving the new BEATS! How’d you get it so much faster?” Let’s find out.

First, for executives:

Reminder: BEATS is the top command-line drum machine of 2010. Feed it a song notated in YAML, and it will produce a wave file of impeccable timing and feel. To learn more, visit http://beatsdrummachine.com.

Song Audio is Written in Chunks

My therapist has been telling me I need to lose my all-or-nothing thinking, and I’ve taken it to heart. Let’s say you have a 4 minute song. When tasked with producing its audio, 1.1.0 would create a giant array to hold the entire sample data. It would have about 10.5 million elements (44,100 samples a second, for 240 seconds). The sample data for each pattern would then be painted onto the array at the appropriate place. Finally, the sample data for the entire song would be saved to disk in one operation.

Giant arrays require a lot of memory, and are sloooooooow. One big speed-up in 1.2.0 comes from eliminating the giant master array in favor of smaller per-pattern arrays. The output Wave file is now written in chunks, one pattern at a time. First, BEATS creates an empty Wave file that just has a header, but no sample data. As the sample data array for each pattern is generated, it is then appended to the file. This is much faster than writing a single giant array in one operation.

This optimization especially makes a difference when using the -s option, which saves each track to a separate file.

Patterns are Broken Into Smaller Sub-Patterns

So I said that smaller arrays can be faster, right? Well, a way to work with even smaller arrays is to break each pattern up. Suppose your song has this pattern:

Verse:
  - bass:   X.......X.X.....
  - snare:  ....X.....X.X...

Behind the scenes, BEATS 1.2.0 will break it up into 4 sub-patterns:

Verse0:
  - bass:   X...
  - snare:  ....

Verse4:
  - bass:   ....
  - snare:  X...

Verse8:
  - bass:   X.X.
  - snare:  ..X.

Verse12:
  - bass:   ....
  - snare:  X...

The song flow will also be updated so that the new patterns are played in the right order. (You don’t have to do anything special, BEATS will do all of this automatically).

When breaking up patterns, BEATS will make each one at most 4 steps long (equivalent to a quarter note, or two eighth notes). After some testing, this seems to be the sweet spot for performance - any bigger or smaller, and it gets worse.

By itself, this adds a noticeable speed improvement. But when combined with the next optimization, it gets really interesting.

Identical Patterns are Consolidated

In BEATS, caching occurs at the pattern level. The sample data for a pattern will only be generated the first time it occurs in the song. All subsequent times a cached version will be used.

Notice in the example above how Verse4 and Verse12 are identical. In BEATS 1.1.0, this would result in the same sample data being generated twice. BEATS 1.2.0 is smart enough to notice this, and will replace all occurrences of Verse12 with Verse4 (or vice versa). The end result is that cached sample data will be used in places where it would have previously been generated from scratch. This consolidation happens across the entire song after all patterns are split up. So if a rhythm occurs in both a verse and chorus, it will only get generated once.

Note that this works in concert with the previous optimization. The shorter each pattern is, the more likely it is to occur somewhere else in the song. This effectively lets BEATS look inside each pattern for repeated rhythms, and cache them. Big performance win.

I mentioned above that patterns 4 steps long are the sweet spot. A lot of this likely has to do with the fact that it represents a quarter note or two eighth notes. A basic assumption behind this optimization is that songs will be “musical”. I.e., a typical beat will contain repeated rhythms, and they will often occur on a quarter note (4/4, 3/4, etc.) or two eighth note (6/8, 12/8) boundary. This does mean if you give BEATS a song with totally random, non-repeating rhythms, it will probably run noticeably slower. If this is how you get your kicks though it’s time to re-evaluate your life.

This algorithm is pleasingly simple and mechanical. It’s easy to understand, and easy to test. I also looked into detecting repeated rhythms at arbitrary boundaries. However, that would be more complex and in a proof-of-concept the performance improvement was too small to be worth it.

Other Stuff

The 3 optimizations above account for the lion’s share of the performance improvements seen in 1.2.0. However, a few other optimizations chip in as well:

Final Notes

I don’t have specific numbers to show how much each individual optimization contributes to the overall performance improvement. This is for two reasons: