An update of Aalto and Kaivo to 1.8.5 is out, fixing a possible crashing bug introduced with 1.8.4 and optimizing file handling. Because the bug only manifested for certain users, it was an interesting one to fix and I wrote a little about it, afterwards:
Six days ago, I was feeling happy and content at the end of a work day. I'd just deployed my 1.8.4 release that added a useful expert feature and a handful of bug fixes. I'd tested in all the usual hosts on both MacOS and Windows, and both plugins were making the right sounds and not eating too much CPU and most importantly, not crashing. So with my celebratory bourbon in hand... you already know where this is going.
I got a couple reports back right away that Aalto was crashing immediately on being scanned. One person reported in that Kaivo was crashing too, but most people were not having problems with it. In short, a whole rotten gift basket of ugly issues was collected in the field, but nothing was reproducible here. This was one to sleep on.
The next morning I upgraded one of my Macs to Mojave to match a customer's software and hardware setup exactly. Still no problems here. I spent the day reviewing code I'd just written and looking for bugs. I'd just done some fairly low-level work on timing, changing a lot of objects to use the C++ std::chrono library for their clocks. (C++ is what I use to write all the plugins, a sprawling mess of a language that can make very fast programs, used by most all audio software developers.) I found a couple of suspicious areas and spent a day on that, making the code more clear and obviously correct. Sent out an update to a few folks—same results. Guessing about what might be crashing and fixing anything suspicious is not a great way to track down bugs, but sometimes it's all I can productively do until a better course of action becomes clear.
One of the crashes seemed to be in code that was making lists of files. What if a particular preset was causing it? So, I asked a nice Aalto user to send me their preset library. And boom! Right away the crash.
I looked at the offending C++ code, which began:
void Path::parsePathString(const char* pathStr)
{
const int pathStrBytes = strlen(pathStr);
SmallStackBuffer<char, kShortFragmentSizeInChars> buf(pathStrBytes);
...
What this parsePathString
does is read a raw sequence of characters in memory and turn it into a compact Path
object that can refer to an object in a collection, like DSP objects in a graph, or files on disk.
The strlen
function is from the old C standard library that, for better or worse, is available as part of any modern C++ toolset. It counts the number of characters in a null-terminated string—that's some piece of text defined by where it starts in memory and containing all the bytes in the following memory until a zero is reached. Sensibly, (?) the zero is not counted as part of the length.
The text SmallStackBuffer<char, kShortFragmentSizeInChars> buf(pathStrBytes)
is an description of an object to be made. The presence of the <> brackets tells us the object is defined by a C++ template. Inside the brackets are the arguments to the template, and these describe a kind of variation on the theme of the basic object. In this case, SmallStackBuffer
is one of my own memory utilities that holds some data in a different place depending on how much data there is. Without getting too deeply into it, if there's a small amount of data it can go into a commonly used area called the stack, ensuring that other memory is not allocated, an operation that may lead to audio glitches. kShortFragmentSizeInChars
is the constant that determines how much data will go onto the stack.
Back to the scene of the crash. The file name "Xenos Soundworks - Forbidden Experiments/MA Bass Engine (Use MW)" was being read at the time. (Check out their Soundbanks for Aalto and many other synths.) Calling the strlen
function on this string returns: 64. That's a power of two, which might be a clue, and sure enough, it turns out that 64 was the value of kShortFragmentSizeInChars
. The crash starts to look like a crime scene. Time to look at what happens next, very carefully.
The next thing that happens to a 64-character file name is, it gets written to the SmallStackBuffer
object—including the final null terminator which goes to location 65 out of 64, overflowing the object's memory on the stack. The null might be written to part of another object, which could lead to the program crashing right away. Or, depending on when that other object is accessed, it might not crash for a while, which is why memory corruption like this can be so hard to diagnose.
How do we fix this? The easiest fix is very easy. Just change strlen(pathStr)
to strlen(pathStr) + 1
. Then there will be room for a 63-character file name and its terminator byte. Post update, go upstairs, back to celebratory bourbon. But hold up. To improve stability and efficiency over time I treat each bug as an opportunity to make all the code better, to fix not just this one instance of something having gone wrong but rather a whole class of things with the potential to go wrong. What can I learn from this mistake? Instead of just adding a +1
I'm going to look at all of my code and tests, and I've got four tasks ahead of me.
The first task is to look at my use of strlen
across all of the code for all my plugins. If I messed up this way once, I'm likely to have done it elsewhere. Thankfully I have not used strlen
many times, because of the potential for problems like this very one. The only good place to use it is in a higher-level object or function that will abstract away the confusion. Searching, I find there are only five other places in my own code where I've used strlen
, and I can quickly verify that I'm doing the right thing in all of them.
Secondly, I'm going to look at all the code for parsePathString
. Zoom out a bit. Could I have been doing this whole operation in a different way that didn't open the door to the problem? Yes, it turns out. The stack was getting corrupted because I was writing the path string to a temporary buffer in the process of decoding it to make symbols. That was the easiest way to use the Unicode text library I'm relying on. It should be possible to do the same thing just reading the path and not writing it, which will make the program a little faster, but how to do that wasn't clear the first time around. Digging into the library a little more deeply, I understood how to do it, and made a faster version of parsePathString
that never has to allocate any heap memory at all.
Third, I'm going to add tests that could have caught this problem before it happened to anyone else. The Path
object is a part of my core library madronalib and it really should be tested automatically as part of each release. Testing it with a string of length 64 is the edge case that would have caught this, so I'll add that as well as the neighboring lengths and some weird strings like empty and badly-formed ones.
Finally, there's a simple way I could have figured this out sooner: by having had the magic preset file in my collection to begin with! I usually have on hand only the "factory patches" I've made here, because when I make music I'm always making patches from scratch. But lots of people have made Aalto patches over the years and they are a bigger collection of data that might help shed light on other aspects of the code in the future. So I'll be grabbing some sample banks to have on hand.
Every programmer I know has days like the past few, and could empathize if I'd just said "I got stuck on a stack corrupting off-by-one bug" instead of the above. I wrote this more for myself, to cement good practices, and for the non-programmers out there, to share some feeling of what it's like to do this work. Some days it's even more fun.
I'd say you've earned the whole bottle of bourbon.
What’s the expert feature?
It's the "master_tune" parameter. I wrote more about it with the 1.8.4 release. https://www.madronalabs.com/topics/7588-kaivo-and-aalto-updated-to-1-8-4
Oh. Sorry. I didn’t check. Thanks for the info. Is cpu a bit lighyer in Kaivo with 1.84/5? Just curious. I didn’t directly compare some of the patches with the 1.8.3, but it seems like it is. I could also just be smoking something.
No worries... Audio CPU use should be more or less the same in 1.8.5. I've changed a lot of interface timer code so UI stuff may feel snappier, which I don't have a good way of quantifying but it does feel better to me.
everything fine here - thank you !
Three Thumbs up :o)
I can relate, thanks for the update !
nice sleuthing. thanks for sharing the journey with us.
Hi Randy, great that you are still supporting your fantastic VSTs for us. Much appriciated and also I am looking forward for SUMU :)
Have a great and relaxed time.
By the way, the new looks of the MADRONA LABS page are great. Very relaxing for the eye.
Having lived through the "Microsoft pretends there are long filenames" era I can tell you that names must only be 8 characters and only letters and numbers. No! No lower case, stop trying to break things!