A DoS vulnerability I found in Apigee’s platform

Tue Jul 22 2025

To be clear, this happened about a decade ago and has been patched for very nearly as long. Nothing here will help you hack anything. But maybe it will help you avoid writing so easily hackable code.

Apigee, now owned by Google, was one of those business-to-business software products that exist just because businesses will buy any old shit. If you can convince a manager that your overlay can improve accessibility then it doesn’t matter whether or not it works, because they’ve signed a £37,000/year five-year contract and anyway it’s their developers’ fault for integrating it wrong. Apigee was never that useless, but fundamentally it did things that were not terribly hard or necessary to do yourself, and it did them by routing all your API traffic through their servers which just feels like a hugely unnecessary single point of failure and/or data security concern to add to your system.

One of the easily implemented things it did for you was add logging, and like any good logging platform, it allowed you to mask data. After all, you don’t want to log something like this:

{
    "username": "john",
    "password": "hunter2"
}

So you can set the password field to be masked, and it will instead log this:

{
    "username": "john",
    "password": "**********"
}

Rather neatly, the password is now ten asterisks — regardless of the actual password’s length, so the length isn’t being leaked.

Now, the way a normal person would do this would be something like this:

function log(payload, keyToMask) {
    try {
        // Turn the string input into an easily manipulable data structure
        const value = JSON.parse(payload);
        // Edit that structure to remove the sensitive data
        value[keyToMask] = MaskString;
        // Turn it back into a string
        sendToLogger(JSON.stringify(value, null, 2));
    } catch {
        // Invalid JSON, we can’t mask this
        sendToLogger(payload);
    }
}

This will reformat your whitespace, which is arguably a good thing, but if you really want to avoid that you could write your own JSON parser/serialiser that streams through the data, masking as it goes. It’d be a nice little coding exercise, I think — challenging, but fundamentally not terribly difficult, and it would have the lovely advantage of never having to hold the whole string in memory, making it extremely performant.

Apigee I don’t know exactly what they did, or how their code was written, but I know at least one of their products was Node-based and it’s the only explanation for some of the nonsense that is to come, so I am going to go ahead and assume that their code is indeed JavaScript, and that it works how I think it does.:

function log(payload, keyToMask) {
    // Turn the string input into an easily manipulable data structure
    const value = JSON.parse(payload);
    // Read the password out of that structure
    const valueToMask = value[keyToMask];
    // Mask all occurrences of the password from the original string
    sendToLogger(payload.replaceAll(valueToMask, MaskString));
}

I’m not going to say "this is a reasonable-seeming approach". It isn’t. It’s something I’d expect a novice or ChatGPT to write, and any good senior engineer would look at the pull request and use it as a mentoring opportunity. And there are a surprising number of problems with this approach.

It doesn’t just mask the password

If your password is "password", which to be clear it should not be, then because their code replaces all occurrences of it, the logs will look like this:

{
    "username": "john",
    "**********": "**********"
}

and everyone reading them will know your password.

It doesn’t even mask the password

Because the call replaceAll runs on the JSON string, and not the decoded values, it will not work if your password contains a double-quote character — JSON will escape " to \" and so if your password is pass"word, replaceAll will look for exactly that, and since all it can find is the slightly-different pass\"word, it will log this:

{
    "username": "john",
    "password": "pass\"word"
}

and everyone will know your password.

String.replaceAll did not exist back then

String.replaceAll has been widely available since August 2020, and all this was happening around 2015. The older String.replace function would only mask the first appearance of your password, which is obviously no good (at least, assuming we’re committed to this terrible string-replacement approach). But there was a workaround: regular expressions.

While "test string".replace("s", "-") will stop after the first match and give you "te-t string", "test string".replace(/s/g, "-") will return "te-t -tring", because the g flag makes the regular expression global.

Which is all well and good, as long as your regular expression can be trusted. But this is running on user input, so something like:

function log(payload) {
    // Turn the string input into an easily manipulable data structure
    const value = JSON.parse(payload);
    // Read the password out of that structure
    const valueToMask = value[keyToMask];
    // Generate a global regex
    const valueRegExp = new RegExp(valueToMask, "g");
    // Mask all occurrences of the password in the original string
    sendToLogger(payload.replaceAll(valueRegExp, MaskString));
}

This is extremely bad. Not least because it means that as well as not logging your password if it has a double-quote in it, we now don’t log it if it contains OK, . is probably fine., which includes $, +, ^ and ?. Ironically, your password will only be masked if it’s laughably insecure.

The memory bomb

Clearly this is a data-logging issue, but how can you use it to create a denial of service attack?

A memory bomb is an exploit based on sending a small payload to a target system, which it will expand into a very large data file, using up all its resources, and crashing the server. In this case, we do it by repeatedly masking our password.

For this, you’ll need to find an Apigee-powered server that masks more than one field. Perhaps we’re collecting data on people and we want to mask their name, gender, address and date of birth. If the payload looks like this:

{
    "profile_id": "a88f932c-ceb4-4f54-a0e5-cecb40186c01",
    "first_name": "John",
    "surname": "Smith",
    "gender": "MALE",
    "address_number": "152",
    "address_street": "Fake Street",
    "address_line_2": "Nowherton",
    "address_city": "Nullchester",
    "address_country": "USA",
    "address_zip": "24601"
}

then we’re looking at masking nine different fields. To trigger the attack, we submit this request:

{
    "profile_id": "f0b9e7cf-5f75-4666-a0a6-23555ac48e00",
    "first_name": ".",
    "surname": ".",
    "gender": ".",
    "address_number": ".",
    "address_street": ".",
    "address_line_2": ".",
    "address_city": ".",
    "address_country": ".",
    "address_zip": "."
}

I should stress that this is the (second) point where I alerted them to an issue with this system, so what follows is mostly untested — we were using Apigee’s systems, so I was somewhat motivated not to destroy them — but I do strongly believe it would have worked.

The dot character is a RegExp control character that matches any single character.

There are 280 characters in our malicious payload, and all of them would have been replaced with ten asterisks when we mask the first_name field, making a 2,800 character masked string. Then we mask the surname, and all 2,800 asterisks will be replaced with ten asterisks, making the payload 28kB long. By the time we’ve replaced the zip code, our payload is 280GB long and the server has run out of memory and crashed.

The CPU bomb

There is, though, an even worse payload you could send this API:

{
    "profile_id": "0dd08034-de2f-40df-8b98-aa7ba31304aa",
    "first_name": ".",
    "surname": ".",
    "gender": ".",
    "address_number": ".",
    "address_street": ".",
    "address_line_2": ".",
    "address_city": ".",
    "address_country": ".",
    "address_zip": "(\\**)+a"
}

This uses two separate regular expressions. The first is the same “any single character” matcher from before, meaning by the time we’ve masked every field except the country, our payload is 28GB of asterisks.

The second regular expression, (\**)+a kind of means “any number of asterisks followed by an a”, but written in a deliberately inefficient way. Specifically it’s saying “any number of groups, each of which is any number of asterisks, and then an a”. This is a fairly standard “malicious regex", and it will cause the RegExp engine to check if the expression matches with groups of every size from zero to 28 billion, and when that doesn’t work, it will say “OK, the first asterisk doesn’t match” and proceed to check the other 28 billion of them in the same way. The execution time for this grows exponentially with input size, and our input is big. If you submit a few of these, you should be able to tie up all of the server’s execution threads, so even if it somehow has enough memory to run all these checks at once, it won’t have any free CPU to serve anyone else’s requests. This is why you never parse user input as a regular expression: the user might not have your best interests at heart. Arguably it would be a worse attack with fewer masked values, as if we can keep the payload size under about a gigabyte then it will fit in memory and we’ll tie up the server for ages instead of letting it crash and reboot gracefully.

Conclusion

I never tried this on Apigee’s servers. I was sufficiently horrified already and didn’t think I could handle the psychic damage when this attack inevitably worked. They have since patched their logger to not use regular expressions generated from user input.

But I think it’s a good little teachable moment — you can see how they got to this terrible situation making only decisions that seemed passably reasonable in isolation. “We can just string-replace the password,” they thought. “Oh, we need to do replace-all to make sure we get it.” “Oh, I guess we need to use regular expressions for that.” “Hm, we should probably support masking multiple fields.” Most of those are bad ideas, but only one of them is disastrous, and by that point you’re already quite a way along implementing this train wreck.

It’s a really common error for junior (and sometimes quite senior) developers to start down a reasonable-seeming but wrong path like this, and end up doing some quite horrible things to force it to work. Their code usually does wind up working, but there’ll usually be some edge case like this that almost makes you wish it hadn’t. With experience, you learn to recognise things like turning user input into a regular expression as signs that you’ve taken a wrong turn and there was a simpler solution earlier on that you missed.