Credentials Leak with Knex

Credentials Leak with Knex

Knex 2.4.2 and below are vulnerable to a credentials leak. Learn how to mitigate the problem.

Preamble

I wasn't planning on writing this post, but I think it's time to talk about this problem openly. First things first, let me be clear about something before I start: It's not my intention to bash on Knex or its maintainers.

This article will be focused on a security issue that I found in Knex and how to mitigate it, but I'll also talk briefly about the social aspects of this problem.

The Problem

Knex is (in essence) a query builder for Node.js, but its functionality does not end there. It also provides wrappers for database drivers, and a way to execute SQL queries directly (without having to pass them to a driver or client object). As we'll see, the ability to execute SQL queries directly is one of the reasons why it's possible to accidentally leak credentials without even knowing it.

Let's look at a minimal example of how to reproduce the problem:

import Knex from "knex";
import { pino } from "pino";

import { getConnectionConfig } from "./db.js";

const logger = pino();

const knex = Knex({
    client: "pg",
    connection: getConnectionConfig(),
});

const badQuery = knex.raw("CRASH AND BURN");
try {
    await badQuery;
} catch (err) {
    // We might want to log the failed query, so we can debug it later
    logger.error({ err, badQuery }, "It crashed and burned!");
}

So, taking aside the obvious bad code, where does the problem lie? Well, the problem is that the badQuery object holds a reference to the connection object, which in turn holds a reference to the credentials. This, by itself, is not enough to cause a credentials leak, but there are 2 other factors that make it happen:

  • The nested credentials object has enumerable properties (we'll see why this is important later)

  • Our logger (in this case, Pino) will serialize the badQuery object "too well".

What follows is an example of what could end up appearing in our logs:

[18:10:11.777] ERROR: It crashed and burned!
    badQuery: {
      "_events": {},
      "_eventsCount": 0,
      "client": {
        "_events": {},
        "_eventsCount": 4,
        "config": {
          "client": "pg",
          "connection": {
            "database": "svc-db-name",
            "host": "svc-db-host",
            "password": "svc-db-password", <-- here!
            "port": 5432,
            "user": "svc-db-user"
          }
        },
        "connectionConfigExpirationChecker": null,
        "connectionSettings": {
          "database": "svc-db-name",
          "host": "svc-db-host",
          "password": "svc-db-password", <-- and here too!
          "port": 5432,
          "user": "svc-db-user"
        },
                ...

We only wanted to log the poblematic query, but we ended up logging the credentials as well. Not good.

Mitigations

What can we do about it? Let's go over some mitigation strategies.

1. Use toQuery for logging

The first mitigation strategy is to convert the query object to string before logging it.

logger.error(
  { err, badQuery: badQuery.toQuery() },
  "It crashed and burned!"
);

I'm not a fan of this solution, because it requires Knex users to change their code, and to do it everywhere where a query is logged, but also because it's not entirely clear how good is the toQuery method at returning the real query that was executed... 😮‍💨

Don't get me wrong, doing this is better than nothing, and it can be useful in its own right if we want less verbose logs, but it's not a complete solution because it's too easy to reintroduce the problem again.

2. Enable redaction in your logger

Many loggers (including Pino) support redaction for sensitive data, and it is almost always advisable to enable it. The good thing about it is that we have a single place where we can configure redaction, and we don't have to worry about it anymore.

const logger = pino({
    redact: [
        "password",
        "connection.password"
        "connectionSettings.password",
        "*.password",
        "*.*.password",
        "*.*.*.password",
        "some.nested.list[*].password"
    ],
});

This approach is not free of problems either:

  • Although most "redacters" are quite flexible, they still require from us to know the path to the sensitive data, or at least the depth of the object where it is located.

    • As you can see in the snippet above, we can use wildcards if we don't know the exact path, but the depth still matters (at least for Pino, other libraries might be more flexible, but I suspect they won't, mainly for performance reasons).

    • Another problem that we can't solve with wildcards is that, although we can substitute array indices with [*] or property names with *, we still have to tell the redacter if it will encounter an array or a "regular" object.

Let's say for a moment that we have tested our redaction configuration thoroughly, and that we even overshoot by covering more cases than we need to. Sadly, this might not be enough.

A redaction system like this one is essentially a denylist, which means that new usage patterns might not be covered by it. For example, imagine that a developer decides to log an object that has nested query object, instead of directly logging the query. If there are arrays involved, the problem becomes even more complicated, wildcards only take us so far.

3. Do the right thing: fix Knex

I'm going to show the general idea behind the fix here. However, you can find a link to the actual PR in the References section at the end of this article.

The idea is to make the credentials property (in our case password) non-enumerable, so it won't be picked up by the logger serializer.

/**
 * Sets a hidden (non-enumerable) property on the `target` object, copying it
 * from `source`.
 *
 * This is useful when we want to protect certain data from being accidentally
 * leaked through logs, also when the property is non-enumerable on the `source`
 * object and we want to ensure that it is properly copied.
 *
 * @param {object} target
 * @param {object} source - default: target
 * @param {string} propertyName - default: 'password'
 */
function setHiddenProperty(target, source, propertyName = 'password') {
  if (!source) {
    source = target;
  }

    // THIS IS THE IMPORTANT PART
  Object.defineProperty(target, propertyName, {
    enumerable: false,
    value: source[propertyName],
  });
}

// Usage example:
// authentication.options.password becomes non-enumerable, therefore it won't be
// serialized by JSON.stringify, util.inspect or any other serializer.
setHiddenProperty(authentication.options);

We rely on Object.defineProperty to make the property non-enumerable, and we use a helper function to make it easier to use.

There is a caveat, though: as soon as we make a property non-enumerable, it won't be copied when we rely on Object.assign or the spread syntax (...) to perform object shallow copies (if we know the name of the property and access it explicitly, then we can copy it without problems).

4. Don't wait for Knex to fix it

I'll talk about this in the next section, but I don't have much trust in Knex applying the proposed fix any time soon. In the meantime, I recommend you to "patch" your knex dependency.

  • Yarn has a patch command (Yarn Berry only, Yarn classic doesn't support it).

  • PNPM has a patch command too.

  • NPM doesn't have a patch command, but you can use patch-package to achieve the same result.

You can base your patch on my PR, or at least you can use it as a reference.

5. Apply a combination of the previous strategies

None of the previous strategies is perfect, but combining them can result in a more robust solution that accounts for the shortcomings of each individual strategy.

Challenges in Open Source Maintenance

Some notes to set the context:

  • At the time of writing this article, Knex has 95 pending PRs and 737 open issues.

  • The only commits that are being merged are those from dependabot and from "regular" maintainers.

  • They have no clear security policy, so I had to report the security issue as a regular issue, in the open.

  • My report has been waiting for a response for 1 month now. The same goes for the solution that I proposed.

  • This was a known issue for pg developers, and they managed to fix it a long time ago (at the pg level), but the knowledge of this problem didn't reach Knex maintainers.

Ok, that doesn't sound good. But as I said at the beginning, I don't want to bash on Knex or its maintainers (I myself have maintained open source projects, and I neglected them due to lack of time and burnout).

Contributing code, triaging issues, and reviewing PRs all require a significant amount of time and effort. Sadly, most open source projects are understaffed, and most of their contributors are non-paid volunteers. You already know all this, but I think it's important to keep it in mind every time that we encounter these problems.

I have a few loose recommendations that (I believe) can help communities to cope with these problems (if they are not already doing it):

  • Establishing clear contribution guidelines, a code of conduct and a security policy can help a lot to reduce friction and to make the project more welcoming.

  • Defining stewardship rules can help to distribute the workload among maintainers and contributors, and to make it easier to onboard new maintainers.

Conclusion

I hope that this article will be helpful for those who are using Knex and want to ensure that they are not leaking credentials through their logs. I also hope that it served to raise awareness about some generic techniques that can be used to mitigate these kinds of problems, such as log redaction or making sensitive properties non-enumerable..

All the best 🤗

References