« back to posts

Abusing Flow comment syntax for great good

2018-05-23 · view article source

Flow is a static type system for JavaScript. Code written with Flow looks like normal JavaScript with extra type declarations and type annotations:

type ShoppingCartEntry = {productName: string, count: number};
function totalCount(cart: ShoppingCartEntry[]): number {
    const counts: number[] = cart.map((entry) => entry.count);
    return counts.reduce((a, b) => a + b, 0);
}

These addenda of course mean that the code is not itself valid JavaScript. Flow provides a preprocessor to strip away these annotations, but also offers an alternative comment syntax in case using a preprocessor would be undesirable for some reason or other:

/*:: type ShoppingCartEntry = {productName: string, count: number}; */
function totalCount(cart /*: ShoppingCartEntry[] */) /*: number */ {
    const counts /*: number[] */ = cart.map((entry) => entry.count);
    return counts.reduce((a, b) => a + b, 0);
}

The semantics are simple: any block comment that starts with :: (a double colon) is treated as normal Flow code by the Flow parser, and for convenience any block comment that starts with : (a single colon) is treated as normal Flow code that starts with a literal colon, so that type annotations can be written like (17 /*: number */) instead of the more awkward (17 /*:: : number */).

This comment syntax is an entirely reasonable feature that we can abuse to create horrifying, devious contraptions. Sounds like fun!

(Note: All code in this post works on Flow v0.72. These techniques may well be patched in the future.)

Motivation: Incompleteness

Sometimes, we write code that is provably correct in a way that the type checker can’t infer. For instance, suppose that we have an array with elements of type Person | null (“either Person or null”), where Person is an object type with a string field called name. We want to retrieve the names of all the people in the array, ignoring the null elements. In plain JavaScript, we might write something like this:

/*:: type Person = {name: string, favoriteColor: string}; */
function peopleNames(maybePeople /*: (Person | null)[] */) /*: string[] */ {
    return maybePeople
        .filter((person) => person !== null)
        .map((person) => person.name);
}

A human can look at this code and easily see that it returns a valid list of strings. But Flow can’t, for a fully understandable reason. Flow knows that filter takes an array T[] and a predicate (T) => boolean, and returns a new array T[]. However, Flow doesn’t understand the relationship between the inputs and the output—in particular, that every element in the output satisfies the predicate. So, as far as Flow is concerned, the result of the call to filter might still contain null elements, and in that case the expression person.name would indeed be cause for alarm.

In situations like these, it is tempting to reach for the any keyword: this is a magic type that is interconvertible with every type and for which all operations are permitted. In effect, it says that “anything goes” whenever a particular variable is involved. We can write:

function peopleNames(maybePeople: (Person | null)[]): string[] {
    return maybePeople
        .filter((person) => person !== null)
        .map((person) => (person: any).name);  // cast through `any`!
}

But here we are losing valuable type safety. We lose the ability to catch many potential errors in our code—for instance, a typo like person.nmae would go completely undetected. We want to refine the type information, not throw it away.

We could give Flow a hint, by explicitly checking that each person in the filtered array is actually not null:

function peopleNames(maybePeople: (Person | null)[]): string[] {
    return maybePeople
        .filter((person) => person !== null)
        .map((person) => {
            // Explicit assertion just to appease the typechecker.
            if (person === null) {
                throw new Error("Unreachable!");
            }
            // If we get here, `person` is non-null, so this next line is fine.
            return person.name;
        });
}

Flow is now happy to treat the argument to map as a function taking Person | null and returning string, so this code type-checks and runs correctly. But this is not a great solution. Assertions like this make the code more verbose and harder to read, interrupting (ironically) a reader’s flow. Furthermore, writing code in anything other than the most natural way simply to appease tooling of any sort should always be a red flag: tools exist to help programmers, not hinder them, and if the tools are broken then they must be fixed.

Or: instead of fixing these tools, we can just lie to them.

White lies

Suppose that we had access to a function withoutNulls that gave a copy of its input array with all null elements removed. In that case, Flow would be satisfied by the following code:

function withoutNulls<T>(xs: (T | null)[]): T[] { /* implementation elided */ }
function peopleNames(maybePeople: (Person | null)[]): string[] {
    let people = maybePeople.filter((person) => person !== null);
    people = withoutNulls(people);  // no-op
    return people.map((person) => person.name);
}

Of course, we don’t actually want to call this function, and ideally we don’t even want the function to exist.

In fact, Flow makes it easy for us to declare that a function exists without providing its implementation, because this is commonly needed to talk about external library functions and the like. We can start with the following:

declare function withoutNulls<T>(xs: (T | null)[]): T[];
function peopleNames(maybePeople: (Person | null)[]): string[] {
    let people = maybePeople.filter((person) => person !== null);
    people = withoutNulls(people);  // now fails at runtime: no such function
    return people.map((person) => person.name);
}

Now, Flow is still happy, but our code will fail at runtime unless we actually provide an implementation of the withoutNulls function. We need Flow to think that we’re calling this function without actually having to do so.

Behold:

declare function withoutNulls<T>(xs: (T | null)[]): T[];
function peopleNames(maybePeople: (Person | null)[]): string[] {
    let people = maybePeople.filter((person) => person !== null);
    /*:: people = withoutNulls(people); */  // ta-da!
    return people.map((person) => person.name);
}

The comment syntax was designed to allow including Flow type annotations, declarations, and the like, but nothing stops us from including actual code! As far as Flow is concerned, the middle line of the function is just as real as the other two.

Now, for something a bit crazier.

Utter fabrications

Suppose that we have some code that requires a module of generated code: created at build time, say, or even at runtime. In JavaScript, it is perfectly fine to write

const frobnicateWidgets = require("./frobnicateWidgets");

as long as the module is available when the require expression is evaluated. But such an import is of course incompatible with any static analysis. In particular, Flow will yield an error—“Cannot resolve module”—when the module in question has not yet been generated.

We can’t use exactly the same trick as before, wherein we performed some assertions that only Flow could see. The problem is that Flow knows what require does—it loads a module. If we were in a context where require were a normal function of appropriate type, then this wouldn’t be a problem.

And we can make it so:

const frobnicateWidgets =
    /*:: ((require: any) => */ require("./frobnicateWidgets") /*:: )() */;

Here we see the return of any. Within the body of this lambda expression—which only exists in Flow’s eyes!—require is treated as a normal function that we call with a normal string to get back what we need.

We can even give the result a well-defined type so that code in the rest of the program continues to have statically strong types, instead of being polluted by the any:

type WidgetFrobnicator = (Widget) => void;  // whatever the module signature is
const frobnicateWidgets: WidgetFrobnicator =
    /*:: ((require: any) => */ require("./frobnicateWidgets") /*:: )() */;

(This works because require, at type any, is treated as a function that also returns an any, which is then converted to a WidgetFrobnicator.)

In the peopleNames example, we added some phantom statements to the body of a function. Here, we’re actually changing the structure of the AST. Dangerous? Perhaps. Brittle? Probably. Interesting? Certainly!

Conclusion

We have seen how to bend Flow to our will by splicing arbitrary code into its token stream.

Ridiculous as it seems, this method has some benefits. It’s more precise than using casts through any. Using this method, we lie to Flow in a very specific and explicit way, instead of declaring that “all bets are off” for a particular variable and anything that it touches. Indeed, the keyword any is itself a grand lie, just one that tends to be better documented and supported.

The observant reader may recall our motivating suggestion that an ideal solution should be unsurprising to readers and should be written like natural JavaScript code, and protest that we have failed on both these counts.

Such a reader is 100% correct, but is also no fun at parties, because this hack is way cooler than any “practical”, “enterprise-grade” solution—so there.

« back to posts