No holds barred refactoring

David Veen

published:

updated:

16 november 2019

21 december 2020

How many times have you started tidying up on one end only to break something you thought was completely unrelated? You change an object reference somewhere and it breaks a counter somewhere else - that kind of deal. This almost always boils down to the curse of mutable state. A trap that is indelibly connected to an imperative programming style.

I recently came across a bit of code that's the perfect measure of small, isolated, and wrong, and which I'd like to share. I've taken the liberty of taking it out of its domain and converting it into plain JavaScript.

A bit of context: we have a few filters on screen that govern what data is shown to the user. If a filter is ON the data for that category is displayed, and vice versa. Rather than simply toggling on and off individually the filters act as a group:

if all filters are ON, turn all filters OFF except the one selected
if switching a filter OFF means that all filters are OFF then all filters are turned ON instead
otherwise, toggle the selected filter.

// DRAGONS LIVE HERE

const filters = {
  'blue': true,
  'green': true,
  'orange': true
};

function toggleFilter(name) {
  if (isAllChecked()) {
    Object.keys(filters).forEach(key => {
      filters[key] = false
    })
  }
  filters[name] = !filters[name]
  if (isNoneChecked()) {
    Object.keys(filters).forEach(key => {
      filters[key] = true
    })
  }
}

function isAllChecked() {
  let flag = true;
  Object.keys(filters).forEach(key => {
    flag = flag && filters[key];
  })
  return flag;
}

function isNoneChecked() {
  let flag = true;
  Object.keys(filters).forEach(key => {
    flag = flag && !filters[key];
  })
  return flag;
}

What exactly is wrong with this code? It's hard to articulate why, but it's hard to read, isn't it? The functions are small and don't take on too much responsibility. But they also throw a lot of implementation details at us without specifying intent. The main function here is called toggleFilter, but how exactly do the ifs and foreaches reach that goal? We have to grok every line to see what it does. Secondly, it seems verbose for what it's trying to achieve - especially the isAllChecked and isNoneChecked functions. Third, the toggleFilter function mutates state outside of its own scope. Rather than producing an output it produces a side-effect. Though we know our objective is to change the filters object, mutating it as a side-effect is a kind of casual cruelty to the future maintainer of this code. As the saying goes, we should assume he or she is a violent psychopath and knows where we live. All compelling reasons to refactor.

Mutating the filters object as a side-effect is a kind of casual cruelty to the maintainer of this code

Refactoring guidelines

Let's establish some guidelines before we dive in. After all, it would be entirely possible to solve this problem using binary arithmetic (flip a switch, change a bit!), but would that be an acceptable solution? How do we measure success?

As a first rule, let's assume that the original implementation is correct. We want to keep the tried and tested algorithm, not change it. That's our dot on the horizon.

A nice colloquial metric for code quality is 'what-the-f*cks per line of code' (WTF/LoC). Realising that our functions mutate a global object is clearly a WTF moment, so we want to get rid of that. There are a few more things we would like to achieve: we want our end result to be readable (self-documenting), small (no more than we need), and predictable (we do what we say we do). We also want to be as declarative as possible: we communicate what we want to achieve, not how we achieve it.

So with that in mind, let's change a few things:

#1: humble beginnings

// STILL DRAGON TERRITORY

const filtersMap = new Map([
  ['blue', true],
  ['green', true],
  ['orange', true]
])

function toggleFilter(name) {
  function isAll(value) {
    for (const val of filters.values()) {
      if (value !== val) {
        return false
      }
    }
    return true
  }

  function setAll(toValue) {
    for (const key of filters.keys()) {
      filters.set(key, toValue) // <--
    }
  }

  function toggleKey(key) {
    filters.set(key, !filters.get(key)) // <--
  }

  if (isAll(true)) {
    setAll(false)
  }

  toggleKey(name)

  if (isAll(false)) {
    setAll(true)
  }
}

We exchanged our filters object for a Map. Since we want to store key/value pairs that we can iterate over, a Map mirrors our intent much more closely. It also plays much nicer with arrays, in that a map can both be constructed from and destructured into one. When we're just using an object as a dictionary, Map is generally the superior choice.

Since we're not reusing our separated functions anywhere we should bring them back into the fold and hide our implementation details. Speaking of functions; turns out we don't need two separate functions to implement isAll and isNone but can parametrise the abstraction into one. We can also create a second abstraction; a function to set all values. Both of these functions enjoy the simplicity of the humble for loop. They're not fussy.

We've also added a third abstraction called toggleKey. Though it's just one line of code, separating it as a function gives us a major benefit: it means that we have fully separated our actions from our logic. This should make it much easier to read and maintain our code. Compare the resulting ifs structure with the previous implementation. Isn't it much more explicit in its intent? The beauty is that we don't have to know about the details of the implementation to understand the result. The only thing that might seem weird is that when we start reading toggleFilter from the top it doesn't do anything at first except define a bunch of functions.

Immutable with Style

While this version is quite an improvement it didn't fix the issue that got us so riled up in the first place - we're still mutating state. Our function pretends to be a well-trained pet, except it hasn't been house-broken: lines 21 and 26 are where it wets the bed.

We can taper over this problem by mutating a copy of the object instead. Add a line const copy = new Map(filters) then change all further references to refer to the copy, and return it. Done. But it feels dirty, doesn't it? Dangerous, even. Hiding the body in the closet doesn't mean it won't start to rot. How about a coding style where it's impossible to mutate state? Let's turn to the power of arrays.

There's an xkcd for this

#2: immutable by default

function toggleFilter(name, filters) {
  const isAll = isValue => pairs.every(([_, value]) => value === isValue)
  const setAll = toValue => pairs.map(([key, _]) => [key, toValue])
  const toggleKey = onKey => pairs.map(([key, value]) => key === onKey ? [key, !value] : [key, value])

  let pairs = [...filters]
  if (isAll(true)) {
    pairs = setAll(false)
  }

  pairs = toggleKey(name)

  if (isAll(false)) {
    pairs = setAll(true)
  }

  return new Map(pairs)
}

Our code still has the exact same abstractions as before, but the implementation is different (an easy thing to do now that we've separated logic and implementation). Instead of iterating over a Map we're iterating over an array of key-value pairs (a destructured Map). Then, when we're done with our function we construct a new Map from our intermediate result. The nice thing here is that the array function map always returns a new array rather than performing a mutation. So even though toggleKey's implementation might look a little more awkward than mutating the Map like we did before, this implementation is much safer.

Note that our desire to be pure also necessitates a change in interface. Our function signature changed from a chasmic unary string -> void to a more civilized binary string, map -> map. Friends and colleagues of our function need to be informed - this may or may not be easy to do, depending on the context.

With all that out of the way, there's just one more thing that should concern us. It's that isAll, setAll and toggleKey aren't generic. They know we're iterating over the pairs array, but we'd prefer to keep that information strictly need-to-know. The alternative, unfortunately, is a little verbose:

#3: separating the data from the implementation

function toggleFilter(name, filters) {
  const isAll = (isValue, arr) => arr.every(([_, value]) => value === isValue)
  const setAll = (toValue, arr) => arr.map(([key, _]) => [key, toValue])
  const toggleKey = (onKey, arr) => arr.map(([key, value]) => key === onKey ? [key, !value] : [key, value])
  
  let pairs = [...filters]
  if (isAll(true, pairs)) {
    pairs = setAll(false, pairs)
  }

  pairs = toggleKey(name, pairs)

  if (isAll(false, pairs)) {
    pairs = setAll(true, pairs)
  }

  return new Map(pairs)
}

What's with all the pairs growing in our code like mushrooms?

You'll like this, maybe

Though it seems silly to do so, our insistence on separating the hows from the whats teases out something that's been lingering at the back of our minds for all the iterations of this function so far: we're performing consecutive operations on a single data structure. Two of those operations are conditional, one of them is not. These if statements are actually getting in the way of conveying this information to the reader. Unfortunately there just isn't a nice way to tell JavaScript to maybe map over the data structure and chain some actions. At least not out of the box. But what if we could magically do something like this:

#4: mapping over a data structure

function toggleFilter(name, filters) {
  const isAll = isValue => arr => arr.every(([_, value]) => value === isValue)
  const setAll = toValue => arr => arr.map(([key, _]) => [key, toValue])
  const toggleKey = onKey => arr => arr.map(([key, value]) => key === onKey ? [key, !value] : [key, value])

  const pairs = Pairs.of([...filters])
    .map(maybe(isAll(true), setAll(false)))
    .map(toggleKey(name))
    .map(maybe(isAll(false), setAll(true)))
  
  return new Map(pairs.value)
}

Our function creates a Pairs data structure, and maps over it to create our end result. Map accepts a function, like toggleKey or setAll. But we can nest this function in another one called maybe, which does exactly what we'd expect; when some predicate function like isAll returns truthy we perform an action; otherwise we don't. There's a lot going on here. But just reading it; isn't our function suddenly wonderfully declarative? We pour our filters into a Pairs container, perform some actions, then return whatever value came out the other end. Some more things to note about this implementation:

no if statements in sight
it's impossible to mutate state
the chain is extensible and flexible; we can add as many operations as we want
our Pairs data structure can contain additional logic for null or error handling, without that getting in the way of our implementation (no nested ifs or try-catches)

It's like we get to the checkin at the airport and find out that our seats were upgraded. We don't know how it happened, but all that extra legroom sure is comfortable.

"But hang on", you will say - "we've cheated!" This function is no longer self-contained. Where does the Pairs object come from? How do we know it will work? That's a fair point. We've entered a world called Fantasy Land, ruled by algebraic data structures. Hooded figures roam the streets, waiting to pounce if we don't understand the difference between a lift and a blackbird combinator.

On the bright side, this means that the concepts we're using here are not some accidental by-product of this function, but fundamental building blocks to use and reuse in this style of programming. To see what's going on let's take a peek at a home-grown implementation inspired by Professor Frisby's Mostly Adequate Guide to Functional Programming. It only takes about 10 lines of code to get what we need. It's not that scary.

Prof. Frisby - the stop motion hedgehog

helper functions

// partial implementation of a pointed identity functor
class Pairs {
  static of(value) {
    return new Pairs(value)
  }
  get value() {
    return this._value;
  }
  constructor(value) {
    this._value = value;
  }
  map(fn) {
    return Pairs.of(fn(this.value))
  }
}

// heaven?
const maybe = (p, f, g=v=>v) => x => p(x) ? f(x) : g(x)
const pipe = (...fn) => (...xs) => fn.reduce((res, f) => [f.call(null, ...res)], xs)[0]

Looking under the hood of a partial implementation, the identity functor is at its core just a box storing our data. Mapping over it with a function returns a new box with new data. It's the most boring functor out there. No-one even remembers its birthday. But the power it brings to the table is phenomenal. Suddenly we can chain operations on a piece of immutable data.

The maybe function is also quite simple. It takes three functions: a predicate, an action and an alternative. The predicate determines which function is executed at run-time. The default alternative is a no-op that simply returns the input of the function.

That leaves us with the more scary-looking pipe function - which we haven't used up till now. It's the variadic implementation that makes it a bit hairy, but it's a simple idea: we provide a list of functions and when we call the pipe with an input, it will then call the second function in the sequence with the result of the first function, and so on. It allows us to apply the composition law to our consecutive mapping of operations.

compose(map(f), map(g)) === map(compose(f, g))

— Composition law

#5: look mommy, no intermediate data structures!

function toggleFilter(name, filters) {
  const isAll = isValue => arr => arr.every(([_, value]) => value === isValue)
  const setAll = toValue => arr => arr.map(([key, _]) => [key, toValue])
  const toggleKey = onKey => arr => arr.map(([key, value]) => key === onKey ? [key, !value] : [key, value])

  const pairs = Pairs.of([...filters])
    .map(pipe(
      maybe(isAll(true), setAll(false)),
      toggleKey(name),
      maybe(isAll(false), setAll(true))
    ))

  return new Map(pairs.value)
}

Getting rid of .map().map().map() in favour of a single pipe not only gives us more street cred, it is also a more performant implementation than any since version #2. Combining the operations prevents the engine from having to create three intermediate structures (one at the end of each map), which can make quite a difference when the datasets are large. It's nice to keep this in mind, even if the gains are negligible for us at the moment.

Go forth and functor it up?

We've made quite a journey from stateful, hard to read code to a declarative chain of operations over immutable data. You may wonder what the point was of all these intermediate versions. Functors ftw. If so, I'm happy to have impressed the beauty of the functional paradigm upon you. Most of the time, however, the Fantasy Land implementations stray a little too far beyond the pale. Measured in WTFs/LoC our final iterations certainly raise some red flags. Moreover, we don't want to import a library just to be able to run a single function. So, do we want to adopt this style everywhere in the application? What would that look like? And what would that mean for development speed in the short run? What about maintainability? Even if these challenges have real and liberating answers - insert zealot speech <here> - that decision is not ours to make.

While a 'no holds barred' refactoring is a useful and educational exercise, the prime directive at the end of the day still reads 'reduce complexity'. So the version I submitted was the boring ol' 2nd one - the arrays version. It makes two wonderful improvements: it separates logic from implementation details, and is immutable by default. It also takes a lot less time to explain to current and future colleagues.

That said, I really want to dive deeper into this tantalizing Fantasy Land. Maybe I'll write a blog about it.