Hamish Rickerby

Technology Consultant & iOS Developer based in Sydney, Australia

Refactoring Legacy Code - Gilded Rose Kata

| Comments

Often getting legacy code into a state where you’re comfortable making changes to it can be challenging. Legacy code can often have varying levels of quality throughout the codebase, as well as different levels of testing applied to it. I’ve taken over my fair share of old projects that have no tests, or haven’t followed SOLID principles, and understanding how the code works and what it does can be a challenge.

The Gilded Rose Kata is an exercise in taking a system that calculates “quality” and expiry dates for items in a shop over time. When you start the Kata, you have a single method that calculates the changes in quality and expiry for the items. The business logic implemented is hard to understand, and features nested if statements with conditionals that feel inverted. You also need to add functionality to support a new item type to the shop to complete the Kata.

This post will describe how I refactored the code. All code and commits are available at https://github.com/rickerbh/GildedRose-Refactoring-Kata-Swift

My Approach

Acceptance Tests

When refactoring, you’re typically looking to improve the internal structure of the code without changing the externally observable behaviour of the system. To ensure the existing behaviour didn’t change whilst I was refactoring, I followed the advice in the readme that recommends “Text-Based Approval Testing”. This is essentially an acceptance test in which the behaviour of the system is captured before you start refactoring, and then while refactoring you can compare the changed systems’ output with the original output.

To implement this, I captured the text output from running the application in a file and stored it in the repo. I then implemented a test that would replicate running the application, but storing the results in memory. Then the test compared the results of my application with the output from the original application. As an aside here, when I first implemented this I did a comparison of the full body. I quickly found this to be a terrible solution, and then implemented a line-by-line comparison so I could easily tell what part of the calculations were going wrong. Also, I found an issue with the original project setup, so submitted an (accepted) PR to the original repo that corrected the store definintion.

Understanding the Existing Logic

I’ve recently read Working Effectively With Legacy Code by Michael Feathers and there was one approach in this book that seemed to jump out at me to help understand the Gilded Roses logic - Sprout Method. This is where you take a small block of existing functionality (e.g., an if statement), extract it out to it’s own function, and give it a sensible name. Then, rather than seeing a block of code such as

1
2
3
if (items[i].name != "Sulfuras, Hand of Ragnaros") {
  items[i].sellIn = items[i].sellIn - 1
}

you see something like

1
decrementSellDate(item[i])

Example of this here.

The behaviour of the system doesn’t change, but suddenly it becomes more readable. Each “sprouted” method is also accompanied by a test. This allows you to also validate that your extracted code will then continue to work as expected, even with other changes taking place.

For this refactor, I continued to write tests and extract methods to get the giant block of if statements to a point where I could understand what they were doing.

After this, I considered the if statement itself. It used a bunch of negation statements to control the flow of execution through the application e.g., if (items[i].name != "Aged Brie" && items[i].name != "Backstage passes to a TAFKAL80ETC concert"). Personally, I find this style of if statement creates a higher congnitive load that “positive” statement, so I restructured it to use statements more like if (items[i].name == "Aged Brie" || items[i].name == "Backstage passes to a TAFKAL80ETC concert"). Maybe that’s a personal thing, but I find it simpler to understand - being explicit about what the predicate matches, rather than “everything but these items”. During this activity, having the acceptance tests run was crucial to ensure the system behaved the same, while the statements were changing.

After this, I extracted one more method for managing expiry dates. I felt that I had a good understanding of how the existing system worked and the logic was clear in the codebase. Time for some bigger changes.

Restrictions

One of the key restrictions in the Gilded Rose Kata is that you’re not allowed to alter the Items class or the Items property. Not having this restriction would have made the refactor relatively trivial as you could just create different subclasses that represent the behavour of each of the types of items. To do this however, would have involved altering the Items property to instantiate the different item types, or turning the Items class into a factory that would return items of different types. So, I needed a way to move the rules for specific types of items away from the main calculation flow, so that new item types could be introduced without impacting the existing processes.

Behaviours

I decided to add an ItemBehaviour class that would act as a superclass for each of the items to encapsulate the rules/logic for that specific item. The goal was to separate out the process of incrementing a day in the store, and how each item changes as the days increment. Then I created subclasses (Brie, Sulfuras, BackStagePass) to override the default behaviour for each item that has unique logic. To surface this behaviour, a BehaviourFactory class was implemented that would return an ItemBehaviour based on the items name (not the approach I would have liked to implement, but not being allowed to change the Item class is quite a restriction).

The behaviour was then integrated into the main application flow by replacing the if-statement/logic that takes place before the date change with a single call to updateQualityPreDateChange.

Next, the functionality to decrement the sales date was extracted from the sprouted method, and placed into the specific item behaviour classes. And finally, the changes to process expired items was extracted out into the item behaviour classes.

Done

The main body of the store updating function has been dramatically simplified. It now maps through all the items, retrieves the behaviour for an item, and calls the functions to advance the item through a day. Simples.

1
2
3
4
5
6
let _ = items.map { item in
  let itemBehaviour = BehaviourFactory.getBehaviour(item)
  itemBehaviour.updateQualityPreDateChange(item)
  itemBehaviour.decrementSellDate(item)
  itemBehaviour.processExpiredItem(item)
}

I believe that’s significantly easier to understand than the original code.

New Item

Now that the item behaviour is decoupled from the main day-end processing, adding a new item type is trivial. The new type gets added and then integrated into the factory. The only other thing that needs to happen is the old acceptance test needs to be updated, as this item didn’t exist in the original set. Ideally at this point the acceptance test would be updated (or removed) as the system behaviour has moved on, but for the purposes of this exercise I excluded the new item from the test.

Wrap Up

Legacy System maintenance is hard, but not impossible. You can always get a piece of business logic to be able to be tested, and restructure messy code so it’s simple and understandable. Techniques such as sprout method make gaining initial understanding and making code a bit more clear relatively easy to perform. Working Effectively With Legacy Code was a super-valuable resource, and covers many, many techniques to get an unwieldy codebase under control, both in terms of gaining understanding as well as getting code testable.

Personally, I found this kata a valuable experience in getting an existing system under control, whilst maintaining existing behaviour. I recommend if you have a spare hour or so give it a go yourself.

Refactoring Hard to Test Code

| Comments

As part of the Apprenticeship, I need to create an HTTP server. The purpose of the activity is to create a relatively complex, real-world system with TDD, following SOLID/clean code principles. It’s going well so far. There is a FitNesse spec for acceptance tests that the server needs to pass, and I have to create it in Java. I’m getting there.

However, I recently found that I was having to jump through some hoops to test my socket connection completion handler, and my mentor suggested this might be a code smell, and that I should look at refactoring the code to make it easier to test. As an aside, I’m using the “new” non-blocking IO socket classes rather than the traditional socket classes, so that multithreading should be simpler as I shouldn’t need to manage threadpools.

The design I had come up with for handling HTTP requests and responses was quite tightly tied to the Java CompletionHandler generic interface, and specifically the completed method that receives AsynchronousSocketChannels. Testing it, by generating fake AsynchronousSocketChannels felt super hacky. The code for dealing with processing requests and responses was also tied in with the channel handling code, so the responsibility of the class was not very clear.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
public class HTTPCompletionHandler implements CompletionHandler<AsynchronousSocketChannel, Void> {
    ResponseFactory responseFactory;
    AsynchronousServerSocketChannel listeningChannel;

    public HTTPCompletionHandler(String rootDirectory, AsynchronousServerSocketChannel listeningChannel) {
        responseFactory = new ResponseFactory(rootDirectory);
        this.listeningChannel = listeningChannel;
    }

    @Override
    public void completed(AsynchronousSocketChannel ch, Void attachment) {
        listeningChannel.accept(null, this);
        String requestText = extractRequestText(ch);

        Request request = new Request(requestText);
        Response response = responseFactory.makeResponse(request);

        sendResponse(ch, response);

        closeChannel(ch);
    }

@Override
    public void failed(Throwable exc, Void attachment) {

    }

    private String extractRequestText(AsynchronousSocketChannel ch) {
        ByteBuffer buffer = ByteBuffer.allocate(8192);
        byte[] requestBytes = null;
        try {
            int bytesRead = ch.read(buffer).get(20, TimeUnit.SECONDS);
            requestBytes = new byte[bytesRead];

            if (bytesRead > 0 && buffer.position() > 2) {
                buffer.flip();
                buffer.get(requestBytes, 0, bytesRead);
                buffer.clear();
            }
        } catch (Exception e) {
            e.printStackTrace();
        } finally {
            return new String(requestBytes);
        }
    }

    private void sendResponse(AsynchronousSocketChannel ch, Response response) {
        ch.write(ByteBuffer.wrap(response.getBytes()));
    }

    private void closeChannel(AsynchronousSocketChannel ch) {
        if (ch.isOpen()) {
            try {
                ch.close();
            } catch (Exception e) {
                e.printStackTrace();
            }
        }
    }

}

Time to refactor

Below are a series of steps that I went through to refactor this class to ensure single-responsibility, and make it significantly easier to test the functionality.

Links are provided at the end to github for the steps as commits. Github also contains the tests. 100% TDD baby!

Create interfaces

First of all, I needed to break the dependency between the HTTP Request/Repsonse code, and the AsynchronousSocketChannel code. I introduced an abstraction for reading and writing data.

1
2
3
public interface ByteReader {
     byte[] read();
}
1
2
3
public interface ByteWriter {
    void write(byte[] bytes);
}

I could have put these in a single interface, but splitting them gives easier support for different implementation models, such as reading from a socket, and writing out to a file.

Concrete implementations

Then, implement classes that support these interfaces that use the AsynchronousSocketChannel to read and write from.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
public class AsynchronousSocketChannelReader implements ByteReader {

    AsynchronousSocketChannel channel;

    public AsynchronousSocketChannelReader(AsynchronousSocketChannel channel) {
        this.channel = channel;
    }

    @Override
    public byte[] read() {
        ByteBuffer buffer = ByteBuffer.allocate(8192);
        byte[] requestBytes = null;
        try {
            int bytesRead = channel.read(buffer).get(20, TimeUnit.SECONDS);
            requestBytes = new byte[bytesRead];

            if (bytesRead > 0 && buffer.position() > 2) {
                buffer.flip();
                buffer.get(requestBytes, 0, bytesRead);
                buffer.clear();
            }
        } catch (Exception e) {
            e.printStackTrace();
        } finally {
            return requestBytes;
        }
    }
}
1
2
3
4
5
6
7
8
9
10
11
12
public class AsynchronousSocketChannelWriter implements ByteWriter {
    private AsynchronousSocketChannel channel;

    public AsynchronousSocketChannelWriter(AsynchronousSocketChannel channel) {
        this.channel = channel;
    }

    @Override
    public void write(byte[] bytes) {
        channel.write(ByteBuffer.wrap(bytes));
    }
}

Integrate these back in

The next step is to start to use these within the HTTPCompletionHandler class. The bulk of the extractRequestText and sendResponse functions can also be dropped from the HTTPCompletionHandler.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
...

    private String extractRequestText(AsynchronousSocketChannel ch) {
        ByteReader reader = new AsynchronousSocketChannelReader(ch);
        return new String(reader.read());
    }

...

    private void sendResponse(AsynchronousSocketChannel ch, Response response) {
        ch.write(ByteBuffer.wrap(response.getBytes()));
        ByteWriter writer = new AsynchronousSocketChannelWriter(ch);
        writer.write(response.getBytes());
      }

...

The code in sendResponse at this interim step has grown longer, but what we’ve actually done here is enable a very clean break in the dependency between the HTTP request/response processing, and the AsynchronousSocketChannel, as we’ll see in the next step.

Extract HTTP Request/Response handling

Now, we need to remove the HTTP Request and Response handling out of this completion handler. The completion handler will soon have a very specific responsibility in providing an interface adapter between AsynchronousSocketChannels and processing that data. The reading, writing, or orchestration of the data flow will no longer be part of that classes responsibility.

We’re creating a new class called HTTPHandler. This class will take the configuration required (a directory) on instantiation, and for each request it needs to process it’ll receive a ByteReader and ByteWriter.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
public class HTTPHandler {
    ResponseFactory responseFactory;

    public HTTPHandler(String rootDirectory) {
        responseFactory = new ResponseFactory(rootDirectory);
    }

    public void run(ByteReader reader, ByteWriter writer) {
        String requestText = extractRequestText(reader);

        Request request = new Request(requestText);
        Response response = responseFactory.makeResponse(request);

        sendResponse(writer, response);
    }

    private String extractRequestText(ByteReader reader) {
        return new String(reader.read());
    }

    private void sendResponse(ByteWriter writer, Response response) {
        writer.write(response.getBytes());
    }
}

The beauty in creating a class like this where the reader and writer are abstract and provided to it, is that it’s not dependent on any particular type of Socket, or Stream, of Buffer.

Fake Reader and Writer for testing

The interface for these two types is very simple, and very, very, very easy to create Fake versions of to test.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
public class FakeReader implements ByteReader {

    private byte[] byteData;

    public FakeReader(String data) {
        byteData = data.getBytes();
    }

    @Override
    public byte[] read() {
        return byteData;
    }
}

public class FakeWriter implements ByteWriter {

    private byte[] byteData;

    @Override
    public void write(byte[] bytes) {
        byteData = bytes;
    }

    public byte[] readWrittenBytes() {
        return byteData;
    }
}

Remove HTTP functionality from CompletionHandler

We can drop the extractRequestText and sendResponse methods from the HTTPCompletionHandler altogether, as these are now provided by the HTTPHandler.

The relevant parts of the HTTPCompletionHandler will change as below.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
public class HTTPCompletionHandler implements CompletionHandler<AsynchronousSocketChannel, Void> {
HTTPHandler handler;
...

    public HTTPCompletionHandler(String rootDirectory, AsynchronousServerSocketChannel listeningChannel) {
        responseFactory = new ResponseFactory(rootDirectory);
        handler = new HTTPHandler(rootDirectory);
        this.listeningChannel = listeningChannel;
        }

    @Override
    public void completed(AsynchronousSocketChannel ch, Void attachment) {
        listeningChannel.accept(null, this);

        ByteReader reader = new AsynchronousSocketChannelReader(ch);
        ByteWriter writer = new AsynchronousSocketChannelWriter(ch);
        handler.run(reader, writer);

        closeChannel(ch);
    }

...
}

Complete

The completed method now only instatiates ByteReader and ByteWriters that can handle AsynchronousSocketChannels, and asks the instantiated handler to begin processing with the reader and writers.

The concrete ByteReader and ByteWriters now encapsulate the logic for reading from and writing to AsynchronousSocketChannels, rather than this being tied with with dealing with HTTP requests and responses.

The HTTPHandler now supports any sort of interface that can conform to ByteReader and ByteWriter for receiving and sending HTTP content.

And, all classes are small, all methods are small, and due to the interface abstractions, super simple to test.

End

Refactoring is never really over, but I hope this has been a useful example of how you could refactor a class to

  • extract dependencies
  • provide a simpler API
  • ensure single-responsibility, and
  • make functionality easier to test

If you want to see the whole refactor as a single diff, check out this commit, or if you’re interested in seeing each individual step, check the last 8 commits on this branch.

Software Apprenticeships

| Comments

We have trouble hiring developers. The number of people in Australia with Scala/FP experience is small, and the number of high-quality developers with this experience is even smaller. It also costs us a huge amount of money to hire new staff. Recruiters fees are massive and international candidates attract giant work visa application fees. We typically also only hire Senior Developers with 10+ years commerical experience - people who can lead projects, architect high performing systems, understand not only software developement but all the auxiliary systems as well. Our staff need to be able to be as comfortable with code as they are engaging with clients and stakeholders, presenting at user groups, mentoring, training etc. We need fully rounded software professionals.

We can’t find them. Or at least, it’s becoming harder to find them. We’ve tried to sort this out on the the demand side - we have open positions with very competitive salaries, in one of the most stunning cities in the world. Our teams work on interesting greenfields projects, with a heavy functional focus. With all this, we’re still struggling to find people.

So, how can we solve this problem? We’re going to try to create these sorts of professionals, or at least help create them - experimenting on the supply side of this equation. Apprenticeship programmes can be a way of quickly upskilling motivated, capable, junior and mid-level devs who show great promise. We’re currently putting two of our staff (I’m one of them) through a programme so we can get hands on experience of what it’s like, and see how applicable something like this might be to our business. 8th Light are administering the programme for us.

The apprenticeship programme will cover fundamentals of software development - both OO and FP styles. There will be a focus on developing software in a maintainable, understandable, clean style. It’ll cover building scalable, robust systems. But, we don’t want people to come out of this only knowing how to code. We need the to have all the other skill a true software professional has. Skills such as team leadership, ability to design and architect simple and complex systems, understanding of infrastructure setup, costs, estimating projects, client communications, and project management. They’ll need to be able to perform the non-technical day-to-day activities as well - running standups, planning sessions, project retrospectives, user story analysis. Our apprentices will need to be able to engage with communities, so things like maintaining blogs, speaking at user groups and conferences, are super important. And they’ll need to want to continue this education, so reading books, mentoring, and training will be included too. We’re looking to enable people to become true Software Professionals - not just “coders”.

We feel software apprenticeships have promise as a bridge between theoretical university CS (or self-taught) developers, and professionalism in software. We also feel that having a shared training experience across all consultants within the company will help increase internal trust in terms of quality and capability, as well as improve our already stellar marketplace reputation for high quality systems development and consulting.

I plan on writing occasionally on how it’s all going. Watch this space.

(BTW - if either the apprenticeships, or our work in general sounds like something you’d be interested in, get in touch with me on twitter! - @rickerbh)

Interface Segregation

| Comments

The Interface Segregation principle (part of S.O.L.I.D.) is about, very simply, not making subclasses or clients implement interfaces they’re not concerned with. Robert Martin calls these ‘fat’ interfaces. They contain functions or methods that are unrelated to each other, and could be split out into more cohesive interfaces.

Forcing clients to implement interfaces they’re not concerned with causes unnecessary tight coupling of the client to the interface. If the interface changes, the client needs to reimplement/update itself even if it doesn’t use that specific interface function. The result is wasted development effort in maintaining unnecessary code for testing and implementation. This should be avoided.

Issues with interface definition can arise in languages that support inheritance, subtype conformance, or concepts like Interface or Protocol. Specifically, issues are more likely to occur when an object or type can only inherit/implement one super-class or protocol/interface, such as with inheritance with (most) object-oriented languages. C++ is a notable exception here with support for multiple-inheritance, and implementation of protocols/interfaces via abstract base classes with pure virtual functions. The majority of languages I’ve seen that support the concept of interfaces, also support multiple interface inheritance. This is supported in Swift, Java, and Objective-C. Ruby can support this via the include statement, although the duck typing removals the formal need for this definition - same with Python. Haskell supports this via type class conformance.

Show me some code

Here’s a contrived example of an interface that tries to do too much. It’s in swift, but it should be understandable. Lets say we’re modelling animals.

1
2
3
4
5
6
protocol Animal {
  var species: String { get }
  var legs: Int { get }
  func speak() -> String
  func birth() -> Animal
}

And we define a couple of animals.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
class Dog: Animal {
  var species = "Canis lupus familiaris"

  var legs = 4

  func speak() -> String {
    return "Woof"
  }

  func birth() -> Animal {
    return Dog()
  }
}

class Cat: Animal {
  var species = "Felis catus"

  var legs = 4

  func speak() -> String {
    return "Meow"
  }

  func birth() -> Animal {
    return Cat()
  }
}

This all seems fine, but becomes unstuck when we attempt to model something Oviparous, or a sterile hybrid.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
class Chicken: Animal {
  var species = "Gallus gallus domesticus"

  var legs = 2

  func speak() -> String {
    return "Cluck"
  }

  func birth() -> Animal {
    // Chickens have eggs, not chickens. 
    // And they don't "birth" them. They lay them.
  }
}

class Mule: Animal {
  var species = "Equus asinus x Equus caballus"

  var legs = 4

  func speak() -> String {
    return "Bray"
  }

  func birth() -> Animal {
    // Uh oh, Mules are typically sterile. They can't reproduce.
  }
}

The interface for Animal forces all animals to be able to birth things, and not all animals do. To solve this, I see a couple of options.

  • We make the birth() function optional
  • We extract the birth() function out to a separate Protocol and compose multiple protocols together

Personally I prefer the extraction of the function to a separate protocol. If we made the birth() function optional, any object that attempts to use it on any animal will need to ensure it’s available before it can use it, and potentially force consumers of a function that uses that function to also deal with optional returns.

1
2
3
4
5
6
7
func birthSays(parent: Animal) -> String? {
  if let child = parent.birth()? {
    return child.speak()
  } else {
    return nil
  }
}

If the birth() function is extracted out, then:

  1. Chickens and Mules won’t need to implement the birth() method, and
  2. We can typecheck methods so we don’t need optionals, in languages that support this construct.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
protocol Animal {
  var species: String { get }
  var legs: Int { get }
  func speak() -> String
}

protocol Egg {
  func hatch() -> Animal
}

protocol Viviparous {
  func birth() -> Animal
}

protocol Oviparous {
  func lay() -> Egg
}

class ChickenEgg: Egg {
  func hatch() -> Animal {
    return Chicken()
  }
}

class Chicken: Animal {
  var species = "Gallus gallus domesticus"

  var legs = 2

  func speak() -> String {
    return "Cluck"
  }
}

extension Chicken: Oviparious {
  func lay() -> Egg {
    return ChickenEgg()
  }
}

func birthSays(parent: Viviparous) -> String {
  return parent.birth().speak()
}

You can see above that the Chicken is no longer required to implement birth. Through conformance to multiple, specific/detailed protocols it only needs to support functions and properties that make sense to that specific Class. This splitting of protocols alse ensures that we can typecheck inputs to functions, reducing the need for boilerplate code performing nil checks on optionals.

Surprising Usage

To illustrate another benefit of small interfaces, we consider the relationship between parents and children. If the relationship between two entities is abstracted out and made generic, we can think of it as a Node in a Graph, with a parent (node), and multiple children (other nodes).

1
2
3
4
protocol Node {
  var parent: Node { get }
  var children: [Node] { get }
}

With this view of a Node, we can model families of Viviparous animals. A Dog can return it’s children, and they can reference their parents. This Node however, can also be reused for any directed graph, such as dependencies between different software libraries. If a client implements a function to produce a family tree of Dogs via the Node interface, the exact same code can be reused to produce a tree of library dependencies, as it’s based on the generic Node, not Animal.

hat tip to @triggerNZ for this example

Interface Segregation is one of the S.O.L.I.D. principles (I). Through ensuring that your interfaces small, targetted, and cohesive, you simplify implementation for clients. Clients won’t be required to implement interfaces that don’t make sense in the context of their object. Your interfaces also have greater opportunities for reuse, due to being more composable. Clients will be forced to change less, as only changes that impact their operation will need to be managed, rather than interface changes that they don’t care about.

Wrapping 3rd Party Code

| Comments

I’ve been integrating a system I’ve been developing with a third party service for data synchronisation. We’re looking to synchronise tasks out with systems like Pivotal Tracker, Jira, Trello etc, but were unsure which of those systems we would actually use. I’m currently reading Clean Code, and there’s a really interesting and relevant chapter on “Boundaries”. It covers a similar scenario to what we were facing: Using Third-Party Code.

Isolation

Generally, you should wrap any third party code that you’re dependent on with your own interfaces for that code. That way you get to define how your main application logic interfaces with the third party code, rather than having to have to bend your application to conform to a third party library, API, or applications structure. Isolating the complexities of dealing with third parties within an application to a particular class or module behind an interface that you control also allows you to:

  1. deal with changes to that third party (e.g., API upgrade)
  2. swap out that integration with another one

If we were building direct integration (with a project management tool like Jira, Pivotal Tracker, Trello) into our application, the models and interfaces of the third party system would leak into our core system.

Our application has support for Projects, Stories and Epics. Pivotal Tracker models these entities as well, but the relationship between Stories and Epics is based on a labelId assigned to the Epic at Pivotal. To attach a story to an epic we actually attach it to the label of the epic. Pivotal Tracker also handles story creation with labels/epics differently from story updates with labels/epics. Epics were introduced with v5 of the Tracker API. Previously, they didn’t exist as a concept.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
const saveStoryCallback = (response, error) => {
  if error {
    handleError(error)
    return
  }

  const story = response.data
  // Perform other post-save actions (eg, update aggregated or count fields)
  ...

  // Sync with Pivotal
  let labelId = null
  if story.epic {
    const epicModel = story.epic.syncData
    if epicModel {
      labelId = pivotal.getEpic(epicModel.id).labelId
    } else {
      data = pivotal.createEpic(story.epic.pivotalModel)
      story.epic.syncData = data
      labelId = data.labelId
    }
  }

  const storyModel = story.syncData
  if storyModel && pivotal.getStory(storyModel.id) {
    pivotal.updateStory(story.pivotalModel)
  } else {
    story.syncData = pivotal.createStory(story.pivotalModel)
  }

  if (labelId) {
    pivotal.attachLabel(story, labelId)
  }
}

Our save story callback will need to deal with the intricacies of the external system/library. If we wanted to swap this out for another tool (e.g. Trello) we’d need to completely rewrite the logic in our save story callback, which is intermingled with core application logic. If we wanted to model the same synchronisation process with Trello, the logic would be different because the external model is different. We’d model epics as a Board, and stories as a Card.

To avoid changes to our core application logic, we need to isolate the synchronisation logic behind an interface that represents entities in the language of our application, not the third party application.

1
2
3
4
5
6
7
const getStory = (story) => { ... }
const syncStory = (story) => { ... }
const deleteStory = (story) => { ... }
const syncProject = (project) => { ... }
const deleteProject = (project) => { ... }
const syncEpic = (epic) => { ... }
const deleteEpic = (epic) => { ... }

Then, we need to implement an adapter to interface with one of the third parties. Lets say we’re interacting with Pivotal again:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
import { syncEpic, syncStory } from "sync-adapter"

const saveStoryCallback = (response, error) => {
  if error {
    handleError(error)
    return
  }

  const story = response.data
  // Perform other post-save actions (eg, update aggregated or count fields)
  ...

  syncEpic(story.epic)
  syncStory(story)
}

Then in our sync-module:

1
2
3
4
5
6
7
8
9
10
11
const syncEpic = (epic) => {
  if epic {
    const epicModel = epic.syncData
    if epicModel {
      labelId = pivotal.getEpic(epicModel.id).labelId
    } else {
      data = pivotal.createEpic(epic.pivotalModel)
      epic.syncData = data
    }
  }
}

If we need to swap out Pivotal for Trello, we can simply replace the contents of the sync-adapter with the implementation for the different provider. The core application callback won’t have to change if the interface is isolated in this way.

Multiple Third Parties

Let’s say in the future we have a requirement to support more than one external system. The existing isolation model between the callback and the sync-module still applies. We’d just need to inject another adapter in the middle of this flow.

  1. Rename sync-adapter to represent the specific external system it relates to: pivotal-adapter.
  2. Implement the appropriate adapter for the new external system: trello-adapter
  3. Implement a new sync-adapter that will interface with both of these modules.
1
2
3
4
5
6
7
8
9
10
11
import { syncEpic as syncPivotalEpic } from 'pivotal-adapter'
import { syncEpic as syncTrelloEpic } from 'trello-adapter'

const syncEpic = (epic) => {
  const syncData = epic.syncData
  if syncData.service == 'pivotal' {
    syncPivotalEpic(epic)
  } else if syncData.service == 'trello' {
    syncTrelloEpic(epic)
  }
}

Addition of more third parties can take place in the future without further changes to the core application logic. All the changes are pushed out to the boundaries of the sytems.

Upgrades and Changes

If a third party changes their interface, or even their domain model, all changes will be isolated to the integration module alone. Your core application flow should be unaffected by the change, as the interface it interacts with should remain stable.

When Pivotal changed their API from v3 to v5, they introduced the concept of the Epic. Previously our application would have had epics internally, but the adapter would have converted epics to labels to support Pivotals model. With changes for their v5 API, Epics become a first class citizen as far as they are concerned and we’d update the adapter, but our application core application would not need to change.

Wrap Up

Isolating third party code via adapters provides benefits in terms of abstracting logic and complexity out of your main application flow. Your application will communicate with the third parties in a consistent manner via the interface, and not be forced to change if a change appears. It ensures that any future conceptual, model, flow/logic and interface changes in the third party only impact the code that deals with that third party. It gives you the ability to swap out your external dependencies with little to no impact on your core applications, as well inject new functionality.

Function Arguments

| Comments

In Clean Code it’s advocated that the ideal number of arguments for a function is 0. Then, 1, 2, and in very rare cases 3. No functions should take more than 3 arguments.

In general, I agree with this. Fewer function arguments = fewer separate parts to understand and should aid with readability (and understanding). There are a couple of complexities though that I’d like to run through here. One is levels of abstraction, and the other is testing.

Abstraction

When a function has 0 arguments, it can only do 3 things:

  1. It can call other functions
  2. It can access data (object state) from itself and return it.
  3. Nothing - if it doesn’t call other functions, or query object state, then why bother calling it?

A useful niladic function is effectively a wrapper for other code, providing a higher level of abstraction. For example, if we wanted to get a set of data from a report, in a particular format, we could call functions as below.

1
2
3
4
5
extract_report_variables_as_json()

# vs

extract_report_variables_from(a_report, FORMAT_JSON)

The niladic (0 argument) version reads nicely, and seems simple to understand on the surface. It also provides a higher level of abstraction than the function where you have to provide the report to extract the data from, and the format to return the data in. This (in my mind) is a double-edged sword. The extractReportVariablesAsJson() function completely hides where it gets data from, and how the formatting request is passed in (and potentially what other formatting options there might be).

1
2
def extract_report_variables_as_json(self):
    return extract_report_variables_from(self.report, FORMAT_JSON)

It’s also unclear if there are side effects from this. To figure this out, you need to delve down into the functions to bottom out what they’re all doing. For example, to generate the JSON format extract, it may write the data to disk, and this may cause exceptions due to disk space, or permissions problems. You could receive a seemingly unrelated error to the task you’re trying to perform due to side effects happening inside an abstracted function.

In the OO world, the general takeaway is that this is a perfect situation. The object provides abstract functions, and they hide the complextity of what is happening under the covers. The main issue I have with niladic functions is the tying of the function to particular state in the object, eg, in our example above the report used will always be self.report. To avoid this, the developer has to repeat the abstraction to generalise the functions, all the way down. The developer also has to consider what abstract, and what more detailed (generalised) functions their object might like to expose, and set appropriate access controls on these functions. For this to happen in reality requires a diligence and dedication on the side of the developer to their craft. They must continually strive for clean code - to wrap and abstract out functions. This is a good thing, but not something that’s always done by default.

Testing

Testing functions has (I believe) an interesting effort curve, depending on the number of arguments in the function, as well some more fundamental constructs of the language you’re developing in.

Niladic Functions

With niladic functions, you may have to perform more test setup around the function to be able to test it effectively. When a function takes no arguments, you have to setup the data elsewhere for it to operate on. If your function operates on the data it’s passed, then you only need to deal with that function to test it. I consider (in general) the test creation effort for niladic functions to be greater than for functions that take arguments.

Functions with 1+ arguments

I believe the easiest function to test is one with one argument. You can pass in the data you want the function to operate on, receive the result, and check it. This is also true for functions with more than one argument, but the problem with these is that the permutations of possible arguments explode. If you’re looking to exhaustively test out a function, then you need to multiply the possible values for each argument together. Lets say you have a function that takes a single boolean. This has 2 possible values you can pass in to it, so it is relatively simple to test exhaustively. If you have a function that takes 3 boolean arguments, then you have 8 possible permutations of the arguments (2 x 2 x 2). Things get crazy when we start looking an other data types such as String or Int. How many different values can you have for a String?

Typed Arguments

One other issue for consideration is if the language you’re using supports typed arguments. If it can, a function can define (and ideally enforce) the type of data it can accept. In languages that don’t support this, your function could receive data of a different type to what it was expecting.

1
2
3
format = "My Custom Format"
report_data = True
extract_report_variables_from(report_data, format)

To generate tests to exhaustively test this is virtually impossible. You won’t practically be able to create all the different permutations of options available.

How to solve?

I think there is a way to get a level of confidence in your functions that receive arguments via testing. Unit tests can provide expected, common scenarios that you want to test for to make sure that the function is behaving under normal scenarios. This should always be done.

If your function can support it (and your language has appropriate libraries), Property Based Testing can provide a level of confidence that your function can handle all other scenarios. I’m not going to go into it in depth here, but in general, you construct test specifications that will randomly generate test data and pass it to your function. They will repeat this over and over again a number of times, and determine if your function operates as expected in a wider number of scenarios that is humanly possible to code manually. This should give confidence that your function works correctly, even in the face of misuse.

A Small Functional Refactoring of Javascript

| Comments

I’m working on a web based productivity application at the moment, and have been modifying some old code. This code deals with synchronisation of data with external services, and storage of metadata about the synchronisation of that data. The application had quite a bit of duplication in dealing with this metadata; specifically in extracting data from the stored structures. In the interests of having very DRY Javascript, it was time to refactor.

The configuration/metadata structures typically look like

1
2
3
4
{
  ...
  config: [{ key: 'attribute-name', value: { value: 'attribute-value' }}],
}

Sidenote: value is embedded in value because the top-level value item actually receives an object to store, so other fields can be added in the future

The metadata object was then optionally attached to other user-entered data entities, and queried when updates that require synchronisation to be triggered.

1
2
3
4
5
6
const containerObject = {...
  config: [
      { key: 'itemId', value: { value: 12345 }}, 
      { key: 'itemState', value: { value: 'active' }}
    ],
};

The application was already using RamdaJS to extract data from these types of structures. To query, there was a whole bunch of duplicated code that traversed the containing objects, and extracted data. Example below.

1
2
3
4
5
6
// Extract config object, maybe.
const syncData = R.propOr({}, 'config')(containerObject);
// Find a structure with a specific key value
const itemSyncData = R.find(R.propEq('key', 'itemId'), syncData) || {};
// Extract the value from the entity with that key
const itemId = R.path(['value', 'value'], itemSyncData);

To DRY all this up, I looked at the possibility of using Currying and Partial Application to help me define generic extraction functions, and reuse them.

First of all, I turned the above into a composed function, with the results of one step flowing as the inputs to the next.

1
2
3
4
5
6
7
8
9
10
// compose reads from the bottom up, like you're feeding in the object from
// the end and it's consuming it, right to left.
const composedFunction = R.compose(
  R.path(['value', 'value'],
  R.find(R.propEq('key', 'itemId')),
  R.propOr({}, 'config')
); 

// How to use?
const itemId = composedFunction(containerObject);

However, this only supports config objects in the container, and itemId’s inside that. We can make this more generic.

1
2
3
4
5
6
7
8
9
const composedGenericFunction = (data, key) => R.compose(
  R.path(['value', 'value'],
  R.find(R.propEq('key', key)),
  R.propOr({}, data)
); 

// How to use?
const itemId = composedGenericFunction('config', 'itemId')(containerObject);
const itemState = composedGenericFunction('config', 'itemState')(containerObject);

Better because we get more reuse, but we still are repeating ourselves with the definition of the attribute that houses the config. So, lets make our function even more reusable with currying and partial application.

1
2
3
4
5
6
7
8
9
10
const composedGenericCurriedFunction = R.curry((data, key) => R.compose(
  R.path(['value', 'value'],
  R.find(R.propEq('key', key)),
  R.propOr({}, data)
)); 

// How to use?
const configGetter = composedGenericCurriedFunction('config'); // Partial Application
const itemId = configGetter('itemId')(containerObject);
const itemState = configGetter('itemState')(containerObject);

We can take this a step further with something like below, and generate a getter that will retrieve the itemId from different containerObjects.

1
2
3
const idGetter = configGetter('itemId');
const idOne = idGetter(containerObject);
const idTwo = idGetter(containerObjectTwo);

We can also extract fields from other objects that conform to the same structure, but aren’t under a config key.

1
const syncDataGetter = composedGenericCurriedFunction('syncData');

So, with the use of currying and partial application with Ramda, we can create generic, reusable functions that are then used to generate other functions that we can use in our application. Super easy and effective way to DRY out your javascript.

Meteor, Heroku, and Bcrypt

| Comments

Just had a very frustrating day. Was getting a deployment error on Heroku with a Meteor app I’m making.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
Starting process with command `node build/bundle/main.js`
/app/build/bundle/programs/server/boot.js:324
}).run();
   ^
Error: Module did not self-register.
    at Module.require (module.js:365:17)
    at Module._compile (module.js:460:26)
    at Module.load (module.js:355:32)
    at Object.<anonymous> (/app/build/bundle/programs/server/npm/node_modules/meteor/npm-bcrypt/node_modules/bcrypt/bcrypt.js:3:35)
    at Error (native)
    at require (module.js:384:17)
    at Function.Module._load (module.js:310:12)
    at Object.Module._extensions..js (module.js:478:10)
    at bindings (/app/build/bundle/programs/server/npm/node_modules/meteor/npm-bcrypt/node_modules/bcrypt/node_modules/bindings/bindings.js:74:15)
    at Module.load (module.js:355:32)
Process exited with status 1
State changed from starting to crashed

I was using the build pack at https://github.com/srbartlett/heroku-buildpack-meteor due to its support for Meteor 1.3.

The issue seems to be that the npm-bcrypt atmosphere package doesn’t force a recompliation of the npm bcrypt package via node-gyp (I could be wrong here, but that’s what it seems like). It seems that the atmosphere package includes the wrong (or a fixed?) architecture, where as what we actually need is to recompile for the current target. To fix this, I’ve altered the buildpack to remove the bundled bcrypt package, reinstall from source, and copy back to the bundled location. The reinstall from source seems to force node-gyp to compile for the correct architecture.

My altered buildpack is available at https://github.com/rickerbh/heroku-buildpack-meteor if anyone wants it.

Back to Basics - Editors

| Comments

My boss bought me a copy of the seminal classic The Pragmatic Programmer. It’s full of sensible advice for both beginners and experienced software professionals.

I have lots of thoughts about the contents of this book but what I’d like to cover is usage of text editors, and specifically becoming proficient with a editor that’s typically universally available. I can think of 3 of these for UNIX systems. vi, emacs, and nano.

When I was at university I used emacs for all CS course text editing. In my first job, the UNIX servers we had didn’t have emacs installed. I wanted to install emacs, but the application vendor didn’t allow us to, as they had a support contract and hadn’t tested their application with emacs on the box, so no-dice. vi was it for a few years. Then, I just a stopped using these types of editors for about 11 years. The advent of IDEs (Eclipse, Xcode) meant I didn’t have to use these “basic” editors about had a lot of point-and-click, plugins, refactoring etc functionality available.

Until now.

The Pragmatic Programmer has convinced me to relearn basic (and powerful) UNIX tools, and I’m starting with emacs. In my job (at the moment) I typically do about 50-80% dev, the rest other stuff (sales, management, architecture, analysis). So I spend a fair amount of time in development environments. I’m not a big shortcut key user, and this needs to change - it’s just wasting time moving that mouse around to get to menu items and select text.

So far, I’ve started with base emacs installations, Haskell packages, git packages, Clojure packages, some convenience packages, and I’m about to start with some JavaScript packages. Also getting used to the keystrokes again is a bit tricky, but I’m getting there. To help out, I’ve moved other IDEs I use (like WebStorm) to use emacs bindings as well.

I’m enjoying the ability to use the same environment for text + 3 languages + code control. Super powerful.

If you’re interested in my setup, I have an emacs file in my dotfiles git repo. Check it out (ho ho ho).

If you have any tips for emacs please let me know in the comments or on Twitter - @rickerbh

Reading Not Reading

| Comments

I recently finished a book called Apprenticeship Patterns (more on why later) and one of the things it suggested is keeping an online list of books you read, and books you want to read. So here is mine.

http://hamishrickerby.com/books/

I’m going to add a few more books to the read list as well - ones I think other people may be interested in.

If you’ve got any recommendations please let me know in the comments or on Twitter - @rickerbh