Leaky Abstractions

October 23, 2018 @ 8:56 am Posted to .Net by Antony Koch
As a software engineer, it’s important to understand what leaky abstractions are, how to spot them, and how to mitigate against them. The term often remains unused by teams, most likely due to gaps in knowledge or training, or for fear of causing offence to colleagues. Understanding its core concepts will help you become a better developer, allowing you to spot potentially poisonous changes that may wreak havoc on your systems in the not-too-distant future. So what does the term mean? How can we spot it? What are some good examples? And when we do spot it in a codebase or architecture diagram, what mitigating actions can we take?
Put simply, a leaky abstraction is any API in your codebase or architecture that reveals too much information about its underlying implementation. One example most of us have seen is an API that returns its fields in all upper case letters, each of which matches the underlying database’s column name. Or an API whose functions and parameters match precisely those of its underlying implementation. Good API design – and by API I mean any interface we use to communicate with a software program, not just an API in the sense of a RESTful service – shields us from the information we don’t need to know. It asks only for that which it needs, and obfuscates unsightly legacy software that might ultimately be serving up its functionality.
When implemented correctly, an abstraction holds up against any and all use cases you can throw against it. If you’ve missed something, it is easier than not to plug in. Writing software to the right level of abstraction, avoiding leaks, is our toughest challenge as developers, but it also offers the greatest rewards. Developers love to be correct, and writing code or a design that holds up to any and all scrutiny feels great. But more than that, it proves that we have understood the problem and have created a solid solution that tackles our problem well enough for now.
An example of a leaky abstraction comes from my current contract, which was a leaky abstraction introduced by a third party into one of the solutions I was working on. We had held meetings regarding service contracts, and had proposed a minimal contract consisting of an entity type and id. . Consequently, the HTTP API would need to do some fetching of resources under the covers, but in doing so it would shield the outside world from needing to know more than the bare minimum. This allowed existing systems in the architecture to consume this service without any need to retrieve additional information in order to call the new API. Unfortunately, the third party didn’t grasp this concept, and decided to introduce several additional fields. Now, it’s at this point I should mention that this was an API over a well know big-name document management platform, the client’s system-of-choice for storing documents. The structure in which the documents were held involved some client metadata, some subsystem data from the client’s CRM system of choice, and one or two other friendly-names from said system. These references, metadata and IDs were now all required as part of the API contract. The rationale was that the API shouldn’t need to fetch additional data from the CRM system ‘because it would be too slow.’ This sounded alarm bells in me. A classic leaky abstraction: The document management system’s underlying file system was bleeding profusely out of their API. I set into motion trying to explain this to the third party. The issue was that any consumer of this API now needed to know a hell of a lot more than it ought to. Want to upload one attachment to a note on a back office incident? You now need to fetch customer details along with the parent hierarchy of IDs relating to the note you’re administering. These changes would permeate throughout the system. I tried to explain that the ‘slow’ call made in the API would now be spread into tens, if not hundreds, of calls from outlying systems that need to fetch the aforementioned data, however the third party – and unfortunately the client – simply did not get the concept. I had failed to appropriately explain the issue and it’s potential for cost. Several weeks later we wound up specifying a new piece of the solution which would cost the client a lot of money to implement, all because the leaky abstraction had not been plugged.
So how can you spot a leaky abstraction? Start with an empty API, and introduce a method or endpoint that has a requirement. Now what’s the smallest amount of information you can introduce that allows you to achieve the APIs goal? Make sure you understand whose job it is to know what information. In the above example, whose job is it to know the underlying folder structure of your document management platform? Is it a website that allows users to add notes to incidents, or the API whose job it is to shield users from such mundanity? Consider why you are building this API in the first place. You are attempting to build something of value; a streamlined way to access a capability. The most streamlined way possible is a clear, concise, and expressive API, with an ability to perform small, discrete tasks to manipulate its internal state while shielding the outside world from how you are manipulating that state.
Perhaps the best question we can ask, again using the example above, is how our API contract look were we to move away from the current document management platform? If we can say that our concepts and naming conventions stand up well, then we have a solid abstraction for document storage within our domain. If we realise we have parameters called parent folder or vnd_doc_sp_search_id, then we have a leaky abstraction.
No comments (click to be first!)

Taking a step back

April 12, 2017 @ 8:00 am Posted to Productivity by Antony Koch

Taking a step back is something we talk about doing, yet rarely find time to. We ponder things in between other thoughts, usually while moving from place to place or standing in the shower. Do these stolen moments offer the most benefit though? What are we missing when we’re lost in thought? Is there a more effective way to think about things?

In Zen Mind, Beginner’s Mind, Shunryu Suzuki says

Or you may say, “This is bad, so I should not do this.” Actually, when you say, “I should not do this,” you are doing not-doing in that moment. So there is no choice for you. When you separate the idea of time and space, you feel as if you have some choice, but actually, you have to do something, or you have to do notdoing. Not-to-do something is doing something

I boil this down to the idea that if I am going to think about something, then let that be the thing that I am doing. Doing something tends to require some kind of output, be it notes, a decision, deferring a decision to a later date, or some other measurable. Passing thoughts in the shower which are soon forgotten have yielded no tangible output. When you are taken out of what you’re currently doing by some thought, if that thought feels important then note down to return to it later, then come back to it.

This is an approach for productivity I have found massively useful. I use EverNote, and create a new note containing a table with two columns. One column is for the current focus of my attention, the other is for thoughts that arise outside the scope of my current focus. I use Egg Timer’s Pomodoro timer to manage my focused efforts, with 25 minutes devoted to items in the left column, then 5 minutes devoted to items in the right column. I do some brainstorming at the start, usually filling the left column with 90% of what I need to do. If anything arises in the mind that cannot fit into the left column, I stick them in the right hand column. If the items cannot be done within the 5 minutes, they get a new note and I tackle that thought process too using pomodoros.

This is taking a step back from my usual mental process of flitting from one thing to another, and instead using targeted attention to get more done in a shorter space of time. Give it a try. Let me know how you get on in the comments.

No comments (click to be first!)

Conversations with Little People

April 11, 2017 @ 9:34 am Posted to Fatherhood by Antony Koch

There is something wonderful about having a complete conversation with one of your children. My eldest son is almost four and a half, meaning he’s now able to hold a brilliant conversation lasting several minutes. This evening’s topic was why George didn’t want to sleep in the top bunk of his bed. We bought bunk beds about 6 weeks ago with some money we came into, money that certainly belonged to the boys, so we opted to get a lovely big wooden bed. George loved being in the top at first, however he’s since developed a fear of darkness and monsters that has meant him sharing the bottom bunk with Charlie.

Tonight I talked to George about it, trying to get to the bottom of his fear. It turns out he was scared of, in this order:

  • Monsters
  • Dinosaurs
  • Ninjas
  • Pirates
  • Polar bears

I discussed each one in order:

  • Not real
  • Extinct
  • More important people to kill
  • Seldom found off the coast of Burgess Hill
  • Isn’t icy today

And he seemed to really take my comments on board. He’s now asleep in his bed, and as ever I am proud and humbled by fatherhood.

No comments (click to be first!)

F# on aspnetcore: Escaping the framework

February 15, 2017 @ 2:24 pm Posted to .Net, dotnetcore, F# by Antony Koch

Mark Seemann has both blogged and talked about escaping the OO .Net Web API framework in order to use a more idiomatic functional style. This is achieved by providing a function per verb to the controller’s constructor, and replacing the IHttpControllerActivator:

type CompositionRoot() =  
    interface IHttpControllerActivator with
        member this.Create(request, controllerDescriptor, controllerType) =
            if controllerType = typeof<HomeController> then
                new HomeController() :> IHttpController
            elif controllerType = typeof<DoesSomethingController> then
                let imp x = x * x
                let c = new DoesSomethingController(imp) :> _
            else
                raise
                <| ArgumentException(
                    sprintf "Unknown controller type requested: %O" controllerType,
                    "controllerType")    

Then in the startup for your app (global or Startup):

GlobalConfiguration.Configuration.Services.Replace(  
    typeof<IHttpControllerActivator>,
        CompositionRoot(
            reservations, 
            notifications, 
            reservationRequestObserver, 
            seatingCapacity))

This works great, and I love its honesty. It makes you feel the pain, to quote Greg Young, and in composing tight workflows in your composition root the ‘what’ of your domain is laid bare.

However, this won’t work in aspnetcore because it’s more Mvc and less WebApi, or – to use MS phrasology – more Web and less Http, meaning there’s no IHttpControllerActivator. The fix is simple, and aligned with the terminology: drop the ‘http!’ One instead replaces the IHttpControlleractivator with an IControllerActivator instance inside the aspnetcore DI framework and the same results are achieved:

type CustomControllerActivator() =  
    interface IControllerActivator with
        member this.Create(c : ControllerContext) : obj =
            if c.ActionDescriptor.ControllerTypeInfo.AsType() = 
typeof<DoesSomethingController> then  
                let imp x = x * x
                new DoesSomethingController(imp) |> box
            else   
                invalidArg "controllerType" "Cannot find controller"

        member this.Release (c : ControllerContext, ctrl : obj) =   
            ()

And in your OWIN startup:

    member this.ConfigureServices (services:IServiceCollection) =
        services.AddSingleton<IControllerActivator>(new CustomControllerActivator()) |> ignore

        services.AddMvc() |> ignore

Sorted!

No comments (click to be first!)

A non-generic AutoFixture Create method

October 24, 2016 @ 8:03 am Posted to .Net by Antony Koch

Sometimes I need to dynamically generate test fixtures, but don’t in the test context have the ability to use generics, instead having only an instance of a Type.

I looked through the AutoFixture code and managed to find a reflection-friendly way to return me an object that can then be changed using Convert.ChangeType where necessary. Here’s the snippet:


typeof(SpecimenFactory).GetMethods().Single(x => x.IsStatic && x.IsGenericMethod && x.Name == "Create" && x.GetParameters().Length == 1 && x.GetParameters().Single().ParameterType == typeof(ISpecimenBuilder)).MakeGenericMethod(type).Invoke(fixture, new [] { fixture });

No comments (click to be first!)

Gauge your code’s adherence to the single responsibility principle

September 26, 2016 @ 7:40 pm Posted to .Net, OO by Antony Koch

During a routine ponderance on software engineering, I was thinking about a conversation recently in which we discussed how and when to copy and paste code. The team agreed that it’s OK to copy code once or twice, and to consider refactoring on the third occasion. This led me to wonder about the size of the code we copy, and how it might indicate adherence to the single responsibility principle.

For the uninitiated reader, the single responsibility principle – one of the first five principles of Object Oriented Programming and Design – can be succinctly summed up as:

“A class should have one reason to change”

The subject of many an interview question, often recited as rote, yet often misunderstood, the core premise can – I think – be understood by just talking about the code block, or class, at hand, in the form of a response to the question: “Tell me all the reasons you might need to edit this class?” Responses such as:

  • “If we want to change the backing store, we have to change this class.”
  • “If we want to change the business rules for persisting, we have to change this class.”
  • “If we want to change the fields used in the response, we have to change this class.”

When recited in one’s mind, are all that are required. And by considering the answers carefully, we can make an informed decision about whether to refactor, or that we’re actually happy with what we have, and are left with the option to change the class easily at a later date. The latter part of this sentence is critical to becoming a better developer, because what we have might be acceptably incomplete, and refactoring might take an inordinate amount of time and fail to offer significant business value to justify the expense.

This said, is copying and pasting code OK? Mark Seemann wrote an excellent blog post on the subject – which I won’t attempt to better – suffice to say I agree, and that it’s OK to copy and paste under a certain set of circumstances. The primary concern is the tradeoff: to suitably generify code requires at most a deeper understanding of the abstractions in play, and at least the ability to introduce dependencies between classes and modules that might not have otherwise been required. A quick copy paste of code that’s unlikely to change is not going to kill anyone. It might introduce an overhead should the code’s underlying understanding change, however volatile concepts do not in the first place represent good candidates for copying and pasting.

Now to wistfully return to the subject at hand – how can we use copying and pasting to judge our code’s adherence to the single responsibility principle? Quite simply: if we can copy only a line or two, then the surrounding code within the method body is perhaps not doing as targeted a job as we might hope. If we can copy entire classes, we can say that we’ve adhered strictly to the core tenets of the single responsibility principle: this class has such a defined person it can be lifted and shifted around the codebase with ease.

This means we can judge any of our code in a couple of ways: answer the question “what reasons does this class have to change?” as well as being honest with ourselves about our ability to copy and paste this code into a different codebase without being refactored.  Would half of the class be thrown away? Would we have to change a bunch of code in order to fit a different persistence model, say copying from a SQL Server backed system to a system backed by Event Store? I think it’s an interesting idea, and definitely one I’m going to keep trying in the coming days.

1 comment

A simple Dapper Wrapper

August 5, 2016 @ 10:28 am Posted to .Net by Antony Koch

If you need to sub out some Dapper functionality, and aren’t too worried about the specifics of the call, then I’ve crafted a nifty class you can use to perform just such a task.

In all it’s glory, the subbable IDapperWrapper, with its default implementation:


    public interface IDapperWrapper
    {
        IEnumerable<T> Return<T>(IDbConnection connection, Func<IDbConnection, IEnumerable<T>> toRun);
        T Return<T>(IDbConnection connection, Func<IDbConnection, T> toRun);
        void Void(IDbConnection connection, Action<IDbConnection> toRun);
    }

    public class DapperWrapper : IDapperWrapper
    {
        public IEnumerable<T> Return<T>(IDbConnection connection, Func<IDbConnection, IEnumerable<T>> toRun)
        {
            return toRun(connection);
        }

        public T Return<T>(IDbConnection connection, Func<IDbConnection, T> toRun)
        {
            return toRun(connection);
        }

        public void Void(IDbConnection connection, Action<IDbConnection> toRun)
        {
            toRun(connection);
        }
    }

No comments (click to be first!)

How can one ‘keep it simple’ in a complex system?

June 5, 2016 @ 8:56 pm Posted to .Net, Tech by Antony Koch

The often used phrase “Keep it simple, stupid,” abbreviated KISS, is solid advice in the field software development. We strive for simplicity, planning and refactoring continuously to ensure our code is extensible, reusable, and all those other words ending in ‘ble’ that apply. But what is simple? And how can things be kept simple in a complex system with several complex domains?

Simple is subjective. To some it means few moving parts, to others it means code that reads like a book, even if it’s repeated in several places. Disagreements can easily arise when deciding what simple means. Keeping an open mind is critical with regards to the definition of simplicity, because all sides can have valid arguments.

Complexity, like simplicity, is also subjective. Code can be composed in a complex way, however the component parts may be implemented simply. So can only simple things be over complicated?

There’s a saying: work smart, not hard, that applies to software development more deeply than in many other domains. Finding smart solutions usually means less code, fewer moving parts and a more concept-based approach.

To some developers a concept-based approach can be perplexing. Simplicity masquerading as complexity that remains obscure until scrutinised further. It can also, however, be over-engineered — six classes used where one might have sufficed until a later date.

Does this mean those developers who find smart solutions complex aren’t up to scratch, or should the codebase cater to the needs of the team and be legible to all? In my opinion, no. Smart trumps legibility every time, for the simple reason that legibility is subjective and based on the abilities of the reader. Some are baffled by lambdas and some aren’t. This doesn’t mean teams should avoid lambdas, it means teams should shift dead weight.

All of this begs the question: can something that seems complex always be reduced to something simple? In most cases, yes. A video I watched (which I will need to find later as the author escapes me) stated that in most cases he could walk into a company and reduce a code base by a factor of 80%. That is to say that a 100,000 line codebase could be reduced to roughly 20,000 lines of code.

Part of the reason this is, in my eyes, true, is because teams wilfully introduce technical debt, qualifying its introduction with ‘We’ll fix it if we need to later’. This is a flag to me that says “we know we aren’t doing it properly.” This is not counterintuitive to Ayende’s JFHCI — in fact it works with it: work smart, not hard. His example of hard coding is not an introduction of technical debt, it’s a forward thinking solution with minimal down payment now.

So how do we keep things simple in complex domains? Here’s a bulleted list of how it can be done:

Limit your abstractions

Don’t introduce a phony abstraction in order to make it mockable. If you see an IFoo with a single implementation Foo, you’re overcomplicating and you’re missing the point of interfaces in the first place. Code should be written to concepts as per Ayende’s limit your abstractions post.

Test outside in

Test your components using their public API. Don’t test the components internals because you’re then testing implementation. This allows two benefits:

  • Get the internals working correctly, quickly, with minimal fuss and with good test coverage
  • Once complete, it allows you to refactor into any concepts you may have uncovered along the way.

Work smart, not hard

Highly focussed components with specific jobs connected in a smart way. Some people might not understand them; it’s your job to enlighten them. If they still don’t understand it, cut them loose and hire someone who does. The inverse is true too, though — if everyone disagrees with your code it’s either wrong and you need to learn, or it’s right and you need to leave.

No comments (click to be first!)

The Purpose of Point-based Estimates

December 16, 2015 @ 8:06 am Posted to Agile by Antony Koch

It’s easy to forget precisely what the purpose of point-based estimates is, often resulting in attempts to equate them to time. However, that’s not what they’re for.

Point based estimates using the Fibonacci scale, t-shirt sizes, and any other method of measuring relative complexity are tools to help the business prioritise a backlog. These finger in the air estimates are useful insofar as they can provide a crude method of deciding whether to tackle 3 simpler tasks or one more complex one in the upcoming sprint. This represents the limit of their usefulness. Beyond that these estimates hold no value, especially when attempts are made to attribute a period of time to them.

The time a story takes does not correlate to its original points value. Traditional burndowns work in the sense of points-per-sprint, however there is no remit for turning these into a real period of time–they merely highlight a teams ability to reliably compare complexity to a base-story’s estimate.

Smalls, mediums and larges will blur into each other when time is taken into consideration. Metrics off the back of finger-in-the-air estimates only compound incorrect thinking in the upper echelons of the business as to a teams ability to deliver production code.

If you need to know hours because you’ve a deadline, or you’re billing out to a client, then sprint planning is the place for estimates in half days or above. The stories you choose to plan are in the upcoming sprint because of your finger in the air point estimates, but now it’s time to get into the nitty gritty and figure out just how long this thing will take. These discussions often make stories considerably more or considerably less complex when compared to their original estimate.

In summary, if you need to know how long a story will take in hours, get your best guys to estimate in half day chunks. Don’t use points.

No comments (click to be first!)

Dive into open source

November 10, 2015 @ 2:05 pm Posted to OS by Antony Koch

I had often wondered how to get into contributing to open source software. What are the rules? What is the etiquette?

Then one day, I realised that I knew the answer to both those questions:

What are the rules? Be nice. Don’t be a dick. Follow the standards.

What is the etiquette? Follow the above rules. People are nice and are there to help so long as you have tried your hardest at obeying the above rules.

I’ve now created PRs for a very small number of projects. Seeing my name as a contributor, though, feels great. It also looks great to prospective clients.

Short update, but the crux of it is to just do it. Fail, retry, succeed.

No comments (click to be first!)