July 24, 2019
Have you ever had to jump into a codebase you’ve never seen before and start being productive immediately? As a freelancer/consultant-y type, this is pretty much what I do for a living and it can be horrendously painful. To the point where I now automagically build in time and an “audit” to my initial client estimates.
And then sometimes it’s wonderful. Or at least mostly not painful. Getting up and running is relatively straightforward. There’s well-written code and there are some decent tests.
We’re writing this code for other humans and we need to remember we have these higher-level programming languages to make it easier for us and our colleagues to read, write, and maintain. When I first jump into a codebase and when I work in one, here are some of the qualities I look for.
How you name classes, functions, and variables, how you structure your code, and the complexity of it all play an important part in communicating what is happening. However, these don’t necessarily communicate why or give any context about it.
Have you ever looked at a codebase and seen something like this? Or was this you? Please don’t do this!
for i in list do process(i) end
The name of a class, function, or variable should tell us exactly what it is. If you need to add a comment to explain it further, it quite possibly doesn’t reveal it’s intention (see Clean Code). In the example above, I have no idea what
list is a list of or what
i is or what
process actually does!
When we slack on our naming, it is difficult to figure out what’s going on. We normally read from top to bottom, but as we read and consume code, our brains have to dig into each layer from class to function to variable and the different possible code paths (the nearly infinite number of code paths) and then retain it all as we start hopping back out of the stack to the original functionality we were trying to understand. If we’re trying to keep track of variables like
list and functions named
process() and what our overly complex, poorly structured code is doing, we’re making our job so much more difficult.
Another cool benefit we get from intention revealing code is better code completion in our IDE, which reduces the jumping around we have to do to understand what a function does and what the arguments need to be.
Structure & Complexity
In addition to the quality, the structure and complexity of your code add to the difficulty in understanding what it’s doing and what it’s supposed to be doing. Our classes and functions should be small, simple, and focused on one thing (see Single Responsibility Principle). Your classes, files, and objects don’t need to be four scrolling pages long. If your app has an app.js file, don’t shove everything in there and cross your fingers.
It’s freeing to remove dead code! Do it. It’s confusing when I run across unused code and it can be more harmful when left in the codebase. It’s often left unmaintained (because how do you maintain it if it’s not even executable in production?). There is overhead to leaving it in the codebase. We have version control. If you find you need to resurrect it, you have tools to do just that.
Standards and Conventions and Consistency
Standards, conventions, and consistency are key. Standards and conventions give us cues as to how to do something or how it works in addition to making setup, configuration or even use easier on us. But there are times you can’t or don’t want to follow a convention or a standard. Just because you’ve always done it that way doesn’t mean you always have to (please don’t fall into that trap). Maybe it just doesn’t work for this particular use case. Maybe you want to try a new way of implementing it because it will save your team time. Maybe you want to deviate because it will offer a huge performance improvement for your uses. Whatever the reason, make sure there is a reason and be intentional about this decision to deviate. When you do deviate, document the why behind this decision.
Comments come in many forms — commentary about what you’re doing, why you’re doing it, as metadata for code and documentation generators — but sometimes they can be used excessively or in a way that does more harm than good.
My personal preference is to pair well-written code (structure, complexity, and naming) with clear, concise comments that add context to the ‘why’ that I can’t communicate with the actual code.
Comments have the benefit of adding context about why you took a particular approach. While comments are not instructions for the machine, they live alongside your code and are version controlled and maintained together. They need to be updated when the code is updated. Dangling comments and stale comments can actually be more harmful than no comments at all.
This is going to bite me as soon as I find a bug in some_gem or want to upgrade it.
gem 'some_gem', git: 'https://github.com/some_gem/some_gem.git'
Still Not Great
Not much better. It’s still going to bite me, but at least there is a clue for future me that I’ll have some extra work to do. Maybe I’ll even be able to build that time into my client estimates ahead of time.
# gem not maintained gem 'some_gem', git: 'https://github.com/some_gem/some_gem.git'
Now, this is a little more helpful. I have an idea of what to look at when I do have to address this and there is a little context as to why we’re pulling from github rather than rubygems.org.
# gem not maintained; try some_gem/some_better_gem for updates or migrate to best_gem gem 'some_gem', git: 'https://github.com/some_gem/some_gem.git'
Once upon a time I was obsessed with test-driven development and automated testing and getting to 100% test coverage. So much so that my business card read something like “software engineer, technical lead, and automated test fanatic”. I’ve since moved on from the very dogmatic view of test all the things all the time to a more pragmatic approach to testing. One where I test what and when I need to test. I write tests to understand the code. I write tests to create better code. I write tests to communicate intent. I write tests to prevent future regressions in a particularly difficult or critical area of the code.
As with everything in software, what you test and when you test depends on your own situation (which I’ll save for another blog post). When I take on a new project, though, one of the first things I seek out are the tests. I use it not only as a way to assess the health of the code base but to learn more about the business and how the code meets those business rules. If I step into a new project with a bunch of tests like:
require 'test_helper' class MyDomainModelTest < ActiveSupport::TestCase # test "the truth" do # assert true # end end
What happens when I need to change the underlying code in
MyDomainModel? I don’t have the context of past decisions and likely don’t fully understand the business yet so I have no confidence in my changes.
But if I step into a project with some tests around the hard parts, the critical parts, or the not-so-frequently changed parts, I can learn a few things about this codebase that will be helpful as I start changing it.
I can learn where some of the more complex logic lies. I can learn where the critical business logic is handled. I might even be able to learn about the parts of the codebase that are changed frequently or not-so-frequently. And using this information, I can make decisions about how I’m going to go about my next set of changes. Will I write some new tests to learn about the code? To make sure I don’t break the existing logic? Or are there enough tests there to just dive right in?
Tests aren’t a silver bullet for communicating what’s going on in the code though. They need to be maintained just like comments and other documentation. They need to be correct and if the developer didn’t fully understand the problem, then a test won’t help you out.
The README is part of your code repo. It’s the introduction to this specific code base and it’s likely the first thing a new developer sees. Make it useful. When I see the default Rails readme, I cringe. This is not helpful.
# README This README would normally document whatever steps are necessary to get the application up and running. Things you may want to cover: * Ruby version * System dependencies * Configuration * Database creation * Database initialization * How to run the test suite * Services (job queues, cache servers, search engines, etc.) * Deployment instructions * ...
It’s great that you built a “standard Rails app” but oftentimes getting started (even for someone who’s been doing this for years) can be painful if there’s something custom about it. And let’s face it. You’ve built custom software, so there is going to be something custom. I’ve never picked up a Rails project that didn’t have some deviation from the “standard” setup even when a prior developer told me there wasn’t anything special about it.
We have to remember that people are opinionated about technology and how they go about building and deploying it. Remember the whole tabs vs spaces debate? We haven’t even solved that yet! When working with less experienced folks, they may not know every standard or convention (or maybe they weren’t documented for the team). So just because it’s a standard or a convention doesn’t mean it was followed.
And what about deployment? Is it deployed manually or automagically? When? On merge to master or when tests run successfully? When you press a button? Do you have Heroku pipelines set up? Or maybe you have some deployed to AWS and some to Digital Ocean. What if some of it runs against a Firebase datastore? These are all details that a full stack developer running your app and working in the codebase probably will need to know to do real work. It might feel quicker to just omit these details because of deadlines, but when you’re in the habit of writing these down it actually becomes easier and quicker to do and the long term benefits will pay off. For you and for your clients.
These are just a few of the things I look for when I take on a project in an existing codebase or when I start from scratch. This is what helps me wrap my head around the system when I wasn’t around to be a part of the original decisions and to build it myself. This is what helps me when I’m context switching between different projects, clients, and tech stacks every few hours or days. We have a lot to get up to speed on (and to remember) — everything from the tech stack, languages, and syntax to how it’s all connected together, how it’s integrated with other systems, why certain technical decisions were made to understanding the business domain and the actual functionality of the system. Be consistent and intentional in these decisions to comment or not, to document or not, to deviate or not, in how you name and structure your code, what and why you test, and make these decisions in the context of your environment, team, and product.