Dr. Jekyll and Mr. Tumblr

Yesterday morning this tweet from Mark Sands showed up in my mentions:

@jaredsinclair Looks like Tumblr deleted your blog. Any hope for reviving it?

Up until yesterday, my blog was hosted via Tumblr with a CNAME redirect from blog.jaredsinclair.com to the appropriate Tumblr domain. I set everything up seven or eight years ago, and it had worked fine all that time. I tried visiting the site and, like Mark said, the blog was gone. I got the Tumblr equivalent of a 404 instead of my blog. Next I tried logging into my Tumblr account to figure out what was wrong. Here’s what I saw:

W. T. Fuck.

I thought perhaps I’d missed a warning from Tumblr, but I had no recollection of receiving anything. I double and triple checked all my email accounts. Nada. There wasn’t anything from Tumblr since one day in March when I used a magic link to sign into my account rather than username and password. Nothing archived, nothing in spam. Without warning Tumblr had terminated my account for no discernable reason. I tried logging in from a desktop browser to be sure there wasn’t some mobile-rendering goof. Same thing. Your account has been terminated in classic Tumblr chunky white.

Wait, back up.

Ever since I slung screwdrivers around repairing Macs at a third-party Mac store, I’ve lived in mortal fear of losing data. Practically every day we had to tell somebody your baby pictures, your wedding pictures, your dissertation: all gone. There’s only so many times you can watch folks sob over the consequences of a spilled cup of coffee before you’re sobbing with them. For the most part, my backup strategy has been comfortably paranoid:

Yet my blog was the one corner of my digital life where I had gotten lazy. I didn’t have a backup of my posts anywhere, only scattered draft versions in a Dropbox folder. The canonical versions of all my blog posts were whatever Tumblr had saved for my account. The Tumblr versions had lots of small edits and corrections that weren’t saved anywhere, even if there was a draft copy in Dropbox. Now suddenly, everything that Tumblr had was gone.

Moving to Jekyll

I used the spartan contact form to ask Tumblr why my account was terminated. But rather than wait for a response that might never come, I decided it was past time to bring my blogging setup in line with the rest of my backup paranoia.

I decided to move everything over to Jekyll, backed by a comprehensive GitHub repository. I’ve had a Media Temple account for years and have been really happy with them. I knew once I figured out how to actually use Jekyll in a comfortable way, my Media Temple Grid Service would be able to host the static files easily. Why Jekyll? The short answer is that it’s used by GitHub Pages. In a sea of alternatives, I’m content to follow the smart folks at GitHub wherever they go.

If you’re reasonably comfortable with basic web programming, and know enough about the shell to get yourself in trouble, using Jekyll isn’t so bad. I got a proof-of-concept site up and running pretty quickly. Since Media Temple is also the domain registrar for jaredsinclair.com, it was easy to replace the Tumblr CNAME record for blog.jaredsinclair.com with an A record pointing to the same IP address as jaredsinclair.com (this was a necessary part of ensuring that old blog post links resolved to their new Jekyll permalink). Within hours, anyone that wanted to visit this blog was able to see something here. The real problem was recovering all my old posts.

Recovering all my old posts

Felix Lapalme recommended a command-line tool that downloads an entire site’s content from the Internet Wayback Machine. Before anything else could go wrong, I immediately ran that tool, which worked as advertised (isn’t it great when things work like they say on the tin?). This archive was missing a lot of posts, particularly my more recent stuff, but something was better than nothing. I was pretty worn out from getting Jekyll set up so I went to bed, putting off figuring out how to transform this backup into something formatted for Jekyll.

When I woke up this morning, I had a pleasant email in my inbox. It wasn’t from Tumblr, natch. It was from Ben Ubois, the founder of Feedbin, the RSS aggregator:

Hey Jared,

I saw on Twitter about Tumblr closing your account. That sounds lame!

Feedbin has posts from your blog going back to 2012. I’ve attached all 234 of them as JSON.

The structure looks like:

{ title: “Title”, url: …

Hope it helps!

Thanks to Ben, I now had everything I needed. Unlike the Internet Archive backup, in which each post would need to be heavily transformed to unsleeve the post content from all the page chrome that got captured with each crawl, the RSS backup was already free of such chrome. Better still, the JSON data structure would make it possible to automate the capture of the critical metadata for each page, namely the title and published date.

Using the Swift Package Manager, I made a quick n’ dirty utility that transforms a JSON-encoded file of Feedbin posts into a directory of HTML files formatted for Jekyll. After running this utility, all I had to do to republish all my old content (in correct chronological order, too) was to copy those HTML files into the _posts directory in my Jekyll project, run jekyll build, and upload them to the right directory on my Media Temple server.

The one truly unfortunate downside of moving to Jekyll is that all existing links to my Tumblr-hosted blog are now defunct. Tumblr blog post URL paths take one of the following forms:

/post/123456789
/post/123456789/title-slug-for-post
/post/123456789/title-slug-for-post/index.html

Whereas Jekyll links use a date-based path:

/2019/04/07/title-slug-for-post.html

At least that’s the Jekyll default. I like this default and have decided to keep it and fix broken Tumblr links on a case-by-case basis. For the posts that I care about most (the ones that at one time or another got a lot of traffic, like this one or this one), I’ve updated the .htaccess file on my Media Temple server with redirects like this:

RedirectMatch 301 /post/97655887470.*$ /2014/09/16/good-design-is-about-process-not.html

I don’t know if I’ll do it this way forever (there might be a better way that Jekyll supports), but this was effective and took only a few minutes.

For all the posts that I haven’t redirected, I’ve updated the 404 with a blurb about what happened to my blog this weekend, with a link to my archive page.

Looking ahead

To make day-to-day life easier going forward, I added a script to my Jekyll project that uses rsync to upload the _sites directory to my Media Temple server, authenticated with ssh. From the root directory of my Jekyll repo, I can just run publish.sh to rebuild and upload everything. I’m continually impressed by how efficient rsync is. Small changes to the site, like correcting a typo in some markup, are published in seconds.

Update: Tumblr replies

By the time I finished writing this post, I received a reply from Tumblr:

Hello,

We’ve restored your account.

Thank you for bringing this problem to our attention. We’re sorry that it occurred, and we’ll do our best to make sure that it doesn’t happen again.

You should now be able to log in just fine with your email address and password.

Please let me know if there’s anything else I can help you with!

Drew

Community Manager

Too little, too late, Drew. I hope Tumblr understands that I simply cannot trust them anymore. I’m grateful that they responded to my contact request and that my account has been reopened, but it’s unacceptable that a years-old account still in good standing can be terminated without any advance warning or preventative recourse. This weekend’s debacle is a textbook case for why folks should own their own data. It’s also a good reminder that no single service provider can be wholly trusted.

If something isn’t backed up in more than one place, it’s not backed up at all.

|  7 Apr 2019




Please Pardon Our Mess

Since Tumblr decided to terminate my account without warning or explanation, I’ve decided it’s past time to move my blog to something under my control. Unfortunately I don’t have a backup of my old posts ready to go. In lieu of anything better, I’ll fill my Jekyll queue with my favorite lorem ipsum alternative, the “Clipper Ships” poem from Little Man Tate.

|  6 Apr 2019




Clipper Ships

Me and my dad make models of clipper ships. Clipper ships sail on the ocean. Clipper ships never sail on rivers or lakes. I like clipper ships because they are fast. Clipper ships have lots of sails and are made of wood.

(Matt Montini)

|  6 Apr 2019




Unit Testing is Easier Than You Think

I am ashamed to admit how many years I avoided incorporating unit tests into my iOS projects. The simple truth is that I was afraid of what I didn’t know. I don’t have a CS degree. I never studied programming formally. The terminology itself is intimidating. What is a unit? How do I know if my app has units in it? What does it mean to test them? Not understanding what they are or even what good unit tests look like, my anxiety filled the gaps in my knowledge with frightening mental imagery.

After struggling with them for a few years, and after finding the occasional inspiring tech talk, I have come to understand that not only is unit testing not scary, but in fact good unit testing is surprisingly easy. The simplest and best unit test looks exactly like this:

XCTAssertEqual(actual, expected)

That’s it. A straightforward comparison of some unknown value against what you expect that value to be. The goal with unit testing is to write simple, direct assertions like that one. Every other choice you make is just a means to that end. To see how, first let’s widen our field of vision to the code surrounding that assertion:

let input = ... // hard-coded inputs
let actual = SomeWidget().doSomething(with: input)
let expected = ... // hard-coded output
XCTAssertEqual(actual, expected)

A good unit test answers the question, “When I pass something into this other thing, what value do I get out?” Answering that question is easier if your input and expected output are written using simple, hard-coded constants. Unlike writing regular code, when you’re writing a unit test, using hard-coded data is mandatory. Swift literals are your friends. You jot down some hard-coded input values, and also a hard-coded expected output value. Sandwiched in the middle is the behavior you’re testing. Imagine if you wanted to test String.lowercased():

let input = "unIT TesTING Is NoT SO BAD"
let actual = input.lowercased()
let expected = "unit testing is not so bad"
XCTAssertEqual(actual, expected)

I’m calling a method called lowercased(). I’m passing a string into it (input) and I’m getting another string out of it (actual). I hope that the returned value is the same as another string (expected). By using string literals (instead of, say, dynamic values obtained from a networked resource), you’ve eliminated unpredictability from the test. There’s now only a single variable (in the algebraic sense) at play, the behavior of lowercased(). This is a good unit test.

This may strike you as overly simplistic, but I assure you it isn’t. Even the most complex behaviors in your app can be tested in this manner. If you have some dark corner of your app that you wish had unit tests, start by building a mental model of the problem that’s oriented towards that XCTAssert assertion you’re going to write. Say you want to add unit tests to some code that interacts with a web service. You have a class that looks like this:

class APIManagerHamburgerHelper {
    func getUser(withId id: String, completion: @escaping (Result) -> Void) {...}
}

Right now there’s no way to unit test that getUser method, not in the way that I’m advocating. There are several things hindering you. The method has no return value. It requires making a roundtrip request to an actual server. There are many jobs hiding inside the implementation of that method: building a URL request, evaluating a URLSession response envelope (response, data, and error), decoding JSON-encoded data, mapping any error along the way to your APIError type. Each of these hidden jobs is itself something that needs unit test coverage. To test them, you’ll need to expose those jobs in a form that is “shaped” like the .lowercased() example above. There’s no one single way to do this, but here’s a rough example. You can break out these jobs into a single-purpose utilities:

struct URLRequestBuilder {
    func getUserRequest(userId: String) -> URLRequest
}

struct URLResponseEnvelopeEvaluator {
    struct Success: Equatable {
        let response: HTTPURLResponse
        let data: Data
    }

    struct Failure: Swift.Error, Equatable {
        let response: URLResponse?
        let error: APIError?
    }

    typealias Result = Result<Success, Failure>

    func evaluate(data: Data?, response: URLResponse?, error: Error?) -> Result {...}
}

struct User: Decodable {
    let id: String
    let name: String
    let displayName: String
}

The knowledge of how to implement each of these jobs (building requests, evaluating responses, parsing data) has been extracted out of the untestable getUser method and into discrete types that lend themselves to straightforward unit tests. Testing the request builder might look something like this:

let id = "abc"
let actual = URLRequestBuilder().buildGetUserProfileRequest(userId: id)
let expected: URLRequest = {
    let url = URL(string: "https://baseurl.com/user/\(id)")!
    var request = URLRequest(url: url)
    request.addValue("foo", forHTTPHeaderField: "Bar")
    return request
}()
XCTAssertEqual(actual, expected)

Note how the input value and expected output value are all written using hard-coded constants as possible. As with all good unit tests, we pass hard-coded input into the member being tested, and compare the actual output against a hard-coded expected output value. Because inputs and expected outputs are hard-coded, we can write unit tests to cover any imaginable scenario. Perhaps you want to test a specific error pathway, what happens when the web service replies with a 401 status code. We set up the input values to closely reflect what a URLSession would actually present to the developer in a completion block:

let data: Data? = nil
let response = HTTPURLResponse(
    url: URL(string: "https://baseurl.com/user/abc")!,
    statusCode: 401,
    httpVersion: "1.0",
    headerFields: nil
)
let error = NSError(
    domain: NSURLErrorDomain,
    code: 401,
    userInfo: ["foo": "bar"]
)

Then we use those values as inputs to the method being unit tested, as well as to the expected result (where applicable):

let actual = URLResponseEnvelopeEvaluator().evaluate(
    data: data,
    response: response,
    error: error
)
let expected: URLResponseEnvelopeEvaluator.Result = .failure(Failure(
    response: response,
    error: .authenticationError401(error)
))
XCTAssertEqual(actual, expected)

In all the foregoing examples, no matter how hairy the subject matter, all the unit tests take the same shape:

This simple, repeatable pattern is what makes good unit tests “easy”. The hardest part isn’t writing the tests themselves, but rather structuring your code so that the behaviors are unit-testable in the first place. Doing that takes experience and much trial-and-error. That effort will come more easily to you once you have internalized the essential simplicity of a good unit test.

If you would like to learn more about refactoring your code for unit testing, I have a screencast on Big Nerd Ranch’s The Frontier with some live coding examples that you may find helpful.

|  1 Apr 2019




PSA: Please Don’t Double Space Between Sentences

In the nineteenth century, which was a dark and inflationary age in typography and type design, many compositors were encouraged to stuff extra space between sentences. Generations of twentieth-century typists were then taught to do the same, by hitting the spacebar twice after every period. Your typing as well as your typesetting will benefit from unlearning this quaint Victorian habit. As a general rule, no more than a single space is required after a period, a colon, or any other mark of punctuation.

~ Robert Bringhurst, The Elements of Typographic Style

|  27 Mar 2019




Think Twice Before Downgrading to a Free GitHub Account

Today I learned that if you downgrade from a paid to a free GitHub account, you’ll lose any branch protection rules you’ve added to your private repositories. It’s my fault for not reading the fine print more carefully, but still – it would have been helpful for them to toss up an alert or something that makes it obvious that by downgrading to the free tier there will be destructive side effects on features you probably set-and-forgot years ago and have taken for granted. I live in mortal fear of making a dumb mistake and losing irreplaceable source code. Downgrading to free account is, in my estimation, Step One on the path to me making such a mistake.

Please note also that when upgrading back to a Pro account, any branch protection rules you had before were permanently deleted when you downgraded to the free tier. They will all have to be recreated from scratch. So if you were considering downgrading to a free GitHub account, I don’t recommend doing so if you use private repositories for code that you care about. And if you have already downgraded to a free account, double check that you can live with the consequences of accidentally force pushing or deleting an important branch.

Update: I received a friendly reply from GitHub CEO Nat Friedman. :)

|  24 Mar 2019




iOS App Analytics a Necessary Evil, or Maybe Just an Evil

I have yet to see an iOS project where implementing client-side analytics (page loads, event logging, behavior tracking) wasn’t unspeakably awful to implement. A litany of sins:

And these are just the problems with implementation details. Let’s not forget how morally and/or legally perilous the entire enterprise is, stockpiling user data without regard for its half-life — which is effectively infinite — or for all the incalculable damage that can be done in five, ten, twenty years when Faild Strtup Dotcom goes bellyup and all it’s data gets dumped onto the lap of the highest bidder.

There are many reasons to just abandon this foul mess, and only one indestructible, immovable reason not to: you can’t run a business if you don’t know what your customers want. We all understand this dilemma, but understanding it doesn’t make it any easier to stomach.

|  19 Mar 2019




Reference Types vs Value Types: the Original Sin of Programming

Gentle reader, you might know where this is written, better than I have written it here, and perhaps canonically: there aren’t that many kinds of code, right? It seems to me that in any language, any given statement or expression can be reduced to one (or a composite) of these two kinds of activities:

A notion I’ve been struggling to nail down in words over the past several years is how—still struggling here—all pain seems to stem from one original sin: real-world programs require both kinds of activities, reference and value, but those two activities are as incompatible as oil and water. To write a useful program, you need to undertake reference activities and value activities, but the two don’t want to be mixed. The act of writing the program is itself the cause of the problem!

Programming paradigms — OO, Functional, Procedural, Imperative — and application design patterns — MVC, MVVM, VIPER, YADA•YADA – all seem to be answers to the problem of how to resolve the impedance mismatch between reference and value activities, but a side effect is that in the act of proposing a solution they implicitly suggest to the developer the paradigm has done the hard work of understanding the diagnosis for you, so you don’t have to. When inevitably a given paradigm runs aground on a blind spot, too often the ensuing debate becomes about the merits of particular paradigms and not nearly enough about achieving a universal understanding of the nature of the problem that all paradigms aim to solve.

I believe that it is vastly more important for a developer to internalize the “simple” lesson of how reference types and value types differ. It’s Programming 101 material, but so much of what we do is merely a footnote to that difference. The more one internalizes that insight, the easier it becomes to reason about this or that paradigm, the easier it becomes to jump to a different ship as project needs change, lacking loyalty to any solution but fiercely deepening one’s understanding of the diagnosis.

|  12 Dec 2018




Little Weeds of Dread

A sobering comparison occurred to me last night: the same way that I can never again fully enjoy the software on the little devices in our pockets and on our wrists, knowing as I do how an app is made and being able to spot when one is not made well, perpetually bracing myself for whatever horrors await at the end of each flickering transition and every unrelenting activity indicator – it is like the anxiety of becoming parent, a parent who was once a child also but whose simple childhood joy has since been choked by the little weeds of dread that take root in the soul of every adult, as one cannot reach adulthood without acquiring unspeakable knowledge, accompanied by horrifying detail, of how a life is made and how easily a life can be unmade.

|  7 Dec 2018




TIL: Boy, Have I Been Misusing SCNetworkReachability

After reading this discussion — courtesy of Jeremy Sherman — I learned that I’ve been misusing SCNetworkReachability for years. I’ve been allowing certain user-facing states and features to be influenced by the current reachability state, even to the point of blocking some user-initiated network requests. In ’sodes, for example, I’m currently preventing a playback attempt whenever the network is unreachable.

Turns.

Out.

SCNetworkReachability, like all networking, is not reliable enough to support that kind of behavior. If there’s a false negative (which is much more common than one might think), it means the app becomes needlessly unusable.

SCNetworkReachability should only be used to influence what you do about a network request that has already failed, not an initial request that has yet to be attempted. Use a negative status to determine whether or not you attempt an automatic retry, or to tweak the user-facing language of an alert. Use a positive status to consider retrying an earlier failed request. Never prevent a user-initiated request from being attempted just because SCNetworkReachability thinks there’s not a reachable network.

You can see the code I’m using to monitor reachability status right here on GitHub. To drive the point home to myself, I’m probably going to change the public API of my network reachability wrapper from this:

var isReachable: Bool {...}

to something that more accurately models the truth:

enum ReachabilityStatus {
    case probablyNotButWhoKnows
    case itWorkedThatOneTimeRecently
}

var status: ReachabilityStatus {...}

SCNetworkReachability, or rather the realities of real-world networking at whose mercy SCNetworkReachability remains, is just not reliable enough to deserve a Bool.

|  15 Oct 2018