Dr. Jekyll and Mr. Tumblr
Yesterday morning this tweet from Mark Sands showed up in my mentions:
@jaredsinclair Looks like Tumblr deleted your blog. Any hope for reviving it?
Up until yesterday, my blog was hosted via Tumblr with a CNAME redirect from blog.jaredsinclair.com
to the appropriate Tumblr domain. I set everything up seven or eight years ago, and it had worked fine all that time. I tried visiting the site and, like Mark said, the blog was gone. I got the Tumblr equivalent of a 404 instead of my blog. Next I tried logging into my Tumblr account to figure out what was wrong. Here’s what I saw:
W. T. Fuck.
I thought perhaps I’d missed a warning from Tumblr, but I had no recollection of receiving anything. I double and triple checked all my email accounts. Nada. There wasn’t anything from Tumblr since one day in March when I used a magic link to sign into my account rather than username and password. Nothing archived, nothing in spam. Without warning Tumblr had terminated my account for no discernable reason. I tried logging in from a desktop browser to be sure there wasn’t some mobile-rendering goof. Same thing. Your account has been terminated in classic Tumblr chunky white.
Wait, back up.
Ever since I slung screwdrivers around repairing Macs at a third-party Mac store, I’ve lived in mortal fear of losing data. Practically every day we had to tell somebody your baby pictures, your wedding pictures, your dissertation: all gone. There’s only so many times you can watch folks sob over the consequences of a spilled cup of coffee before you’re sobbing with them. For the most part, my backup strategy has been comfortably paranoid:
-
All photos and videos are backed up to my iCloud Photo Library. This entire library is synced to the family computer. The family computer is synced to Backblaze. Google Photos adds a redundant backup of everything, as well, using the iOS app. There’s also a good chunk of our most important photos and videos in the Dropbox accounts of my family members. Last but not least, we print a book of photos every year so there’s a hard copy that’ll survive once I die and there’s no one else in the family willing to be as paranoid as I am.
-
Every paper document that comes through our house that has even a scant chance of being useful later gets digitized with Scanner Pro and Dropbox-ed. Scanner Pro has a cool feature where you can have every scan automatically synced to a predetermined Dropbox folder. At the start of each year, I update this to point to an
Inbox 20--
folder for that year. If I need to move something out of there to a more appropriate folder, I do, but otherwise I just let things accumulate. The perfect is the enemy of the good and all that. -
Both my and my wife’s Dropbox accounts are fully synced to the family computer, and thus also to Backblaze along with the rest of the stuff on it.
-
Github: use it! I use it religiously, even for the small stuff, like the secret Gist I use to track my
.bash_profile
. I enable branch protections on everything, too, just in case. -
An external hard drive or two, however outdated, are kept in our safe deposit box. So even if Apple, Dropbox, Google, GitHub, and everyone else all decide to terminate my accounts in one day, hopefully there will still be something available on those old drives that I can salvage.
Yet my blog was the one corner of my digital life where I had gotten lazy. I didn’t have a backup of my posts anywhere, only scattered draft versions in a Dropbox folder. The canonical versions of all my blog posts were whatever Tumblr had saved for my account. The Tumblr versions had lots of small edits and corrections that weren’t saved anywhere, even if there was a draft copy in Dropbox. Now suddenly, everything that Tumblr had was gone.
Moving to Jekyll
I used the spartan contact form to ask Tumblr why my account was terminated. But rather than wait for a response that might never come, I decided it was past time to bring my blogging setup in line with the rest of my backup paranoia.
I decided to move everything over to Jekyll, backed by a comprehensive GitHub repository. I’ve had a Media Temple account for years and have been really happy with them. I knew once I figured out how to actually use Jekyll in a comfortable way, my Media Temple Grid Service would be able to host the static files easily. Why Jekyll? The short answer is that it’s used by GitHub Pages. In a sea of alternatives, I’m content to follow the smart folks at GitHub wherever they go.
If you’re reasonably comfortable with basic web programming, and know enough about the shell to get yourself in trouble, using Jekyll isn’t so bad. I got a proof-of-concept site up and running pretty quickly. Since Media Temple is also the domain registrar for jaredsinclair.com
, it was easy to replace the Tumblr CNAME record for blog.jaredsinclair.com
with an A record pointing to the same IP address as jaredsinclair.com
(this was a necessary part of ensuring that old blog post links resolved to their new Jekyll permalink). Within hours, anyone that wanted to visit this blog was able to see something here. The real problem was recovering all my old posts.
Recovering all my old posts
Felix Lapalme recommended a command-line tool that downloads an entire site’s content from the Internet Wayback Machine. Before anything else could go wrong, I immediately ran that tool, which worked as advertised (isn’t it great when things work like they say on the tin?). This archive was missing a lot of posts, particularly my more recent stuff, but something was better than nothing. I was pretty worn out from getting Jekyll set up so I went to bed, putting off figuring out how to transform this backup into something formatted for Jekyll.
When I woke up this morning, I had a pleasant email in my inbox. It wasn’t from Tumblr, natch. It was from Ben Ubois, the founder of Feedbin, the RSS aggregator:
Hey Jared,
I saw on Twitter about Tumblr closing your account. That sounds lame!
Feedbin has posts from your blog going back to 2012. I’ve attached all 234 of them as JSON.
The structure looks like:
{ title: “Title”, url: …
Hope it helps!
Thanks to Ben, I now had everything I needed. Unlike the Internet Archive backup, in which each post would need to be heavily transformed to unsleeve the post content from all the page chrome that got captured with each crawl, the RSS backup was already free of such chrome. Better still, the JSON data structure would make it possible to automate the capture of the critical metadata for each page, namely the title and published date.
Using the Swift Package Manager, I made a quick n’ dirty utility that transforms a JSON-encoded file of Feedbin posts into a directory of HTML files formatted for Jekyll. After running this utility, all I had to do to republish all my old content (in correct chronological order, too) was to copy those HTML files into the _posts
directory in my Jekyll project, run jekyll build
, and upload them to the right directory on my Media Temple server.
Fixing broken links
The one truly unfortunate downside of moving to Jekyll is that all existing links to my Tumblr-hosted blog are now defunct. Tumblr blog post URL paths take one of the following forms:
/post/123456789
/post/123456789/title-slug-for-post
/post/123456789/title-slug-for-post/index.html
Whereas Jekyll links use a date-based path:
/2019/04/07/title-slug-for-post.html
At least that’s the Jekyll default. I like this default and have decided to keep it and fix broken Tumblr links on a case-by-case basis. For the posts that I care about most (the ones that at one time or another got a lot of traffic, like this one or this one), I’ve updated the .htaccess
file on my Media Temple server with redirects like this:
RedirectMatch 301 /post/97655887470.${html`*`}$ /2014/09/16/good-design-is-about-process-not.html
I don’t know if I’ll do it this way forever (there might be a better way that Jekyll supports), but this was effective and took only a few minutes.
For all the posts that I haven’t redirected, I’ve updated the 404 with a blurb about what happened to my blog this weekend, with a link to my archive page.
Looking ahead
To make day-to-day life easier going forward, I added a script to my Jekyll project that uses rsync to upload the _sites
directory to my Media Temple server, authenticated with ssh. From the root directory of my Jekyll repo, I can just run publish.sh
to rebuild and upload everything. I’m continually impressed by how efficient rsync is. Small changes to the site, like correcting a typo in some markup, are published in seconds.
Update: Tumblr replies
By the time I finished writing this post, I received a reply from Tumblr:
Hello,
We’ve restored your account.
Thank you for bringing this problem to our attention. We’re sorry that it occurred, and we’ll do our best to make sure that it doesn’t happen again.
You should now be able to log in just fine with your email address and password.
Please let me know if there’s anything else I can help you with!
Drew
Community Manager
Too little, too late, Drew. I hope Tumblr understands that I simply cannot trust them anymore. I’m grateful that they responded to my contact request and that my account has been reopened, but it’s unacceptable that a years-old account still in good standing can be terminated without any advance warning or preventative recourse. This weekend’s debacle is a textbook case for why folks should own their own data. It’s also a good reminder that no single service provider can be wholly trusted.
If something isn’t backed up in more than one place, it’s not backed up at all.
Please Pardon Our Mess
Since Tumblr decided to terminate my account without warning or explanation, I’ve decided it’s past time to move my blog to something under my control. Unfortunately I don’t have a backup of my old posts ready to go. In lieu of anything better, I’ll fill my Jekyll queue with my favorite lorem ipsum alternative, the “Clipper Ships” poem from Little Man Tate.
Clipper Ships
Me and my dad make models of clipper ships. Clipper ships sail on the ocean. Clipper ships never sail on rivers or lakes. I like clipper ships because they are fast. Clipper ships have lots of sails and are made of wood.
(Matt Montini)
Unit Testing is Easier Than You Think
I am ashamed to admit how many years I avoided incorporating unit tests into my iOS projects. The simple truth is that I was afraid of what I didn’t know. I don’t have a CS degree. I never studied programming formally. The terminology itself is intimidating. What is a unit? How do I know if my app has units in it? What does it mean to test them? Not understanding what they are or even what good unit tests look like, my anxiety filled the gaps in my knowledge with frightening mental imagery.
After struggling with them for a few years, and after finding the occasional inspiring tech talk, I have come to understand that not only is unit testing not scary, but in fact good unit testing is surprisingly easy. The simplest and best unit test looks exactly like this:
XCTAssertEqual(actual, expected)
That’s it. A straightforward comparison of some unknown value against what you expect that value to be. The goal with unit testing is to write simple, direct assertions like that one. Every other choice you make is just a means to that end. To see how, first let’s widen our field of vision to the code surrounding that assertion:
let input = ... // hard-coded inputs
let actual = SomeWidget().doSomething(with: input)
let expected = ... // hard-coded output
XCTAssertEqual(actual, expected)
A good unit test answers the question, “When I pass something into this other thing, what value do I get out?” Answering that question is easier if your input and expected output are written using simple, hard-coded constants. Unlike writing regular code, when you’re writing a unit test, using hard-coded data is mandatory. Swift literals are your friends. You jot down some hard-coded input values, and also a hard-coded expected output value. Sandwiched in the middle is the behavior you’re testing. Imagine if you wanted to test String.lowercased()
:
let input = "unIT TesTING Is NoT SO BAD"
let actual = input.lowercased()
let expected = "unit testing is not so bad"
XCTAssertEqual(actual, expected)
‘m calling a method called lowercased()
. I’m passing a string into it (input
) and I’m getting another string out of it (actual
). I hope that the returned value is the same as another string (expected
). By using string literals (instead of, say, dynamic values obtained from a networked resource), you’ve eliminated unpredictability from the test. There’s now only a single variable (in the algebraic sense) at play, the behavior of lowercased()
. This is a good unit test.
This may strike you as overly simplistic, but I assure you it isn’t. Even the most complex behaviors in your app can be tested in this manner. If you have some dark corner of your app that you wish had unit tests, start by building a mental model of the problem that’s oriented towards that XCTAssert
assertion you’re going to write. Say you want to add unit tests to some code that interacts with a web service. You have a class that looks like this:
class APIManagerHamburgerHelper {
func getUser(withId id: String, completion: @escaping (Result<user apierror>) -> Void) {...}
}
Right now there’s no way to unit test that getUser
method, not in the way that I’m advocating. There are several things hindering you. The method has no return value. It requires making a roundtrip request to an actual server. There are many jobs hiding inside the implementation of that method: building a URL request, evaluating a URLSession response envelope (response, data, and error), decoding JSON-encoded data, mapping any error along the way to your APIError
type. Each of these hidden jobs is itself something that needs unit test coverage. To test them, you’ll need to expose those jobs in a form that is “shaped” like the .lowercased()
example above. There’s no one single way to do this, but here’s a rough example. You can break out these jobs into a single-purpose utilities:
struct URLRequestBuilder {
func getUserRequest(userId: String) -> URLRequest
}
struct URLResponseEnvelopeEvaluator {
struct Success: Equatable {
let response: HTTPURLResponse
let data: Data
}
struct Failure: Swift.Error, Equatable {
let response: URLResponse?
let error: APIError?
}
typealias Result = Result<Success, Failure>
func evaluate(data: Data?, response: URLResponse?, error: Error?) -> Result {...}
}
struct User: Decodable {
let id: String
let name: String
let displayName: String
}
The knowledge of how to implement each of these jobs (building requests, evaluating responses, parsing data) has been extracted out of the untestable getUser
method and into discrete types that lend themselves to straightforward unit tests. Testing the request builder might look something like this:
let id = "abc"
let actual = URLRequestBuilder().buildGetUserProfileRequest(userId: id)
let expected: URLRequest = {
let url = URL(string: "https://baseurl.com/user/\(id)")!
var request = URLRequest(url: url)
request.addValue("foo", forHTTPHeaderField: "Bar")
return request
}()
XCTAssertEqual(actual, expected)
Note how the input value and expected output value are all written using hard-coded constants as possible. As with all good unit tests, we pass hard-coded input into the member being tested, and compare the actual output against a hard-coded expected output value. Because inputs and expected outputs are hard-coded, we can write unit tests to cover any imaginable scenario. Perhaps you want to test a specific error pathway, what happens when the web service replies with a 401 status code. We set up the input values to closely reflect what a URLSession would actually present to the developer in a completion block:
let data: Data? = nil
let response = HTTPURLResponse(
url: URL(string: "https://baseurl.com/user/abc")!,
statusCode: 401,
httpVersion: "1.0",
headerFields: nil
)
let error = NSError(
domain: NSURLErrorDomain,
code: 401,
userInfo: ["foo": "bar"]
)
Then we use those values as inputs to the method being unit tested, as well as to the expected result (where applicable):
let actual = URLResponseEnvelopeEvaluator().evaluate(
data: data,
response: response,
error: error
)
let expected: URLResponseEnvelopeEvaluator.Result = .failure(Failure(
response: response,
error: .authenticationError401(error)
))
XCTAssertEqual(actual, expected)
n all the foregoing examples, no matter how hairy the subject matter, all the unit tests take the same shape:
- define input
- pass input into tested member, getting actual output
- define expected output
- compare the actual and expected outputs
This simple, repeatable pattern is what makes good unit tests “easy”. The hardest part isn’t writing the tests themselves, but rather structuring your code so that the behaviors are unit-testable in the first place. Doing that takes experience and much trial-and-error. That effort will come more easily to you once you have internalized the essential simplicity of a good unit test.
If you would like to learn more about refactoring your code for unit testing, I have a screencast on Big Nerd Ranch’s The Frontier with some live coding examples that you may find helpful.
PSA: Please Don’t Double Space Between Sentences
In the nineteenth century, which was a dark and inflationary age in typography and type design, many compositors were encouraged to stuff extra space between sentences. Generations of twentieth-century typists were then taught to do the same, by hitting the spacebar twice after every period. Your typing as well as your typesetting will benefit from unlearning this quaint Victorian habit. As a general rule, no more than a single space is required after a period, a colon, or any other mark of punctuation.
~ Robert Bringhurst, The Elements of Typographic Style