Truck Tech: You've Got Mail (and TLS and other stuff)

August 27, 2016 at 2:21 PM

Source: Amalgamated from Jim Browne Chevy and Bob 'n' Bee. Basically, I took a way nicer, newer version of my truck and Photoshopped someone else's mail icon onto it.

I've gotten into this wonderful rhythm of publishing posts that are tragically untimely. My Spring Cleaning post was five days too late, I'd had the insulation and skylight for weeks before I'd recounted the tale, and I'd successfully picked up and put down my weight goals way before I picked up the pen and put down a post. So it's only fitting that I'm just getting around to talking about a feature I added to the site over eight months ago.

Every so often, I get bored and fidgety and change things around on the site. Sometimes I'll tweak the savings calculation, sometimes I'll move around items on the mobile layout, and yet other times I'll sit down with a coffee and some music and plop a few new features into existence. The title and picture probably provide plenty perspective, but this time around, I added email subscriptions. I noted in this post (a relative eternity ago) that I was thinking about adding them because, in theory, it's pretty simple. Someone gives me their email address and clicks a button, then they get an email every time I write a new post. Unfortunately, like pretty much every problem I've sized up thus far, it wasn't nearly as simple as I was expecting, and I definitely hit a few snags along the way.

Forewarning: this is another post for my fellow nerds. For anyone else, it's a valid substitute for a tranquilizer.

The Nitty Grity Technical Details

Depending on how closely you've been following my various love affairs with trendy web technologies, you may or may not know that my site runs on Google's App Engine platform. Knowing very little about how email works, it logically seemed like the App Engine Mail API documentation could maybe potentially be a good place to start. The more I read though, the more foreign concepts I wandered across. What is DKIM? How does SPF work? Do I need an alias? Whose SMTP server am I using? I started to wonder, how bad do I really want email subscriptions anyway? But I, with ample hesitation, plodded down the path anyway.

DKIM

The ideas behind DKIM are pretty similar to the ones that power TLS and consequently all the important things that happen on the Internet. My understanding of DKIM is that, like, you have a DNS record saying "Yo, this is my public key, use it if you want to see if an email from me is legit" and anyone can see that when they look up records for your domain name (FromInsideTheBox.com in my case). Then, every time I send a message [at] frominsidethebox [dot] com, AppEngine adds a special magic signature in the header that's signed with the corresponding private key, and the recipient can use the public key to verify that the signature makes sense and thus the email must have really been sent by me. Cool stuff.

SPF

Turns out it's not just for sunscreen anymore. SPF works kind of like DKIM, minus all of the fancy encryption stuff. SPF works by adding another DNS record, but this one says "Bro, if you get a message from my buddy App Engine, you can trust it, he's chill." This way, you can send mail from someone else's mail server and still show that it can be trusted.

At this point, you may be wondering a few things, like Why are all of these things necessary? and Why is Brandon personifying computing protocols and making them sound like frat bros? To answer the former question (and ignore the latter), it's because the Internet is a giant pile of dusty, wobbling, unstable, duct-taped together systems stacked on top of each other like some shoddy version of digital Jenga. The underlying protocol for sending emails reared its head in 1982, and offered literally no way to authenticate where the messages are coming from. It makes sense, it evolved at a time when the "Internet" consisted entirely of academics and the military, so you could just trust the network. It wasn't like some drunk researcher was going to send prank emails, like:

From: Ronald Reagan <rreagan@whitehouse.gov>

To: Alexander Haig <ahaig@whitehouse.gov>

Subject: fire the nukes

Date: April 1, 1985

Ay Hags,

Launch one of those suckers at the moon, I wanna make it rain nacho cheese.

LOL,
Ya boy El Prezzo

...so yeah. That'd definitely be a problem today, thus DKIM and SPF.

CAN SPAM

There was one other acronym I had forgotten to take into consideration: CAN SPAM. CAN SPAM is a US law signed in 2003. It's also the reason all of the (legitimate) emails you get have an "Unsubscribe" link. Not wanting to be at the mercy of the US Government (well, any more than my taxes already cause me to be), I figured it prudent to add one of these "Unsubscribe" links myself. That takes work and effort though, now we're not just blindly (and indefinitely) sending out emails to anyone who sends me an address. Instead, now we have to maintain an active database, and remove people when they ask.

I'm making this sound harder and more complicated than it actually was. All I had to do was add a few web endpoints, and add a new type to my Datastore. The end code (removing the extra boring parts) looked something like:

// Email actions
http.HandleFunc("/subscribe", confirmationHandler)
http.HandleFunc("/confirm", subscribeHandler)
http.HandleFunc("/unsubscribe", unsubscribeHandler)

// When someone submits their email
func confirmationHandler(w http.ResponseWriter, r *http.Request) {
  address := strings.TrimSpace(r.PostFormValue("email"))

  // Do a few basic validations

  exists, err := db.EMailExists(address)
  if err != nil || exists{
    // Return an error
  }

  id, err := db.CreateEMail(address)
  if err != nil {
    // Return a different error
  }

  data := struct {
    ID int64
  }{
    id,
  }
  var buf bytes.Buffer
  if err := emailTemplate.ExecuteTemplate(&buf, "confirm.html", data); err != nil {
    // Yet another error
  }
// Send email with data in buffer, that basically says "Click here to subscribe"
}

// Someone clicking the confirm link I sent them
func subscribeHandler(w http.ResponseWriter, r *http.Request) {
  id, err := strconv.ParseInt(r.FormValue("key"), 10, 64)
  if err != nil {
    // Uh oh, looks like nobody is subscribing today
  }

  if err := db.ConfirmEMail(id); err != nil {
    // Experiencing some technical difficulties
  }

  render(w, BasicTemplate{
    Description: "Your subscription has been added.",
    Header:      "You're all set!",
    Message:     "Good to go.",
    Type:        "success",
  })
}

// Someone doesn't want my ramblings in their inbox anymore
func unsubscribeHandler(w http.ResponseWriter, r *http.Request) {
  id, err := strconv.ParseInt(r.FormValue("key"), 10, 64)
  if err != nil {
    // Uh oh, they aren't going to like this
  }

  if err := db.DeleteEMail(id); err != nil {
    // Looks like they're stuck with me until I learn how to program better
  }

 render(w, BasicTemplate{
    Description: "Your subscription has been removed.",
    Header:      "You're all set!",
    Message:     "Good to go.",
    Type:        "success",
  })
}

// The struct used to hold subscriptions
type Subscription struct {
  Address   string
  Confirmed bool
  Added     time.Time
  ID        int64 `datastore:"-"`
}

Woah Brandon, when did you add highlighted code snippets to the site?!

I'm glad you noticed, hypothetical reader! I added it a few months ago, but this is the first post to actually utilize it. It uses highlight.js behind the scenes, and took like five lines of code to set up. But I'm getting sidetracked here. As for the code above, that's really all I needed to satisfy the CAN SPAM act requirements, all in all not too bad. Granted, that doesn't include the code for actually sending emails, but we're getting there.

Actually Sending Mail

Okay, so at this point in the narrative, we've set up DKIM and SPF to work with App Engine, and we've got a storage system set up for subscriptions. According to the documentation, all we need to do now is add:

subs, err := db.Subscriptions(ctx)
if err != nil {
  // Looks like no emails are going out today
}

for _, sub := range subs {
  data := struct {
    ID int64
  }{
    sub.ID,
  }
  var buf bytes.Buffer
  if err := emailTemplate.ExecuteTemplate(&buf, "new_post.html", data); err != nil {
    // Guess we're not sending this one
  }
  mail.Send(ctx, &mail.Message{
    Sender: "post-notifier@frominsidethebox.com",
    To: sub.Address,
    Subject: "New Post on From Inside The Box: " + post.Title,
    Body: buf.String(),
  })
}

Hook that up to some authenticated web endpoint, pass in the post ID as a parameter, and boom, you've got a working subscription system. Shockingly enough, that's pretty much all it took to get working. Not only did it work, it worked on (nearly) the first try, no less. In fact, it was so easy that I was sure something would break in the not-too-distant future, as my experience with trucks and things and life in general has shown me to be universally true.

Mistake #1: Not Reading the Manual

A bit of background: I have a special button I push when I want to send out emails to subscribers. It's separate from the special button for publishing posts, because in the event that I accidentally publish something too early, I'd rather not spam everyone with my half-baked ramblings. And if anything goes wrong in the email-sending process after I push my special button, the server will return a message. For a month or two, this process worked well. Things were going along swimmingly, and then one time I pushed the special email button, and things stopped swimming. I got an error that of my 106 subscription emails, 6 of them didn't send.

Huh, that's a bit suspicious, exactly 100 emails sent successfully...

After the slightest bit of sleuthing, I found out that App Engine has some limits and quotas, a particualrly relevant one being that I can't send more than 100 emails a day. This works fine when ≤100 people are subscribed, but not so well when 106 people are subscribed. Sorry to the 6 people who didn't receive an email when I put up this post, unfortunately my logging was also bad enough that I didn't know which six subscribers didn't get a letter, so the best thing I could do was fix the problem for the future. Enter stage right, SendGrid.

SendGrid is a mail-delivery service. They do lots of other things too, but those aren't quite as relevant or interesting to me. All I care about was that they'd happily send way more than 100 emails. So I ripped out the App Engine mailing code, and pulled in SendGrid's Go client library. It took about an hour to throw together, and the only real difference is that I have to pass the SendGrid API key, which I embed directly in my code because I have no sense for design or regard for security.*

Mistake #2: Just Generally Being Incompetent

If you look at the mail snippet above, you might notice one teensy little problem, mainly that I'm just iterating over all the subscriptions and making a blocking call to send a message. This is fine when you only have a few users, but it scales linearly with the number of subscriptions. Not only that, but most of the time spent sending each message is waiting for the request to trot off to The Internet At Large™ and mosey on back. This is a shame, because Go has all these fancy concurrency primitives that I'm not taking advantage of, and would be particularly well-suited to this embarrassingly parallel problem. Using goroutines, we can fire off all the requests, which will then do all that waiting in the background. Similar to my last performance post, it means we can send all of the emails in (almost, kinda sort if you squint a little bit) constant time, instead of watching it get slower as I get more subscribers.

Sounds great right? Well it would be if I was a less shoddy programmer. Concurrency can be tricky to get right (even in a language built for it) if you aren't being careful, and I wasn't thinking. Here's the problem: each email is slightly different because the unsubscribe link has a unique ID for each subscriber, so I can't fire off the same email for everyone. See if you can spot where I went wrong in this first implementation:

var buf bytes.Buffer
data := struct {
  Unsub  int64
  PostID int64
  Desc   string
}{
  0,
  post.ID,
  post.Desc(),
}

var wg sync.WaitGroup
// Iterate over list of subscribers
for _, email := range emails {
  wg.Add(1)
  go func(email *EMail) {
    defer wg.Done()
    data.Unsub = email.ID
    if err := emailTemplate.ExecuteTemplate(&buf, "new_post.html", data); err != nil {
      // Always check your errors kids
    }

    message := sendgrid.NewMail()
    // Build the rest of the message here

    if err := sg.Send(message); err != nil {
      // Log the errors and whatnot
    }

    buf.Reset()
  }(email)
}
// Wait for all the emails to be sent
wg.Wait()

...did you see the ~~gorilla~~ mistake? I thought I was being clever by using one buffer for the emails to save memory, but I was really setting myself up for an awful and awfully obvious race condition by writing to the same buffer from 100 different goroutines with no synchronization whatsoever. The end result could have been a number of problems of varying unpleasantness, but what happened, in reality, was that every person got an email containing the emails for EVERYONE concatenated together. Aside from just looking silly, it means that anyone who got one of those emails could have (and still can) click each and every one of the unsubscribe links in the email and unsubscribe >100 people from my mailing list. I greatly appreciate everyone continuing to not do that, but I regularly back up the mailing list just in case.

The first thing I did to fix this was to give each goroutine its own buffer. Then, I wrote some beefy unit and integration tests for the mailing system. Once I was satisfied with the passing tests, I ripped out all the mailing code, and replaced it with a pool of mailer threads that get fed subscriber IDs via a channel. That way, I only need as many buffers as workers in the pool. The first few attempts caused the tests to fail for one reason or another (malformed message bodies, deadlocks, etc), but I eventually got it working. I'm going to skip including that code, because this post is already probably too long, but you get the idea.

Last Thing: TLS

All this talk of email is great, but there was one oversight on my part that I'd been avoiding because it was convenient to do so. My site was being served over HTTP, not HTTPS, so if you clicked the "Subscribe" button, it'd be sending your email address over the open Internet. Not a huge deal, but it's 2016 and I can do better than that. So with that in mind (and a bit of prodding from a friend), I started looking up the process for getting TLS on a custom domain with App Engine. I also wasn't ecstatic about the idea of paying a signing authority for a certificate, so I looked into Let's Encrypt. I ended up finding a few great guides on how to set it all up, then I just had to switch my image serving over to HTTPS, and add a "secure: true" in my App Engine app.yaml configuration so that it would always redirect to the HTTPS version of the site and that was it! I don't know about you, but the green lock in the address bar gives me a warm, fuzzy feeling.

*I'm joking, of course. What kind of idiot would mix credentials with code? Kidding again, I'm that kind of idiot and that's exactly what I do, but I swear I have every intention of having a separate Datastore table to store all of the private keys I need for the application. In the mean time, my source code is stored in a private Github repository, only accessible via two-factor or private key authentication. Granted, I'm still putting quite a bit of faith in Github here, but the stakes aren't all that high for this particular case.

Site Stuff Truck Tech

Thoughts from Inside the Box