Oct 12, 2015

Chef seemed like a big hack when I first started with it. Chef "cookbooks" have "recipes" and there's something called "kitchens" and "data bags" and servers are called "nodes." Recipes seem like the important things - they define what setup will happen on your servers.

Recipes are ruby scripts written at the Kernel level, not reasonably contained in classes or modules like one would expect. You can include other recipes that define "lightweight resource providers" - neatly acronymed to the unpronouncable LWRP. These define helper methods also at the top level, and make them available to your recipe. What's to keep one recipe's methods from clobbering another's? Nothing, as far as I can tell.

Recipes run with "attributes," which can be specified in a number of different ways:

  • on the node itself: scoped for a specific box
  • inside a "role:" for every server of an arbitrary type
  • inside an "environment:" for overriding attributes on dev/stg/prd
  • from the recipe's defaults

Last time I used Chef, attributes had to be defined in a JSON file - an unusual choice for Ruby, which usually goes with YAML. Now apparently there's a Ruby DSL, which uses Hashies, which also appear to run at the Kernel level. I couldn't get it to work in my setup. Chef munges these different levels together with something like inheritence - defaults get overridden in a seemingly sensible order. Unless you told them to override each other somewhere. Then whatever happens, happens.

"Data bags" are an arbitrary set of JSON objects. Or is the Ruby DSL supposed to work there, too? I dunno. Anyway, they store arbitrary data you can access from anywhere in any recipe, and who doesn't love global state? They seem necessary for things like usernames and keys, so I can forgive some globalization.

This seems like a good enough structure / convention, until you start relying on external recipes. Chef has apparently adopted Berkshelf, a kind of Bundler for chef. You can browse available cookbooks at "the supermarket:" are you tired of the metaphors yet?

The problem here is that recipe names are not unique or consistent! I was using an rbenv recipe. But then I cloned my Chef repo on a new machine, ran berks install, and ended up with a totally different cookbook! I mean, what the hell guys? You can't just pull the rug out like that. It's rude.

Sure, I could vendor said recipes and store them with my repo. Like an animal. But we don't do that with Bundler, because it seems like the absolute bloody least a package manager can do. Even Bower can handle that much, and basically all it does is clone repos from Github.

These cookbooks often operate in totally different ways. Many cookbooks include a recipe you can run with all the setup included; i's dotted and t's crossed. They install something like Postgres 9.3 from a package manager or source with a configuration specified in the munged-together attributes for your box. Others rely on stuff in data bags, and you have to specify a node name in the data bag attributes or something awful. Some cookbooks barely have any recipes and you have to write your own recipe using their LWRPs, even if attributes would be totally sensible.

Coming back to Chef a few months after doing my last server setup, it seems like they are trying to make progress: using a consistent Ruby DSL rather than JSON, making a package manager official, etc. But in the process it's become even more of a nightmarish hack. The best practices keep shifting, and the cookbook maintainers aren't keeping up. You can't use any tutorials or guides more than a few months old - they'll recommend outdated practices that will leave you more confused about the "right" way to do things. Examples include installing Berkshelf as a gem when it now requires the ChefDK, using Librarian-Chef despite adoption of Berkshelf, storing everything in data bags instead of attributes, etc, etc, etc.

Honestly, I'm just not feeling Chef any more. Alternatives like Ansible, Puppet, and even Fucking Shell Scripts are not exactly inspiring. Docker is not for system configuration, even though it kinda looks like it is. It's for isolating an app environment, and configuring a sub-system for that. Maybe otto is the way to go? But damn, their config syntax is weirder than anything else I've seen so far.

I'm feeling pretty lost, overall.

Oct 1, 2015

Everything you know about html_safe is wrong.

As pointed out in the World of Rails Security talk at RailsConf this year, even the name is kind of crap. Calling .html_safe on some string sounds kind of like it would make said string safe to put in your HTML. In fact, it does the opposite.

Essentially, you need to ensure that every bit of user output is escaped. The defaults make things pretty safe: form inputs, links, etc. are all escaped by default. There are a few small holes, though.


  • link_to user_name, 'http://hired.com'
  • image_tag user_image, alt: user_image_title
  • HAML: .xs-block= user_text
  • ERB: <%= user_text %>

Not Safe

  • link_to user.name, user_entered_url
  • .flashbar= flash[:alert].html_safe # with, say, username included

Sep 23, 2015

Fight for the Future wrote an open letter to Salesforce/Heroku regarding their endorsement of the Cybersecurity Information Sharing Act (pdf link). The bill would, according to FFTF, leak personally identifying information to DHS, NSA, etc.

The first sentence of the letter bothered me, though:

I was disappointed to learn that Salesforce joined Apple, Microsoft, and other tech giants last week in endorsing the Cybersecurity Information Sharing Act of 2015 (CISA).

Apple is proud of their lack of knowledge about you. They encrypt a lot of things by default. They have a tendency to use random device identifiers instead of linking things to an online account, which is better security but causes annoying bugs and edge cases for users. Tim Cook has specifically touted privacy and encryption as advantages of using Apple devices and software. The FBI has given Apple flack for using good encryption, and there were rumors they would take Apple to court.

Has Apple reversed their stance? Are they lying to their customers? I haven't seen them do that, ever. It would be really weird if they started now.

Oh, wait, they're not:

Microsoft and Apple, two of the world's largest software companies, did not directly endorse CISA. They -- along with Adobe, Autodesk, IBM, Symantec, and others—signed the letter from BSA | The Software Alliance generally encouraging the passage of data-sharing legislation. They also specifically praised four other bills, two of which focused on electronic communications privacy.

But who cares about the details, right? Get outraged! Get mad! Go the window, open it, stick your head out and yell: "I'm as mad as hell, and I'm not going to take this any more!"

The second sentence of the letter is also problematic:

This legislation would grant blanket immunity for American companies to participate in government mass surveillance programs like PRISM...

This implies a conflation I've seen around the internet a lot: that Apple willingly and knowingly participated in an NSA data-harvesting program codenamed PRISM because Apple's name appeared on one of the Snowden-leaked slides about the program. Also appearing: Google, Microsoft, Facebook, etc.

Apple responded that they did not participate knowingly or willingly. Google said the same thing. Microsoft spouted some weasel words; damage control as opposed to "what the fuck?!"

The NSA may have been using the OpenSSL "Heartbleed" bug for some or all of the data collection from these companies. Apple issued a patch for that bug with timing that subtly suggests it was in response to PRISM - pure speculation, but plausible.

Point is, if the three-letter agencies were using exploits like heartbleed, they wouldn't tell Apple or Google. To all appearances, Apple and Google didn't know anything about PRISM. The FFTF letter is making a weird insinuation that Apple, Google, and other companies would knowingly participate in such a scheme if the bill were passed.

I'm sick and tired of web sites, Twitter, news, etc telling me to be outraged. Virtually all of them reduce big, complex issues to sound bytes so we can get mad about them. I flat-out refuse to have any reaction (positive or negative) to anything "outrageous" I find on the internet, until I've done my own homework.

Aug 5, 2015

It started with TextMate when I first discovered Ruby on Rails in 2006 or so. TextMate went for ages without an update, Sublime Text was getting popular, and appeared to have mostly-complete compatibility with TextMate, so I switched.

Now Sublime has finally annoyed me. The Ruby and Haml packages just try too hard to be helpful, throwing brackets and indents around like there's no tomorrow, often in places I don't even want them. Time to try out Atom, especially since Github had a rather amusing video about it.

It takes quite a few packages to get up to the level I had Sublime at, but I think I'm basically there. Here's my setup:

  • Sync Settings - back up your Atom settings to Gist. Here's mine. Like dotfiles, these are meant to be shared. In Sublime this was a PITA involving symlinking things to Dropbox.
  • Sublime Word Navigation - nothing is more frustrating than having to hit alt+← twice just to get past a stupid dash.
  • Editorconfig - keep your coding style consistent.
  • Local Settings - I've wanted this in Sublime for ages. Simple things like max line length, soft wrap settings, and even package settings like "should RubyTest use rspec or zeus" on a per-project basis.
  • RubyTest - speaking of... Does everything I need from Sublime's RubyTest, just had to re-map the keyboard shortcuts.
  • Pigments - shows css colors in the editor, and alternative to Sublime's GutterColor.
  • Aligner - works way better than Sublime's AlignTab package.
  • Git History - step through the history of any file.
  • Git Blame - shows the last committer for each line in the gutter. Unfortunately, the gutter is too small for many names, so it craps out and shows "min". Also, the gutter can't keep up with the main window's scrolling, which is janky.
  • Git Plus - I still end up doing Git on the command line. This often didn't support the stuff I need to do on a daily basis.
  • Language-haml - if you're unfortunate enough to have to deal with HAML, this kinda helps. Like putting a band-aid on a bullet wound.
  • Rails Transporter - this is a nice idea, but it still doesn't cover the functionality that Sublime's RubyTest had. cmd+. would let you jump from a file to the spec file and back, and transporter just gives up if you're in a namespace, form object, worker, etc.

How's it working out? Well, Atom still feels a bit unpolished overall. Some of the packages above don't work quite right, or aren't as helpful as they advertise. And Atom's auto-completion is annoying as bloody hell. It seems to use CTAGs or some variant, so it pulls in all symbols from everywhere, and the one I want is never even close to the top. And it pops up on every. single. thing. I. type. in a big flashy multi-colored box that randomly switches whether it's above or below the cursor.

Finally, the quick-tab-switch is terrible compared to Sublime's. It's fuzzy matching is way worse, it ignores punctuation like underscores, and definitely maintains no concept of how "nearby" a file is, nor how recently I've opened it.

I might switch back.

May 26, 2015

Or: I'm getting too old for magic tricks

So much of what we try to do is get to a point where the solution seems inevitable: you know, you think "of course it's that way, why would it be any other way?" It looks so obvious, but that sense of inevitability in the solution is really hard to achieve.

~ Jony Ive, July 2003

I've been doing Rails for nearly a decade. I've seen bits of magic come and go, I've written too-fancy abstractions that leak like sieves, and mostly I've worked both solo and on teams. I've come to like boring code. Code with little to no magic, that looks "enterprisey," that has too many classes and objects, and uses boring old things like inheritence instead of composition.

Boring code is easy to read, and easy to debug. When you don't define methods and classes dynamically, you can actually use the stacktrace. When you don't use mixins, modules and concerns, you never have to wonder where a method is defined. You can grep your codebase. When you separate domain logic from the underlying technology, it's very clear what is happening where.

That's very helpful for working on teams. Everyone should be able to read and understand your code. The ability for someone else to understand and work with your code has an inverse, exponential correlation with the number of files, objects, and messages between input and output. Layers of indirection and metaprogrammed magic make the curve even steeper.

I want to make it really hard for the most annoying, stupid member of my team to screw it up: future me. Me in three months, when I've lost context and forgotten why I wrote any of this, or how. I want him to pick it up. Maybe he'll say "man, this code is stodgy," but he'll understand it immediately.

Let's get to work.

No side effects in model code

Code in your models should not change any other models, send emails, call APIs, or write to anything other than the primary data store. Especially in callbacks.

Callbacks are great for setting and verifying internal state. A callback to normalize a url, email, or url slug is great. You're just ensuring the model's data is consistent. A callback to send an email is total bullshit. There will be times, probably many of them, when you do not want to send that email. Data migrations, actions from admins, a hundred other cases. Put those actions in another class, or make a method that is never called automatically. Force yourself to be explicit about when that is happening in your controllers, background workers, etc.

Of course there are exceptions. touch: true is generally fine, as long as the touched model has no side effects on update.

No observers

Observers were removed in Rails 4 for a reason. They are invisible logic that no one knows to anticipate. Use explicit calls in controllers or workers.

No default scopes

When you write an ActiveRecord query, you should see exactly what it does. No one should have to wonder why they are getting unexpected ordering, joins or n+1 queries.

No state machines for models

Everyone thinks this state machines for your models are a great idea, and I've no idea why. Look at all these state machines. These put your business logic inside your models. That's great, right? I mean, it gets them out of the controller. But models are not your junk drawer for business logic.

Models will get to invalid states, as inevitably as the fucking tides. The business logic will change. You will deploy bugs. Then you have to do some ugly hack like update_columns status: 'fml' to herd them back into line. You have to do a ton of setup in tests. State machines define tons of magic methods. Guard methods, state-specific methods, and transitions will fail.

State machines are for in-line processing. Regular Expressions are a great example. They are not for asynchronous changes over time that sync to an external service like a database.

Just use a bloody string field, or better yet an ActiveRecord Enum. You can use conditional validations, but really you should put your business logic elsewhere.

Avoid instance variables in views & helpers

I write partials like this:

# app/views/blog_posts/_byline.html.erb
  post = local_assigns[:post] || @post

<div class="byline">
  <span class="author-avatar"><%= fetch_author_avatar(post.author) %></span>
  <span class="author-name"><%= post.author.name.titleize %></span>
  <span class="post-date"><%= localize post.updated_at %></span>

# app/helpers/blog_posts_helper.rb
module BlogPostsHelper
  def fetch_author_avatar(author)

Even that's not great, since post.author may be an n+1 query, but that's manageable with the Bullet gem.

Explicitly declaring variables and passing dependencies downward makes it crystal clear where everything is coming from. When you want to render this partial in some other view, and you inevitably will, you won't have to dig through the whole chain and figure out what to set in the controller. Instance variables are effectively global variables for the view scope, and nobody likes globals.

Locals are excellent for making sure your partial doesn't depend on instance variables, but they're bloody annoying when it isn't clear where they're coming from. The local_assigns hash prevents cryptic undefined method errors, makes the partial's dependencies explicit, and allows you to override them when you're using the instance variable for something else. I even pull a local out of this hash for the partial-name-variable passed in with render partial: 'my_partial', object: obj - byline in this case. This allows for defensive coding, sensible defaults, and makes it an explicit dependency.

Helpers that depend on instance variables are less clear and less reusable than helpers with arguments. They compound the problem of instance variables in views or partials, since they're not immediately visible when looking at the view code.

No view helpers in models or controllers

"Convention over configuration" is one of the huge benefits of Ruby on Rails. You don't wonder where to put this or that bit of code, and other devs don't wonder where to find it. If you have a method on a model that formats a name so it can be used in a view, you've made it harder for anyone else to find. Same thing if you define a helper in a controller that is used in the view.

Use additional conventions

Some really smart people in the Rails community have invented more specialized objects for parts of a Rails app, and they had some good reasons. Form Objects, Service Objects, Presenters, and other conventions exist to help you keep your code clean and DRY.

Don't always or dogmatically use these things - a form to update a string in a model doesn't need a form object. A controller that saves one model and makes an API call doesn't need a service object. But when code gets re-used or specialized, these can be super helpful. Having more conventions for your team helps keep it obvious where any given piece of code is or should be.

Don't go too far, either - I think Trailblazer or Hexagonal Architecture make it harder for Rails devs to understand where things are, and tempt you into using more magic to wire everything up.

Remember that abstractions hurt

All abstractions leak, and these are some of the most aggravating bugs to deal with. You end up pouring through someone else's source code trying to figure out what the hell is going on. Not to pick on Trailblazer (it really does look interesting), but when I saw the contract / validation DSL I immediately shook my head. Knowing when something is invoked and how is pretty important. The more of that you have to keep in your head, the less working memory you have for actually writing your code.

To justify an abstraction, it has to have 10x easier than operating without it. Not using the abstraction has to be so painful that you're actively losing hair over it.

For example, this is my main issue with HAML. It's a big abstraction - it takes you very far away from the actual HTML you want to render - and the only value it provides is "it's pretty." And it's not even pretty, for non-trivial apps. If you use BEM notation, any amount of data attributes, conditional classes, or I18n, you end up with perl-like punctuation soup. You can't even add arbitraty white space to make it more readable.

Sass (in its scss form) is a great counter-example. Lacking variables, comprehensions, and clear inheritence is a massive pain when writing css. Sass keeps you pretty close to the generated css, and provides 100x the power.

DSLs, Concerns, transpiled languages, and syntax sugar gems are all suspect. Be mindful about when and how you introduce new layers of abstraction.

Don't monkey patch

Duh. Use Decorators to make it explicit where your methods are coming from.

These are all very general guidelines. Rules are meant to be broken, and you totally should if it makes your code 10x easier. I'll add more if I can think of anything else.

May 8, 2015

My experience with analytics & measurement:

The Analytics Cycle

May 6, 2015

I've been getting into Ruby & other software engineering talks lately, as they complement my usual diet of quantum physics, neuroscience, and social psychology lectures. I'm not actually that smart, a lot of it goes over my head, but sometimes I get concepts and other times they prompt me to poke through Wolfram Alpha, Wikipedia, etc.

Anyway, Technical Talks:

Starts off with some fun ranting about some bad code, then gets real. Fantastic.

How people on dev teams interact, and how to maintain sanity.

How to design your Ruby gem so people will actually want to use it.

How to make sure your open source project doesn't die, and actually get other people to contribute.

Why estimation is important and how it goes wrong.

Gets into some of the low-level capabilities the Ruby engine gives you. I thought I knew a lot about Kernel and Object methods, but this taught me otherwise.

I guess most of these are "soft talks" in that they aren't about some new library or specific functions of programming. But these topics are critical to working on a team. Even if you know some or all the material, it's worth a refresher course now and again.

May 3, 2015

I had the pleasure of attending RailsConf 2015 this year with my company Hired.

It was exhausting.

That's the biggest thing I learned - I haven't been to many tech conferences, and I've only once been paid to fly somewhere else on business. The factors added up, and I spent nearly every minute tired, exhausted, and not so functional.

  • 5-hour flight on Monday
  • Jet lag
  • Being out of my familiar places
  • Going to talks all day, and trying to learn something at each of them
  • Socializing during any breaks or downtime
  • Syncing up with the team
  • Trying to bang out some code here and there
  • The Hired semi-official after party on Weds

For an introvert like me, that kind of chaos took everything I had.

The upshot was, there were some great talks, and I met a lot of cool fellow Rubyists. We bounced ideas and war stories off each other during lunch & breaks, talked about our respective companies, and made the place more of a community than a business conference.

Favorite Talks

The videos are up on ConFreaks, who are by far the best conference-talk recording people I've ever seen. Some of my favorites:

A few of my notes

  • DHH's motivation on Rails is making a framework for small teams the does 90% of what you need. Twitter took this as him crapping on microservices and front-end frameworks, which he did a little bit. I respect that motivation, as that's what let me pick up Rails and do cool stuff with it in the first place. And for small teams (<100ish) it allows everyone to be self-sufficient.

  • He mentioned Writeboard as a terrible experience developing a microservice. Couldn't agree more - it was a PITA to use. I've gone down a similar path with at least one team, and the added overhead becomes awful.

  • If you're running an open source project, respond to ALL pull requests within 48hrs. If you wait more than a week, they'll never contribute to you again.

  • Don't hoard the keys to your open source project - make sure someone else has access to the domain, can publish the gem, etc.

  • Kubernetes is pronounced kübêrnêtēs - thanks to Aja "Thagomizer" for the clarification. And the quick intro to Rails on Docker.

  • Model callbacks in Rails 5 will not halt the callback chain unless you explicitly throw(:abort) . For a ridiculously long discussion why, check out this ginormous PR.

  • From Koichi - keyword params are still 2x slower than normal params (30x slower on Ruby 2.0)

  • I left the "Crossing the Bridge" talk on Rails with client-side js frameworks. The architecture he outlined (ActiveRecord, ActiveModel::Serializers, ng-rails-resource) is terrible. 20x the overhead of server-side rendering, and your client-side app ends up a disastrous mess.

  • I did get to talk to Mike Perham, the creator of Sidekiq. We had an interesting chat about memory usage and ruby GC. I was hoping that the OS would clean up memory used by a separate thread -- ie, ending a sidekiq job cleans out memory much faster than letting ruby's GC run. Unfortunately, that's not the case, and there's still no real way to predict when ruby GC will run, short of calling it manually.

Mar 20, 2015

On the second day of Elastic{ON}, I woke up to an email from my VPS provider saying that my server was participating in a DDoS attack. Network access had been suspended, and I needed to back up any data and kill the server. I had console access via their portal, so I logged in.

Turned out ElasticSearch was the culprit. I found a bash console running under the elasticsearch user, so I killed all their processes (and Elasticsearch). If you are not on on the latest version, you need to be. And if you have dynamic scripting on (the default in previous versions), you need to make sure it's off.

I didn't have much of import on there anyway, so I just blew away the server. Then it was time to figure out a new, more secure setup. I use this server to try out quick apps I do on the side. They don't take very much in terms of resources. Usually they just need a basic app run, and a service like Postgres, Redis, or Mongo at very low scale. There's no reason to have one or more servers per app.

Heroku has the auto-sleep thing, which sucks, and not all addons are free at the intro tier. For example, Found.

My first thought was Docker, because it's the new hotness.

  • Dooku is the simplest solution, but it seems to be very oriented towards having one app.
  • Deis seems production-ready, but it's very focused on having multiple servers
  • Flynn has single-server examples, but no way to add "appliances" (stateful applications) besides Postgres

While I could run just base Docker, I just can't justify having to do these things manually. For now, I'm sticking with the "just a linux box" architecture.

Enter chef-solo. I'd been itching to write a setup & config script for a while, especially since my apps have so many components in common. Upstart, monit, logrotate, cron jobs - it's way better to have this stuff in a repo than just sitting on a server somewhere.

Plus, the recipes for the most part come with secure defaults and recommended best practices right in the REAME. My final repo stack ended up using:

This made it super easy to write some chef scripts, run a test build on a Vagrant box, and then deploy it to my shiny new dev server. My blahg here is running on nginx on it, since it's built with Jekyll, Grunt, and rsync, modified from the super-nice yeoman generator.

My new setup is hopefully more secure, and won't be going down again for a while.

Dec 2, 2014

Why? Because job security. See also: Coding in Emoji with Swift