Category Archives: programming

SQL Is Your Friend, Part 3

Yo dawg, I heard you like queries. So I put a query in your query so you can query while you query.

Now we are starting to get into some really cool and powerful SQL. Subqueries! Let’s jump right in, shall we?


Subqueries For Change


I want to count up all the times something changed across some set of data. You could pull all the data down and iterate over it in your code, and maybe that runs just fine. You can also let the database do the heavy lifting and pull down just the results that you need.

Using the LEAD windowing function we can look forward one record to see what happens next, and include that in the “current” record. We saw that in yesterday’s post. That’s not quite what we need, though. Fortunately we can query against that query, or use that as a subquery for the query we actually want to run.

SELECT *
FROM (
  SELECT id,
         major,
         minor,
         LEAD(minor) OVER (ORDER BY reading_timestamp) AS next_minor
  FROM readings
) AS amazing_subquery
WHERE minor != next_minor;

Right, so what we have there is a nested subquery. The outer query runs against the output of the inner query. If you were to run an EXPLAIN plan against this you would see how many rows the inner query returns and at what cost. I do tend to run the query plans against any of these and then keep an eye on them as the data scales up.

See the nested subquery for yourself in this SQL Fiddle.


Subqueries For A Field


We can also pull in a specific data point with a subquery and stuff it into a row. Let’s say I just want to know the last reading ID for an event.

SELECT id,
         phone,
         major,
         minor,
         reading_timestamp,
         ( SELECT MAX(id)
           FROM readings r2
           WHERE r2.phone = r1.phone AND r2.id > r1.id
         ) AS last_id
FROM readings r1
ORDER BY 2, 5;

Sorcery! You can see that I refer to the main query in the subquery. That’s pretty handy, but you don’t have to. You could just ask Postgres to give you the max value from this or some other table.

The reading ID may not be the most useful thing, so maybe instead you pull a timestamp. If you did that then you could do math and calculate the duration, which Postgres can also do for you. You can also wrap that up in another outer query. That’s when I start looking to DRY up the base query into a function.

You can play with this some more in this SQL Fiddle


Common Table Expression (CTE)


You can see where all this subquerying can maybe get out of hand. I’ll confess that I have stopped myself when I’ve nested a few queries deep and asked if maybe there is a better way. I try to avoid Russian Dolls when I program. You can also JOIN on a subquery, which is powerful when you need it and also pretty ugly.

Enter the Common Table Expression, or CTE. This is also known as a WITH query. Let’s take a look at that first query and see how that would look in CTE.

WITH amazing_subquery AS (
  SELECT id,
         major,
         minor,
         LEAD(minor) OVER (ORDER BY reading_timestamp) AS next_minor
  FROM readings
)
SELECT *
FROM amazing_subquery
WHERE minor != next_minor;

Here is the SQL Fiddle. This is a pretty simplified example, so maybe that doesn’t so much knock your socks off.

Let’s take this up to 11, shall we?

WITH interval_query AS (
  -- Start by making a series of 12 time intervals
  SELECT (ts ||' hour')::INTERVAL AS hour_interval
  FROM generate_series(0,11) AS ts
), time_series AS (
  -- Then take that interval and convert it into a TIMESTAMP
  SELECT date_trunc('hour', now()) + INTERVAL '60 min' * ROUND(date_part('minute', now()) / 60.0) - interval_query.hour_interval AS start_time
  FROM interval_query
), time_intervals AS (
  -- Then take that TIMESTAMP and add an hour to it to make an hour timebox
  SELECT start_time, start_time + '1 hour'::INTERVAL AS end_time
  FROM time_series
  ORDER BY start_time
)
-- RIGHT JOIN on those times s.t. we get all the times even when data is missing
SELECT f.start_time, f.end_time, r1.minor, count(DISTINCT r1.phone) readings
FROM readings r1
RIGHT JOIN time_intervals f
  ON r1.reading_timestamp >= f.start_time AND r1.reading_timestamp < f.end_time
GROUP BY f.start_time, f.end_time, r1.minor
ORDER BY f.start_time, r1.minor

BOOM. Time series data with counts. Here is that SQL Fiddle.


55502354


Now it is entirely possible that there are better ways to do this. I am no DBA. I’m just a simple caveman programmer who likes to take data out for a spin and see what it can do for me.

When I had projects in Sybase and SQL Server I would build these complex queries up over a series of temp tables. You can do that with Postgres, too. I just tend to prefer CTE and subqueries. However, there is another technique that Postgres brings to the table that I do use from time to time. Tomorrow.

1 Comment

Filed under programming, ruby, SQL

SQL Is Your Friend, Part 2

Yesterday I began to make the case that it’s OK to skirt your ORM and write some raw SQL every once in a while.

I’ll be honest with you. I have no idea how well the ActiveRecord postgres adapter handles the JSON datatype. I mentioned in that post that I always store entire JSON payloads. I also never select the entire column. I always pick what I want out, like I showed in the post.

Today, we have a brand spanking new version of Postgres! Version 9.4 shipped today, and it has an update that I had already planned to talk about. But we aren’t quite ready to talk about it yet.

I’m working through this blog series the same way I work through a refactoring process:

  • I have some data, and I want to query it.
  • Huh. That’s a giant pile of SQL. How can I clean that up?
  • Maybe there is a more performant way to access this gnarly data?

So today we will be in the first 2 bullets.


Windowing Functions


You have a lot of data. You can take that data and fold it on itself to see how one record is related to or changes from another.

Let’s say you want to look and see when someone has moved from one place to another. You can take a row and fold in data from the next row to make that comparison by using the LEAD windowing function. Like so:

SELECT major,
       minor,
       LEAD(minor) OVER (ORDER BY reading_timestamp) AS next_minor
FROM readings
ORDER BY reading_timestamp;

See the query in action with this SQL Fiddle

So what’s happening here?

We’ve asked for the next record — the LEAD. We use the OVER keyword to tell Postgres how to construct the window. This is how we fold the data. So I’ve said that I want the data sorted by the reading timestamps, and given that same sort I want to pull in the value for the next record’s minor column.


Repetition Is Lame


Now imagine that query is the basis for a couple of different queries. You can store that query in the database in a function. You saw how to create a function in the previous post. This is something that I actually do every once in a while.

I will, at this point, take a step back and say that I am not advocating that you stuff all of your business logic in the database. I have fought in that stored procedure holy war on the side of keeping business logic in the application that really needs to know about it.

I see this in a different light, though. This is a code optimization. I’m in control of the code and the logic, and I’ve decided where it makes the most sense for what code to run. Yes, the SQL in the function is compiled and optimized and whatever in Postgres, but I don’t honestly know how important that is. This is just me DRYing up redundant queries and making joins less hairy.

CREATE OR REPLACE FUNCTION readings_with_lead()
RETURNS TABLE (
  id INT, phone TEXT, major INT, minor INT,
  reading_timestamp TIMESTAMP,
  prev_beacon_minor INT, next_beacon_minor INT
)
LANGUAGE sql
AS $$
SELECT
  id,
  phone,
  major,
  minor,
  reading_timestamp,
  lag(minor) OVER (ORDER BY phone, reading_timestamp) AS prev_beacon_minor,
  lead(minor) OVER (ORDER BY phone, reading_timestamp) AS next_beacon_minor
FROM readings
ORDER BY phone, reading_timestamp;$$

This gives us a nice way to call on an enriched version of our data. It quacks like a table, too, so you can join on it and do anything else you would do with a normal table.

SELECT *
FROM readings_with_lead()
WHERE next_beacon_minor IS NOT NULL;

See the function in action with this SQL Fiddle


Leveling Up Your Windowing


Now I want to take the data and start to piece together sequences of events. What I really want to do is put together specific events in sequence. And I can totally do that with a windowing function:

SELECT phone,
       major,
       minor,
       reading_timestamp,
       FIRST_VALUE(reading_timestamp) OVER (PARTITION BY phone, major ORDER BY phone, reading_timestamp)
FROM readings
ORDER BY 1, 4;

I’ve asked Postgres to give me the first reading_timestamp value across every row in a window. That window is defined as a grouping of like phone numbers and major values, and those partitions are also sorted.

See it in action with this SQL Fiddle

Once you have this information you can start looking at all sorts of interesting things. How long was an event? What happened in that event?

We will look at some more techniques that make this even more interesting soon. Next up: subqueries.

8 Comments

Filed under programming, ruby, SQL

SQL Is Your Friend, Part 1

This is the kickoff of a series of posts on using Postgres and SQL to do some interesting things that you may not have known were possible. If you typically use an ORM, like ActiveRecord, you probably do not write a lot of raw SQL. I like Rails, and ActiveRecord has it’s place. This is not a rant where I shake my fist in the air and tell ActiveRecord to get off my lawn.

Instead I want to highlight a handful of techniques that I find really useful, and also perhaps suggest that it’s OK to venture off the golden path and do a little raw SQL where it makes sense. The fact of the matter is that you cannot simply switch databases and have the ORM just handle it for you. At least not once your application starts to get interesting. So use the right tool for the job, and I hope to highlight a few tools that Postgres brings to the table.

TL;DR: SQL is your friend.


JSON Datatype


You may already be familiar with the HSTORE datatype that came in Postgres 9.0. It’s nice, but does not handle nesting. In Postgres 9.2 we got something even awesomer: the JSON datatype. It’s pretty spectacular, and I have used it to store entire raw payloads now anytime I consume resources.  You can also use this datatype in ActiveRecord migrations without requiring any additional gems.

What makes the JSON datatype especially nice is that you can reach into the data and pull out specific elements. You can also create custom functions and even operators to do even more powerful things with the data in the JSON column.

Here is an example that takes a column called “beacons” which contains a list of bluetooth beacons that a phone has ranged. We want to look into that list and pull out the minor number from the closest beacon, and the accuracy from the next closest beacon. We also cast values from the JSON datatype to a more meaningful datatype. Yes, the cast to TEXT and then to the actual datatype is a little awkward, but it works.

SELECT beacons->0,
  json_array_length(beacons),
  json_extract_path(beacons->0, 'minor')::TEXT::INT AS minor2,
  json_extract_path(beacons->1, 'accuracy')::TEXT::FLOAT AS accuracy2
FROM readings;

See this code in action with this SQL Fiddle


Custom Function and Operator


There are a lot of good functions in Postgres to handle the JSON datatype, but let’s say that I want to do something that’s not built in. I want to look at ALL of the values for a given key.

Take a look at this discussion. You can create a function in Postgres, and even create a new operator for that function:

CREATE OR REPLACE FUNCTION json_pluck(j json, field TEXT)
RETURNS TEXT[]
LANGUAGE sql
AS
$$ SELECT array_agg(j->>$2) FROM json_array_elements($1) j $$;

DROP OPERATOR IF EXISTS | (json, text);
CREATE OPERATOR | (procedure = json_pluck, leftarg = 'json', rightarg = 'text');

Wait. What even is that?! You create a function called json_pluck. Then you create a custom operator | that allows you to invoke json_pluck by simply calling the pipe character and telling it for what key to return the values.

Like so:

SELECT beacons->0,
  json_array_length(beacons),
  json_extract_path(beacons->0, 'minor')::TEXT::INT AS minor2,
  json_extract_path(beacons->1, 'accuracy')::TEXT::FLOAT AS accuracy2,
  beacons | 'minor' AS minors
FROM readings;

You’ll note that I use CREATE OR REPLACE wherever possible. You can’t do that with an operator, though, so I preface that with a guard to drop it if it already exists, and then create it.

See this code in action with this SQL Fiddle

Yeah. That’s pretty cool, no? You can drop all of those into ActiveRecord migrations, too:

class CreateFunctionJsonPluck < ActiveRecord::Migration
  def up
    sql = <<-SQL
      CREATE OR REPLACE FUNCTION json_pluck(j json, field TEXT)
      RETURNS TEXT[]
      LANGUAGE sql
      AS
      $$ SELECT array_agg(j->>$2) FROM json_array_elements($1) j $$;
    SQL
    ActiveRecord::Base.connection.execute(sql)

    sql = <<-SQL
      DROP OPERATOR IF EXISTS | (json, text);
      CREATE OPERATOR | (procedure = json_pluck, leftarg = 'json', rightarg = 'text');
    SQL
    ActiveRecord::Base.connection.execute(sql)
  end

  def down
    sql = "DROP OPERATOR IF EXISTS | (json, text);"
    ActiveRecord::Base.connection.execute(sql)
    sql = "DROP FUNCTION IF EXISTS json_pluck(j json, field TEXT);"
    ActiveRecord::Base.connection.execute(sql)
  end
end

So that’s the JSON datatype. In this series plan to discuss window functions, subqueries, WITH queries (Common Table Expression), and materialized views.

8 Comments

Filed under programming, ruby, SQL

Remembering Ruby

I’ve seen some discussion over the years on how to onboard new Ruby developers.  What about re-onboarding “old” developers though?

I’ve been a Ruby developer for several years.  I remember the bump to Rails 1.0.  We used to have to hand-crank our app servers while walking uphill.  You kids don’t know how good you have it.  I digress while I shake my cane.
 
As I was saying, I’ve been in the community for a long time, but I have also been away from ruby for several months on other languages.  I consider Ruby my “home” language, though, and the Ruby community is a special community that I enjoy feeling a connection with.
 
So how do I reorient myself?  How do I stay connected with the language and the community that I am not always directly involved in?
  • Twitter
  • Local Ruby Brigade
  • Practice
  • Playing Around
Twitter
 
I follow a lot of developers on Twitter, as one does.  Sometimes I binge add people, and sometimes I cull the list.  I love the community, but I don’t love the drama.  The drama sometimes eats at me, actually.  The things that we get worked up about also eat at me, but sometimes we get a little caught up in the swirl.
 
I like seeing the evolution of the language and ideas.  I like seeing other languages come into play, too.  I don’t think Ruby is necessarily the best language solution for all problems, and I believe that it’s important to have a varied language toolkit at your disposal.
 
Local Ruby Brigade
 
User groups are a fantastic resource.  The Dallas Ruby Brigade has always been very open and nurturing in my experience.  There are monthly meetings where people give talks, and I’ve given a few over the years.  There are also weekly meetups where we code and socialize.  We also have an active mailing list where we discuss current events, upcoming meetings, and also ask and answer questions.
 
I felt safe as an inexperienced developer, and I am tremendously grateful.  A local user group is a really great place to keep your chops up and get experience talking in a group setting (whether in the front or in the audience).
 
After a couple of years I felt like it was my turn to “give back” and get more involved, which has been very fulfilling.
 
Practice
 
Certainly you can practice the wrong things, but I find practice very valuable.  Repetition helps me learn and cements ideas for me.  I also learn by doing.  I need a concept to have a practical application or else I just will not see it.  For the past several months I have done Exercism exercises in several languages.  For a while I was doing Ruby backend services, so I did Javascript exercises to keep up those chops.  Then we started doing a lot of native apps, so I focused on Ruby exercises.  I’ve also looked at a few katas here and there.  Sometimes they’re really hard and I don’t get them.  Sometimes they’re too easy and I don’t get much value out of those either.  Usually they’re just right though.
 
Scratch An Itch (Play Around)
 
I’ll occasionally try to cook up a little project to work on.  I can’t tell you how many times I’ve redone my blog.  Not each of those has seen the light of day, and that’s ok.  It’s more about the process.  I usually also have a new data model in mind, which means that the data needs to be transformed.  I love data, and this is a really fun exercise for me.  ActiveRecord is great, but it can also be pretty limiting.  Sometimes raw SQL really is the best way to express what you need to do.
 
Other
 
I also try to make it to at least one Ruby conference a year, as well as submit talks on various things.  Thinking about what I might be able to shed light on or what things of value I might be able to share helps me take a critical look at my own growth and evolution.
 
Addendum
 
I realized as I was walking around thinking about doing something in Xcode that this isn’t just Ruby for me.  I bounce around languages a lot based on the projects I work on.  One thing that is unique to Ruby, though, is the community.

Leave a Comment

Filed under programming, ruby

Rake Task To Load Production Data

When I’m working on a Rails app that relies on live data, like a reporting app for example, I find myself constantly wanting to pull live data to my local dev database. So I made a rake task to do this:

namespace :db do
  namespace :heroku do
    desc "Load the PROD database from Heroku to the local dev database"
    task :load => :environment do
      if Rails.env == 'development'
        Bundler.with_clean_env do
          config = Rails.configuration.database_configuration[Rails.env]
          system <<-CMD
            heroku pgbackups:capture
            curl -o latest.dump `heroku pgbackups:url`
            pg_restore --verbose --clean --no-acl --no-owner -h localhost \
              -U #{config["username"]} -d #{config["database"]} latest.dump
            rm -rf latest.dump
          CMD
        end
      end
    end
  end
end

3 Comments

Filed under programming, ruby

Polyglot Life – Benefits And Detriments

Perspective. Being a polyglot is good, and I think the natural evolution as you progress from beginner to seasoned developer. I’m not here to discuss the benefits of a multi-lingual cross-platform architecture. This is about me, and this is about you. This is about how we spend our time honing our craft on our time.

The Sapir-Whorf hypothesis says (about natural languages) that the languages we know limit what we think. Continuing that thought, I believe that the programming languages you know, the communities you participate in, the things you feel passionate about, all become the lens through which you see the world. Sort of like Kant’s subject-object problem, where he suggests that we experience the world as conveyed by our senses. You don’t experience (or evaluate) anything directly, but instead through rose-colored glasses (as I remember the lesson going).

As you participate more, and as you learn more, you expand your field of vision. You recognize patterns and see where different problems are better solved in different ways, and perhaps a language is a good match for some problems and not as good for others.

A single language (or pattern) isn’t the silver bullet for all programs.

Varied experience also allows you to think about problems at a higher level rather than getting lost in the weeds of implementation specifics or community dogma.

Another benefit to the polyglot life is that it’s just fun to explore and play around. We are naturally curious. The Pragmatic Programmer suggests that we make ourselves uncomfortable and learn a new language every year. A different language with different constructs.

As a reminder, I’m talking about personal growth and how we spend our off time honing our craft. Some of these new languages and patterns will make their way into production code or glue processes, but that’s not really what I’m driving at here. This is more along the lines of what Steve Klabnik has called “Irresponsible Ruby.” You’re playing with language to explore and push the boundaries.

I have noticed some downsides to the polyglot life, though.

I am a long-time Rubyist. I remember the bump to Rails 1.0. I remember Chunky Bacon and Rails is a Ghetto. I’ve forgotten more Rails API changes than I remember now. I haven’t spent as much time doing Ruby lately, and I miss it. I miss the community. I think the community sometimes takes itself a little too seriously, but I miss my “home” community.

Context switching can also be expensive. Even just between different tasks in a single project, but especially between different projects in different languages.

I also joked with someone recently that I’m not “good” at anything anymore. I once had a boss suggest (when I was just getting into web development) that I should narrow my focus. Remember all the things you have to get a handle on to do web development? HTML, CSS, JavaScript, some app server backend, some web server to run the app server, linux to run the web server that runs the app server, ….

He was trying to look out for me. He was concerned that I would be a Jack of all trades and master of none. I feel that way now as I bounce around different projects in different languages.

Maybe it’s like that for everyone, and maybe that’s just where I am personally in my own growth. Perhaps I am back in that “beginner” stage, but on a different field. I’ll take it, though.

Perspective.

Leave a Comment

Filed under programming

How I Docker, Part 2

Yesterday I laid out a high level overview of how I use Docker. I neglected to point out 1 thing — the environment variables. I exposed the POSTGRES_USER and POSTGRES_PASSWORD, but didn’t explicitly call out what to do with those in the context of the Rails app.

database.yml

default: &amp;default
  adapter: postgresql
  encoding: unicode
  pool: 5
  host: &lt;%= ENV['POSTGRES_PORT_5432_TCP_ADDR'] || 'db' %&gt;
  username: &lt;%= ENV['POSTGRES_USER'] || 'developer' %&gt;
  password: &lt;%= ENV['POSTGRES_PASSWORD'] || 'password' %&gt;

development:
  &lt;&lt;: *default
  database: rails_4_1_6_development

test:
  &lt;&lt;: *default
  database: rails_4_1_6_test

production:
  &lt;&lt;: *default
  database: rails_4_1_6_production

Docker exposes some variables for you based on the containers and their names. You can also specify your own with that -e switch.

Alright. With that little bit cleared up, let’s talk about how I actually use all this now.

How I Actually Use All This Now

Remember the intent of this exercise was to come up with a way to easily share an app with it’s dependencies in some nice tidy package such that nobody else needs to understand how to line up all those dependencies and deal with the maintenance.

Docker is great, but one of the things that I liked about Vagrant was how easily provisioning and reprovisioning happens. But I don’t really need all the overhead of some big complicated thing that requires 3 separate licenses and all that. We have boot2docker, which handles the virtualization on OSX with VirtualBox. It’s maybe not the most amazing software, but it hasn’t left a bad taste in my mouth yet.

I like the simplicity of Docker, and I like the simplicity of just using the tooling that Docker provides us.

So to simplify the workflow (using Docker and Boot2Docker) I wrote a bunch of rake tasks. Feel free to comment or submit updates.

https://gist.github.com/barrettclark/114a4441a1a542c6daae

The Rakefile

So what does this buy us, this giant pile of rake tasks? This gives us a thin wrapper around some pretty basic functionality. Rake also has the ability to chain things together, so you can ask to clean up all the old instances and it will make sure they’re all stopped and that boot2docker is even running.

It makes is so that we can pass this around and have a simplified workflow.

Sometimes boot2docker loses it’s mind and quits on you, and you don’t even have to care about that anymore because the rake tasks will make sure boot2docker is running for you.

Here are all the rake tasks that I currently have in place:
$ rake -T
rake boot2docker:install_nsenter # Install nsenter
rake boot2docker:shellinit # Setup shell
rake boot2docker:ssh[container] # SSH into the given container
rake boot2docker:start # Start VM from any states
rake boot2docker:stop # Gracefully shutdown the VM
rake docker:fig:build # Use fig to build the Dockerfile image
rake docker:fig:up # Use fig to run the app as described in fig.yml
rake docker:info:hostname[container,port] # Give the hostname and port for a container
rake docker:info:images # List all the images on localhost (docker images)
rake docker:info:ps # List all the container (docker ps -a)
rake docker:maintenance:rm[container] # Remove one or more containers
rake docker:maintenance:rmi[container] # Remove one or more images
rake docker:maintenance:stop[container] # Stop running containers
rake git:pull[remote,branch] # Pull the freshest code
rake git:stash # Stash files

Generalized Yet Specific

So we have Dockerfiles and environment variables that we can pass into containers. We have boot2docker to handle the virtualization for us on OSX. We have rake to manage docker and boot2docker. But what about apps and their dependencies?

Fig

We need a DSL that is a little bit generalized, but not as generalized as Chef or Puppet. We just need to be able to tell Docker how to do all those things that were in my little container startup script. Enter Fig.

Whereas before we had dockerup.sh

docker run -d \
  --name rails-basic-postgres \
  -e POSTGRES_USER=docker -e POSTGRES_PASSWORD=docker \
  barrettclark/postgis

docker build -t=&quot;barrettclark/rails-basic:devel&quot; .

docker run -d -P \
  --name rails-basic-app \
  --link rails-basic-postgres:postgres \
  -e POSTGRES_USER=docker -e POSTGRES_PASSWORD=docker \
  -v /Users/barrettclark/temp/rails_basic/project:/rails \
  barrettclark/rails-basic:devel

We now have fig.yml

db:
  image: barrettclark/postgis
  ports:
    - &quot;5432&quot;
  environment:
    - POSTGRES_USER=docker
    - POSTGRES_PASSWORD=docker
web:
  build: .
  environment:
    - POSTGRES_USER=docker
    - POSTGRES_PASSWORD=docker
  volumes:
    - ./project:/rails
  links:
    - db
  ports:
    - &quot;3000&quot;

You’ll note in the Rakefile that the default rake task is docker:fig:up, which will first run the docker:fig:build task (and also make sure boot2docker is running and the environment is set).

Putting It All Together

So let’s run this app, shall we?
$ rake
db uses an image, skipping
Building web...
---> d90ac1ef8a16
Step 1 : RUN apt-get update
---> Using cache
---> 8e44373a6c4e
Step 2 : RUN apt-get upgrade -y
---> Using cache
---> 7c8a817248f8
Step 3 : RUN apt-get -y --fix-missing install libpq-dev nodejs
---> Using cache
---> 0fd7db2ace6d
Step 4 : RUN gem install bundler --no-ri --no-rdoc
---> Using cache
---> d653711e99ef
Step 5 : ADD start.sh start.sh
---> Using cache
---> b5298535c564
Step 6 : RUN chmod +x /start.sh
---> Using cache
---> 51f55dde7927
Step 7 : ADD project/Gemfile Gemfile
---> Using cache
---> 49f2911395c8
Step 8 : ADD project/Gemfile Gemfile.lock
---> Using cache
---> f1f6b6cf4863
Step 9 : RUN bundle install
---> Using cache
---> 73271fb83da5
Step 10 : RUN rm /Gemfile*
---> Using cache
---> be0105be5b00
Step 11 : ADD project/ /rails
---> ff9da0621df5
Removing intermediate container b18ed235b2a5
Step 12 : WORKDIR /rails
---> Running in 045c488e22c0
---> 4c85897ee553
Removing intermediate container 045c488e22c0
Step 13 : EXPOSE 3000
---> Running in 0909d1f73082
---> 43df967060e7
Removing intermediate container 0909d1f73082
Step 14 : CMD /start.sh
---> Running in 8c515adc0285
---> 69d3e7832b58
Removing intermediate container 8c515adc0285
Successfully built 69d3e7832b58
Creating railsbasic_db_1...
Creating railsbasic_web_1...
Attaching to railsbasic_db_1, railsbasic_web_1
web_1 | *** STARTING RAILS APP ***
web_1 | rm: cannot remove /rails/tmp/pids/server.pid': No such file or directory
db_1 | 2014-10-28 15:00:14 UTC LOG: database system was shut down at 2014-10-17 16:06:21 UTC
db_1 | 2014-10-28 15:00:14 UTC LOG: autovacuum launcher started
db_1 | 2014-10-28 15:00:14 UTC LOG: database system is ready to accept connections
web_1 | Don't run Bundler as root. Bundler can ask for sudo if it is needed, and
web_1 | installing your bundle as root will break this application for all non-root
web_1 | users on this machine.
web_1 | Fetching gem metadata from https://rubygems.org/..........
web_1 | Using rake 10.3.2
web_1 | Using i18n 0.6.11
web_1 | Using json 1.8.1
web_1 | Using minitest 5.4.2
web_1 | Using thread_safe 0.3.4
web_1 | Using tzinfo 1.2.2
web_1 | Using activesupport 4.1.6
web_1 | Using builder 3.2.2
web_1 | Using erubis 2.7.0
web_1 | Using actionview 4.1.6
web_1 | Using rack 1.5.2
web_1 | Using rack-test 0.6.2
web_1 | Using actionpack 4.1.6
web_1 | Installing mime-types 2.3
web_1 | Using mail 2.6.1
web_1 | Using actionmailer 4.1.6
web_1 | Using activemodel 4.1.6
web_1 | Using arel 5.0.1.20140414130214
web_1 | Using activerecord 4.1.6
web_1 | Using coffee-script-source 1.8.0
web_1 | Installing execjs 2.2.1
web_1 | Using coffee-script 2.3.0
web_1 | Using thor 0.19.1
web_1 | Using railties 4.1.6
web_1 | Using coffee-rails 4.0.1
web_1 | Using hike 1.2.3
web_1 | Using multi_json 1.10.1
web_1 | Installing jbuilder 2.1.3
web_1 | Using jquery-rails 3.1.2
web_1 | Using pg 0.17.1
web_1 | Using bundler 1.7.4
web_1 | Using tilt 1.4.1
web_1 | Using sprockets 2.11.0
web_1 | Installing sprockets-rails 2.1.4
web_1 | Using rails 4.1.6
web_1 | Using rdoc 4.1.2
web_1 | Using sass 3.2.19
web_1 | Using sass-rails 4.0.3
web_1 | Using sdoc 0.4.1
web_1 | Using spring 1.1.3
web_1 | Using turbolinks 2.4.0
web_1 | Using uglifier 2.5.3
web_1 | Your bundle is complete!
web_1 | Use
bundle show [gemname]` to see where a bundled gem is installed.
web_1 | == 20141004114903 CreateGrilledThings: migrating ==============================
web_1 | -- create_table(:grilled_things)
web_1 | -> 0.0099s
web_1 | == 20141004114903 CreateGrilledThings: migrated (0.0100s) =====================
web_1 |
web_1 | [2014-10-28 15:00:43] INFO WEBrick 1.3.1
web_1 | [2014-10-28 15:00:43] INFO ruby 2.1.2 (2014-05-08) [x86_64-linux]
web_1 | [2014-10-28 15:00:43] INFO WEBrick::HTTPServer#start: pid=25 port=3000

Then you can ask docker about the instance:
$ rake docker:info:hostname[railsbasic_web_1,3000]
http://192.168.59.103:49154

Even cooler, you can open that up in your browser with
open $(rake docker:info:hostname[railsbasic_web_1,3000])

And using a little bit on linux magic (nsenter) you can SSH into the running container:
$ rake boot2docker:ssh[railsbasic_web_1]
SSH into container: f52a7f422ec2
root@f52a7f422ec2:/#

Summary

So that’s where my head is at the moment. Docker + boot2docker + rake + fig. That gives you a flexible system that is still specific enough. You can run multiple apps, and you can have multiple apps that have dependencies in each other. I’ve had a little more trouble getting this running with a Scala stack, so YMMV.

Again, I’m not saying that any of this is particularly correct. It’s just what has evolved for me in my particular team’s use case. I would love to hear feedback on what you think works best for you and your needs. I would also like to thank Clifton King, who gave me a couple of docker-related tips along the way.

11 Comments

Filed under programming

How I Docker

I have spent the past several months playing with Docker off and on. I have to say that I think I really like it.

I love the idea of really small, specific, containers. I also love the idea of building up a toolkit of things that you use.

Some Background

I work at Sabre Labs, and we get to explore and experiment with trends and technologies that we think could make a difference in the travel world. This means that we spin up a lot of projects in a lot of different technologies and languages. So it is really nice to have a toolkit of things that we’ve used and done. That also helps with knowledge sharing.

About a year ago we set upon a path to attempt to make DevOps easier in a cross-functional team. We have a designer, a front-end developer, and a couple of back-end or full-stack developers. DevOps can be very time consuming when you’re working in a Java stack one day and a Rails stack the next.

Vagrant

First we looked at Vagrant, using Chef to setup the dependencies. It worked, generally. It’s expensive. It’s messy.

The licenses are frustrating. You pay for Vagrant, which is fine. If you want to use VMWare, which we did, you pay for that. Also fine. If you want to use Vagrant with VMWare you pay for a third license. That feels weird.

Chef. I’ve used Chef before in previous engagements, and it’s worked ok. It’s better than nothing (or shell scripts) for sure. My frustration with Chef is how messy the recipes are. The dependencies in particular. We didn’t use Chef Server, so we would just copy in all the recipes that we needed, and then their dependencies. Look at the dependency chain for git (on Ubuntu). Before you’re done you’ve brought in Windows dependencies. And also the Yum package manager. Why? That makes me crazy.

I know that Vagrant supports Puppet, and I just never got around to looking at Puppet. I know that Vagrant also supports Docker now, and I can’t get it to dance. But I don’t really want to. I think that’s overly complicated.

Docker

What’s so great about Docker then? It’s complicated in different ways. Or can be complicated.

Like I said at the top, I love how simple and concise the containers are. I also like the DSL. It’s specific, and it lets me do what feels natural on that particular environment. I don’t need to overly generalize. I don’t need a DSL that can work on all *nix distros. I just want to say what I need and make it so.

Teach Me How To Docker

Look around at blog posts on how to Docker. I’ve noticed that a great many of them fall into the “install ALLTHETHINGS” camp. You build a single container that includes the database and the app server. Bleh. Why bother with containerization if you’re just going to stick everything in a single container?

Small. Simple. Concise. That’s how I Docker.

So, without all of that as the background, here is a simple Docker setup for a simple Rails app.

Dockerfile:

FROM barrettclark/ruby-2.1.2

RUN apt-get update
RUN apt-get upgrade -y
RUN apt-get -y --fix-missing install libpq-dev nodejs

RUN gem install bundler --no-ri --no-rdoc

ADD start.sh start.sh
RUN chmod +x /start.sh
ADD project/Gemfile Gemfile
ADD project/Gemfile Gemfile.lock
RUN bundle install
RUN rm /Gemfile*

ADD project/ /rails
WORKDIR /rails

EXPOSE 3000

CMD /start.sh

You see here that I have a base container image called barrettclark/ruby-2.1.2. Docker containers can inherit from other containers. This is where you can end up with a bit of a Russian Doll situation because that container takes a base image and puts Ruby 2.1.2 on it. You see it then runs some basic apt-get commands. We are using Postgres for this Rails app, so we need those header files for the pg gem, hence libpq-dev.

Next we add some files. Docker caches each step to see if it needs to be rebuilt. That’s why we don’t just copy in the whole project yet. Once we’ve handled the gem dependencies, then we can copy over the app. Rails will run on port 3000, and because we know it’s the only thing running in that container, and also in this machine for that matter, we can just expose that port and call it good.

CMD /start.sh tells the container to run that file when the container is started (unless you tell it to do something else when you start it). You can see that it does some housekeeping, database setup, and fires up the app.

start.sh

echo "*** STARTING RAILS APP ***"
rm /rails/tmp/pids/server.pid
bundle install
rake db:create
rake db:migrate
rake db:seed
bundle exec rake log:clear
rails s

Build the container: docker build -t="barrettclark/rails-basic:devel" . Great. Now you’ve got everything you need to run a simple Rails app in Docker.

Run All The Things

I usually write a little shell script to help with this step. We need 2 containers, and we want them eventually linked together.


docker run -d \
--name rails-basic-postgres \
-e POSTGRES_USER=docker -e POSTGRES_PASSWORD=docker \
barrettclark/postgis

docker run -d -P \
--name rails-basic-app \
--link rails-basic-postgres:postgres \
-e POSTGRES_USER=docker -e POSTGRES_PASSWORD=docker \
-v /Users/barrettclark/temp/rails_basic/project:/rails \
barrettclark/rails-basic:devel /bin/bash -l -c "/start.sh"

What’s all that? The first docker run spins up a Postgres container named ‘rails-basic-postgres’ and sets some environment variables. If you don’t have the given image on your localhost, Docker will go try to fetch it from Dockerhub. The ‘-d’ switch tells Docker to daemonize the container — run it in the background. Next we run the Rails container and link that Postgres container to it.

Remember how I said the whole point of all this was to be able to give this service (or whatever) to everyone else in a cross-functional team. I don’t want them to have to know how to line up all these dependencies. This allows me to handle that for them. The real kicker here is that you can mount a directory on the localhost inside the container. Yes! The caveat is that the localhost directory MUST live in /Users. A symlink is not sufficient. Now this allows a designer to play with the javascript or markup on his or her computer in his or her preferred editor, and nobody has to know where the code is actually running.

Well, that last piece isn’t entirely true. Remember how I sort of waved my hands at just exposing port 3000 and calling it good? The -P Docker option allows us to expose ports, and Docker will map them to a port on the localhost. If you want to be specific you can do that as well with a different switch.

Run docker ps -a to see all the instances running and what ports they’ve been mapped to.

I’m not saying that any of this is particularly correct. It’s just what has evolved and makes the most sense to me. The next blog post will go into a little more detail on how I actually use all of this, and some tooling that I’ve written to make it a little easier.

6 Comments

Filed under programming

Ruby Minitest

I had the privilege of attending DCamp this past weekend. It was awesome. While doing the Game Of Life pairing sessions, one of my pair partners wanted to explore the exercise using Minitest. Neither of us had ever used it, so we moved on with RSpec and I made a note to come back to it.

I try to be discerning about bringing additional dependencies (including gems) into a project. I’ve always just used test/unit, and I prefer that syntax over the RSpec syntax. Minitest is baked into the stdlib (starting with Ruby 1.9.3), so you’ve already got everything you need. You can also write tests using either the test/unit or spec style. Win-win right? If you want more there are also additional gems that can supplement functionality. Everybody wins!!!

Here is a basic class, taken from the Minitest gem documentation.

lib/meme.rb

class Meme
  def i_can_has_cheezburger?
    "OHAI!"
  end

  def will_it_blend?
    "YES!"
  end
end

Unit tests, using Minitest:

test/test_meme.rb

require "minitest/autorun"
require_relative "../lib/meme"

class TestMeme < Minitest::Unit::TestCase
  def setup
    @meme = Meme.new
  end

  def test_that_kitty_can_eat
    assert_equal "OHAI!", @meme.i_can_has_cheezburger?
  end

  def test_that_it_will_not_blend
    refute_match /^no/i, @meme.will_it_blend?
  end

  def test_that_will_be_skipped
    skip "test this later"
  end
end

And a spec testing the same things:

spec/meme_spec.rb

require 'minitest/autorun'
require_relative "../lib/meme"

describe Meme do
  before do
    @meme = Meme.new
  end

  describe "when asked about cheeseburgers" do
    it "must respond positively" do
      @meme.i_can_has_cheezburger?.must_equal "OHAI!"
    end
  end

  describe "when asked about blending possibilities" do
    it "won't say no" do
      @meme.will_it_blend?.wont_match /^no/i
    end
  end
end

Outstanding. You can run either of these from the command line:

$ ruby spec/meme_spec.rb
Run options: –seed 25656

# Running tests:

..

Finished tests in 0.001430s, 1398.6014 tests/s, 2097.9021 assertions/s.

2 tests, 3 assertions, 0 failures, 0 errors, 0 skips

You can also use rake to run the tests, and this is the preferred method.

Rakefile

require 'rake/testtask'

Rake::TestTask.new(:spec) do |t|
  t.warning    = false    # this is the default -- shown as example of options
  t.verbose    = false
  t.test_files = FileList['spec/*_spec.rb']
end

Rake::TestTask.new(:unit) do |t|
  t.test_files = FileList['test/test_*.rb']
end

desc "Run ALL tests"
task :default do
  Rake::Task[:spec].invoke
  Rake::Task[:unit].invoke
end

Now you can ask rake what tasks it has for you:

$ rake -T
rake default # Run ALL tests
rake spec # Run tests for spec
rake unit # Run tests for unit

Cool. How about we just run the default rake task (ALLTHETESTS):

$ rake
Run options: –seed 28918

# Running tests:

..

Finished tests in 0.001187s, 1684.9200 tests/s, 2527.3799 assertions/s.

2 tests, 3 assertions, 0 failures, 0 errors, 0 skips
Run options: –seed 26846

# Running tests:

..S

Finished tests in 0.000936s, 3205.1282 tests/s, 3205.1282 assertions/s.

3 tests, 3 assertions, 0 failures, 0 errors, 1 skips

Awesome. You can also run all the tests, just a group of tests, or even just a single test file. This looks tasty enough to overtake test/unit as my go-to testing framework.

Leave a Comment

Filed under programming, ruby

SFO API Swift Implementation

A pattern that we’ve started using at Sabre Labs in our iOS projects is that the view controllers are written in Objective-C and utility classes are written in Swift. We also try to keep as much of the heavy lifting out of the view controllers, so that the VC is responsible for just brokering the information to the view and handling all the necessary delegate methods therein. Anything that a utility class needs to bubble up is handled through a delegate method.

Here is a simple example that hits the SFO API.

First, the header file of the view controller. Note that this has been pulled out of a real project, so I present it here merely as example.

//
//  ViewController.h
//  YourProject
//
//  Created by Barrett Clark on 8/25/14.
//  Copyright (c) 2014 Barrett Clark. All rights reserved.
//

#import <UIKit/UIKit.h>
#import <YourProject-Swift.h>

@interface ViewController : UIViewController <SFOProtocol> {
  SFO *sfo;
}

@end

And now the implementation file:

//
//  ViewController.m
//  YourProject
//
//  Created by Barrett Clark on 8/25/14.
//  Copyright (c) 2014 Barrett Clark. All rights reserved.
//

#import "ViewController.h"

@interface ViewController ()

@end

@implementation ViewController
            
- (void)viewDidLoad {
  [super viewDidLoad];
  // Do any additional setup after loading the view, typically from a nib.
  sfo = [[SFO alloc] initWithApiKey:@"sekretkey"];
  sfo.delegate = self;
  [sfo getAirlinesWithOffset:0 limit:5];
  [sfo getDiningWithOffset:0 limit:5];
  [sfo getPassengerServicesWithOffset:0 limit:5];
  [sfo getShoppingWithOffset:0 limit:5];
  [sfo getThingsToDoWithOffset:0 limit:5];
}

#pragma mark SFO Protocol
- (void) didReceiveAirports:(NSArray *)results {
  NSLog(@"Airports: %@", results);
}
- (void) didReceiveDining:(NSArray *)results {
  NSLog(@"Dining: %@", results);
}
- (void) didReceivePassengerServices:(NSArray *)results {
  NSLog(@"Passenger Services: %@", results);
}
- (void) didReceiveShopping:(NSArray *)results {
  NSLog(@"Shopping: %@", results);
}
- (void) didReceiveThingsToDo:(NSArray *)results {
  NSLog(@"Things To Do: %@", results);
}

@end

And finally, the “utility” file. I don’t generally use AFNetworking. The standard library HTTP calls are good enough for me. Perhaps the completion block that is mostly redundant can be streamlined. I welcome your comments.

//
//  SFO.swift
//  YourProject
//
//  Created by Barrett Clark on 8/26/14.
//  Copyright (c) 2014 Barrett Clark. All rights reserved.
//

import Foundation

@objc protocol SFOProtocol {
  optional func didReceiveAirports(results: NSArray)
  optional func didReceiveDining(results: NSArray)
  optional func didReceiveShopping(results: NSArray)
  optional func didReceiveThingsToDo(results: NSArray)
  optional func didReceivePassengerServices(results: NSArray)
}

@objc class SFO: NSObject {
  var delegate: SFOProtocol?
  let queue = NSOperationQueue()
  var apiKey: NSString
  let baseURL = "http://www.flysfo.com/api"
  
  init(apiKey: NSString) {
    self.apiKey = apiKey
    super.init()
  }
  
  /*
   * Airlines
   * http://www.flysfo.com/api/airlines.json?limit=20&key=[Your API Key]
   */
  func getAirlines(offset: Int = 0, limit: Int = 30) {
    var request = requestForURL(limit: limit, offset: offset, endpoint: "airlines")
    
    let completion = { (response: NSURLResponse!, data: NSData!, error: NSError!) -> Void in
      let res = response as NSHTTPURLResponse!
      if ((error) != nil) {
        println(error.description)
        return
      }
      var error: NSError?
      if (data != nil) {
        var e: NSError?
        var jsonResult = NSJSONSerialization.JSONObjectWithData(data, options: NSJSONReadingOptions(0), error: &e) as NSArray
        self.delegate?.didReceiveAirports!(jsonResult)
      }
    }
    NSURLConnection.sendAsynchronousRequest(request, queue: self.queue, completionHandler: completion)
  }

  /*
   * Dining
   * http://www.flysfo.com/api/dining.json?limit=20&key=[Your API Key]
   */
  func getDining(offset: Int = 0, limit: Int = 30) {
    var request = requestForURL(limit: limit, offset: offset, endpoint: "dining")
    
    let completion = { (response: NSURLResponse!, data: NSData!, error: NSError!) -> Void in
      let res = response as NSHTTPURLResponse!
      if ((error) != nil) {
        println(error.description)
        return
      }
      var error: NSError?
      if (data != nil) {
        var e: NSError?
        var jsonResult = NSJSONSerialization.JSONObjectWithData(data, options: NSJSONReadingOptions(0), error: &e) as NSArray
        self.delegate?.didReceiveDining!(jsonResult)
      }
    }
    NSURLConnection.sendAsynchronousRequest(request, queue: self.queue, completionHandler: completion)
  }

  /*
   * Shopping
   * http://www.flysfo.com/api/shopping.json?limit=20&key=[Your API Key]
   */
  func getShopping(offset: Int = 0, limit: Int = 30) {
    var request = requestForURL(limit: limit, offset: offset, endpoint: "shopping")
    
    let completion = { (response: NSURLResponse!, data: NSData!, error: NSError!) -> Void in
      let res = response as NSHTTPURLResponse!
      if ((error) != nil) {
        println(error.description)
        return
      }
      var error: NSError?
      if (data != nil) {
        var e: NSError?
        var jsonResult = NSJSONSerialization.JSONObjectWithData(data, options: NSJSONReadingOptions(0), error: &e) as NSArray
        self.delegate?.didReceiveShopping!(jsonResult)
      }
    }
    NSURLConnection.sendAsynchronousRequest(request, queue: self.queue, completionHandler: completion)
  }

  /*
   * Things To Do
   * http://www.flysfo.com/api/things-to-do.json?limit=20&key=[Your API Key]
   */
  func getThingsToDo(offset: Int = 0, limit: Int = 30) {
    var request = requestForURL(limit: limit, offset: offset, endpoint: "things-to-do")
    
    let completion = { (response: NSURLResponse!, data: NSData!, error: NSError!) -> Void in
      let res = response as NSHTTPURLResponse!
      if ((error) != nil) {
        println(error.description)
        return
      }
      var error: NSError?
      if (data != nil) {
        var e: NSError?
        var jsonResult = NSJSONSerialization.JSONObjectWithData(data, options: NSJSONReadingOptions(0), error: &e) as NSArray
        self.delegate?.didReceiveThingsToDo!(jsonResult)
      }
    }
    NSURLConnection.sendAsynchronousRequest(request, queue: self.queue, completionHandler: completion)
  }

  /*
   * Passenger Services
   * http://www.flysfo.com/api/passenger-services.json?limit=20&key=[Your API Key]
   */
  func getPassengerServices(offset: Int = 0, limit: Int = 30) {
    var request = requestForURL(limit: limit, offset: offset, endpoint: "passenger-services")
    
    let completion = { (response: NSURLResponse!, data: NSData!, error: NSError!) -> Void in
      let res = response as NSHTTPURLResponse!
      if ((error) != nil) {
        println(error.description)
        return
      }
      var error: NSError?
      if (data != nil) {
        var e: NSError?
        var jsonResult = NSJSONSerialization.JSONObjectWithData(data, options: NSJSONReadingOptions(0), error: &e) as NSArray
        self.delegate?.didReceivePassengerServices!(jsonResult)
      }
    }
    NSURLConnection.sendAsynchronousRequest(request, queue: self.queue, completionHandler: completion)
  }

  private func requestForURL(#limit: Int, offset: Int, endpoint: NSString) -> NSMutableURLRequest{
    let theURL = "\(baseURL)/\(endpoint).json?limit=\(limit)&offset=\(offset)&key=\(apiKey)"
    let url = NSURL(string: theURL)
    var request = NSMutableURLRequest(URL: url)
    request.cachePolicy = .ReloadIgnoringLocalAndRemoteCacheData
    request.timeoutInterval = 30.0
    request.HTTPMethod = "GET"
    request.setValue("application/json", forHTTPHeaderField: "Accept-Encoding")
    return request
  }
}

Leave a Comment

Filed under iOS, programming