SQL Is Your Friend, Part 1

This is the kickoff of a series of posts on using Postgres and SQL to do some interesting things that you may not have known were possible. If you typically use an ORM, like ActiveRecord, you probably do not write a lot of raw SQL. I like Rails, and ActiveRecord has it’s place. This is not a rant where I shake my fist in the air and tell ActiveRecord to get off my lawn.

Instead I want to highlight a handful of techniques that I find really useful, and also perhaps suggest that it’s OK to venture off the golden path and do a little raw SQL where it makes sense. The fact of the matter is that you cannot simply switch databases and have the ORM just handle it for you. At least not once your application starts to get interesting. So use the right tool for the job, and I hope to highlight a few tools that Postgres brings to the table.

TL;DR: SQL is your friend.


JSON Datatype


You may already be familiar with the HSTORE datatype that came in Postgres 9.0. It’s nice, but does not handle nesting. In Postgres 9.2 we got something even awesomer: the JSON datatype. It’s pretty spectacular, and I have used it to store entire raw payloads now anytime I consume resources.  You can also use this datatype in ActiveRecord migrations without requiring any additional gems.

What makes the JSON datatype especially nice is that you can reach into the data and pull out specific elements. You can also create custom functions and even operators to do even more powerful things with the data in the JSON column.

Here is an example that takes a column called “beacons” which contains a list of bluetooth beacons that a phone has ranged. We want to look into that list and pull out the minor number from the closest beacon, and the accuracy from the next closest beacon. We also cast values from the JSON datatype to a more meaningful datatype. Yes, the cast to TEXT and then to the actual datatype is a little awkward, but it works.

SELECT beacons->0,
  json_array_length(beacons),
  json_extract_path(beacons->0, 'minor')::TEXT::INT AS minor2,
  json_extract_path(beacons->1, 'accuracy')::TEXT::FLOAT AS accuracy2
FROM readings;

See this code in action with this SQL Fiddle


Custom Function and Operator


There are a lot of good functions in Postgres to handle the JSON datatype, but let’s say that I want to do something that’s not built in. I want to look at ALL of the values for a given key.

Take a look at this discussion. You can create a function in Postgres, and even create a new operator for that function:

CREATE OR REPLACE FUNCTION json_pluck(j json, field TEXT)
RETURNS TEXT[]
LANGUAGE sql
AS
$$ SELECT array_agg(j->>$2) FROM json_array_elements($1) j $$;

DROP OPERATOR IF EXISTS | (json, text);
CREATE OPERATOR | (procedure = json_pluck, leftarg = 'json', rightarg = 'text');

Wait. What even is that?! You create a function called json_pluck. Then you create a custom operator | that allows you to invoke json_pluck by simply calling the pipe character and telling it for what key to return the values.

Like so:

SELECT beacons->0,
  json_array_length(beacons),
  json_extract_path(beacons->0, 'minor')::TEXT::INT AS minor2,
  json_extract_path(beacons->1, 'accuracy')::TEXT::FLOAT AS accuracy2,
  beacons | 'minor' AS minors
FROM readings;

You’ll note that I use CREATE OR REPLACE wherever possible. You can’t do that with an operator, though, so I preface that with a guard to drop it if it already exists, and then create it.

See this code in action with this SQL Fiddle

Yeah. That’s pretty cool, no? You can drop all of those into ActiveRecord migrations, too:

class CreateFunctionJsonPluck < ActiveRecord::Migration
  def up
    sql = <<-SQL
      CREATE OR REPLACE FUNCTION json_pluck(j json, field TEXT)
      RETURNS TEXT[]
      LANGUAGE sql
      AS
      $$ SELECT array_agg(j->>$2) FROM json_array_elements($1) j $$;
    SQL
    ActiveRecord::Base.connection.execute(sql)

    sql = <<-SQL
      DROP OPERATOR IF EXISTS | (json, text);
      CREATE OPERATOR | (procedure = json_pluck, leftarg = 'json', rightarg = 'text');
    SQL
    ActiveRecord::Base.connection.execute(sql)
  end

  def down
    sql = "DROP OPERATOR IF EXISTS | (json, text);"
    ActiveRecord::Base.connection.execute(sql)
    sql = "DROP FUNCTION IF EXISTS json_pluck(j json, field TEXT);"
    ActiveRecord::Base.connection.execute(sql)
  end
end

So that’s the JSON datatype. In this series plan to discuss window functions, subqueries, WITH queries (Common Table Expression), and materialized views.

8 Comments

Filed under programming, ruby, SQL