External Database, Phase 1

We’ve gone pretty far with our application so far, but it’s time to make that big leap, and convert things over to an actual database. We’ll be running PostgreSQL in a Docker container. [1]

We’re definitely going to be taking two steps backward before taking further steps forward, but the majority of the changes will be in the clojure-game-geek.db namespace; the majority of the application, including the field resolvers, will be unaffected.

Dependency Changes

project.clj
(defproject clojure-game-geek "0.1.0-SNAPSHOT"
  :description "A tiny BoardGameGeek clone written in Clojure with Lacinia"
  :url "https://github.com/walmartlabs/clojure-game-geek"
  :license {:name "Eclipse Public License"
            :url "http://www.eclipse.org/legal/epl-v10.html"}
  :dependencies [[org.clojure/clojure "1.9.0"]
                 [com.stuartsierra/component "0.3.2"]
                 [com.walmartlabs/lacinia "0.30.0"]
                 [com.walmartlabs/lacinia-pedestal "0.10.0"]
                 [org.clojure/java.jdbc "0.7.8"]
                 [org.postgresql/postgresql "42.2.5.jre7"]
                 [com.mchange/c3p0 "0.9.5.2"]
                 [io.aviso/logging "0.3.1"]])

We’re bringing in the latest versions of lacinia and lacinia-pedestal available at the time this page was written (something we’ll likely do almost every chapter). Since these are now based on Clojure 1.9, it’s a fine time to upgrade to that.

We’re also adding several new dependencies for accessing a PostgreSQL database:

  • The Clojure java.jdbc library
  • The PostgreSQL driver that plugs into the library
  • A java library, c3p0, that is used for connection pooling

Database Initialization

We’ve added a number of scripts to project.

First, a file used to start PostgreSQL:

docker-compose.yml
version: '3'
services:
  db:
    ports:
    - 25432:5432
    image: postgres:10.2-alpine

This file is used with the docker-compose command to set up one or more containers. We only define a single container right now.

The image key identifies the name of the image to download from hub.docker.com.

The port mapping is part of the magic of Docker … the PostgreSQL server, inside the container, will listen to requests on its normal port: 5432, but our code, running on the host operation system, can reach the server as port 25432 on localhost.

The docker-up.sh script is used to start the container:

bin/docker-up.sh
#!/usr/bin/env bash

docker-compose -p cgg up -d

There’s also a bin/docker-down.sh script to shut down the container, and a bin/psql.sh to launch a SQL command prompt for the cggdb database.

After starting the container, it is necessary to create the cggdb database and populate it with initial data, using the setup-db.sh script:

bin/setup-db.sh
#!/usr/bin/env bash

docker exec -i --user postgres cgg_db_1 createdb cggdb

docker exec -i --user postgres cgg_db_1 psql cggdb -a  <<__END
create user cgg_role password 'lacinia';
__END

docker exec -i cgg_db_1 psql -Ucgg_role cggdb -a <<__END
drop table if exists designer_to_game;
drop table if exists game_rating;
drop table if exists member;
drop table if exists board_game;
drop table if exists designer;

CREATE OR REPLACE FUNCTION mantain_updated_at()
RETURNS TRIGGER AS \$\$
BEGIN
   NEW.updated_at = now();
   RETURN NEW;
END;
\$\$ language 'plpgsql';
 
create table member (
  member_id int generated by default as identity primary key,
  name text not null,
  created_at timestamp not null default current_timestamp,
  updated_at timestamp not null default current_timestamp);

create trigger member_updated_at before update
on member for each row execute procedure 
mantain_updated_at();
  
create table board_game (
  game_id int generated by default as identity primary key,
  name text not null,
  summary text,
  min_players integer,
  max_players integer,
  created_at timestamp not null default current_timestamp,
  updated_at timestamp not null default current_timestamp);

create trigger board_game_updated_at before update
on board_game for each row execute procedure 
mantain_updated_at();
 
create table designer (
  designer_id int generated by default as identity primary key,
  name text not null,
  uri text,
  created_at timestamp not null default current_timestamp,
  updated_at timestamp not null default current_timestamp);

create trigger designer_updated_at before update
on designer for each row execute procedure 
mantain_updated_at();
 
create table game_rating (
  game_id int references board_game(game_id),
  member_id int references member(member_id),
  rating integer not null,
  created_at timestamp not null default current_timestamp,
  updated_at timestamp not null default current_timestamp);

create trigger game_rating_updated_at before update
on game_rating for each row execute procedure 
mantain_updated_at();

create table designer_to_game (
  designer_id int  references designer(designer_id),
  game_id int  references board_game(game_id),
  primary key (designer_id, game_id));

insert into board_game (game_id, name, summary, min_players, max_players) values
  (1234, 'Zertz', 'Two player abstract with forced moves and shrinking board', 2, 2),
  (1235, 'Dominion', 'Created the deck-building genre; zillions of expansions', 2, null),
  (1236, 'Tiny Epic Galaxies', 'Fast dice-based sci-fi space game with a bit of chaos', 1, 4),
  (1237, '7 Wonders: Duel', 'Tense, quick card game of developing civilizations', 2, 2);

alter table board_game alter column game_id restart with 1300;

insert into member (member_id, name) values
  (37, 'curiousattemptbunny'),
  (1410, 'bleedingedge'),
  (2812, 'missyo');

alter table member alter column member_id restart with 2900;

insert into designer (designer_id, name, uri) values
  (200, 'Kris Burm', 'http://www.gipf.com/project_gipf/burm/burm.html'),
  (201, 'Antoine Bauza', 'http://www.antoinebauza.fr/'),
  (202, 'Bruno Cathala', 'http://www.brunocathala.com/'),
  (203, 'Scott Almes', null),
  (204, 'Donald X. Vaccarino', null);

alter table designer alter column designer_id restart with 300;

insert into designer_to_game (designer_id, game_id) values
  (200, 1234),
  (201, 1237),
  (204, 1235),
  (203, 1236),
  (202, 1237);

insert into game_rating (game_id, member_id, rating) values
  (1234, 37, 3),
  (1234, 1410, 5),
  (1236, 1410, 4),
  (1237, 1410, 4),
  (1237, 2812, 4),
  (1237, 37, 5);
__END

The DDL for the cggdb database includes a pair of timestamp columns, created_at and updated_at, in most tables. Defaults and database triggers ensure that these are maintained by PostgreSQL.

Primary Keys

There’s a problem with the data model we’ve used in prior chapters: the primary keys.

We’ve been using simple numeric strings as primary keys, because it was convenient. Literally, we just made up those values. But eventually, we’re going to be writing data to the database, including new Board Games, new Publishers, and new Members.

With the change to using PostgreSQL, we’ve switched to using numeric primary keys. Not only are these more space efficient, but we have set up PostgreSQL to allocate them automatically. This is great news once we have multiple Clojure Game Geek servers running, as it ensures that primary keys are truly unique, enforced by the database. We’ll circle back to this issue when we add mutations to create new entities.

In the meantime, the schema has changed; the id fields have changed type from type ID (which, in GraphQL, is a kind of opaque string) to type Int (which is a 32 bit, signed integer).

resources/cgg-schema.edn
{:objects
 {:BoardGame
  {:description "A physical or virtual board game."
   :fields
   {:game_id {:type (non-null Int)}
    :name {:type (non-null String)}
    :rating_summary {:type (non-null :GameRatingSummary)
                     :resolve :BoardGame/rating-summary}
    :summary {:type String
              :description "A one-line summary of the game."}
    :description {:type String
                  :description "A long-form description of the game."}
    :designers {:type (non-null (list :Designer))
                :description "Designers who contributed to the game."
                :resolve :BoardGame/designers}
    :min_players {:type Int
                  :description "The minimum number of players the game supports."}
    :max_players {:type Int
                  :description "The maximum number of players the game supports."}
    :play_time {:type Int
                :description "Play time, in minutes, for a typical game."}}}

  :GameRatingSummary
  {:description "Summary of ratings for a single game."
   :fields
   {:count {:type (non-null Int)
            :description "Number of ratings provided for the game.  Ratings are 1 to 5 stars."}
    :average {:type (non-null Float)
              :description "The average value of all ratings, or 0 if never rated."}}}

  :Member
  {:description "A member of Clojure Game Geek.  Members can rate games."
   :fields
   {:member_id {:type (non-null Int)}
    :member_name {:type (non-null String)
                  :description "Unique name of member."}
    :ratings {:type (list :GameRating)
              :description "List of games and ratings provided by this member."
              :resolve :Member/ratings}}}

  :GameRating
  {:description "A member's rating of a particular game."
   :fields
   {:game {:type (non-null :BoardGame)
           :description "The Game rated by the member."
           :resolve :GameRating/game}
    :rating {:type (non-null Int)
             :description "The rating as 1 to 5 stars."}}}

  :Designer
  {:description "A person who may have contributed to a board game design."
   :fields
   {:designer_id {:type (non-null Int)}
    :name {:type (non-null String)}
    :url {:type String
          :description "Home page URL, if known."}
    :games {:type (non-null (list :BoardGame))
            :description "Games designed by this designer."
            :resolve :Designer/games}}}}

 :queries
 {:game_by_id
  {:type :BoardGame
   :description "Select a BoardGame by its unique id, if it exists."
   :args
   {:id {:type (non-null Int)}}
   :resolve :query/game-by-id}

  :member_by_id
  {:type :Member
   :description "Select a ClojureGameGeek Member by their unique id, if it exists."
   :args
   {:id {:type (non-null Int)}}
   :resolve :query/member-by-id}}

 :mutations
 {:rate_game
  {:type :BoardGame
   :description "Establishes a rating of a board game, by a Member.

   On success (the game and member both exist), selects the BoardGame.
   Otherwise, selects nil and an error."
   :args
   {:game_id {:type (non-null Int)}
    :member_id {:type (non-null Int)}
    :rating {:type (non-null Int)
             :description "Game rating as a number between 1 and 5."}}
   :resolve :mutation/rate-game}}}

In addition, the id field on the BoardGame, Member, and Publisher objects has been renamed: to game_id, member_id, and publisher_id, respectfully. This aligns the field names with the database column names.

As Clojure developers, we generally follow the kebab case convention of using dashes in names. GraphQL, JSON, and most databases use snake case, with underscores. Snake case keywords in Clojure look slightly odd, but are 100% valid.

There’s nothing that prevents you from reading database data and converting the column names to kebab case … but you’ll just have to undo that somehow in the GraphQL schema, as kebab case is not valid for GraphQL names.

Much better to have as consistent a representation of the data as possible, spanning the database, the GraphQL schema, the Clojure data access code, and the over-the-wire JSON format … and not buy yourself any extra work that has no tangible benefits.

org.clojure/java.jdbc

This library is the standard approach to accessing a database from Clojure code. java.jdbc can access, in a uniform manner, any database for which there is a Java JDBC driver.

The clojure.java.jdbc namespace contains a number of functions for acessing a database, including functions for executing arbitrary queries, and specialized functions for peforming inserts, updates, and deletes.

For all of those functions, the first parameter is a database spec, a map of data used to connect to the database. In a trivial case, this identifies the Java JDBC driver class, and provides extra information to build a JDBC URL, including details such as the database host, the user and password, and the name of the database.

In practice, opening up a new connection for each operation is unacceptible so we’ll jump right in with a database connection pooling library, C3P0.

clojure-game-geek.db

In prior chapters, the :db component was just a wrapper around an Atom; starting here, we’re going to update it to be a wrapper around a connection to the PostgreSQL database running in the Docker container.

Our goal in this chapter is to update just one basic query to use the database, the query that retrieves a board game by its unique id. We’ll make just the changes necessary for that one query before moving on.

src/clojure_game_geek/db.clj
(ns clojure-game-geek.db
  (:require
    [com.stuartsierra.component :as component]
    [clojure.java.jdbc :as jdbc])
  (:import (com.mchange.v2.c3p0 ComboPooledDataSource)))

(defn ^:private pooled-data-source
  [host dbname user password port]
  {:datasource
   (doto (ComboPooledDataSource.)
     (.setDriverClass "org.postgresql.Driver" )
     (.setJdbcUrl (str "jdbc:postgresql://" host ":" port "/" dbname))
     (.setUser user)
     (.setPassword password))})

(defrecord ClojureGameGeekDb [ds]

  component/Lifecycle

  (start [this]
    (assoc this
           :ds (pooled-data-source "localhost" "cggdb" "cgg_role" "lacinia" 25432)))

  (stop [this]
    (-> ds :datasource .close)
    (assoc this :ds nil)))

(defn new-db
  []
  {:db (map->ClojureGameGeekDb {})})


(defn find-game-by-id
  [component game-id]
  (first
    (jdbc/query (:ds component)
                ["select game_id, name, summary, min_players, max_players, created_at, updated_at
               from board_game where game_id = ?" game-id])))

(defn find-member-by-id
  [component member-id]
  (->> component
       :db
       deref
       :members
       (filter #(= member-id (:id %)))
       first))

(defn list-designers-for-game
  [component game-id]
  (let [designers (:designers (find-game-by-id component game-id))]
    (->> component
         :db
         deref
         :designers
         (filter #(contains? designers (:id %))))))

(defn list-games-for-designer
  [component designer-id]
  (->> component
       :db
       deref
       :games
       (filter #(-> % :designers (contains? designer-id)))))

(defn list-ratings-for-game
  [component game-id]
  (->> component
       :db
       deref
       :ratings
       (filter #(= game-id (:game_id %)))))

(defn list-ratings-for-member
  [component member-id]
  (->> component
       :db
       deref
       :ratings
       (filter #(= member-id (:member_id %)))))

(defn ^:private apply-game-rating
  [game-ratings game-id member-id rating]
  (->> game-ratings
       (remove #(and (= game-id (:game_id %))
                     (= member-id (:member_id %))))
       (cons {:game_id game-id
              :member_id member-id
              :rating rating})))

(defn upsert-game-rating
  "Adds a new game rating, or changes the value of an existing game rating."
  [db game-id member-id rating]
  (-> db
      :db
      (swap! update :ratings apply-game-rating game-id member-id rating)))

The requires for the db namespace have changed; we’re using the clojure.java.jdbc namespace to connect to the database and execute queries, and also making use of the ComboPooledDataSource class, which allows for pooled connections.

The ClojureGameGeekDb record has changed; it now has a ds (data source) field, and that is the connection to the PostgreSQL database. The start method now opens the connection pool to the database, and the stop method shuts down the connection pool.

For the meantime, we’ve hardwired the connection details (hostname, username, password, and port) to our Docker container. A later chapter will discuss approaches to configuration. Also note that we’re connecting to port 25432 on localhost; Docker will forward that port to the container port 5432.

That leaves the revised implementation of the find-game-by-id function; the only data access function rewritten to use the database connection. It simply constructs and executes the SQL query.

With clojure.java.jdbc the query is a vector consisting of a SQL query string followed by zero or more query parameters. Each ? character in the query string corresponds to a query parameter, based on position.

The clojure.java.jdbc/query function returns a seq of matching rows. By default, each selected row is converted into a Clojure map, and the column names are converted from strings into keywords.

For an operation like this one, which returns at most one map, we use first.

If no rows match, then the seq will be empty, and first will return nil. That’s a perfectly good way to identify that the provided Board Game id was not valid.

At the REPL

Starting a new REPL, we can give the new code and schema a test:

(start)
=> :started
(q "{ game_by_id(id: 1234) { game_id name summary min_players max_players }}")
=>
{:data {:game_by_id {:game_id 1234,
                     :name "Zertz",
                     :summary "Two player abstract with forced moves and shrinking board",
                     :min_players 2,
                     :max_players 2}}}")

Great! That works!

Notice how everything fits together: the column names in the database (game_id, summary, etc.) became keywords (:game_id, :summary, etc.) in a map; meanwhile the GraphQL field names did the same conversion and everything meets together in the middle, with GraphQL fields selecting those same keys from the map.

Meanwhile all the other clojure-game-geek.db namespace functions, expecting to operate against a map inside an Atom, are now broken. We’ll fix them in the next couple of chapters.

User Namespace Improvements

We’ve made some tiny changes to the user namespace:

dev-resources/user.clj
(ns user
  (:require
    [com.walmartlabs.lacinia :as lacinia]
    [clojure.java.browse :refer [browse-url]]
    [clojure-game-geek.system :as system]
    [clojure.walk :as walk]
    [com.stuartsierra.component :as component])
  (:import (clojure.lang IPersistentMap)))

(defn simplify
  "Converts all ordered maps nested within the map into standard hash maps, and
   sequences into vectors, which makes for easier constants in the tests, and eliminates ordering problems."
  [m]
  (walk/postwalk
    (fn [node]
      (cond
        (instance? IPersistentMap node)
        (into {} node)

        (seq? node)
        (vec node)

        :else
        node))
    m))

(defonce system nil)

(defn q
  [query-string]
  (-> system
      :schema-provider
      :schema
      (lacinia/execute query-string nil nil)
      simplify))

(defn start
  []
  (alter-var-root #'system (fn [_]
                             (-> (system/new-system)
                                 component/start-system)))
  (browse-url "http://localhost:8888/")
  :started)

(defn stop
  []
  (when (some? system)
    (component/stop-system system)
    (alter-var-root #'system (constantly nil)))
  :stopped)

(comment
  (start)
  (stop)
  )

To make loading and reloading the user namespace easier, we’ve changed the system Var to be a defonce. This means that even if the code for the namespace is reloaded, the system Var will maintain its value from before the code was reloaded.

A common cycle is to make code changes, stop, then start the system.

We’ve moved the code that contructs a new system into the start function, and changed the stop function to return the system Var to nil after stopping the system, if a system is in fact running.

Lastly, there’s a comment containing expressions to start and stop the system. This is great for REPL oriented development, we can use the Cursive send form before caret to REPL command (Shift-Ctrl E) [2] to make it easier to quickly and accurately execute those commands.

Next Up

We’ve been sloppy about one aspect of our application: we’ve entirely been testing at the REPL. It’s time to write some tests, then convert the rest of the db namespace.

[1]

A Docker container is the Inception of computers; a container is essentially a light-weight virtual machine that runs inside your computer.

To the PostgreSQL server running inside the container, it will appear as if the entire computer is running Linux, just as if Linux and PostgreSQL were installed on a bare-metal computer.

Docker images are smaller and less demanding than full operating system virtual machines. In fact frequently you will run several interconnected containers together.

Docker includes infrastructure for downloading the images from a central repository. Ultimately, it’s faster and easier to get PostgreSQL running inside a container that to install the database onto your computer.

[2]The author uses Cursive, but Emacs and other editors all have similar functionality.