Import data into postgres quickly, implemented using postgresql’s
COPY
in binary format.
Because sometimes jdbc/insert!
and friends aren’t fast enough.
This library uses type-based dispatch for determining the correct postgres binary format. See the mapping section for more info.
Add the library to your project. See description on Clojars for more information.
Given a table and some data:
(require '[clj-pgcopy.core :as pgcopy])
(jdbc/with-db-connection [conn conn-spec]
(jdbc/db-do-commands conn
["drop table if exists example"
"create table example(
internal_id bigint primary key,
external_id uuid,
height float8,
title varchar(64) not null,
description text,
values bytea,
dob date,
created_at timestamptz
)"]))
(def data
[{:internal_id 201223123
:external_id #uuid "1902c205-2bc6-40b8-943b-f5b199241316"
:height nil
:title "Mr. Sandman"
:description nil
:values (.getBytes "not very secret" "UTF-8")
:dob (java.time.LocalDate/of 1954 8 20)
:created_at (java.util.Date.)}
{:internal_id 2012391238
:external_id nil
:height 160.2
:title "Prince"
:description "Tonight we're gonna party"
:values (.getBytes "2000 Two Zero" "UTF-8")
:dob (java.time.LocalDate/of 1999 12 31)
:created_at (java.util.Date.)}])
With clojure.java.jdbc
, open a connection, prepare data rows (as
tuples, not maps), and call clj-pgcopy.core/copy-into!
:
(let [columns [:internal_id :external_id :height
:title :description :values :dob :created_at]]
(jdbc/with-db-connection [conn conn-spec]
(pgcopy/copy-into! (:connection conn)
:example
columns
(map (apply juxt columns) data))))
2
The table has been populated with the data:
(jdbc/with-db-connection [conn conn-spec]
(jdbc/query conn "table example"))
({:internal_id 201223123, :external_id #uuid "1902c205-2bc6-40b8-943b-f5b199241316", :height nil, :title "Mr. Sandman", :description nil, :values [110, 111, 116, 32, 118, 101, 114, 121, 32, 115, 101, 99, 114, 101, 116], :dob #inst "1954-08-20T04:00:00.000-00:00", :created_at #inst "2019-07-23T01:24:38.466000000-00:00"} {:internal_id 2012391238, :external_id nil, :height 160.2, :title "Prince", :description "Tonight we're gonna party", :values [50, 48, 48, 48, 32, 84, 119, 111, 32, 90, 101, 114, 111], :dob #inst "1999-12-31T05:00:00.000-00:00", :created_at #inst "2019-07-23T01:24:38.466000000-00:00"})
Note: depending on how you’ve set up clojure.java.jdbc
and its
IResultSetReadColumn
protocol, the types that come back on query may
differ from the above.
JVM type | Postgres type |
---|---|
Short | int2 (aka smallint) |
Integer | int4 (aka integer) |
Long | int8 (aka bigint) |
Float | float4 (aka real) |
Double | float8 (aka double presicion) |
BigDecimal | numeric/decimal |
Boolean | boolean |
String | text/varchar/char |
java.util.UUID | uuid |
JVM type | Postgres type |
---|---|
java.sql.Date | date |
java.time.LocalDate | date |
java.util.Date | timestamp[tz] |
java.sql.Timestamp | timestamp[tz] |
java.time.Instant | timestamp[tz] |
java.time.ZonedDateTime | timestamp[tz] |
java.time.OffsetDatetime | timestamp[tz] |
org.postgres.util.PGInterval | interval |
JVM type | Postgres type |
---|---|
java.net.Inet4Address | inet |
java.net.Inet6Address | inet |
JVM type | Postgres type |
---|---|
org.postgres.geometric.PGpoint | point |
org.postgres.geometric.PGline | line |
org.postgres.geometric.PGpath | path |
org.postgres.geometric.PGbox | box |
org.postgres.geometric.PGcircle | circle |
org.postgres.geometric.PGpolygon | polygon |
Things that are String-like, or serialized in string form, should work
using the String -> text mapping. An exception is the jsonb
type,
because the binary format requires a version signifier. Wrapping a
JSON string in a JsonB
handles that, which is provided by the
library.
Impemented for the following JVM-typed arrays for:
JVM type | Postgres type |
---|---|
int[] | int4[] (aka integer[]) |
long[] | int8[] (aka bigint[]) |
float[] | float4[] (aka real[]) |
double[] | float8[] (aka double precision[]) |
byte[] | bytea |
String[] | text[] (or varchar[]) |
java.util.UUID[] | uuid[] |
Currently, only 1-dimensional Postgres arrays are supported.
- more array types (date, timestamp, etc)
- range types
- hstore
- cidr, macaddr, macaddr8
- bit strings
- composite types / records
- multi-dimensional arrays