Using Postgres 9.5, how can I query a table with a JSONB column, joining two JSON fields, crucially utilising a GIN index to maximise performance?
I'm prototyping a system which can accommodate different schemas/data models for different customers on a single platform. Entity-Attribute-Value (EAV) is a common pattern for this and I'm keen to benchmark how well Postgres can handle queries on entities stored using its JSONB data type.
I have a single table which stores entities of different type e.g. customer, sales transaction.
CREATE TABLE entity_jsonb (
id uuid PRIMARY KEY,
type text,
created_at timestamp without time zone,
properties jsonb,
org_id integer
);
CREATE INDEX entity_jsonb_created_at_idx ON entity_jsonb USING btree (created_at);
CREATE INDEX entity_jsonb_org_id_idx ON entity_jsonb USING btree (org_id);
CREATE INDEX entity_jsonb_type_idx ON entity_jsonb USING btree (type);
CREATE INDEX entity_jsonb_properties_idx ON entity_jsonb USING gin (properties);
I have a customer record in this table with type 'customer' and properties:
{"uuid": "8f8896c7-f41c-49f7-ad6e-4613f7b51a23", "email": "[email protected]", "lastname": "McCarthy", "createdAt": "May 27, 2015 12:06:45 PM", "firstname": "Nathan"}
I have a sales transaction record in this table with type 'sales_transaction' and properties:
{"uuid": "54243d48-e69f-4bb6-ab33-6defb8a0f626", "amount": 99817, "status": 0, "paymentType": 1, "currencyCode": "USD", "customerUuid": "8f8896c7-f41c-49f7-ad6e-4613f7b51a23", "transactionId": "471170"}
How can I take advantage of the GIN index (operators: @>, ?, ?&, ?|) to query transactions efficiently, but joining with customers to return data from both e.g. populating a list of transactions from the last 7 days, showing customer name as well as transaction amount?
I've tried this query to find a subset of transaction objects (for amount = 99817) and related customer details:
SELECT t.properties AS transaction, c.properties AS customer
FROM entity_jsonb t
JOIN entity_jsonb c ON (c.properties->>'uuid' = t.properties->>'customerUuid' AND c.type = 'customer')
WHERE t.type = 'sales_transaction' AND t.properties @> '{"amount" : 99817}';
It works but it's not using the GIN index i.e. not as fast as I'd like it to be. I tried this approach (using the GIN @> containment operator) but it doesn't include the customer details. What am I doing wrong?
SELECT t.properties AS transaction, c.properties AS customer FROM entity_jsonb t LEFT JOIN entity_jsonb c ON (c.properties @> json_build_array(json_build_object('uuid', t.properties->'customerUuid'))::jsonb AND c.type = 'customer') WHERE t.type = 'sales_transaction' AND t.properties @> '{"amount" : 99817}';
I understand that this is not an optimum solution compared with a traditional relational design but I'm interested to see how efficient the query can be purely storing the entity data as JSON.