Tell me more ×
Stack Overflow is a question and answer site for professional and enthusiast programmers. It's 100% free, no registration required.

I am using a PostgreSQL multi-dimensional array to mimic an array of hashes, and I am looking for a way to locate a record by a key-value pair in that array like e.g ["key1","value1"]. An example array is:

[ ["key1","value1"], ["key2","value2"] ]

The keys are in my case languages and a key may occur more than once.

Is there an efficient way to locate a record by a key-value pair using a PostgreSQL index with Rails 4?

Edit: fixed typo

share|improve this question
 
can you use extensions? then look on Hstore extension - it is exactly what you want - stackoverflow.com/questions/13766300/hstore-and-rails –  Pavel Stehule Jun 25 at 7:34
 
I looked at Hstore, but as Craig says it only supports single-level keys and due to key collision I would need to return an array for a value like e.g { "key1" => ["value1", "value2"] }. –  Andreas Sæbjørnsen Jun 25 at 8:56
 
Is there any particular reason for looking for a convoluted storage solution instead of database fields and tables? –  Denis Jun 25 at 9:15
 
Denis: I have a table that must be auditable, and that also need slugs per locale. I am trying to create a slugging library that does not need extra tables to simplify interactions with the audit requirement, but it might be a blind alley. –  Andreas Sæbjørnsen Jun 25 at 9:20
1  
@AndreasSæbjørnsen What does single-field give you that a two-level table doesn't in audit terms? Either way, you have a transaction that adds a change or set of changes to an audit table using a trigger when the table changes. –  Craig Ringer Jun 25 at 9:41
show 2 more comments

1 Answer

up vote 2 down vote accepted

I am using a PostgreSQL multi-dimensional array to mimic an array of hashes

Those two things aren't really all that similar, and I wouldn't recommend attempting to use multidimensional arrays to model nested hashes.

Pavel is quite right that hstore is probably a lot closer to what you want, and it's indexable too. However, the current version of hstore (in Pg 9.3 and older) supports only single-level keys; it's a dictionary/hash that can contain only scalar string values. A planned enhancement to hstore for PostgreSQL 9.4 will hopefully bring multi-level nesting and JSON syntax compatibility.

Ordinary tables

You can model arbitrary-depth key/value chains(and trees/graphs) using edgelists and recursive CTEs, but this probably rather more complexity than you really want.

If you only need a fixed two-level key/value list, simply use a table that lists both key levels:

CREATE TABLE twolevel(key1 text, key2 text, thevalue text not null, PRIMARY KEY(key1,key2));

This lets you constrain against duplicate key pairs, which is nice.

You can also use two tables with a foreign key relationship between them. This gives you cascade deletes if you want, so removing a top level key removes all sub-level keys and associated values. It's easy enough to do that with the single-table approach, though.

Use one of these two approaches unless you have good reasons to do otherwise.

Hstore as text

Until the extended hstore is available, one option would be to store text representations of nested hstore fields. This is not pretty or efficient, but it's probably better than trying to search a multidimensional array.

CREATE TABLE nested_hstore(id integer, blah hstore);

insert into nested_hstore(id, blah) values 
(1, hstore( ARRAY['key1','key2'], ARRAY['"key1.1"=>"value1.1", "key1.2"=>"value1.2"', '"key2.1"=>"value2.1", "key2.2"=>"value2.2"']::hstore[]::text[]));

Test:

regress=> select (blah->'key1')::hstore->'key1.1' from nested_hstore ;
 ?column? 
----------
 value1.1
(1 row)

Because the hstore must be parsed each time it's not going to be super-fast, and you won't get the usual indexing benefits on the second level. Still, it's an option if you really genuinely need two-level hashes in fields.

Tables of hstore values

You can combine these two quite reasonably.

CREATE TABLE twolevel(key1 text, level2keyvalues hstore);

It seems pretty ugly to me, though; I'd prefer to be consistent one way or the other.

SQL/XML

Another option is to use SQL/XML, which you can index along arbitrary XPATH expressions. Again, this seems a bit too complicated.

share|improve this answer
 
It is hard for me to make an informed choice about EdgeLists and recursive CTEs since I do not have much experience with them. However, you are correct that it seems like an undesirable solution because of its complexity. Maybe I am stuck using multiple arrays, one for each language? That is also not desirable. –  Andreas Sæbjørnsen Jun 25 at 9:09
1  
@AndreasSæbjørnsen Added some detail. Basically I think you should probably just use an ordinary table for this. –  Craig Ringer Jun 25 at 9:39
 
I ended up using a more traditional schema design as you suggested. It does not seem like the current implementation of the hstore and array feature is that useful in this context. –  Andreas Sæbjørnsen Jul 11 at 20:05
 
@AndreasSæbjørnsen I tend to agree. If the v2 hstore work that supports nesting and json syntax goes well then hopefully that'll change, maybe in 9.5... –  Craig Ringer Jul 11 at 23:07

Your Answer

 
discard

By posting your answer, you agree to the privacy policy and terms of service.

Not the answer you're looking for? Browse other questions tagged or ask your own question.