Including multiple columns in a single index in Postgres

Question

I have a 'users' table with two columns, 'email' and 'new_email'. I need:

A case-insensitive uniqueness constraint covering both columns - i.e., if "[email protected]" appears in one row's 'email' column, then inserting "[email protected]" into another row's (or even the same row's) 'new_email' column should fail.
Fast case-insensitive searching for a given email address in either the 'email' or 'new_email' fields - i.e. find the row where the new_email OR email is "[email protected]", case-insensitive.

I know that I could do this more easily by creating a related 'emails' table, but I'm expecting to be looking up users in this table (by primary key) from several applications, and I'd like to avoid duplicating the join logic in various places to also retrieve their emails. So I think some kind of expression index would be best, if that's possible.

If this isn't possible, I suppose my next best option would be to create a view that the other applications could use to easily fetch a user's emails along with their other information, but I'm not sure how to do that either.

I'm using Postgres 8.4. Thank you!

mu is too short · Accepted Answer · 2011-07-03 01:06:41Z

I think you'll have to use a trigger to enforce your cross-column uniqueness constraint. If you add unique indexes on each column and then a trigger something like this (untested off the top of my head code):

CREATE FUNCTION no_dups_allowed() RETURNS trigger AS $$
DECLARE
    r ROW;
BEGIN
    SELECT 1 INTO r
    FROM users
    WHERE LOWER(email)     = LOWER(NEW.email_new)
       OR LOWER(email_new) = LOWER(NEW.email);
    IF FOUND THEN
        -- Found a duplicate so it is time for a hissy fit!
        RAISE 'Duplicate email address found' USING ERRCODE = 'unique_violation';
    END;
    RETURN NEW;
END;
$$ LANGUAGE plpgsql;

You'd want something like that as a BEFORE INSERT and BEFORE UPDATE trigger. That trigger would take care of catching cross-column duplicates and the unique indexes would take care of in-column duplicates.

Some useful references:

You'll want the individual indexes for your queries anyway and using the uniqueness half of the indexes simplifies your trigger by leaving it to only deal with the cross-column part; if you try to do it all in the trigger, then you'll have to watch out for updating a row without really changing the email or email_new columns.

For the querying half, you could create a view that used a UNION to combine the two columns. You could also create a function to merge the user's email addresses into one list. Hard to say which would be best without know more details of these other queries but I suspect that fixing all the other queries to know about email and email_new would be the best approach; you'll have to update all the other queries to use the view or function anyway so why build a view or function at all?

Two lowercase UNIQUE INDEX functions with a table check is your best bet. It isn't necessary to invoke a trigger to solve this problem. Use a derivation of what @Scott suggested below. — Sean, Jul 3 '11 at 16:42
@Sean: Scott's solution is nice but the the COALESCE leaves a hole. — mu is too short, Jul 3 '11 at 20:45
Agreed, I wouldn't use COALESCE or two columns to store old and current values in a single table. — Sean, Jul 3 '11 at 21:07

Scott Marlowe · Answer 2 · 2011-07-03 14:55:07Z

up vote 0 down vote

No need for triggers. Try this:

create  table et (email text, email2 text);
create unique index et_u on et (coalesce(lower(email),lower(email2)));
insert into et (email,email2) values ('[email protected]',NULL);
insert into et (email,email2) values ('[email protected]',NULL);
ERROR:  duplicate key value violates unique constraint "et_u"
insert into et (email,email2) values (NULL,'[email protected]');
ERROR:  duplicate key value violates unique constraint "et_u"
insert into et (email,email2) values (NULL,'[email protected]');
ERROR:  duplicate key value violates unique constraint "et_u"

answered Jul 3 '11 at 14:55

Scott Marlowe
1,89847

Well, it's not perfect, I can enter the same email into both fields at the same time. So you'd also need a check constraint that email<>email2 here. alter table et add constraint et_nn check (email<>email2); – Scott Marlowe Jul 3 '11 at 14:58

That's interesting. Would Postgres be smart enough to use that index on queries for an email address? Like WHERE lower(email) = '[email protected]'? Or is there another way I could phrase the query to get it to do so? – PreciousBodilyFluids Jul 3 '11 at 17:16

This fails if you insert (NULL,'[email protected]') and then ('[email protected]','[email protected]') when there isn't a "[email protected]" address yet, the COALESCE effectively hides the email2 from the uniqueness checks when email is not NULL. – mu is too short Jul 3 '11 at 20:43

add comment

asked	2 years ago
viewed	667 times
active	2 years ago

Explore our sites

Including multiple columns in a single index in Postgres

2 Answers

Your Answer

Not the answer you're looking for? Browse other questions tagged sql postgresql indexing index or ask your own question.

Community Bulletin

Hot Network Questions

Explore our sites

Including multiple columns in a single index in Postgres

2 Answers

Your Answer

Sign up or login

Post as a guest

Not the answer you're looking for? Browse other questions tagged sql postgresql indexing index or ask your own question.

Community Bulletin

Related

Hot Network Questions