4

I am trying to find a good solution for this question -

Implement two functions that assign/release unique id's from a pool. Memory usage should be minimized and the assign/release should be fast, even under high contention. alloc() returns available ID release(id) releases previously assigned ID

The first thought was to maintain a map of IDs and availability(in boolean). Something like this

Map<Integer, Boolean> availabilityMap = new HashMap();

public Integer alloc() {
    for (EntrySet es : availabilityMap.entrySet()) {
        if (es.value() == false) {
            Integer key = es.key();
            availabilityMap.put(key, true);
            return key;
        }
    }
}

public void release(Integer id) {
    availabilityMap.put(id, false);
}

However this is not ideal for multiple threads and "Memory usage should be minimized and the assign/release should be fast, even under high contention."

What would be a good way to optimize both memory usage and speed? For memory usage, I think map should be replaced with some other data structure but I am not sure what it is. Something like bit map or bit set? How can I maintain id and availability in this case? For concurrency I will have to use locks but I am not sure how I can effectively handle contention. Maybe put availabile ids in separate chunks so that each of them can be accessed independently? Any good suggestions?

2 Answers 2

-1

First of all, you do not want to run over entire map in order to find available ID.

So you can maintain two sets of IDs, the first one for available IDs, and the second one is for allocated IDs.

Then it will make allocation/release pretty easy and fast.

Also you can use ConcurrentMap for both containers (sets), it will reduce the contention.

Sign up to request clarification or add additional context in comments.

Comments

-1

Edit: Changed bottom sentinel, fixed a bug


First, don't iterate the entire map to find an available ID. You should only need constant time to do it.

What you could do to make it fast is to do this:

  1. Create an int index = 1; for your counter. This is technically the number of IDs generated + 1, and is always > 0.

  2. Create a ArrayDeque<Integer> free = new ArrayDeque<>(); to house the free IDs. Guaranteed constant access.

  3. When you allocate an ID, if the free ID queue is empty, you can just return the counter and increment it (i.e. return index++;). Otherwise, grab its head and return that.

  4. When you release an ID, push the previously used ID to the free deque.

  5. Remember to synchronize your methods.

This guarantees O(1) allocation and release, and it also keeps allocation quite low (literally once per free). Although it's synchronized, it's fast enough that it shouldn't be a problem.

An implementation might look like this:

import java.util.ArrayDeque;

public class IDPool {
    int index = 1;
    ArrayDeque<Integer> free = new ArrayDeque<>();

    public synchronized int acquire() {
        if (free.isEmpty()) return index++;
        else return free.pop();
    }

    public synchronized void release(id) {
        free.push(id);
    }
}

Additionally, if you want to ensure the free ID list is unique (as you should for anything important) as well as persistent, you can do the following:

  1. Use an HashMap<Integer id, Integer prev> to hold all generated IDs. Remember it doesn't need to be ordered or even iterated.

    • This is technically going to be a stack encoded inside a hash map.
    • Highly efficient implementations of this exist.
    • In reality, any unordered int -> int map will do here.
  2. Track the top ID for the free ID set. Remember that 1 can represent nothing and zero used, so you don't have to box it. (IDs are always positive.) Initially, this would just be int top = 1;

  3. When allocating an ID, if there are free IDs (i.e. top >= 2), do the following:

    • Set the new top to the old head's value in the free map.
    • Set the old top's value in the map to 0, marking it used.
    • Return the old top.
  4. When releasing an old ID, do this instead:

    • If the old ID is already in the pool, return early, so we don't corrupt it.
    • Set the ID's value in the map to the old top.
    • Set the new top to the ID, since it's always the last one to use.

The optimized implementation would end up looking like this:

import java.util.HashMap;

public class IDPool {
    int index = 2;
    int top = 1;
    HashMap<Integer, Integer> pool = new HashMap<>();

    public synchronized int acquire() {
        int id = top;
        if (id == 1) return index++;
        top = pool.replace(id, 0);
        return id;
    }

    public synchronized void release(id) {
        if (pool.getOrDefault(id, 1) == 0) return;
        pool.put(id, top);
        top = id;
    }
}

If need be, you could use a growable integer array instead of the hash map (it's always contiguous), and realize significant performance gains. Matter of fact, that is how I'd likely implement it. It'd just require a minor amount of bit twiddling to do so, because I'd maintain the array's size to be rounded up to the next power of 2.


Yeah...I had to actually write a similar pool in JavaScript because I actually needed moderately fast IDs in Node.js for potentially high-frequency, long-lived IPC communication.

The good thing about this is that it generally avoids allocations (worst case being once per acquired ID when none are released), and it's very amenable to later optimization where necessary.

4 Comments

Do you have a copy of your javascript code in this part? I am interested at the bit twiddling way of doing so.
I use the more naïve variant in my code (I need it fast, but I don't need to count CPU cycles for it)
Actually, the memory footprint of this is pretty large; at least 4 bytes per number in the range. If each bit in a sequence represents a corresponding id (which equals it's index), then your footprint drops by a factor of 32. Then the problem really is rapidly finding a free number, since you can't use the fifo queue to make life easy.
"The good thing about this is that it generally avoids allocations (worst case being once per acquired ID when none are released), and it's very amenable to later optimization where necessary." - It's a pretty straightforward optimization to implement. And unless you're dealing with tens of millions of IDs, I doubt it's going to matter. Now if I were to make one that scaled to billions, I'd of course use an optimized priority queue plus "find first set".

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.