You are not calculating any samples. Instead, you are creating all possible sequences out of the elements on population
of length n
. Consider population = {1, 1, 2, 3}
and n = 3
. Then you would produce:
1 1 1 1 1 1 2 1 1 3 1 1
1 1 1 1 1 1 2 1 1 3 1 1
1 1 2 1 1 2 2 1 2 3 1 2
1 1 3 1 1 3 2 1 3 3 1 3
1 1 1 1 1 1 2 1 1 3 1 1
1 1 1 1 1 1 2 1 1 3 1 1
1 1 2 1 1 2 2 1 2 3 1 2
1 1 3 1 1 3 2 1 3 3 1 3
1 2 1 1 2 1 2 2 1 3 2 1
1 2 1 1 2 1 2 2 1 3 2 1
1 2 2 1 2 2 2 2 2 3 2 2
1 2 3 1 2 3 2 2 3 3 2 3
1 3 1 1 3 1 2 3 1 3 3 1
1 3 1 1 3 1 2 3 1 3 3 1
1 3 2 1 3 2 2 3 2 3 3 2
1 3 3 1 3 3 2 3 3 3 3 3
The number of sequences is population.sizen
(here: 4³ = 64).
A Bernoulli sample would return one of these sequences with equal probability.
If you want to return a Bernoulli sample, then the following code would be better:
static <T> List<T> bernoulliSample(int size, List<T> population, Random rng) {
assert population != null;
assert rng != null;
List<T> sequence = new ArrayList<T>(size);
for (int i = 0; i < size; i++) {
sequence.add(population.get(rng.nextInt(population.size())));
}
return sequence;
}
If you want to construct all possible sequences (≠ permutations), then we can pre-allocate a data structure that holds all sequences. However, it might be better to return an Iterator
or Iterable
object that computes the sequences lazily.
Eager solution
static <T> List<List<T>> allSequences(int size, List<T> population) {
assert population != null;
assert size >= 0;
int sequenceCount = ipow(population.size(), n); // see http://stackoverflow.com/a/10517609/1521179
List<List<T>> sequences = new ArrayList<List<T>>(sequenceCount);
for (int i = 0; i < sequenceCount; i++) {
sequences.add(sequenceAt(i, population));
}
return sequences;
}
Now what is sequenceAt
? Each integer in the range [0, sequenceCount)
can be translated into a specific sequence, much like decimal numbers can be translated into octal. The difference here is that instead of digits, we have items in population
.
private static <T> List<T> sequenceAt(int size, int n, List<T> population) {
List<T> sequence = new ArrayList<T>(size);
for (int i : asBase(size, n, population.size())) {
sequence.add(population.get(i));
}
return sequence;
}
Notice that we don't copy around any half-finished lists here, and we don't rely on recursion.
This method here will transform an integer into a list of digits in another base:
private static int[] asBase(int size, int n, int base) {
int digits[] = new int[size];
for (int i = size - 1, rem = n; i >= 0; i--, rem /= base) {
digits[i] = rem % base;
}
return digits;
}
Lazy solution
This can reuse the helper methods from above solution.
class Sequences<T> implements Iterable<List<T>> {
private final List<T> population;
private final int count;
private final int size;
public Sequences(int size, List<T> population) {
assert population != null;
this.population = population;
assert size >= 0;
this.size = size:
this.count = ipow(population.size(), size); // see http://stackoverflow.com/a/10517609/1521179
}
public Iterator<List<T>> iterator() {
return new Iterator<List<T>>() {
int cursor = 0;
public boolean hasNext() {
return cursor < count;
}
public List<T> next() {
if (!hasNext()) throw new NoSuchElementException();
return sequenceAt(size, cursor++, population);
}
public void remove() {
throw new UnsupportedOperationException();
}
// sequenceAt, asBase
};
}
// ipow
}
Now we could print out all sequences like
for (List<Integer> sequence : new Sequences<Integer>(3, Arrays.asList(1, 1, 2, 3))) {
for (int i : sequence) System.out.print(i + " ");
System.out.println();
}
With both of these solutions we could skip the indirection from translating a number to a specific sequence. Instead, we could maintain an array of indices directly and update it after each sequence. Such a solution could be more efficient, as it results in fewer allocations.