Generate conditioned random binary array matrix in python

Question

what could be the most efficient way to generate a mxn binary array matrix which is contrained to the sum per column is equal 0 or 1 ?

somethin like this

[[0,0,1,0,0],
[1,1,0,0,0],
[0,0,0,0,1]
[0,0,0,0,0]]

m and n are going to be fixed, but n is larger than 500000 so iteration methods could take a long time until an appropied matrix is found.

What are the expectations in the row axis? Single value of 1, no values of 1, etc.? — Benjamin
– Benjamin, Commented Dec 14, 2016 at 20:52

Warren Weckesser · Accepted Answer · 2016-12-14 21:00:32Z

You are selecting a random subset of the columns, and then for each column, a random row. Here's one way to do that using numpy. The binomial distribution is used to select which columns get a 1. Change the second argument of numpy.random.binomial to tweak the density of columns with a 1.

In [156]: m = 5

In [157]: n = 12

In [158]: a = np.zeros((m, n), dtype=int)

In [159]: cols = np.random.binomial(1, 0.7, size=n)

In [160]: a[np.random.randint(0, m, size=cols.sum()), np.nonzero(cols)[0]] = 1

In [161]: a
Out[161]: 
array([[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0],
       [0, 1, 0, 1, 0, 0, 0, 0, 1, 1, 0, 0],
       [1, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0],
       [0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0],
       [0, 0, 0, 0, 0, 1, 1, 1, 0, 0, 0, 0]])

If you want a 1 in each column, here's a fairly concise method:

In [103]: m = 5

In [104]: n = 12

In [105]: a = (np.random.randint(0, m, size=n) == np.arange(m).reshape(-1, 1)).astype(int)

In [106]: a
Out[106]: 
array([[0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 0, 1],
       [0, 1, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0],
       [0, 0, 1, 0, 0, 1, 1, 0, 0, 0, 1, 0],
       [0, 0, 0, 0, 1, 0, 0, 1, 0, 0, 0, 0],
       [1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0]])

np.random.randint(0, m, size=n) is the random selection of row indices, one for each column. np.arange(m).reshape(-1, 1) is the sequence [0, 1, ..., m-1] stored in an array with shape (m, 1). When this is compared to the random values, broadcasting applies, so an boolean array with shape (m, n) is created. Just convert that to integers, and you have your result.

Benjamin · Accepted Answer · 2016-12-14 20:38:25Z

1

You can select which column gets a value of 1:

a = numpy.zeros((ysize, xsize))
a[numpy.arange(ysize), numpy.random.choice(numpy.arange(xsize), ysize, replace=False)] = 1

answered Dec 14, 2016 at 20:38

Benjamin

11.9k13 gold badges74 silver badges120 bronze badges

Comments

Jean-François Fabre · Accepted Answer · 2016-12-14 20:42:05Z

First, generate a zero-filled matrix using a list comprehension and list multiplication (even faster) to generate each row. This is the step where you have to be fast. Transposing/copying data would hinder performance.
Then loop and choose a number. Between 0 and number of rows * 2 (so it sometimes goes off limits, leaving a 0-filled column). If it's in range, put a 1 there (you could change by (3*nb_rows)//2 to increase the number of ones. This step is fast: memory is already allocated, just mark cells.

code:

import random

m=7
n=5

matrix = [[0] * m for _ in range(n)]
print(matrix)
for i in range(m):
    a = random.randint(0,(3*n)//2)  # 66% chances to get a 1 somewhere
    if a < n:
        matrix[a][i] = 1

print(matrix)

a result:

[[0, 0, 0, 0, 0, 0, 0], [0, 0, 0, 0, 0, 0, 0], [0, 0, 0, 0, 0, 0, 0], [0, 0, 0, 0, 0, 0, 0], [0, 0, 0, 0, 0, 0, 0]]
[[0, 0, 0, 1, 0, 0, 0], [1, 0, 0, 0, 0, 0, 0], [0, 0, 0, 0, 0, 0, 0], [0, 0, 0, 0, 0, 0, 0], [0, 1, 0, 0, 0, 1, 0]]

Brobin · Accepted Answer · 2016-12-14 20:48:13Z

1

You can use the standard library and a few clever list comprehensions to do this. First, we create the columns, with one in each being 1 or 0 so we satisfy the sum constraint. Then we flip the columns into rows to get the result.

from random import choice, randint

def generate_matrix(m, n):
    # Generate the columns
    columns = []
    for x in range(n):
        column = [0 for y in range(m)]
        column[randint(0, m - 1)] = choice([0, 1])
        columns.append(column)
    # Rotate the columns into rows
    rows = [
        [c[x] for c in columns]
        for x in range(m)
    ]
    return rows

m, n = 5, 4
matrix = generate_matrix(m, n)

Example output:

[0, 1, 0, 0]
[0, 0, 0, 0]
[0, 0, 0, 0]
[0, 0, 1, 0]
[0, 0, 0, 0]

edited Dec 14, 2016 at 20:48

answered Dec 14, 2016 at 20:20

Brobin

3,3362 gold badges21 silver badges35 bronze badges

2 Comments

jmparejaz Over a year ago

but the problem with this formulation is that it is constraining the matrix with a mandatory column equal to 1, it is avoiding the posibility that a column sum could be equal to 0, so in the line column[random.randint(0, n - 1)] = 1 #could be write as: column[random.randint(0, n - 1)] = random.randint(0,1)

Brobin Over a year ago

Updated. It now chooses 0 or 1.

Collectives™ on Stack Overflow

Generate conditioned random binary array matrix in python

4 Answers 4

Comments

Comments

Comments

2 Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

4 Answers 4

Comments

Comments

Comments

2 Comments

Your Answer

Sign up or log in

Post as a guest

Related