For reasons unknown I've recently taken up generating word squares, or more accurately double word squares. Below you can see my implementation in Python, in about 40 lines. The code uses this word list. It takes around 10ms for dim=2 (2x2 word squares) and around 13 seconds for dim=3 (3x3 squares). For dim=4 it explodes to something like 3.5 hours in cpython, and causes a MemoryError in pypy. For dim=20 and dim=21 it returns 0 solutions in a few seconds, but for dim=18 and dim=19 it causes a MemoryError in both pypy and cpython.
So, I am looking for improvements that will allow me to explore values of dim >4, but also to explore ways of solving the MemoryErrors for the values <20. Small improvements of a few %, improvements in how things are expressed, as well as large improvements in the algorithm are all welcome.
import time
start = time.clock()
dim = 3 #the dimension of our square
posmax = dim**2 #maximum positions on a dim*dim square
words = open("word.list").read().splitlines()
words = set([w for w in words if len(w)==dim])
print 'Words: %s' % len(words)
prefs = {}
for w in words:
for i in xrange(0,dim):
prefs[w[:i]] = prefs.get(w[:i], set())
prefs[w[:i]].add(w[i])
sq, options = ['' for i in xrange(dim)], {}
for i in prefs:
for j in prefs:
options[(i,j)] = [(i+o, j+o)
for o in prefs[i] & prefs[j]]
schedule = [(p/dim, p%dim) for p in xrange(posmax)]
def addone(square, isquare, position=0):
if position == posmax: yield square
else:
x,y = schedule[position]
square2, isquare2 = square[:], isquare[:]
for o in options[(square[x], isquare[y])]:
square2[x], isquare2[y] = o
for s in addone(square2, isquare2, position+1):
yield s
print sum(1 for s in addone(sq, sq[:]))
print (time.clock() - start)