1. Missing GCD
You didn't give the code for your gcd
function, so I'll assume it's something like this:
def gcd(m, n):
"""Return the greatest common divisor of positive integers m and n."""
while n:
m, n = n, m % n
return m
2. Decomposition: primitive Pythagorean triples
You're having trouble finding the error here because your program consists of a single block of code. You have to run all of it in order to run any of it. It would make your program easier to debug if you split it up into functions with straightforward inputs and outputs, that can be tested individually. This technique is known as functional decomposition.
The first bit of functionality to pull out is the computation of the primitive Pythagorean triples. It's convenient to write this kind of function in the form of a generator — a function that produces its results one by one using the yield
statement.
So, we could start with a function like this:
def f(LIMIT):
for m in xrange(1, LIMIT):
for n in xrange(m + 1, LIMIT):
a = n ** 2 - m ** 2
b = 2 * m * n
if (m - n) % 2 == 0 or gcd(m, n) != 1:
continue
yield a, b
There are some immediate improvements that can be made:
The function needs a descriptive name and a docstring.
It's conventional to use ALL_CAPITALS
only for global constants (see the Python style guide, PEP 8, for this and other recommendations).
It would be make sense to generate all three sides of the triangle (you only use two of them here, but the third one is useful for testing, and it
will come in handy in other Project Euler problems).
The continue
statement can be avoided by reversing the sense of the condition.
itertools.combinations
can be used to reduce the two nested loops to one.
Here's the improved code:
from itertools import combinations
def primitive_pythagorean_triples(limit):
"""Generate the primitive Pythagorean triples whose shortest side is
no longer than 2 * limit - 3, together possibly with some other
primitive Pythagorean triples whose shortest side is longer than
that.
"""
for m, n in combinations(xrange(1, limit), 2):
if (m - n) % 2 and gcd(m, n) == 1:
a = n ** 2 - m ** 2
b = 2 * m * n
c = n ** 2 + m ** 2
yield a, b, c
The hardest part of doing this was writing the docstring. (a
might be as large as (limit - 1) ** 2 - (limit - 2) ** 2
which is 2 * limit - 3
.) When it's hard to write a docstring, this is often a hint that the function is poorly specified. In this case, surely it would be more convenient to pass in the actual maximum shortest side?
Let's check that these are really Pythagorean triples:
>>> all(a ** 2 + b ** 2 == c ** 2 for a, b, c in primitive_pythagorean_triples(100))
True
and have a quick look at the output:
>>> list(primitive_pythagorean_triples(8))
[(3, 4, 5), (15, 8, 17), (35, 12, 37), (5, 12, 13), (21, 20, 29), (45, 28, 53),
(7, 24, 25), (9, 40, 41), (33, 56, 65), (11, 60, 61), (13, 84, 85)]
Does that list include all the primitives triples with shortest side no more than 13? It's hard to tell. It would be easier to read the output if each triple were sorted by side length, so that (15, 8, 17) appears as (8, 15, 17). It would also be convenient to reject triples whose shortest side is too long.
Making these changes gives us a third version of the function:
def primitive_pythagorean_triples(max_side):
"""Generate the primitive Pythagorean triples whose shortest side is
no longer than max_side.
"""
for m, n in combinations(xrange(1, max_side // 2 + 3), 2):
if (m - n) % 2 and gcd(m, n) == 1:
a = n ** 2 - m ** 2
b = 2 * m * n
c = n ** 2 + m ** 2
if a < b and a <= max_side:
yield a, b, c
elif b < a and b <= max_side:
yield b, a, c
And now we can take the output:
>>> sorted(primitive_pythagorean_triples(13))
[(3, 4, 5), (5, 12, 13), (7, 24, 25), (8, 15, 17), (9, 40, 41), (11, 60, 61),
(12, 35, 37), (13, 84, 85)]
and compare it to, say, OEIS sequence A020884. This looks good so far, but to be more certain that this is correct, let's write a naïve version of the function (one that would be far too slow to solve the Project Euler problem) and compare the output of the two functions.
from math import hypot
def primitive_pythagorean_triples_slow(max_side):
for a in xrange(1, max_side + 1):
for b in xrange(a + 1, (a ** 2 - 1) // 2 + 1):
c = hypot(a, b)
if gcd(a, b) == 1 and c == int(c):
yield a, b, int(c)
>>> sorted(primitive_pythagorean_triples(300)) == sorted(primitive_pythagorean_triples_slow(300))
True
3. Decomposition: all Pythagorean pairs
The next bit of functionality to decompose is the computation of the Pythagorean pairs. Your code looks like this:
pyt = {}
k = 1
while True:
temp_a = a * k
temp_b = b * k
max_pyt = max(temp_a, temp_b)
min_pyt = min(temp_a, temp_b)
# I only need the minimum to be smaller than the LIMIT
# but the maximum can't be larger than 2*LIMIT (check below)
if min_pyt > LIMIT:
break
pyt[min_pyt, max_pyt] = 1
k += 1
There are lots of small improvements we can make here:
Make it into a function we can test separately.
Write a docstring.
Generate these pairs instead of storing them in a dictionary.
Use itertools.count
to simplify the loop.
Choose better names for the variables. For example, what's temporary about temp_a
and why is that important? I'd prefer a name like ka
to remind us that this variable holds the value k * a
.
Here's the improved code:
from itertools import count
def pythagorean_pairs(max_side):
"""Generate pairs of integers (a, b), where a is the short leg and b
the long leg of a Pythagorean triangle, and where a <= max_side
and b <= 2 * max_side.
"""
for a, b, _ in primitive_pythagorean_triples(max_side):
for k in count(1):
ka, kb = k * a, k * b
if ka > max_side or kb > 2 * max_side:
break
yield ka, kb
Again, at this point it is useful to have a naïve computation of the same set of pairs, to test the faster version of the code.
def pythagorean_pairs_slow(max_side):
for a in xrange(1, max_side + 1):
for b in xrange(a + 1, 2 * max_side + 1):
c = hypot(a, b)
if c == int(c):
yield a, b
>>> sorted(pythagorean_pairs(1000)) == sorted(pythagorean_pairs_slow(1000))
True
4. Decomposition: counting cuboids
Your code for counting the cuboids contains a lot of comments, which is a sign that it's not very clearly written. Let's pull out the body of the loop into a function and see if we can make sense of it:
def count_cuboids(a, b, LIMIT):
"""Given a Pythagorean pair (a, b), return the number of cuboids (p, q, r)
such that 1 <= p, q, r <= LIMIT and ????
"""
counter = 0
temp = b // 2
if temp >= LIMIT:
return 0
elif b < LIMIT:
counter += (a // 2)
if temp <= a:
counter += (a - temp)
if (a - temp) % 2 == 0:
counter += 1
return counter
Now, the problem I have here is that I can't figure out exactly which cuboids this is supposed to count, so I don't know how to write the docstring.
Your comment at the top says:
# 1, 5, 8 | 2, 4, 8 | 3, 3, 8 (when decomposing 6)
# 6, 2, 6 | 6, 3, 5 | 6, 4, 4 (when decomposing 8, although 6, 1, 7 are not
# good, the largest of the 3 needs to be a member of the Pythagorean triplet)
and indeed:
>>> count_cuboids(6, 8, 100)
6
so this suggests that the docstring for count_cuboids
might need to be:
"""Given a Pythagorean pair (a, b), return the number of cuboids (p, q, r)
such that 1 <= p, q, r <= LIMIT and
either: p <= q <= r and p + q == a and r == b
or: p == a and q <= r <= p and q + r == b."""
but then I tried the simplest example:
>>> count_cuboids(3, 4, 100)
2
which doesn't match your specification. According to your specification, there ought to be three possible cuboids here, not two: (1, 2, 4), (3, 1, 3) and (3, 2, 2).
So your code for counting the cuboids does not match your specification. One or both of them is wrong.
5. Conclusion
I don't want to spoil the Project Euler problem, so I'll leave you to take it from here. I'll finish by repeating the key points:
Use functional decomposition to split your code up into parts that you can test separately.
Write docstrings explaining what each function does. If you are having trouble writing concise and clear docstrings, this is a sign that the way you have decomposed your code was poorly chosen and needs to be reconsidered.
Make use of generators to produce and consume sequences of results without having to store them in intermediate data structures.
When you have a complicated, optimized function, also implement a naïve, straightforward version of the same function that you can use to test it.