Pythagorean number decomposition

Question

I have a pretty interesting problem that I'm trying to solve, and I am kind of stuck.

I want to count the triplets of distinct combinations a, b, c, that are smaller than LIMIThave the following properties:

a >= b, a >= c, and let's note m = a + b (a, b are integer)
a and m are the smallest 2 numbers in a Pythagorean triplet (a^2 + m^2 = x^2). m can be bigger than a, but a has to be larger than b or c

Here is the code I have so far. First, I generate all Pythagorean tripplets (of which I only use the first 2). The smallest one should be under the limit, but the largest one can be above the limit.

LIMIT = 10
pyt = {}

for m in xrange(1, LIMIT):
    for n in xrange(m + 1, LIMIT):
        k = 1
        a = n ** 2 - m ** 2
        b = 2 * m * n
        if (m - n) % 2 == 0 or gcd(m, n) != 1:
            continue

        while True:
            temp_a = a * k
            temp_b = b * k
            max_pyt = max(temp_a, temp_b)
            min_pyt = min(temp_a, temp_b)
            # I only need the minimum to be smaller than the LIMIT
            # but the maximum can't be larger than 2*LIMIT (check below)
            if min_pyt > LIMIT:
                break
            pyt[min_pyt, max_pyt] = 1
            k += 1

Once I have them, I just loop through them, and count the possible combinations that can be generated.

For example:

# pythagorean numbers: 6, 8 => (6^2 + 8^2 = 10^2)
# we can generate the numbers: 
# 1, 5, 8 | 2, 4, 8 | 3, 3, 8 (when decomposing 6)
# 6, 2, 6 | 6, 3, 5 | 6, 4, 4 (when decomposing 8, although 6, 1, 7 are not good, the largest of the 3 needs to be a member of the Pythagorean triplet )

counter = 0
print pyt
for a, b in pyt:
    # if half of b is bigger than the limit, just continue
    temp = b // 2 
    if temp >= LIMIT:
        continue
    elif b < LIMIT:
        # a is smaller than b, so any unordered combination of numbers with the sum a
        # will be good (4 = 1 + 3 or 4 = 2 + 2, 5 = 1 + 4, 5 = 2 + 3 - ie // works)
        counter += (a // 2)

    # at this point we know that temp (half of b) is not higher than the limit
    # and if it's less than a, there are combinations to be counted
    if temp <= a:
        # with the limit 10, we can have the following situation:
        # a = 8, b = 15, so we can count the set (8, 8, 7)
        counter += (a - temp) # 
        if (a - temp) % 2 == 0:
            counter += 1

I hope I explained it well enough. The script seems to be working with lower values, but when going to higher numbers, I'm starting to miss things in limit situations, but I can't figure out what.

For example, with LIMIT = 99 I should be getting 1975, but I get 1943. Any ideas what I could be missing ? Any hint is more than welcome.

Also, since I'm still learning python, I wouldn't mind some tips on how to improve my code, and follow python coding standards.

Edit: The gcd function is from the fractions module (from fractions import gcd)

Yes, that's true. And even though there are plenty of solutions out there, I still want to pursue my idea to get the result myself. — Vlad Preda, Oct 8 '13 at 5:55

Gareth Rees · Accepted Answer · 2013-10-08 13:47:54Z

1. Missing GCD

You didn't give the code for your gcd function, so I'll assume it's something like this:

def gcd(m, n):
   """Return the greatest common divisor of positive integers m and n."""
   while n:
       m, n = n, m % n
   return m

2. Decomposition: primitive Pythagorean triples

You're having trouble finding the error here because your program consists of a single block of code. You have to run all of it in order to run any of it. It would make your program easier to debug if you split it up into functions with straightforward inputs and outputs, that can be tested individually. This technique is known as functional decomposition.

The first bit of functionality to pull out is the computation of the primitive Pythagorean triples. It's convenient to write this kind of function in the form of a generator — a function that produces its results one by one using the yield statement.

So, we could start with a function like this:

def f(LIMIT):
    for m in xrange(1, LIMIT):
        for n in xrange(m + 1, LIMIT):
            a = n ** 2 - m ** 2
            b = 2 * m * n
            if (m - n) % 2 == 0 or gcd(m, n) != 1:
                continue
            yield a, b

There are some immediate improvements that can be made:

The function needs a descriptive name and a docstring.
It's conventional to use ALL_CAPITALS only for global constants (see the Python style guide, PEP 8, for this and other recommendations).
It would be make sense to generate all three sides of the triangle (you only use two of them here, but the third one is useful for testing, and it will come in handy in other Project Euler problems).
The continue statement can be avoided by reversing the sense of the condition.
itertools.combinations can be used to reduce the two nested loops to one.

Here's the improved code:

from itertools import combinations

def primitive_pythagorean_triples(limit):
    """Generate the primitive Pythagorean triples whose shortest side is
    no longer than 2 * limit - 3, together possibly with some other
    primitive Pythagorean triples whose shortest side is longer than
    that.

    """
    for m, n in combinations(xrange(1, limit), 2):
        if (m - n) % 2 and gcd(m, n) == 1:
            a = n ** 2 - m ** 2
            b = 2 * m * n
            c = n ** 2 + m ** 2
            yield a, b, c

The hardest part of doing this was writing the docstring. (a might be as large as (limit - 1) ** 2 - (limit - 2) ** 2 which is 2 * limit - 3.) When it's hard to write a docstring, this is often a hint that the function is poorly specified. In this case, surely it would be more convenient to pass in the actual maximum shortest side?

Let's check that these are really Pythagorean triples:

>>> all(a ** 2 + b ** 2 == c ** 2 for a, b, c in primitive_pythagorean_triples(100))
True

and have a quick look at the output:

>>> list(primitive_pythagorean_triples(8))
[(3, 4, 5), (15, 8, 17), (35, 12, 37), (5, 12, 13), (21, 20, 29), (45, 28, 53),
 (7, 24, 25), (9, 40, 41), (33, 56, 65), (11, 60, 61), (13, 84, 85)]

Does that list include all the primitives triples with shortest side no more than 13? It's hard to tell. It would be easier to read the output if each triple were sorted by side length, so that (15, 8, 17) appears as (8, 15, 17). It would also be convenient to reject triples whose shortest side is too long.

Making these changes gives us a third version of the function:

def primitive_pythagorean_triples(max_side):
    """Generate the primitive Pythagorean triples whose shortest side is
    no longer than max_side.

    """
    for m, n in combinations(xrange(1, max_side // 2 + 3), 2):
        if (m - n) % 2 and gcd(m, n) == 1:
            a = n ** 2 - m ** 2
            b = 2 * m * n
            c = n ** 2 + m ** 2
            if a < b and a <= max_side:
                yield a, b, c
            elif b < a and b <= max_side:
                yield b, a, c

And now we can take the output:

>>> sorted(primitive_pythagorean_triples(13))
[(3, 4, 5), (5, 12, 13), (7, 24, 25), (8, 15, 17), (9, 40, 41), (11, 60, 61),
 (12, 35, 37), (13, 84, 85)]

and compare it to, say, OEIS sequence A020884. This looks good so far, but to be more certain that this is correct, let's write a naïve version of the function (one that would be far too slow to solve the Project Euler problem) and compare the output of the two functions.

from math import hypot

def primitive_pythagorean_triples_slow(max_side):
    for a in xrange(1, max_side + 1):
        for b in xrange(a + 1, (a ** 2 - 1) // 2 + 1):
            c = hypot(a, b)
            if gcd(a, b) == 1 and c == int(c):
                yield a, b, int(c)

>>> sorted(primitive_pythagorean_triples(300)) == sorted(primitive_pythagorean_triples_slow(300))
True

3. Decomposition: all Pythagorean pairs

The next bit of functionality to decompose is the computation of the Pythagorean pairs. Your code looks like this:

pyt = {}
k = 1
while True:
    temp_a = a * k
    temp_b = b * k
    max_pyt = max(temp_a, temp_b)
    min_pyt = min(temp_a, temp_b)
    # I only need the minimum to be smaller than the LIMIT
    # but the maximum can't be larger than 2*LIMIT (check below)
    if min_pyt > LIMIT:
        break
    pyt[min_pyt, max_pyt] = 1
    k += 1

There are lots of small improvements we can make here:

Make it into a function we can test separately.
Write a docstring.
Generate these pairs instead of storing them in a dictionary.
Use itertools.count to simplify the loop.
Choose better names for the variables. For example, what's temporary about temp_a and why is that important? I'd prefer a name like ka to remind us that this variable holds the value k * a.

Here's the improved code:

from itertools import count

def pythagorean_pairs(max_side):
    """Generate pairs of integers (a, b), where a is the short leg and b
    the long leg of a Pythagorean triangle, and where a <= max_side
    and b <= 2 * max_side.

    """
    for a, b, _ in primitive_pythagorean_triples(max_side):
        for k in count(1):
            ka, kb = k * a, k * b
            if ka > max_side or kb > 2 * max_side:
                break
            yield ka, kb

Again, at this point it is useful to have a naïve computation of the same set of pairs, to test the faster version of the code.

def pythagorean_pairs_slow(max_side):
    for a in xrange(1, max_side + 1):
        for b in xrange(a + 1, 2 * max_side + 1):
            c = hypot(a, b)
            if c == int(c):
                yield a, b

>>> sorted(pythagorean_pairs(1000)) == sorted(pythagorean_pairs_slow(1000))
True

4. Decomposition: counting cuboids

Your code for counting the cuboids contains a lot of comments, which is a sign that it's not very clearly written. Let's pull out the body of the loop into a function and see if we can make sense of it:

def count_cuboids(a, b, LIMIT):
    """Given a Pythagorean pair (a, b), return the number of cuboids (p, q, r)
    such that 1 <= p, q, r <= LIMIT and ????

    """
    counter = 0
    temp = b // 2 
    if temp >= LIMIT:
        return 0
    elif b < LIMIT:
        counter += (a // 2)
    if temp <= a:
        counter += (a - temp)
        if (a - temp) % 2 == 0:
            counter += 1
    return counter

Now, the problem I have here is that I can't figure out exactly which cuboids this is supposed to count, so I don't know how to write the docstring.

Your comment at the top says:

# 1, 5, 8 | 2, 4, 8 | 3, 3, 8 (when decomposing 6)
# 6, 2, 6 | 6, 3, 5 | 6, 4, 4 (when decomposing 8, although 6, 1, 7 are not
# good, the largest of the 3 needs to be a member of the Pythagorean triplet)

and indeed:

>>> count_cuboids(6, 8, 100)
6

so this suggests that the docstring for count_cuboids might need to be:

"""Given a Pythagorean pair (a, b), return the number of cuboids (p, q, r)
such that 1 <= p, q, r <= LIMIT and 
either: p <= q <= r and p + q == a and r == b
    or: p == a and q <= r <= p and q + r == b."""

but then I tried the simplest example:

>>> count_cuboids(3, 4, 100)
2

which doesn't match your specification. According to your specification, there ought to be three possible cuboids here, not two: (1, 2, 4), (3, 1, 3) and (3, 2, 2).

So your code for counting the cuboids does not match your specification. One or both of them is wrong.

5. Conclusion

I don't want to spoil the Project Euler problem, so I'll leave you to take it from here. I'll finish by repeating the key points:

Use functional decomposition to split your code up into parts that you can test separately.
Write docstrings explaining what each function does. If you are having trouble writing concise and clear docstrings, this is a sign that the way you have decomposed your code was poorly chosen and needs to be reconsidered.
Make use of generators to produce and consume sequences of results without having to store them in intermediate data structures.
When you have a complicated, optimized function, also implement a naïve, straightforward version of the same function that you can use to test it.

Wow, thank you for the awesome answer. I will review everything tonight, but I just wanted to say that the gcd function is from the fractions module (from fractions import gcd) — Vlad Preda, Oct 8 '13 at 14:19

asked	10 months ago
viewed	275 times
active	10 months ago

current community

your communities

more stack exchange communities

Pythagorean number decomposition

1 Answer 1

1. Missing GCD

2. Decomposition: primitive Pythagorean triples

3. Decomposition: all Pythagorean pairs

4. Decomposition: counting cuboids

5. Conclusion

Your Answer

Not the answer you're looking for? Browse other questions tagged python algorithm project-euler or ask your own question.

Hot Network Questions

current community

your communities

more stack exchange communities

Pythagorean number decomposition

1 Answer 1

1. Missing GCD

2. Decomposition: primitive Pythagorean triples

3. Decomposition: all Pythagorean pairs

4. Decomposition: counting cuboids

5. Conclusion

Your Answer

Sign up or log in

Post as a guest

Not the answer you're looking for? Browse other questions tagged python algorithm project-euler or ask your own question.

Related

Hot Network Questions