Sign up ×
Code Review Stack Exchange is a question and answer site for peer programmer code reviews. It's 100% free, no registration required.

I was doing timing with variations of function largest_prime_factor. I was able to write a better one largest_prime_factor2. There were some repetitions in the new function so I wrote another function largest_prime_factor3.

I thought the 3rd one has to be faster than the 2nd because I broke it up into more functions but it wasn't faster but slower.

  • I wanted to know whether my method of code reuse is good or not?
  • Is there a better way of reusing code in Python?
  • Did I miss something due to which the new function became slower?

My functions alongwith the testing() function used to test all three.

def largest_prime_factor(num):
    '''returns the largest prime factor of a number'''
    ans = 0
    if num % 2 == 0:
        ans = 2
        num /= 2
        while num % 2 == 0:
            num /= 2

    i = 3
    while i <= num:
        if num % i == 0:
            ans = i
            num /= i
            while num % i == 0:
                num /= i

        i += 2

    return ans

def largest_prime_factor2(num):
    '''returns the largest prime factor of a number'''    
    ans = 0
    if num % 2:
        pass
    else:
        ans = 2
        while True:
            num //= 2
            if num % 2:
                break

    i = 3
    while i <= num:
        if num % i:
            pass
        else:
            ans = i
            while True:
                num //= i
                if num % i:
                    break

        i += 2

    return ans

def largest_prime_factor3(num):
    '''returns the largest prime factor of a number'''
    def check(j):
        nonlocal ans
        nonlocal num
        if num % j:
            return
        ans = j
        while True:
            num //= j
            if num % j:
                return
    ans = 0
    check(2)
    i = 3
    while i <= num:
        check(i)
        i += 2

    return ans

def testing():
    from timeit import Timer
    import random

    def tests(f, times):
        def test1(f):
            f(random.randint(1, 1000))
        def test2(f):
            f(random.randint(100000, 1000000))

        print(f.__name__)
        print(Timer(lambda: test1(f)).timeit(number = times))
        print(Timer(lambda: test2(f)).timeit(number = times//10))
        print()


    tests(largest_prime_factor, 10000)
    tests(largest_prime_factor2, 10000)
    tests(largest_prime_factor3, 10000)

if __name__ == "__main__":
    testing()

The timing results that I found:

largest_prime_factor
0.338549207387318
16.599112750324068

largest_prime_factor2
0.21575289594063918
12.086949738861868

largest_prime_factor3
0.36032199674939847
18.893444539398857

This format of results is produced by the testing function. See that for details.

share|improve this question
    
As this question is about time efficiency any discussions for this can take place in this chat room. Please use this to avoid extended discussions. –  Aseem Bansal Jul 19 '13 at 13:45
    
"I thought the 3rd one has to be faster than the 2nd because I broke it up into more functions but it wasn't faster but slower. " Why would you think that? Using function call instead of "inlining the code" can only produce a slowdown. Depending on the interpreter/compiler the code might be optimized to obtain the same efficiency as the inlined version, otherwise you end up with function-call overhead which means slower code. This is especially true in python where the interpreter doesn't do this kind of optimization and function calls are relatively expensive. –  Bakuriu Jul 20 '13 at 15:28
1  
In general no. In your example you had to copy the loop to special case 2. But you can create an iterable that first yields 2 and then only the odd numbers. For example using itertools.chain: for i in it.chain([2], range(3, num, 2)): ... would do what you want. (Last remark: instead of while True you could use while n % i == 0 and remove the break). –  Bakuriu Jul 20 '13 at 15:58
1  
Did you remove the if when timing that? for i in range(3, num, 2): while n %i == 0: ... n //= i. Also, do not obsess to much about timing. Readable code is much better than a code that take 0.1 msec less. Having while Trues around doesn't make code readable. –  Bakuriu Jul 20 '13 at 16:03
1  
By the way: you are considering only micro-optimizations when you could optimize the number of iterations in a much more significant way. You don't have to check numbers over sqrt(n) since they cannot be factors of n. the only case is when n is prime). –  Bakuriu Jul 20 '13 at 16:09

2 Answers 2

up vote 1 down vote accepted

In order to avoid code duplication and the performance regression that you get when putting that code into a function you can use itertools.chain:

import itertools as it

def largest_prime_factor(num):
    '''returns the largest prime factor of a number'''
    ans = num
    for div in it.chain([2], range(3, num, 2)):
        while not num % div:
            num //= div
            ans = div
    return ans

To improve performances in a significant way you can reduce the number of iterations noting that you don't have to check factors bigger than sqrt(num), since if num isn't a prime, then it can have at most one prime factor bigger than its square root(otherwise, if p and q where prime factors bigger than sqrt(n) you'd get that n = A * p *q > A * sqrt(n) * sqrt(n) = A * n > n which cannot be true when A != 1).

This leads to:

def largest_prime_factor(num):
    '''returns the largest prime factor of a number'''
    ans = num
    sqrt_num = int(num**.5)
    for div in it.chain([2], range(3, sqrt_num + 1, 2)):
        while not num % div:
            num //= div
            ans = div
    return ans if num == 1 else num

Note that, at the end of the for loop, the value of num is 1 if all its prime factors were smaller than the square root, otherwise it is equal to the only prime factor bigger than the square root.

Sample run:

In [34]: list(map(largest_prime_factor, range(2, 25)))
Out[34]: [2, 3, 2, 5, 3, 7, 2, 3, 5, 11, 3, 13, 7, 5, 2, 17, 3, 19, 5, 7, 11, 23, 3]

In fact we can further optimize the code noting that, whenever num becomes 1 inside the for loop we can safely break out of it(while the current code tries all the other factors up to sqrt(n) anyway).

def largest_prime_factor(num):
    '''returns the largest prime factor of a number'''
    ans = num
    sqrt_num = int(num**.5)
    for div in it.chain([2], range(3, sqrt_num + 1, 2)):
        while not num % div:
            num //= div
            ans = div
        if div > num // div:
            return ans
    return num

We could also do something like:

def largest_prime_factor(num):
    '''returns the largest prime factor of a number'''
    ans = num
    sqrt_num = int(num**.5)
    candidates = it.chain([2], range(3, sqrt_num + 1, 2))
    for div in it.takewhile(lambda div: div <= num // div, candidates):
        while not num % div:
            num //= div
            ans = div
    return max(ans, num)

However this last version does a function call for each iteration and it will be slightly slower.

share|improve this answer
    
I figured that there is a bigger number. There is slight formatting problem in the beginning of the question when importing itertools. Also when sqrt_num = int(num**.5) + 1 then you don't need the + 1 in the range. Redundant. Please fix that –  Aseem Bansal Jul 23 '13 at 8:49
    
@AseemBansal Fixed. –  Bakuriu Jul 23 '13 at 9:26
2  
largest prime factor of (2^200 * 9) is 3. To find this, the maximal divisor to try by is 3, not (3 * 2^100). :) –  Will Ness Jul 23 '13 at 15:02
    
@WillNess Thanks for the tip. –  Bakuriu Jul 23 '13 at 18:19
1  
@WillNess I see, changed; now it should be as fast as possible for a simple implementation. Regardin the rest I don't see why should I use a while loop where a for will do(note that in python using a while will decrease performances, even though in this case, hopefully, the operations inside the loop should take most of the time). Also the OP specifically asked how to do this without testing 2 separately. Regarding the not: it's common practice in python to use not or simply if expression to check for truth values(e.g. if some_sequence: means some_sequence isn't empty. –  Bakuriu Jul 23 '13 at 19:18

Usually, splitting your code into different smaller functions is a good idea as it makes things easier to read, to reuse and to test.

However, this comes at a (cheap) price as far as performances are concerned because functions calls are not for free (nor are function definitions).

Now, for additional advices, whenever you want to compare performances of functions, it might also be a good idea to return what they compare. It's not good having a fast function if it doesn't return the right thing. By using assert, you could ensure you get notified quickly whenever something goes wrong.

This being said, I'd like to run and comment your code but I don't have a python3 handy and it seems that it is the version you are using so I'll try to do this later :-)

share|improve this answer
    
I am not using many features of Python3. You just need to change the print in the testing function. Maybe the nested function in that also. Everything else is same. Otherwise you can use ideone.com if you want. –  Aseem Bansal Jul 19 '13 at 13:35
    
ideone.com is your friend :-) The code: ideone.com/565hyx (Note that I divided the number of tests by 100.) –  Lstor Jul 19 '13 at 13:35
    
@Lstor Making the numbers too small would give wrong results. the large numbers were the reason it is giving reproducible results. Instead just changing the testing function should suffice. Not much Python3 in this. –  Aseem Bansal Jul 19 '13 at 13:37
    
@AseemBansal I fully agree, and larger numbers are definitely better. But ideone.com has a time limitation, and the purpose of running the code there is running it, not timing it. –  Lstor Jul 19 '13 at 13:39
1  
Thanks for the advice about assert. About function calls costing performance then what about this? –  Aseem Bansal Jul 19 '13 at 14:01

Your Answer

 
discard

By posting your answer, you agree to the privacy policy and terms of service.

Not the answer you're looking for? Browse other questions tagged or ask your own question.