Code Review Stack Exchange is a question and answer site for peer programmer code reviews. Join them; it only takes a minute:

Sign up
Here's how it works:
  1. Anybody can ask a question
  2. Anybody can answer
  3. The best answers are voted up and rise to the top

I recently applied for a job as a Python coder but was rejected.

This was the problem:

Write a python code to check if an array has a sequence (1,3,4)

Assuming they were looking for expert Python programmers, what could I have done better?

# Tested with Python 2.7
import unittest

# Runtime: O(n)

def doesSeqAppear(int_arr):
    #check if input is a list
    if not isinstance(int_arr, list):
        raise TypeError("Input shall be of type array.")

    # check all elements are of type int
    if not all(isinstance(item, int) for item in int_arr) :
        raise ValueError("All elements in array shall be of type int.")

    arr_len = len(int_arr)
    if arr_len < 3: 
        return False

    # Loop through elements
    for i in range(arr_len-2):
        if int_arr[i] == 1 and \
            int_arr[i+1] == 3 and \
            int_arr[i+2] == 4 : 
            return True
    return False


class TestMethodDoesSeqAppear(unittest.TestCase):
    def test_only_single_seq(self):
        #Single time
        assert doesSeqAppear([1,3,4]) == True

    def test_multiple_seq(self):
        #multiple
        assert doesSeqAppear([2,2,1,3,4,2,1,3,4]) == True


    def test_neg_seq(self):
        #multiple
        assert doesSeqAppear([9,-1,1,3,4,-4,4]) == True

    def test_only_empty_seq(self):
        #empty
        assert doesSeqAppear([]) == False
    def test_only_single_elem_seq(self):
        #Single element
        assert doesSeqAppear([1]) == False

    def test_input_is_none(self):
        self.assertRaises(TypeError, doesSeqAppear, None)

    def test_raises_type_error(self):
        self.assertRaises(TypeError, doesSeqAppear, "string")

    def test_raises_value_error(self):
        self.assertRaises(ValueError, doesSeqAppear, [1,2,'a', 'b'])

if __name__ == '__main__':
    unittest.main()

#
share|improve this question
4  
Well for one, Python lists can have multiple types of values, are you sure they required each array to have all ints. For example: ['a','b','c',1,3,4] seems like a valid sequence that would return false in your implementation – Navidad20 11 hours ago
    
Maybe you should have used xrange? ;) – Tamoghna Chowdhury 11 hours ago
1  
Also, they asked about an array. Maybe you should have used a numpy.array instead of a list? With NumPy arrays, this could be done in 2 function calls. – Tamoghna Chowdhury 11 hours ago
3  
As I said in another comment, there are so many unknown points in this question that I think the main reason for rejection may actually be him making assumptions instead of asking questions. – ChatterOne 11 hours ago
    
Why not just str([1,3,4])[1:-1] in str([array])? – Samuel Shifterovich 4 hours ago

By PEP 8, doesSeqAppear should be does_seq_appear. You used the right naming convention for your unit tests, though. Personally, I would prefer def contains_seq(arr, seq=[1, 3, 4]).

Your arr_len < 3 test is superfluous and should therefore be eliminated. Don't write a special case when the regular case works correctly and just as quickly.

Your all(isinstance(item, int) for item in int_arr) check was not specified in the problem, and is therefore harmful. The question does not say that doesSeqAppear([3.1, 1, 3, 4]) should return False, nor does it say that it should fail with an exception. In fact, by my interpretation, it does contain the magic sequence and should therefore return True. In any case, you have wasted a complete iteration of the list just to perform a check that wasn't asked for.

Checking isinstance(int_arr, list) is un-Pythonic, since is the norm in Python. In any case, the code would likely fail naturally if it is not a list.

After cutting all that excess, you should drop the # Loop through elements comment as well.

share|improve this answer
    
I wouldn't say the arr_len < 3 test is superfluous. It is preventing an index error. – Darthfett 5 hours ago
    
@Darthfett: I don't think it prevents any errors. If the length is less than 3, the for loop will execute 0 iterations. – user2357112 5 hours ago
    
@user2357112 oops, you are correct. :) – Darthfett 5 hours ago
    
If dropping the arr_len < 3 check, it would be appropriate to specify why it still works without it, since it takes a moment to figure that out. – jpmc26 2 hours ago

Per the problem definition, I would expect a function thas is able to check any sequence in an array. Not necessarily (1, 3, 4) which was given as an example. In this case, the sequence should also be a parameter of the function, giving the signature:

def has_sequence(array, sequence):

Next, I would rely on Python iterations to "check" if array is a list, or at least an iterable. As there is no obvious reasons, to me, that has_sequence('once upon a time', 'e u') should fail. It seems like a valid usecase.

Following, I would use a variation of the itertools recipe pairwise to group elements of array in tuples of the same length than sequence:

import itertools


def lookup(iterable, length):
    tees = itertools.tee(iterable, length)
    for i, t in enumerate(tees):
        for _ in xrange(i):
            next(t, None)
    return itertools.izip(*tees)


def has_sequence(array, sequence):
    # Convert to tuple for easy testing later
    sequence = tuple(sequence)
    return any(group == sequence for group in lookup(array, len(sequence)))

Now, other things that could have been done better:

  • # Tested with Python 2.7 can be replaced by #!/usr/bin/env python2
  • if int_arr[i] == 1 and int_arr[i+1] == 3 and int_arr[i+2] == 4 : can be replaced by if int_arr[i:i+3] == [1, 3, 4]: removing the need for the ugly \
  • assert in unit tests should be replaced by self.assertTrue(…) or self.assertFalse(…)
  • you should be more consistent in your usage of whitespace (putting one after each comma, none before any colon…).
share|improve this answer
    
Something like tuplewise might be a more evocative name than lookup. (I had the same idea though.) – David Z 7 hours ago

KIS[S]

def sequence_contains_subsequence(sequence, subsequence):
    for i in range(0, len(sequence) - len(subsequence) + 1):
        if subsequence == sequence[i:i+len(subsequence)]:
            return True
    return False

We can't know why your interviewer rejected your application, but these types of questions are often starting points for conversation--not finished product endpoints. If you write the simplest, most straightforward code you can, you and your interviewer can then talk about things like expansion, generalization, and performance.

Your interviewer knows that asking you to change your function interface is more problematic because you'll also have to change all your [unasked for] unit tests. This slows down the process and might make the interviewer worry that you'll pollute their codebase with a lot of brittle code.

share|improve this answer

I have taken the approach of a generator get_triplet:

def get_triplet(data):
    for i, d in enumerate(data):
        if i + 2 < len(data):
            yield(i, data[i:i+3])


haystack = [2, 4, 5, 1, 3, 9, 6, 8, 1, 3, 4, 5]
needle = [1, 3, 4]
for i, d in get_triplet(haystack):
    if d == needle:
        print("Found {} at index {:d}.".format(needle, i))
        break
else:
    print("Search item {} not found.".format(needle))
share|improve this answer

This optimization allows you to reduce the number of times checking an element. It also eliminates the need to make a new iterator which you had created with range

# Loop through elements
index = 0
while index < (arr_len-2):
    if int_arr[index] != 1:
        index += 1
        continue
    if int_arr[index+1] != 3:
        index += 1
        continue
    if int_arr[index+2] != 4:
        index += 1 
        continue
    return True
return False
share|improve this answer
    
What about the sequence [1,1,3,4]? The first test will find 1 and pass, the second test will not find a 3 and will therefore move on 2 places and miss the pattern that is there. – neil 9 hours ago
    
You make a good point. I could add more if statements to fix that, but i'm not getting the top answer so its not worth it – Navidad20 9 hours ago

This may not be optimal but tries to solve the problem generically with very little code:

#! /usr/bin/env python3
def main():
    print('PASS' if contains((0, 1, 3, 4, 5), (1, 3, 4)) else 'FAIL')


def contains(iterable, sequence):
    offset, length = 0, len(sequence)
    for item in iterable:
        if item == sequence[offset]:
            offset += 1
            if offset == length:
                return True
        elif offset:
            offset = 0
    return False


if __name__ == '__main__':
    main()
share|improve this answer
    
Please provide more explanation behind this code. We discourage code solutions alone. – Jamal 2 mins ago

I think your answer is much too long. Here's mine:

def check_for_1_3_4(seq):
    return (1, 3, 4) in zip(seq, seq[1:], seq[2:])

Here are some tests:

>>> check_for_1_3_4([1, 3, 4, 5, 6, 7])
True
>>> check_for_1_3_4([5, 6, 7, 1, 3, 4])
True
>>> check_for_1_3_4([5, 6, 1, 3, 4, 7, 8])
True
>>> check_for_1_3_4([1, 3])
False
>>> check_for_1_3_4([])
False
>>> 

My code may seem terse, but it's still readable for anyone who understands slicing and zip. I expect Python experts to at least know about slicing.

share|improve this answer

Your Answer

 
discard

By posting your answer, you agree to the privacy policy and terms of service.

Not the answer you're looking for? Browse other questions tagged or ask your own question.