Take the 2-minute tour ×
Code Review Stack Exchange is a question and answer site for peer programmer code reviews. It's 100% free, no registration required.

I'm currently going through some lessons on Codility. I've just spent a couple of hours with GenomicRangeQuery, which is intended to demonstrate the use of prefix sums.

The task description is here. Broadly, the idea is that given a string and a set of arbitrary cut points one should be able to return the lowest value character in a given slice. O(N*M) complexity is what we want to avoid.

My solution scores 100%, but I would still value any feedback on style and readability.

def solution(str, slice_starts, slice_ends)
    # Initiate an array to hold rolling totals of character occurrences
    prefix_sums = [[0,0,0,0]]
    # Map possible characters to their positions in the prefix sum array
    chars = { "A" => 0, "C" => 1, "G" => 2, "T" => 3 }

    str.split('').each_with_index do |char, i|
        prefix_sums[i + 1] = prefix_sums[i].clone
        prefix_sums[i + 1][chars[char]] += 1
    end

    slice_starts.each_with_index.map do |slice_start, i|
        s = slice_start
        e = slice_ends[i]

        chars.each do |char, ii|
            occurrences = prefix_sums[e + 1][ii] - prefix_sums[s][ii]
            break (ii + 1) if occurrences > 0
        end
    end
end
share|improve this question

1 Answer 1

Some notes:

  • Use 2-space indentation.
  • [[0,0,0,0]]. Always a space after a comma.
  • str.split('') - >str.chars
  • prefix_sums = [[0,0,0,0]]. This kind of implicit structures hinder readability, I'd simply use a hash. A small space penalty (not in big-Oh, though), better readibility.
  • each_with_index. Use zip instead.
  • As you say, the problem is about getting partial sums (frequencies). You can implement and use a very generic abstraction (Enumerable#scanl), which can be used in more codility problems.

I'd write this purely functional solution:

def solution(input_str, slice_start, slice_end)
  impact_factors = {"A" => 1, "C" => 2, "G" => 3, "T" => 4}
  frequencies = input_str.chars.scanl(Hash.new(0)) do |freq, n|
    freq.merge(n => freq[n] + 1)
  end

  slice_start.zip(slice_end).map do |from, to|
    difference_count_by_factor = frequencies[to+1].map do |n, count| 
      [impact_factors[n], count - frequencies[from][nucleotide]]
    end.to_h
    difference_count_by_factor.reject { |factor, count| count.zero? }.keys.min
  end
end
share|improve this answer
    
This is a great answer, thanks. My only reservation is implementing Enumerable#scanl might confuse other Rubyists, especially those like me without a functional background. I like your solution, but from my experience it perhaps doesn't feel like the most idiomatic Ruby. –  Dae Aug 26 at 15:23

Your Answer

 
discard

By posting your answer, you agree to the privacy policy and terms of service.

Not the answer you're looking for? Browse other questions tagged or ask your own question.