First of all, your method is not really correct: for AAAABBBBAAB
it returns [A+, B+, A+]
instead of the required [A+, B+, A+, B]
. That's because the last group is never added to the list of groups.
In terms of being Pythonic, don't write this:
if accumulate == False:
write like this:
if not accumulate:
Also, instead of iterating over the "alphabet" using indexes, it would be more Pythonic to rewrite to iterate over each letter, in the style for letter in alphabet
.
"alphabets" is not a good name. It seems letters
would be better.
The algorithm can be simplified, and you could eliminate several intermediary variables:
def create_groups(letters):
""" function group the alphabets to list of A(+)s and B(+)s """
prev = letters[0]
count = 0
groups = []
for current in letters[1:] + '\0':
if current == prev:
count += 1
else:
group_indicator = prev + '+' if count > 0 else prev
groups.append(group_indicator)
count = 0
prev = current
return groups
In the for
loop, I appended '\0'
to the end, as a dirty trick to make the loop do one more iteration to append the last letter group to groups
. For this to work, it must be a character that's different from the last letter in letters
.
The above is sort of a "naive" solution, in the sense that probably there is a Python library that can do this easier. Kinda like what @jonrsharpe suggested, but he didn't complete the solution of converting [['A', 'A', 'A', 'A'], ['B', 'B', 'B', 'B'], ['A', 'A'], ['B']]
in the format that you need. Based on his solution, you could do something like this:
from itertools import groupby
def create_groups(letters):
return [x + '+' if list(g)[1:] else x for x, g in groupby(letters, str)]
What I don't like about this is the way we put the letters in a list just to know if there are 2 or more of them (the list(g)[1:]
step). There might be a better way.