I am writing a python library to parse different working hours string and produce the standard format of hours. I stuck in the following case:
My regex should return groups for Mon - Fri 7am - 5pm Sat 9am - 3pm
as ['Mon - Fri 7am - 5pm ', 'Sat 9am - 3pm']
but if there is a comma between first and second then it should return []
.
Also the comma can be in anywhere but should not between the two weekdays & duration. eg: Mon - Fri 7am - 5pm Sat 9am - 3pm and available upon email, phone call
should return ['Mon - Fri 7am - 5pm ', 'Sat 9am - 3pm']
.
This is what I have tried,
import re
pattern = """(
(?:mon|tue|wed|thu|fri|sat|sun|mo|tu|we|th|fr|sa|su|m|w|f|thurs) # Start weekday
\s*[-|to]+\s* # Seperator
(?:mon|tue|wed|thu|fri|sat|sun|mo|tu|we|th|fr|sa|su|^(?![ap])m|w|f|thurs)? # End weekday
\s*[from]*\s* # Seperator
(?:\d{1,2}(?:[:]\d{1,2})?)\s*(?:[ap][.]?m.?) # Start hour
\s*[-|to]+\s* # Seperator
(?:\d{1,2}(?:[:]\d{1,2})?)\s*(?:[ap][.]?m.?) # Close hour
)"""
regEx = re.compile(pattern, re.IGNORECASE|re.VERBOSE)
print re.findall(regEx, "Mon - Fri 7am - 5pm Sat 9am - 3pm")
# output ['Mon - Fri 7am - 5pm ', 'Sat 9am - 3pm']
print re.findall(regEx, "Mon - Fri 7am - 5pm Sat - Sun 9am - 3pm")
# output ['Mon - Fri 7am - 5pm ', 'Sat - Sun 9am - 3pm']
print re.findall(regEx, "Mon - Fri 7am - 5pm, Sat 9am - 3pm")
# expected output []
# but I get ['Mon - Fri 7am - 5pm,', 'Sat 9am - 3pm']
print re.findall(regEx, "Mon - Fri 7am - 5pm , Sat 9am - 3pm")
# expected output []
# but I get ['Mon - Fri 7am - 5pm ', 'Sat 9am - 3pm']
Also I tried negative look ahead pattern in my regex
pattern = """(
(?:mon|tue|wed|thu|fri|sat|sun|mo|tu|we|th|fr|sa|su|m|w|f|thurs)
\s*[-|to]+\s*
(?:mon|tue|wed|thu|fri|sat|sun|mo|tu|we|th|fr|sa|su|^(?![ap])m|w|f|thurs)?
\s*[from]*\s*
(?:\d{1,2}(?:[:]\d{1,2})?)\s*(?:[ap][.]?m.?)
\s*[-|to]+\s*
(?:\d{1,2}(?:[:]\d{1,2})?)\s*(?:[ap][.]?m.?)
(?![^,])
)"""
But I didnt get expected one. Should I explicitly write code for checking condition? Is there any way to just changing my regex instead of writing explicit condition checking?
Another way I like to implement is infix the comma between two weekday- duration if comma doesn't exist and change my regex to group by/split by comma. "Mon - Fri 7am - 5pm Sat 9am - 3pm"
=> "Mon - Fri 7am - 5pm, Sat 9am - 3pm"
re
. – CppLearner Feb 7 '13 at 9:24"Mon - Fri 7am - 5pm Sat 9am - 3pm"
. I will edit my question now. "further processsing"- I already have a parser to standardize the string if comma exists. – underscore Feb 7 '13 at 9:40