Sign up ×
Stack Overflow is a community of 4.7 million programmers, just like you, helping each other. Join them; it only takes a minute:

I have a situation where I have a regular expression like this

regex_string = r'(?P<x>\d+)\s(?P<y>\w+)'
r = re.compile(regex_string)

and, before I start matching things with it, I'd like to replace the regex group named x with a particular value, say 2014. This way, when I search for matches to this regular expression, we will only find things that have x=2014. What is the best way to approach this issue?

The challenge here is that both the original regular expression regex_string and the arbitrary replacement value x=2014 are specified by an end user. In my head, the ideal thing would be to have a function like replace_regex:

r = re.compile(regex_string)
r = replace_regex_variables(r, x=2014)
for match in r.finditer(really_big_string):
    do_something_with_each_match(match)

I'm open to any solution, but specifically interested in understanding if its possible to do this without checking matches after they are returned by finditer to take advantage of re's performance. In other words, preferrably NOT this:

r = re.compile(regex_string)
for match in r.finditer(really_big_string):
    if r.groupdict()['x'] == 2014:
        do_sometehing_with_each_match(match)
share|improve this question
    
Not without re-building the regular expression pattern itself, no. That requires parsing the string pattern, replacing the group with the literal text it must match, recompiling the pattern and returning that. – Martijn Pieters Apr 18 '14 at 13:19
3  
It'll be much easier to just verify that r.group('x') is equal to '2014'. The parsing will have to take into account nested groups, for example. – Martijn Pieters Apr 18 '14 at 13:20
    
@MartijnPieters Recompiling the regular expression is totally fine by me. Any suggestions on how to replace the original variable in the regex string with values in a smart way? – dino Apr 18 '14 at 13:21
    
Care to limit this to a subset of regex? Can the pattern match literal parenthesis, question marks and angle brackets, for example? Can there be nested groups? – Martijn Pieters Apr 18 '14 at 13:22
    
@MartijnPieters The pattern for all intensive purposes could match anything, but I think it is safe to assume that there will not be nested groups. – dino Apr 18 '14 at 13:29

1 Answer 1

You want something like this, don't you?

r = r'(?P<x>%(x)s)\s(?P<y>\w+)'
r = re.compile(r % {x: 2014})
for match in r.finditer(really_big_string):
    do_something_with_each_match(match)
share|improve this answer
    
No, the OP wants to be able to use the original pattern still too, and/or do the same to an arbitrary number of named patterns. – Martijn Pieters Apr 18 '14 at 13:21
    
Nice idea, but as @MartijnPieters mentioned, the regex is provided by an end user and, separately, so is the x=2014 bit. I don't know a priori which parts of the regex will be matched. I'll clarify that in the question. – dino Apr 18 '14 at 13:24

Your Answer

 
discard

By posting your answer, you agree to the privacy policy and terms of service.

Not the answer you're looking for? Browse other questions tagged or ask your own question.