Sign up ×
Stack Overflow is a community of 4.7 million programmers, just like you, helping each other. Join them, it only takes a minute:

Now I have a list of pattern:

patterns = ['php', 'java', 'c++']

and I want to match it in another string, say, r'c++ primer'. I want to use python re module to do it, but the problem is, if I use:

for pattern in patterns:
    re.findall(pattern, r'php php java java c++ c++')

I will get an error because '+' has special meanning in regular expression.

So how can I fix something like c++ or c* in this situation?

Notice that I have a lot of patterns to match so I don't want to convert everything like c++ to c\+\+ manually.

Thanks for your attention.

share|improve this question
    
Made an edit to the question that clarified \+\+ was not what the asker is looking for. But you SHOULD do a Ctrl+H search with "c++" and replace with "c\+\+". – leewangzhong Dec 8 '13 at 7:51
    
Sorry what do you mean by Ctrl+H search? – Warbean Dec 8 '13 at 7:54
    
@Warbean He means, Search & Replace in the editor. – thefourtheye Dec 8 '13 at 7:54
    
Can someone with the relevant power please accept my edit to the question? He means he doesn't want to change to "c\+\+". – leewangzhong Dec 8 '13 at 7:56
    
Depending on what you are doing, the regular string methods might be enough, so no worrying about making valid regex patterns, e.g: for p in patterns: print "php php java java c++ c++".count(p) will show how many times the string occurs, or for p in patterns: print p in "php php java ..." will show if the string contains the pattern at all – dbr Dec 8 '13 at 8:36

2 Answers 2

Use a character class. Outside of a character class + a special meaning so it is not going to work as is, you need to escape it first: r'c\+\+'.

>>> import re
>>> re.findall(r'[+]{2}', r'c++ primer') 
['++']

Update 1:

If you've predefined regexes then use re.escape on those patterns:

>>> patterns = ['php', 'java', 'c++']
>>> for pattern in patterns:
        print re.findall(re.escape(pattern), r'php php java java c++ c++')
...     
['php', 'php']
['java', 'java']
['c++', 'c++']

Update 2:

>>> to_be_escaped = ('c++',)  #patterns that need to be escaped
>>> new_patterns = [re.escape(p) if p in to_be_escaped else p for p in patterns]
>>> for pattern in new_patterns:
        print re.findall(pattern, r'php php java java c++ c++ .net')
...     
['php', 'php']
['java', 'java']
['c++', 'c++']
['.net']
share|improve this answer
    
Sorry that is not my situation. I have edit my question to make myself understood. Can you please review it to help me? Thank you. – Warbean Dec 8 '13 at 7:52
    
@Warbean I've updated my solution. – Ashwini Chaudhary Dec 8 '13 at 7:53
    
+1 Neat solution :) – thefourtheye Dec 8 '13 at 7:55
    
That's what I want! Thank you so much! – Warbean Dec 8 '13 at 7:55
    
Sorry I have a new problem. I just want to convert 'c++', but not someting like '.net', because '.' is not special in regx. But re.escape will convert '.net' to '\.net', too. Can I choose which to escape and which not? Thanks so much! – Warbean Dec 8 '13 at 8:06

Escape + with \ like this

pattern = r'c\+\+'
import re
print re.findall(pattern, r'c++ primer')

Output

['c++']

Edit:

import re
patterns = ['php', 'java', 'c\+\+']
for pattern in patterns:
    print re.findall(pattern, r'php php java java c++ c++')

Output

['php', 'php']
['java', 'java']
['c++', 'c++']
share|improve this answer
    
Sorry that is not my situation. I have edit my question to make myself understood, can you please review it to help me? Thank you. – Warbean Dec 8 '13 at 7:51
    
@Warbean Just change the patterns like this patterns = ['php', 'java', 'c\+\+'] – thefourtheye Dec 8 '13 at 7:53
    
Can' solve my problem sill. @Ashwini Chaudhary has solved it. Thank you all the same.^_^ – Warbean Dec 8 '13 at 7:56
    
@Warbean You mean to say that the sample which I have shown will not work? – thefourtheye Dec 8 '13 at 7:59
    
Not meaning that. I just don't want to convert 'c++' to 'c\+\+' manually because I have a list of something like 'c++' in hand now, all I want is auto converting. So re.escape('c++') help me. – Warbean Dec 8 '13 at 8:02

Your Answer

 
discard

By posting your answer, you agree to the privacy policy and terms of service.

Not the answer you're looking for? Browse other questions tagged or ask your own question.