Sign up ×
Stack Overflow is a community of 4.7 million programmers, just like you, helping each other. Join them; it only takes a minute:

I'm trying to achieve the following:

string = 'C:/some path to mp3/song (7) title and so on (1).mp3'

should become:

C:/some path to mp3/song (7) title and so on.mp3

To match it i'm using the following regex:

pattern = '.*(\s\([0-9]+\))\.mp3'

And the match group contains: (u' (1)',)
however, when i'm trying to substitute the match like so:

processed = re.sub(pattern, '', string)

processed contains an empty string. How can i get re.sub() to only replace the match found above?

share|improve this question
    
Why use regex when all you need is string.replace('(1)', '') ? – alfasin May 14 '14 at 15:40
1  
@alfasin What if they want to match other number as well, like '(2)'? – SethMMorton May 14 '14 at 15:42
    
@SethMMorton a. the question wasn't a general one. b. if it's only (1) and (2) I would still use a string replace ;) – alfasin May 14 '14 at 15:43
2  
@alfasin I think that one can assume if the OP used [0-9]+, they will be matching any integer, not just 1 or 2. – SethMMorton May 14 '14 at 15:44
1  
@alfasin I think the OP's point is to replace the (###) appended to duplicate filenames. So the point is it happens right before the extension, not that it is a digit in parenthesis (hence the persisting (7) in the example). – Sam May 14 '14 at 15:46

1 Answer 1

up vote 2 down vote accepted

You were matching the entire string and replacing it, use a lookahead and only match the whitespace and (1) before the final extension.

Expanded RegEx:

\s*     (?# 0+ characters of leading whitespace)
\(      (?# match ( literally)
[0-9]+  (?# match 1+ digits)
\)      (?# match ) literally)
(?=     (?# start lookahead)
  \.    (?# match . literally)
  mp3   (?# match the mp3 extension)
  $     (?# match the end of the string)
)       (?# end lookeahd)

Demo: Regex101

Implementation:

pattern = '\s*\([0-9]+\)(?=\.mp3$)'
processed = re.sub(pattern, '', string)

Notes:

  • mp3 can be replaced by [^.]+ to match any extension or (mp3|mp4) to match multiple extensions.
  • use \s+ instead of \s* to require at least some whitespace before (1), thanks @SethMMorton.
share|improve this answer
1  
Based on the example, they might want '\s+' instead of '\s*'. – SethMMorton May 14 '14 at 15:43
    
Updated and credited @SethMMorton. – Sam May 14 '14 at 15:44

Your Answer

 
discard

By posting your answer, you agree to the privacy policy and terms of service.

Not the answer you're looking for? Browse other questions tagged or ask your own question.