Take the 2-minute tour ×
Stack Overflow is a question and answer site for professional and enthusiast programmers. It's 100% free, no registration required.

I am trying to split my string into a list, separating by whitespace and characters but leaving numbers together.
For example, the string:

"1 2 +="  

would end up as:

["1", " ", "2", " " ,"+", "="]    

The code I currently have is

temp = re.findall('\d+|\S', input)  

This seperates the string as intended but does also remove the whitespace, how do I stop this?

share|improve this question
 
perhaps you need \s? –  rogaos Nov 18 '13 at 22:00
 
Are you writing a postfix parser? –  sudo_O Nov 18 '13 at 22:05
add comment

2 Answers

You can use \D to find anything that is not a digit:

\d+|\D

Python:

temp = re.findall(r'\d+|\D', input) 
//Output: ['1', ' ', '2', ' ', '+', '=']

It would also work if you just used . since it'll match the \d+ first anyway. But its probably cleaner not to.

\d+|.
share|improve this answer
add comment

Just add \s or \s+ to your current regular expression (use \s+ if you want consecutive whitespace characters to be grouped together). For example:

>>> s = "1 2 +="
>>> re.findall(r'\d+|\S|\s+', s)
['1', ' ', '2', ' ', '+', '=']

If you don't want consecutive whitespace to be grouped together, then instead of r'\d+|\S|\s' it would probably make more sense to use r'\d+|\D'.

share|improve this answer
add comment

Your Answer

 
discard

By posting your answer, you agree to the privacy policy and terms of service.

Not the answer you're looking for? Browse other questions tagged or ask your own question.