Take the 2-minute tour ×
Stack Overflow is a question and answer site for professional and enthusiast programmers. It's 100% free, no registration required.

I'm trying to split a string using a regular expression and split function in JavaScript. For example, I have a string: olej sojowy, sorbitol, czerwień koszenilową and my RegEx is:

/, (?!(któ))/g

When I test it here: http://regexr.com/38ps8 I get 2 matches, as expected, so in result I should get 3 elements after split.

But when I try to use this expression in split function:

var parts="olej sojowy, sorbitol, czerwień koszenilową".split(/, (?!(któ))/g);
console.log("Num of elements:" + parts.length); 
console.log(parts.join("!\n!"));

the result is different and it returns 5 elements in an array, with two additional empty strings:

Num of elements:5 
olej sojowy!
!!
!sorbitol!
!!
!czerwień koszenilową 

Why isn't it working as expected? Is it a problem with split function? Does it use a regular expression in a different way than I would expect?

Edit: I've just also noticed that if I change my Regular expression to /, /g then I get just what I wanted (3 elements in result), but there are other strings which I don't want to split if there is któ after the coma and space. So why is this operator changing a behaviour of split?

share|improve this question
add comment

2 Answers

up vote 0 down vote accepted

From Mozilla's JS ref:

If separator is a regular expression that contains capturing parentheses, then each time separator is matched, the results (including any undefined results) of the capturing parentheses are spliced into the output array. However, not all browsers support this capability.

If the regex in split contains capturing groups, the contents of each group is inserted in the result as well. Since you have a capturing group (któ), that is what you get. It is empty because (?!(któ)) is empty. If you add the text , któ anywhere inside your string, you will see it appear:

var parts="olej sojowy, któ sorbitol, czerwień koszenilową".split(/, (?!(któ))/g);

shows 3 elements. The 2nd is, quite surprising, just ", ". Then again, it is the one where któ follows (not sure how I can "prove" that").

If you omit the parentheses inside your lookahead, it works as you expect it to:

var parts="olej sojowy, któ sorbitol, czerwień koszenilową".split(/, (?!któ)/g);

No capturing groups so you get only the remaining text after removal of the matching regex.

share|improve this answer
    
omitting the parentheses makes no difference. the negative Lookahead asserts that it is impossible to match the regex któ –  Cocoa Puffs yesterday
    
Thanks! It works without parentheses and now I can understand why it didn't :) Just what I needed. –  krajol yesterday
    
@CocoaPuffs: it does make a difference, per the documentation. Including parentheses will always add "group" matches, empty or otherwise. –  Jongware yesterday
    
The negative lookahead is pointless though. What is it actually accomplishing? –  l'L'l yesterday
    
@l'L'l: I get different results with and without the lookahead, so it does do something. Does it return the same matches with and without for you? (See, possibly, the final sentence of the Mozilla quote.) –  Jongware yesterday
show 3 more comments

It's working exactly as it should. You've used , as the delimiter so it gives you five elements:

[1] olej sojowy
[2]   
[3] sorbitol
[4]   
[5] czerwień koszenilową

The empty elements are indicators of where the split(s) are located.

share|improve this answer
1  
I don't think it works the way it should. I've just noticed that if I change my Regular expression to /, /g then I get just what I wanted (3 elements in result), but there are other strings which I don't want to split if there is któ after the coma and space. So why is this operator changing a behaviour of split? –  krajol yesterday
2  
Your regex pattern is not working the way it should; meaning the negative lookahead does absolutely nothing. Take a look: , (?!(anything)). I'm really not sure what you're wanting to accomplish - the pattern you've got in the negative lookahead (któ) is not even in the string of text you're searching. –  l'L'l yesterday
1  
Sure, thanks, obviously I could do it, but it would be great to know why it is working that way using RegEx and I'd love to know how to do it using purely RegEx and sadly I can't figure it out :) –  krajol yesterday
1  
I'm not sure what you mean; regex is designed for finding patterns in strings. When you use those patterns as delimiters (splitting strings) it splits it into an array — There's nothing overly complicated about it really. –  l'L'l yesterday
1  
The statement "The empty elements are indicators of where the split(s) are located" is not correct. –  Jongware yesterday
show 5 more comments

Your Answer

 
discard

By posting your answer, you agree to the privacy policy and terms of service.

Not the answer you're looking for? Browse other questions tagged or ask your own question.