Javascript efficient parsing of css selector

Question

What would be the most efficient way of parsing a css selector input string, that features any combination of:

[key=value] : attributes, 0 to * instances
#id : ids, 0 to 1 instances
.class : classes, 0 to * instances
tagName : tag names, 0 to 1 instances (found at start of string only)

(note: '*', or other applicable combinator could be used in lieu of tag?)

Such as:

div.someClass#id[key=value][key2=value2].anotherClass

Into the following output:

['div','.someClass','#id','[key=value]','[key2=value2]','.anotherClass']

Or for bonus points, into this form efficiently (read: a way not just based on using str[0] === '#' for example):

{
 tags : ['div'],
 classes : ['someClass','anotherClass'],
 ids : ['id'],
 attrs : 
   {
     key : value,
     key2 : value2
   }
}

(note removal of # . [ = ])

I imagine some combination of regex and .match(..) is the way to go, but my regex knowledge is nowhere near advanced enough for this situation.

Many thanks for your help.

regex is rarely the right solution for complex languages parsing. You should have a look at the many libraries doing this (like sizzle) — dystroy, Jul 26 '13 at 18:04
I know sizzle does it, but I'm looking to implement my own simple solution. The domain is not as complex as a language, there is no whitespace etc, and a limited format for delimiters (as listed in the question) — ComethTheNerd, Jul 26 '13 at 18:05
I was suggering to look at the source, not using it. If you want to parse css selectors, you should take whitespaces into account. — dystroy, Jul 26 '13 at 18:06
OK I will consult the source, but I'm talking about tokens already split by whitespace. This question is about the next step after splitting the tokens delimited by whitespace — ComethTheNerd, Jul 26 '13 at 18:07
@dystroy I think this is about parsing the selector "sub-syntax" for a single element match; I'm not sure what that's called. Also SCRIPTONITE note that it's not just splitting on whitespace - whitespace is an operator in the CSS selector syntax, comparable to the + and ~ connectors. — Pointy, Jul 26 '13 at 18:07

dystroy · Answer 1 · 2013-07-26 18:18:13Z

up vote 3 down vote

You might do the splitting using

var tokens = subselector.split(/(?=\.)|(?=#)|(?=\[)/)

which changes

div.someClass#id[key=value][key2=value2].anotherClass

to

["div", ".someClass", "#id", "[key=value]", "[key2=value2]", ".anotherClass"]

and after that you simply have to look how starts each token (and, in case of tokens starting with [, checking if they contain a =).

Here's the whole working code building exactly the object you describe :

function parse(subselector) {
  var obj = {tags:[], classes:[], ids:[], attrs:[]};
  subselector.split(/(?=\.)|(?=#)|(?=\[)/).forEach(function(token){
    switch (token[0]) {
      case '#':
         obj.ids.push(token.slice(1));
        break;
      case '.':
         obj.classes.push(token.slice(1));
        break;
      case '[':
         obj.attrs.push(token.slice(1,-1).split('='));
        break;
      default :
         obj.tags.push(token);
        break;
    }
  });
  return obj;
}

demonstration

edited Jul 26 '13 at 18:18

answered Jul 26 '13 at 18:11

dystroy
148k16222284

1

[key="val#ue"] – Gumbo Jul 26 '13 at 18:13

This is a great start, though I do agree with @Gumbos point. Is there a way to make the attribute search 'greedier' than the other searches to avoid this problem? – ComethTheNerd Jul 26 '13 at 18:15

1

@Gumbo I answered the written question, not another question about any kind of CSS selector because trying to do it in a few lines of javascript would be doomed. – dystroy Jul 26 '13 at 18:27

add a comment |

asked	1 year ago
viewed	237 times
active	1 year ago

current community

your communities

more stack exchange communities

Javascript efficient parsing of css selector

1 Answer 1

Your Answer

Not the answer you're looking for? Browse other questions tagged javascript regex css-selectors or ask your own question.

Linked

Hot Network Questions

current community

your communities

more stack exchange communities

Javascript efficient parsing of css selector

1 Answer 1

Your Answer

Sign up or log in

Post as a guest

Not the answer you're looking for? Browse other questions tagged javascript regex css-selectors or ask your own question.

Linked

Related

Hot Network Questions