Performance issues when parsing code

Question

I was wondering about performance issues when parsing a source file that is being edited by the user (for example, when you need to give a syntax highlight).
I think that the simplest approach is to parse every time the code changes, get the results, and replace the current code with the highlighted one. With large files this may be a problem though. Is there a better way to do that?

I suppose a solution may be to parse just the "area" where there was the last edit. Is this a good idea?

Have you identified this as an issue? Bear in mind while editing text actual typing takes up a very small percentage of the time.

Darknight · Answer 1 · 2011-04-13 11:12:22Z

up vote 0 down vote

Why not use an existing open source project?

I suggest Scintilla:

http://www.scintilla.org/

answered Apr 13 '11 at 11:12

Darknight
10.9k12452

SK-logic · Answer 2 · 2011-04-13 11:12:14Z

If you're using a Packrat parser you can invalidate only the cached terminals that overlaps the edited region, and then reparse the whole buffer - only the missing terminals will be reparsed.

Otherwise you'll need a more customised approach, where you'll use regular expressions, for example, for detecting the beginnings of toplevel declarations and start parsing from that points only, and only in an invalidated region. That's what most of the modern IDEs are actually doing.

asked	2 years ago
viewed	122 times
active	2 years ago

Performance issues when parsing code

2 Answers

Your Answer

Not the answer you're looking for? Browse other questions tagged performance parsing or ask your own question.

Community Bulletin

Performance issues when parsing code

2 Answers

Your Answer

Sign up or log in

Post as a guest

Not the answer you're looking for? Browse other questions tagged performance parsing or ask your own question.

Community Bulletin

Related