Take the 2-minute tour ×
Programmers Stack Exchange is a question and answer site for professional programmers interested in conceptual questions about software development. It's 100% free, no registration required.

Edited the question to be more clear:

It is known that interpreting bytecode is much faster than interpreting source code or some IL version of the source code. The interpreter has a much easier time understanding bytecode than source code or source-code-like IL.

Virtual machines interpret bytecode. This bytecode interpreted by VMs is the result of compiling source code to bytecode.

However, 'independent interpreters' (not inside a VM) are known to interpret source code or source-code-like IL, instead of bytecode.

Why is that? Why don't interpreters interpret bytecode, like VMs? All that is needed is to first compile the source code to bytecode (like done for VMs), and then the interpreter can interpret this bytecode.

Is the reason for this is that an interpreter that interprets bytecode is a VM by definition? (Just guessing here). Or is it something else?

share|improve this question

closed as unclear what you're asking by gnat, MichaelT, Bart van Ingen Schenau, GlenH7, DougM Mar 11 '14 at 18:56

Please clarify your specific problem or add additional details to highlight exactly what you need. As it's currently written, it’s hard to tell exactly what you're asking. See the How to Ask page for help clarifying this question. If this question can be reworded to fit the rules in the help center, please edit the question.

    
To the downvoters - May I ask why the downvotes? Isn't this a legitimate question? –  NPElover Mar 9 '14 at 20:03
    
It looks to me that you want to understand why the overhead of running an interpreter is generally higher than the overhead of starting a VM. Is that correct? –  Kevin Mar 9 '14 at 20:18
    
@Kevin yep correct :) I heard that a lot of programming languages started as languages run by an interpreter, but when the need for better performance rose, the interpreter was converted to a VM to optimize performance. I don't understand how this would optimize performance. –  NPElover Mar 9 '14 at 20:25
    
I edited your question, then. Do you think the edits help? –  Kevin Mar 9 '14 at 20:30
1  
@Kevin I rejected your edits but I figured out more accurately what I wanted to ask. Edited the question. –  NPElover Mar 9 '14 at 20:40

3 Answers 3

Source code is intended to be read by humans.

Byte code is intended to be read by VMs.

If I give you some byte code to read, you could probably slowly decipher what it's doing. The same thing happens in reverse when an interpreter works on the human-oriented source code: It is possible to write an interpreter that works with source code, but it will be slow.

While humans need mnemonics and are more productive with nice syntax, this is exactly reversed for a computer program: textual representations have a low information density, and fancy syntax is more difficult to parse – bytecodes are usually very simple “regular languages” which are extremely simple to decode (in contrast, the syntax of high-level programming languages tends to be “context free”, although many include elements that are even more difficult to parse).

share|improve this answer
    
What do you mean by “high-level programming languages are 'context free'”? –  Robert Harvey Mar 9 '14 at 18:42
    
@RobertHarvey the syntax of high-level languages tends to approximately follow a context-free grammar. will edit my post to clarify –  amon Mar 9 '14 at 18:43
    
I think what the OP is asking (which you don't address) is, why is the overhead of running an interpreted language generally higher than the overhead of starting an entire VM? You only explain that interpreters have overhead, which I think the OP understands. –  Kevin Mar 9 '14 at 20:16
    
You're saying that computers take long to understand human-readable code with fancy syntax, and I understand this. I understand why computers would need simpler code to interpret, and that's why often source code is compiled to bytecode, and only then interpreted by a VM. But what I don't understand, is why the bytecode can't be of textual format. If the only issue is for the interpreted language to have a syntax which will be easy for the interpreter to understand and execute, than what's wrong with having bytecode of textual form (aka could be included in a .txt file) instead of binary form? –  NPElover Mar 9 '14 at 20:21
    
And yes, Kevin is right. I'd very much like to know the answer to what I asked in the previous comment, but that's not my original question. My original question is why do interpreters take longer to execute a program, than VMs (which as far as I understand, 'contain' an interpreter inside them, and also interpret code). –  NPElover Mar 9 '14 at 20:24

a typical grammar for a current day structured language is a context free grammar (or at least parsed as one)

however bytecode is more like a unstructured regular language, with gotos and conditional jumps

this means that when a naive interpreter reaches the end of a block it needs to remember whether it needs to jump back (for or while loop), reading line by line (or any type of delimited chunks) isn't very efficient whereas reading a known number of bytes is much more efficient

share|improve this answer
    
Does that mean that any interpreter that executes code in a native format (if I understand correctly, that means code composed of bytes instead of characters) - is a VM? –  NPElover Mar 9 '14 at 20:51
1  
technically that's an emulator, but the lines start the blur when you think about it too much –  ratchet freak Mar 9 '14 at 21:07

A virtual machine gets an instruction stream in its native format. Basically, its operation looks like this: execute; execute; execute; ...

An interpreter receives instruction in a different language, so it has to translate every instruction before executing it: translate; execute; translate; execute; translate; execute; ...

An interpreter has to translate the same things gain and again if it executes a loop, while the VM has outsourced that effort to another step (byte compilation), and even that step only had to do it once.

On top of that, translating an instruction can involve things like tokenizing, symbol table manipulation etc, that are rather slow, while byte code execution is typically looking up stuff in binary indexes.

All of these conspire to make pure interpretation a slow business. Many interpreters employ tricks to save some of these effort, and often this moves them a considerable distance into virtual machine territory - interpreters that are as pure and simple as I sketched above are actually somewhat rare.

share|improve this answer
    
Thanks for your answer. So - are you saying that an interpreter that executes code in a native format (if I understand correctly, that means code composed of bytes instead of characters) - is a VM? –  NPElover Mar 9 '14 at 20:49

Not the answer you're looking for? Browse other questions tagged or ask your own question.