Why don't interpreters interpret bytecode (like VMs) - instead of source code? [closed]

Question

Edited the question to be more clear:

It is known that interpreting bytecode is much faster than interpreting source code or some IL version of the source code. The interpreter has a much easier time understanding bytecode than source code or source-code-like IL.

Virtual machines interpret bytecode. This bytecode interpreted by VMs is the result of compiling source code to bytecode.

However, 'independent interpreters' (not inside a VM) are known to interpret source code or source-code-like IL, instead of bytecode.

Why is that? Why don't interpreters interpret bytecode, like VMs? All that is needed is to first compile the source code to bytecode (like done for VMs), and then the interpreter can interpret this bytecode.

Is the reason for this is that an interpreter that interprets bytecode is a VM by definition? (Just guessing here). Or is it something else?

To the downvoters - May I ask why the downvotes? Isn't this a legitimate question? — NPElover, Mar 9 '14 at 20:03
It looks to me that you want to understand why the overhead of running an interpreter is generally higher than the overhead of starting a VM. Is that correct? — Kevin, Mar 9 '14 at 20:18
@Kevin yep correct :) I heard that a lot of programming languages started as languages run by an interpreter, but when the need for better performance rose, the interpreter was converted to a VM to optimize performance. I don't understand how this would optimize performance. — NPElover, Mar 9 '14 at 20:25
@Kevin I rejected your edits but I figured out more accurately what I wanted to ask. Edited the question. — NPElover, Mar 9 '14 at 20:40

amon · Answer 1 · 2014-03-09 18:44:43Z

up vote 2 down vote

Source code is intended to be read by humans.

Byte code is intended to be read by VMs.

If I give you some byte code to read, you could probably slowly decipher what it's doing. The same thing happens in reverse when an interpreter works on the human-oriented source code: It is possible to write an interpreter that works with source code, but it will be slow.

While humans need mnemonics and are more productive with nice syntax, this is exactly reversed for a computer program: textual representations have a low information density, and fancy syntax is more difficult to parse – bytecodes are usually very simple “regular languages” which are extremely simple to decode (in contrast, the syntax of high-level programming languages tends to be “context free”, although many include elements that are even more difficult to parse).

edited Mar 9 '14 at 18:44

answered Mar 9 '14 at 18:10

amon
20.3k54180

What do you mean by “high-level programming languages are 'context free'”? – Robert Harvey Mar 9 '14 at 18:42

@RobertHarvey the syntax of high-level languages tends to approximately follow a context-free grammar. will edit my post to clarify – amon Mar 9 '14 at 18:43

I think what the OP is asking (which you don't address) is, why is the overhead of running an interpreted language generally higher than the overhead of starting an entire VM? You only explain that interpreters have overhead, which I think the OP understands. – Kevin Mar 9 '14 at 20:16

You're saying that computers take long to understand human-readable code with fancy syntax, and I understand this. I understand why computers would need simpler code to interpret, and that's why often source code is compiled to bytecode, and only then interpreted by a VM. But what I don't understand, is why the bytecode can't be of textual format. If the only issue is for the interpreted language to have a syntax which will be easy for the interpreter to understand and execute, than what's wrong with having bytecode of textual form (aka could be included in a .txt file) instead of binary form? – NPElover Mar 9 '14 at 20:21

And yes, Kevin is right. I'd very much like to know the answer to what I asked in the previous comment, but that's not my original question. My original question is why do interpreters take longer to execute a program, than VMs (which as far as I understand, 'contain' an interpreter inside them, and also interpret code). – NPElover Mar 9 '14 at 20:24

| show 3 more comments

ratchet freak · Answer 2 · 2014-03-09 18:15:08Z

up vote 0 down vote

a typical grammar for a current day structured language is a context free grammar (or at least parsed as one)

however bytecode is more like a unstructured regular language, with gotos and conditional jumps

this means that when a naive interpreter reaches the end of a block it needs to remember whether it needs to jump back (for or while loop), reading line by line (or any type of delimited chunks) isn't very efficient whereas reading a known number of bytes is much more efficient

answered Mar 9 '14 at 18:15

ratchet freak
15.6k22962

Does that mean that any interpreter that executes code in a native format (if I understand correctly, that means code composed of bytes instead of characters) - is a VM? – NPElover Mar 9 '14 at 20:51

1

technically that's an emulator, but the lines start the blur when you think about it too much – ratchet freak Mar 9 '14 at 21:07

add a comment |

Kilian Foth · Answer 3 · 2014-03-09 19:21:25Z

A virtual machine gets an instruction stream in its native format. Basically, its operation looks like this: execute; execute; execute; ...

An interpreter receives instruction in a different language, so it has to translate every instruction before executing it: translate; execute; translate; execute; translate; execute; ...

An interpreter has to translate the same things gain and again if it executes a loop, while the VM has outsourced that effort to another step (byte compilation), and even that step only had to do it once.

On top of that, translating an instruction can involve things like tokenizing, symbol table manipulation etc, that are rather slow, while byte code execution is typically looking up stuff in binary indexes.

All of these conspire to make pure interpretation a slow business. Many interpreters employ tricks to save some of these effort, and often this moves them a considerable distance into virtual machine territory - interpreters that are as pure and simple as I sketched above are actually somewhat rare.

Thanks for your answer. So - are you saying that an interpreter that executes code in a native format (if I understand correctly, that means code composed of bytes instead of characters) - is a VM? — NPElover, Mar 9 '14 at 20:49

asked	1 year ago
viewed	143 times
active	1 year ago

current community

your communities

more stack exchange communities

Why don't interpreters interpret bytecode (like VMs) - instead of source code? [closed]

closed as unclear what you're asking by gnat, MichaelT, Bart van Ingen Schenau, GlenH7, DougM Mar 11 '14 at 18:56

3 Answers 3

Not the answer you're looking for? Browse other questions tagged virtual-machine interpreters bytecode or ask your own question.

Hot Network Questions

current community

your communities

more stack exchange communities

Why don't interpreters interpret bytecode (like VMs) - instead of source code? [closed]

closed as unclear what you're asking by gnat, MichaelT, Bart van Ingen Schenau, GlenH7, DougM Mar 11 '14 at 18:56

3 Answers 3

Not the answer you're looking for? Browse other questions tagged virtual-machine interpreters bytecode or ask your own question.

Related

Hot Network Questions