Take the 2-minute tour ×
Stack Overflow is a question and answer site for professional and enthusiast programmers. It's 100% free.

Java string pool coupled with reflection can produce some unimaginable result in Java:

import java.lang.reflect.Field;

class MessingWithString {
    public static void main (String[] args) {
        String str = "Mario";
        toLuigi(str);
        System.out.println(str + " " + "Mario");
    }

    public static void toLuigi(String original) {
        try {
            Field stringValue = String.class.getDeclaredField("value");
            stringValue.setAccessible(true);
            stringValue.set(original, "Luigi".toCharArray());
        } catch (Exception ex) {
            // Ignore exceptions
        }
    }
}

Above code will print:

"Luigi Luigi" 

What happened to Mario?

share|improve this question
3  
possible duplicate of Is a Java string really immutable? –  Joe 15 hours ago
2  

6 Answers 6

up vote 65 down vote accepted

What happened to Mario ??

You changed it, basically. Yes, with reflection you can violate the immutability of strings... and due to string interning, that means any use of "Mario" (other than in a larger string constant expression, which would have been resolved at compile-time) will end up as "Luigi" in the rest of the program.

This kinds of thing is why reflection requires security permissions...

Note that the expression str + " " + "Mario" does not perform any compile-time concatenation, due to the left-associativity of +. It's effectively (str + " ") + "Mario", which is why you still see Luigi Luigi. If you change the code to:

System.out.println(str + (" " + "Mario"));

... then you'll see Luigi Mario as the compiler will have interned " Mario" to a different string to "Mario".

share|improve this answer
    
The "other than in a larger string constant expression" bit may not be 100% true, all of the time. In the question, the System.out.println call uses a compile-time constant expression (" " + "Mario"), yet that instance of "Mario" still ends up changed. I suspect this is due to an optimization whereby " Mario" is interned and "Mario" refers to the same memory space due to being a suffix match, though I haven't confirmed it. An interesting edge case to what is a generally true statement, though. (Or I'm just misinterpreting whether this is a compile-time constant.) –  Chris Hayes 21 hours ago
5  
@ChrisHayes: No, that isn't a compile-time constant expression due to the associativity of +. It's evaluated as (str + " ") + "Mario". If you print just " " + Mario or " " + "Mario" + str then you have compile-time concatenation, and you still get Mario in the output. –  Jon Skeet 21 hours ago
    
Ah, I see. That makes sense, if it's not immediately intuitive. Thanks for the explanation. –  Chris Hayes 21 hours ago
3  
@ChrisHayes: Have added more explanation in the answer, given that it's generally useful. –  Jon Skeet 21 hours ago
1  
Yeah, if you declare the str variable final, it’ll change the outcome radically. Note that in theory, it is possible that the manipulated "Mario" instance gets garbage collected after the first concatenation has been performed and a new canonical "Mario" gets created for the next occurrence of that string. But it’s very unlikely. It’s also possible that the string deduplication feature of recent JVM implementations cause the wrong array to get copied to other, non-interned "Mario" instances. Also, there might remain a cached hashcode reflecting the old contents causing funny effects… –  Holger 16 hours ago

It was set to Luigi. Strings in Java are immutable; thus, the compiler can interpret all mentions of "Mario" as references to the same String constant pool item (roughly, "memory location"). You used reflection to change that item; so all "Mario" in your code are now as if you wrote "Luigi".

share|improve this answer
1  
"... as references to the same memory location ..." - The compiler doesn't deal in memory locations, and the runtime system can't do that doesn't because the memory location of any String can be changed at any time by the garbage collector. (I understand what you are trying to say ... but you are expressing it incorrectly. If you were talking about C or C++, this explanation is roughly correct. For Java it isn't.) –  Stephen C 21 hours ago
    
@StephenC: While it would have been better to say "same index in the String constant pool", in the end the effect is identical: "Mario" is stored in a memory location (because even JVM needs to eventually be interpreted on the underlying architecture, where it will be allocated somewhere), and if gc moves it, it still remains true that all mentions of "Mario" will refer to the same (moved) location. Still, you have a point - I should use Java-appropriate jargon, so I'll change it. –  Amadan 21 hours ago
1  
The best way to say is to say that they are all the same object. And it is ultimately the Java runtime system that ensures this not the compiler. –  Stephen C 19 hours ago

To explain the existing answers a bit more, let's take a look at your generated byte code (Only the main() method here).

Byte Code

Now, any changes to the content's of that location will affect both the references (And any other you give too).

share|improve this answer

String literals are stored in the string pool and their canonical value is used. Both "Mario" literals aren't just strings with the same value, they are the same object. Manipulating one of them (using reflection) will modify "both" of them, as they are just two references to the same object.

share|improve this answer

You just changed the String of String constant pool Mario to Luigi which was referenced by multiple Strings, so every referencing literal Mario is now Luigi.

Field stringValue = String.class.getDeclaredField("value");

You have fetched the char[] named value field from class String

stringValue.setAccessible(true);

Make it accessible.

stringValue.set(original, "Luigi".toCharArray());

You changed original String field to Luigi. But original is Mario the String literal and literal belongs to the String pool and all are interned. Which means all the literals which has same content refers to the same memory address.

String a = "Mario";//Created in String pool
String b = "Mario";//Refers to the same Mario of String pool
a == b//TRUE
//You changed 'a' to Luigi and 'b' don't know that
//'a' has been internally changed and 
//'b' still refers to the same address.

Basically you have changed the Mario of String pool which got reflected in all the referencing fields. If you create String Object (i.e. new String("Mario")) instead of literal you will not face this behavior because than you will have two different Marios .

share|improve this answer

The other answers adequately explain what's going on. I just wanted to add the point that this only works if there is no security manager installed. When running code from the command line by default there is not, and you can do things like this. However in an environment where trusted code is mixed with untrusted code, such as an application server in a production environment or an applet sandbox in a browser, there would typically be a security manager present and you would not be allowed these kinds of shenanigans, so this is less of a terrible security hole as it seems.

share|improve this answer

Your Answer

 
discard

By posting your answer, you agree to the privacy policy and terms of service.

Not the answer you're looking for? Browse other questions tagged or ask your own question.