Take the 2-minute tour ×
Stack Overflow is a question and answer site for professional and enthusiast programmers. It's 100% free, no registration required.

This question already has an answer here:

I have a table that includes special characters such as ™.

This character can be entered and viewed using phpMyAdmin and other software, but when I use a SELECT statement in PHP to output to a browser, I get the diamond with question mark in it.

The table type is MyISAM. The encoding is UTF-8 Unicode. The collation is utf8_unicode_ci.

The first line of the html head is

<meta http-equiv="Content-Type" content="text/html; charset=utf-8">

I tried using the htmlentities() function on the string before outputting it. No luck.

I also tried adding this to php before any output (no difference):

header('Content-type: text/html; charset=utf-8');

Lastly I tried adding this right below the initial mysql connection (this resulted in additional odd characters being displayed):

$db_charset = mysql_set_charset('utf8',$db);

What have I missed?

share|improve this question
2  
Unrelated to the question itself, but please use mysqli or PDO rather than mysql extension, which is deprecated. –  Nathan Bouscal Apr 9 '13 at 3:10
 
Are you sure that whatever is in your database is actually utf8? –  Jack Apr 9 '13 at 3:12
 
 
how would I be sure that "whatever is in your database is actually utf8"? I'm typing the ™ character directly into phpMyAdmin, and everywhere I look in phpMyAdmin I see utf8 for both the field and the table... –  Jason Wood Apr 9 '13 at 3:39
add comment

marked as duplicate by Jack, hjpotter92, EdChum, Stony, Sindre Sorhus Apr 9 '13 at 8:04

This question has been asked before and already has an answer. If those answers do not fully address your question, please ask a new question.

3 Answers

up vote 1 down vote accepted

Below code works for me.

$sql = "SELECT * FROM chartest";
mysql_set_charset("UTF8");
$rs = mysql_query($sql);
header('Content-type: text/html; charset=utf-8');
while ($row = mysql_fetch_array($rs)) {
    echo $row['name'];
}
share|improve this answer
 
arg! "mysql_set_charset("UTF8");" DID fix the problem. Just not while also using htmlentities(). I didn't realize that htmlentities() ALSO requires a charset to be specified, as discussed here: stackoverflow.com/questions/9103801/… –  Jason Wood Apr 9 '13 at 5:50
add comment

There are a couple things that might help. First, even though you're setting the charset to UTF-8 in the header, that might not be enough. I've seen the browser ignore that before. Try forcing it by adding this in the head of your html:

<meta charset='utf-8'>

Next, as mentioned here, try doing this:

mysql_query ("set character_set_client='utf8'");
mysql_query ("set character_set_results='utf8'");
mysql_query ("set collation_connection='utf8_general_ci'");

EDIT

So I've just done some reading up an playing around a bit. First let me tell you, despite what I mentioned in the comments, utf8_encode() and utf8_decode() will not help you here. It helps to actually understand UTF-8 encoding. I found the Wikipedia page on UTF-8 very helpful. Assuming the value you are getting back from the database is in fact already UTF-8 encoded and you simply dump it out right after getting it then it should be fine.

If you are doing anything with the database result (manipulating the string in any way especially) and you don't use the unicode aware functions from the PHP mbstring library then it will probably mess it up since the standard PHP string functions are not unicode aware.

Once you understand how UTF-8 encoding works you can do something cool like this:

$test = "™";
for($i = 0; $i < strlen($test); $i++) { 
    echo sprintf("%b ", ord($test[$i]));
}

Which dumps out something like this:

11100010 10000100 10100010

That's a properly encoded UTF-8 '™' character. If you don't have a character like that in your data retrieved from the database then something is messed up.

To check, try searching for a special character that you know is in the result using mb_strpos():

var_dump(mb_strpos($db_result, '™'));

If that returns anything other than false then the data from the database is fine, otherwise we can at least establish that it's a problem between PHP and the database.

share|improve this answer
 
There was no change after adding <meta charset='utf-8'>. After Adding the other stuff, the problem seemed to get worse. Instead of "�" for ™, I got "â�¢". –  Jason Wood Apr 9 '13 at 3:51
 
Just to make sure the character encoding on the page is set right, if you're using firefox you ran right click on the page and go to 'View Page Info' where it shows the encoding. Does it show 'UTF-8' or something like 'ISO-8859-1'? –  Justin Warkentin Apr 9 '13 at 3:56
 
I'm no expert with character encodings, but I've gotten it working before. I don't know if it'll help but you should probably check out some of the unicode related PHP functions like utf8_decode and the mbstring functions. –  Justin Warkentin Apr 9 '13 at 4:00
 
Yes, Firefox confirms it's UTF-8. I'll have a look at those functions. –  Jason Wood Apr 9 '13 at 4:13
 
I just added more to my answer after doing a little bit of research. Let me know if anything helps. –  Justin Warkentin Apr 9 '13 at 6:14
show 1 more comment

you need to execute the following query first.

mysql_query("SET NAMES utf8");   
share|improve this answer
 
Please don't use this, it can create SQL injection problems under certain circumstances. Use the "official" mysql_set_charset API, which the OP already does. –  deceze Apr 9 '13 at 4:06
add comment

Not the answer you're looking for? Browse other questions tagged or ask your own question.