I know this has been discussed several times but yet I'm getting crazy dealing with this problem. I have a form with a submit.php action. At first I didn't change anything about the charsets, I didn't use any utf8 header information.. The result was that I could read all the ä,ö,ü etc correctly inside the database. Now exporting them to .csv and importing them to Excel as UTF-8 charset (also tested all the others) results in an incorrect charset.
Now what I tried:
PHP:
header("Content-Type: text/html; charset=utf-8");
$mysqli->set_charset("utf8");
MySQL: I dropped my database and created a new one:
create database db CHARACTER SET utf8 COLLATE utf8_general_ci;
create table ...
I changed my my.cnf and restarted my sql server:
[mysqld]
character-set-server=utf8
collation-server=utf8_general_ci
[mysql]
default-character-set=utf8
If I connect to my db via bash I receive the following output:
| character_set_client | utf8 |
| character_set_connection | utf8 |
| character_set_database | utf8 |
| character_set_filesystem | binary |
| character_set_results | utf8 |
| character_set_server | utf8 |
| character_set_system | utf8 |
| character_sets_dir | /usr/local/mysql/share/charsets/ |
A php test:
var_dump($mysqli->get_charset());
Giving me:
Current character set: utf8 object(stdClass)#3 (8) { ["charset"]=> string(4) "utf8" ["collation"]=> string(15) "utf8_general_ci" ["dir"]=> string(0) "" ["min_length"]=> int(1) ["max_length"]=> int(3) ["number"]=> int(33) ["state"]=> int(1) ["comment"]=> string(13) "UTF-8 Unicode" }
Now I use:
mysql -uroot -ppw db < require.sql > /tmp/test.csv
require.sql is simply a
select * from table;
And again I'm unable to import it as a csv into Excel no matter if I choose UTF-8 or anything else. It's always giving me some crypto..
Hopefully someone got a hint what might went wrong here..
Cheers
E: TextMate is giving me a correct output so it seems that the conversion actually worked and it's and Excel issue? Using Microsoft Office 2011.
E2: Also tried the same stuff with latin1 - same issue, cannot import special characters into excel without breaking them. Any hint or workaround?
E3: I found a workaround which is working with the Excel Import feature but not with double clicking the .csv.
iconv -f utf8 -t ISO-8859-1 test.csv > test_ISO.csv
Now I'm able to import the csv into excel using Windows(ANSI). Still annoying to have to use this feature instead of doubleclicking. Also I really don't get why UTF8 isn't working, not even with the import feature, BOM added and the complete database in UTF8.
Comma separation turned out to be a mess as well. 1. Concat_WS works only partly because it's adding a stupid concat_ws(..) header to the .csv file. Also "file test.csv" doesn't give me a "comma separated". This means even tho everything is separated by commas Excel won't notice it using double click. 2. sed/awk: Found some code snippets but all of them were separating the table very badly. E.g. colum street "streetname number" remained a 'streetname','number' which made 2 colums out of one and the table was screwed.
So it seems to me that Excel can only open .csv with a double click which a) Are encoded with ISO-8859-1 (and only under windows because standard mac charset is Macintosh) b) File having the attribute "comma separated". This means if I create a .csv through Excel itself the output of
file test1.csv
would be
test1.csv: ISO-8859 text, with CRLF line terminators
while a iconv changed charset with RegEx used for adding commas would look like:
test1.csv: ISO-8859 text
Pretty weird behaviour - maybe someone got a working solution.