Tagged Questions

2
votes
2answers
121 views

Remove copyright symbol

I'm trying to parse a RSS feed on the command line. The code works so far but the feed contains a copyright symbol which I try to remove (it is latin1 encoded). How do I remove the copyright symbol ...
3
votes
4answers
420 views

How to convert hex chars to normal chars?

I tried, but I'm stuck at "escaping" the "sed's": sed -i 's/\\x0/NUL/g' $1 sed -i 's/\\x1/SOH/g' $1 sed -i 's/\\x2/STX/g' $1 sed -i 's/\\x3/ETX/g' $1 sed -i 's/\\x4/EOT/g' $1 sed -i 's/\\x5/ENQ/g' ...
2
votes
1answer
92 views

XML text file with `^@` characters in it?

I have an XML file that I need to parse. When I open it in nano, nano give me the message (converted from Mac format). However between each character, there is a ^@ sequence, like so: ^@t^@h^@e^@ ...
5
votes
2answers
89 views

What is the connection between a gedit bug and a Unix-&-Linux Q/A href?

While answering a Unix-&-Linux question, I observed that Gedit and two other editors, Leafpad and Medit (I tested 12 editors altogether) exhibit the a certain bug. As it turns out, the bug is ...
2
votes
2answers
469 views

How can I test the encoding of a text file… Is it valid, and what is it?

I have several .htm files which open in Gedit without any warning/error, but when I open these same file in Jedit, it warns me of invalid UTF-8 encoding... The html meta tag states ...
6
votes
2answers
1k views

Filtering invalid utf8

I have a text file in an unknown or mixed encoding. I want to see the lines that contain a byte sequence that is not valid UTF-8 (by piping the text file into some program). Equivalently, I want to ...