Remove invalid character from file
I was having trouble opening a .C file. It would open fine with
nano but geany didn't like it.
file -i MAIN.C would only give me
MAIN.C: application/octet-stream; charset=binary, which is what
file says when it can't recognize the charset.
Detecting the actual charset
Using python chardet :
chardetect MAIN.C MAIN.C: Windows-1254 with confidence 0.3362399919117238
This shouldn't be a problem for geany, but let's try converting it from "Windows-1254" to "utf8".
Converting the charset
recode Windows-1254..UTF8 MAIN.C recode: MAIN.C failed: Invalid entry in « CP1254..UTF-8 »
iconv -f Windows-1254 -t utf-8 -o OUT.C MAIN.C iconv: illegal input sequence at position 198
So it turns out there's some rogue data at 198 that prevents the file from being interpreted correcty.
Removing non-printable characters from the file
sed $'s/[^[:print:]\t]//g' MAIN.C > OUT.C
did the trick.
file OUT.C OUT.C: C source, Non-ISO extended-ASCII text, with LF, NEL line terminators