NuSphere Forums Forum Index
NuSphere Forums
Reply to topic
Bug in file encoding detection *false alarm*


Joined: 08 Mar 2006
Posts: 63
Reply with quote
PhpED 4513.

Every time I try to open a file with the extension .html, PhpED tries to use UTF8 encoding which fails since the file is actually ISO-8859-1. What is going on here? And it is not because I previously opened it as UTF8 - I have even thrown away the fileenc.cfg file in the phped folder to no avail. This happens with Default file encoding set to ISO-8859-1 and left at System default.

This is a serious bug, since it means I cannot use PhpED on those files. OK, I could rewrite the fileenc.cfg file myself, but that is of course only a workaround, not a solution.
View user's profileFind all posts by svenaxSend private message
Site Admin

Joined: 13 Jul 2003
Posts: 8344
Reply with quote
Could you please submit your file there?
View user's profileFind all posts by dmitriSend private messageVisit poster's website


Joined: 08 Mar 2006
Posts: 63
Reply with quote
Sure. I can't attach files here but, as I said, it happens with ALL .html files. Try this one for instance

Name: testing.html
Content: abcåäö

Here's a link: http://mywebsite/files/testing.html

The problem is when the file uses characters from the upper half of the iso code page. Then opening it will report "Failed to translate from UTF-8 to UTF-16LE"
View user's profileFind all posts by svenaxSend private message
Site Admin

Joined: 13 Jul 2003
Posts: 8344
Reply with quote
I can't replicate the problem there.
If HTML file does not contain encoding in its meta, the system default encoding is used.
If it has for example UTF-8, this encoding should be detected and used by the IDE. See example below:
<meta http-equiv="Content-Type" content="text/html; charset=utf-8">

if your file is iso-8859-1, please change it to
<meta http-equiv="Content-Type" content="text/html; charset=iso-8859-1">

or remove it at all and make sure that fileenc.cfg does not keep encoding selected before.
View user's profileFind all posts by dmitriSend private messageVisit poster's website


Joined: 08 Mar 2006
Posts: 63
Reply with quote
No, that's not the case. I have the PhpED Default encoding set to iso-8859-1. Or do you mean the Windows default encoding?

I would imagine it is pretty common to work with files containing HTML fragments when using template based systems, so all files will not contain encoding or charset attributes.

If at least PhpED could use it's own Default file encoding in these situations that would be good enough for now.
View user's profileFind all posts by svenaxSend private message


Joined: 08 Mar 2006
Posts: 63
Reply with quote
Update: Perhaps it is my installation of PhpED that has gone crazy. Now I get a "Failed to translate from UTF-16LE to ISO-8859-1" when saving a perfectly normal PHP file that at this point only contains ASCII-7 characters.
View user's profileFind all posts by svenaxSend private message
Site Admin

Joined: 13 Jul 2003
Posts: 8344
Reply with quote
Quote:
only contains ASCII-7 characters

In this case it would not raise any error boxes.
Try to save as UTF8 and check if there are any 8bit characters.

(I know that it's hard to find problems when such transliteration errors happen, because line and position are not displayed, YET)
View user's profileFind all posts by dmitriSend private messageVisit poster's website
Save failed *was* false alarm


Joined: 08 Mar 2006
Posts: 63
Reply with quote
OK, i *did* actually have a spurious unicode character in the file. As you said, it wasn't easy to find - the character was an en-dash, which looks exactly the same as a minus character when using a monospaced font.
View user's profileFind all posts by svenaxSend private message
Site Admin

Joined: 13 Jul 2003
Posts: 8344
Reply with quote
4514 will fill this gap. This version displays transliteration errors like shown below:
View user's profileFind all posts by dmitriSend private messageVisit poster's website


Joined: 08 Mar 2006
Posts: 63
Reply with quote
That is great!

Note that the problem with *opening* files still remain though.
View user's profileFind all posts by svenaxSend private message
Site Admin

Joined: 13 Jul 2003
Posts: 8344
Reply with quote
"Opening" and "saving" both are processed though the same procedures and error messages are very similar:
"Failed to read file in blahblahblah encoding, some characters will be replaced with "?" symbols,Line 27, Position 15, Character: 0xF2".
View user's profileFind all posts by dmitriSend private messageVisit poster's website


Joined: 08 Mar 2006
Posts: 63
Reply with quote
The problem described earlier in this thread.
View user's profileFind all posts by svenaxSend private message
Bug in file encoding detection *false alarm*
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum
All times are GMT - 5 Hours  
Page 1 of 2  

  
  
 Reply to topic