![]() |
| Detecting UTF-8 correctly *solved* |
|
Site Admin
|
I would recommend you to split mixed file(s) into different files. Each with its own encoding.
PhpED does not attept to "detect" anything. It just follow your instructions. If you set Default system encoding, it will be used.
BOM stands for Byte Order Mark and it has no relations to single-byte encodings like ISO-8859-1 or UTF-8. If you need BOM'ed encodings, use UTF-16 family which has LE (little endian) and BE (big endian). |
||||||||||||||||
|
|
|||||||||||||||||
|
It seems we have a serious misunderstanding here. Of course I am not trying to use different encodings in the same file. That is obviously impossible.
Well, if that is the case, how can PHPEd know that a file is UTF-8 after it has been saved as such (from within PHPEd)? It is still just a bunch of bytes.
No, I know that. I just mentioned that PHPEd doesn't add a BOM (which certainly can be used for UTF-8 files) to show that the file content hasn't changed at all, but still my UTF-8 files were not detected as such until after I saved them from within PHPEd. |
||||||||||||||||||||||
|
|
|||||||||||||||||||||||
|
Site Admin
|
PhpED remembered that a different encoding (utf-8 ) was selected when you saved the file. Next time when you open the file, it applies this encondig instead of the "system default".
Mostly BOM make sense for encodings that use 2 or 4 bytes per symbol and while UTF-8 is single byte encoding, BOM's usage in unknown to me. For example, many XML files are utf-8 encoded. Have you ever seen any BOMs in them? |
||||||||||||||||
|
|
|||||||||||||||||
|
OK, so the answer to my question is, no, PHPEdit doesn't know which files are UTF-8 encoded until they have been saved as such by the program itself. That is pretty inconvenient. Other editors can usually detect the encoding using some suitable heuristics (such as detecting a BOM mark).
Information about Byte Order Marks: http://www.unicode.org/faq/utf_bom.html#25 |
||||||||||||||||||||
|
|
|||||||||||||||||||||
|
Site Admin
|
First, phped. It's phped, not phpedit
And it does not use BOM, truth. BOM is really rarely used so I persoanally do not think it's a big deal at all. If you have an UTF8 file, just open it as UTF8 (select Utf8 in appropriate combo in File Open dialog) and it will work fine. Regarding "suitable heuristics", do they work stable and return correct results in all cases? |
||||||||||||
|
|
|||||||||||||
|
Yes, yes, let's forget about BOM:s for the moment. The problem is, as you could well understand, that I have a lot of files that must have their encoding set. I'd rather not open every single one of them from the file menu. Where is this information kept? Perhaps the encoding information that PHPEd uses can be updated directly.
And as to heuristics; I would say this is a pretty good indicator, don't you agree? <?xml version="1.0" encoding="utf-8"?> |
||||||||||||
|
|
|||||||||||||
|
Site Admin
|
fileenc.cfg
No doubts |
||||||||||||||
|
|
|||||||||||||||
|
Good. I'll have a look.
No, but PhpED can edit other file types too, can't it? |
||||||||||||||||||
|
|
|||||||||||||||||||
|
Site Admin
|
we released build 4510.
now it recognizes BOM, encoding for xml and html thanks for pointing out to the problem. |
||||||||||||
|
|
|||||||||||||
| Detecting UTF-8 correctly *solved* |
|
||
|
Content © NuSphere Corp., PHP IDE team
Powered by phpBB © phpBB Group, Design by phpBBStyles.com | Styles Database.
Powered by
Powered by phpBB © phpBB Group, Design by phpBBStyles.com | Styles Database.
Powered by


RSS2 Feed