[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[JDEV] International Char Sets..



  I think these questions in the expat FAQ may give some additional information regarding this conversation..

---
How can I get expat to deal with non-ASCII characters?

By default, expat assumes that documents are encoded in UTF-8. In UTF-8, ASCII characters are represented by a single byte as they would be in ASCII, but non-ASCII characters are represented by a sequence of two or more bytes all with the 8th bit set. The encoding most widely used for European languages is ISO 8859-1 which is not compatible with UTF-8. To use this encoding, expat must be told either by supplying an argument of "iso-8859-1" to XML_ParserCreate, or by starting the document with <?xml version="1.0" encoding="iso-8859-1"?>.

What encodings does expat support?

expat has built in support for the following encodings:

utf-8 
utf-16 
iso-8859-1 
us-ascii 
Additional encodings can be supported by using XML_SetUnknownEncodingHandler


---
Thomas Charron


--== Sent via Deja.com http://www.deja.com/ ==--
Share what you know. Learn what you don't.