[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [JDEV] Character Encodings and Languages thread



> Jer, you noted that it's not possible to change encodings on the fly. :) I
> would like to respectfully point out that it would be perfectly possible
> (and could even simplify the Jabber parse routines) if we modified the
> Jabber protocol to be one-document-to-one-packet instead of using a
> one-document-to-many-packets approach. This way, expat switches between
> encodings on the fly.

Yes, the thought ran through my head, again :)

That approach is very similiar to how the protocol originally was long
long ago, but we moved to a more sreamed-document approach because of
various issues.

The heart of the matter is that if you stream multiple XML "chunks"
together, you need some way of identifying the break between them, and
this is difficult to do when each chunk can contain just about any binary
pattern(based on the allowable characters and encodings within each
chunk).

>From what I understand, The W3C is working on a proposal for doing just
this, streaming/packaging chunks of XML in a standard format.  When a
reccomendation comes out on this topic I think we want to revist this
approach, but it's probably not worth the added complexity and time it
would require at this point. 

> Just out of curiosity, how does AIM handle this? Or ICQ for that matter?

That part I'm not sure about... anyone else?

Jer