On Mon, 13 Oct 2003, S. wrote:
Quote:
if in my website i am using the sgml { notation, is it accurate
to say to my users that the site uses unicode or that it requires
unicode? |
Yes, no, maybe. HTML4 and later have "Unicode" (well, to be pedantic
iso-10646) as their character set, so - potentially - any HTML4 client
agent has to at lest _understand_ the use of that character set. The
specification doesn't actually require every client agent to be able
to *render* the entire character set - that would hardly be practical.
Anyhow, { is an ascii character ;-)
Quote:
is there a mathematical formula to calculate a unicode value given its
utf8 value? |
Sure: but if you needed to ask, I doubt that you'd want to program it
yourself. Why don't you ask the question about what you _really_ want
to achieve, rather than this detail which probably isn't really going
to help?
You do understand, don't you, that utf-8 is one of the recommended
encodings of Unicode/iso-10646? Maybe a bit of browsing around
www.unicode.org (obvious as it might seem) would help you to put the
details into context.
Perl (at least 5.8.0 or later) understands this stuff internally, so
if you talk to it nicely, it'll do anything you need. Sure, there are
plenty of other ways too. But your question is in one sense too vague
(no proper context) and in another sense too specific (you asked a
question to which the answer can only be "yes", but we don't know how
that can help you to achieve your real aims).
good luck