HighDots Forums  

Re: mime charset problem

alt.html alt.html


Discuss Re: mime charset problem in the alt.html forum.



Reply
 
Thread Tools Display Modes
  #1  
Old   
Brandy Red
 
Posts: n/a

Default Re: mime charset problem - 05-26-2008 , 11:31 AM






On 25 Mai, 20:55, "Martin Nadoll" <mar... (AT) nadoll (DOT) de> wrote:
Quote:
I will check all these advice.
Thanks a lot for spending so much time with my problem.
Looks like you are very capable of handling all that issues.
A lot of my content comes from a database, so there i have many of that
german characters.
That's why i thought, it's not a good idea, to switch to UTF-8.
I'm from Norway. In norway we also have some characters that don't
match with utf-8. In norway it's primarly æ, ø and å. Should I stick
to
ISO-8859-1 or move to utf-8? That might seems like a hard question,
but it is really simple. Every us-letters in utf-8 are encoded with
one "letter",
most europen letters like those form German and Norway are encode
with two "letters". Your letters (äöü), doubble s, and so on is
encoded whith
two "letters". Just one little image will make the diffrerens
unimportent.



Reply With Quote
  #2  
Old   
Jukka K. Korpela
 
Posts: n/a

Default Re: mime charset problem - 05-26-2008 , 12:51 PM






Scripsit Brandy Red:

Quote:
On 25 Mai, 20:55, "Martin Nadoll" <mar... (AT) nadoll (DOT) de> wrote:
- -
A lot of my content comes from a database, so there i have many of
that german characters.
That's why i thought, it's not a good idea, to switch to UTF-8.

I'm from Norway.
A small world, isn't it? This little peninsula of Asia that we call
"Europe" is inhabited by interesting people, even though it is
linguistically relatively uniform. A collection of only about 1,000
characters (the so-called Minimum European Subset 2, MES-2) covers
virtually all letters and punctuation used in European languages. But I
digress.

First I'd like to mention that data coming from a database tends to be
in an encoding of the database, but quite often you can easily convert
it to a different encoding. Even PHP, which is primitive in many ways
and doesn't really support Unicode, has tools for converting from
ISO-8859-1 to UTF-8 (which is a fairly trivial conversion anyway).

Quote:
In norway we also have some characters that don't
match with utf-8.
Pardon? I was very confused... but I think you mean characters that have
a different representation in UTF-8 than in ISO-8859-1.

Quote:
In norway it's primarly æ, ø and å.
Right. And perhaps some punctuation marks, though partly they don't
exist in ISO-8859-1 at all.

Quote:
Should I stick to
ISO-8859-1 or move to utf-8? That might seems like a hard question,
but it is really simple.
It depends.

Quote:
Every us-letters in utf-8 are encoded with
one "letter",
most europen letters like those form German and Norway are encode
with two "letters". Your letters (äöü), doubble s, and so on is
encoded whith
two "letters". Just one little image will make the diffrerens
unimportent.
That sounds very confusing, but there is a point behind it. What you
really mean is that letters like æ, ø, å, ä, ä, ü, ß etc. each occupy
one octet (one 8-bit byte) in ISO-8859-1, two octets in UTF-8, and that
this difference is not very important in terms of efficiency.

The real issues are elsewhere. Can you work with UTF-8 in your authoring
software? Can other people who edit the pages later do the same? Can you
change the Content-Type header sent by the server? And so on. The
efficiency impact is mostly ignorable.

--
Jukka K. Korpela ("Yucca")
http://www.cs.tut.fi/~jkorpela/



Reply With Quote
Reply




Thread Tools
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

vB code is On
Smilies are On
[IMG] code is On
HTML code is Off



Powered by vBulletin Version 3.5.4
Copyright ©2000 - 2008, Jelsoft Enterprises Ltd.