![]() | |
![]() |
| | Thread Tools | Display Modes |
#11
| |||
| |||
|
|
David Mark wrote: On Jan 13, 6:11*pm, Bart Van der Donck <b... (AT) nijlen (DOT) com> wrote: It is the browser itself who silently converts \n (or \r) into \r\n, before the data is sent to the server. The script at the server only reads out what was offered. But the database should store in a predetermined canonical form, regardless of what the browser says. *Whether that is \n, \n\r or \r is up to the DBA. You probably mean '\r\n' in stead of '\n\r'. I would say that it's |
|
rather up to the operating system. I haven't seen a case where the DBA interferes with these OS settings when it comes to _storing_ data. Fromhttp://en.wikipedia.org/wiki/Newline: \r: Multics, Unix and Unix-like systems (GNU/Linux, AIX, Xenix, Mac OS X, etc.), BeOS, Amiga, RISC OS, and others \r\n: DEC RT-11 and most other early non-Unix, non-IBM OSes, CP/M, MP/ M, DOS, OS/2, Microsoft Windows \n: Commodore machines, Apple II family and Mac OS up to version 9 http://www.rfc-editor.org/EOLstory.txtsays: | ASCII text (ed.: like percent-encoded form-data) transmitted across | the network *must* use the two-character sequence: CR LF (ed.: \r \n). I don't agree with your suggestion to store end-of-line characters as \n by force; I would always store \r\n, as offered by the browser. As offered by which browser? *As mentioned, some don't send \r\n. When a browser doesn't send '\r\n', it violates RFC (see quotation above fromhttp://www.rfc-editor.org/EOLstory.txt). The word *must* means: * | MUST * This word, or the terms "REQUIRED" or "SHALL", mean that the * | definition is an absolute requirement of the specification. http://www.faqs.org/rfcs/rfc2119.html One can safely conclude that a browser which doesn't send '\r\n' is a bad browser. |
#12
| |||
| |||
|
|
I re-read the OP as I thought it had implied that some browsers were sending \n alone. If they all send \r\n and a text field is used in the database (which would likely be the norm in this case), then you are right on all counts. |
#13
| |||
| |||
|
|
I have a related question. Many of my webpages use simple flat files as their "database" with one line added per transaction. This is fine until the data to be stored comes from a TEXTAREA, because that can contain embedded CRLF/CR/LF sequences which would screw up the lines in my file. |
|
I've adopted the convention of converting CRLF or CR or LF into x'0102' on the assumption that no one (certainly no one in their right mind) will ever enter hex 01 or 02 characters into a text area. |
|
I'm curious to know if anyone sees a problem with this; I've not encountered one in many years of practice. |
#14
| |||
| |||
|
|
Checking on a separate CR or LF is not necessary; CR+LF should be enough. Newlines in a TEXTAREA which are not transmitted as '\r\n', are in violation of RFC. |
#15
| |||
| |||
|
|
Bart Van der Donck wrote: Checking on a separate CR or LF is not necessary; CR+LF should be enough. Newlines in a TEXTAREA which are not transmitted as '\r\n', are in violation of RFC. Bart, Thank you for confirming what I'd noticed in practice. I do, however, have a few examples where single x'0A' characters have found their way into my data files, and since this is the linend sequence on my linux server, it caused problems. I checked my code 'till I was blue in the face, and never found any way this could happen unless a browser had submitted an x'0A' as a linend from a TEXTAREA control. Of course, I have no control over what strange browsers people might be using, so I took the pragmatic approach of translating both x'0A' and x'0D' to my x'0102' "line-end" sequence. There have been no re-occurrences of the problem. |
#16
| |||
| |||
|
|
Steve Swift wrote: Bart Van der Donck wrote: Checking on a separate CR or LF is not necessary; CR+LF should be enough. Newlines in a TEXTAREA which are not transmitted as '\r\n', are in violation of RFC. .... I'm thinking of 4 possibilities: |
#17
| |||
| |||
|
|
I checked my code 'till I was blue in the face, and never found any way this could happen unless a browser had submitted an x'0A' as a linend from a TEXTAREA control. Of course, I have no control over what strange browsers people might be using, so I took the pragmatic approach of translating both x'0A' and x'0D' to my x'0102' "line-end" sequence. There have been no re-occurrences of the problem. I'm just waiting for the browser that sends x'0A0D' now, but hope to retire before that occurs. :-) |
#18
| |||
| |||
|
|
One needs an algorithm to convert bad newlines to good ones. |
![]() |
| Thread Tools | |
| Display Modes | |
| |