![]() | |
![]() |
| | Thread Tools | Display Modes |
#21
| ||||
| ||||
|
|
Thomas 'PointedEars' Lahn <PointedE... (AT) web (DOT) de> writes: Lasse Reichstein Nielsen wrote: var re = /('(?:[^']|\\')*')/g; alert(re.exec(code)[0]); It alerts the string "'abc\\'", i.e., it does end at the first "'", even if the quote is escaped. |
|
The reason it does so is that [^'] matches backslash as well, and with a higher priority than what comes after, so it matches the backslash as well. The immediate fix of swapping the alternatives: var re = /('(?:\\'|[^'])*'/g; |
|
and giving \\' priority over [^'], will match "\\'" as a non-string-ender, but will also ignore "\\\\'". It's necessary to know whether there is an even number of backslashes before the quote in order to know whether it's escaped or not. The RegExp below is the simplest one I have found to do that. |
|
/* 'foo \\' */ var code = "'foo \\\\' '"; /* ["'foo \\'", "'foo \\'"] */ /('(?:[^']|\\')*')/.exec(code) Glad to be of service ![]() ECMAScript syntax is ... interesting. Context depending lexing combined with semicolon-insertion gives ample room to make mistakes ![]() var b=2,g=1; var a = 84 /b/g; // <- it's division ![]() |

#22
| |||
| |||
|
|
On Nov 6, 6:36 pm, Lasse Reichstein Nielsen <lrn.unr... (AT) gmail (DOT) com wrote: Thomas 'PointedEars' Lahn <PointedE... (AT) web (DOT) de> writes: Lasse Reichstein Nielsen wrote: *var re = /('(?:[^']|\\')*')/g; *alert(re.exec(code)[0]); It alerts the string *"'abc\\'", i.e., it does end at the first "'", even if the quote is escaped. The above recognizes from one single quote to the final single quote in the string. One may just as well write var re = /('.*?')/g |
#23
| |||
| |||
|
|
It is really merely an issue to recognize and ignore string literals first, then to recognize and ignore RegExp initializers outside of them. *My replace function already implements the former; adapting it to also take care of the latter is left as an exercise to the reader. |
#24
| |||
| |||
|
|
Thomas 'PointedEars' Lahn wrote: It is really merely an issue to recognize and ignore string literals first, then to recognize and ignore RegExp initializers outside of them. My replace function already implements the former; adapting it to also take care of the latter is left as an exercise to the reader. Your replace function so far converts a syntactically correct source into syntactically incorrect one: /foobar//foobar comes to /foobar which is "unterminated regular expression literal" |
|
P.S. It is a bit of fun to watch people making a robust parser algorithm for an algorithmically unparseable matter. |
|
But keep going, I have more... |
#25
| |||
| |||
|
|
and giving \\' priority over [^'], will match "\\'" as a non-string-ender, but will also ignore "\\\\'". It's necessary to know whether there is an even number of backslashes before the quote in order to know whether it's escaped or not. The RegExp below is the simplest one I have found to do that. Lasse, to me, the RegExp below looks identical to the first one above. So in the absence of me seeing it, here is a regular expression that recognizes single quoted strings. It will match from a single quote to the next single quote not preceded by an odd number of backslashes. var re = /'(?:\\.|[^\\'])*'/g |
|
var b=2,g=1; var a = 84 /b/g; // <- it's division ![]() This is highly interesting, where the interpretation of that final line also depends on what comes before it. For example: var b=2,g=1; var a = 84; /b/g; // <- it's a regular expression ![]() or whole(truth) /b+c/g; // division vs. while(truth) /b+c/g; // RegExp I wonder about other examples of (non embedded) code being interpreted differently depending on what precedes it. |
|
Also, while your example of [^] works on my FF1.5, it does not complile on my IE 6. Ie. adding var re=/[^]/; results in an error message from IE. |

![]() |
| Thread Tools | |
| Display Modes | |
| |