HighDots Forums  

Remove trailing comments exercise

Javascript JavaScript language (comp.lang.javascript)


Discuss Remove trailing comments exercise in the Javascript forum.



Reply
 
Thread Tools Display Modes
  #21  
Old   
Csaba Gabor
 
Posts: n/a

Default Re: Remove trailing comments exercise - 11-07-2009 , 05:21 AM






On Nov 6, 6:36 pm, Lasse Reichstein Nielsen <lrn.unr... (AT) gmail (DOT) com>
wrote:
Quote:
Thomas 'PointedEars' Lahn <PointedE... (AT) web (DOT) de> writes:
Lasse Reichstein Nielsen wrote:
var re = /('(?:[^']|\\')*')/g;
alert(re.exec(code)[0]);

It alerts the string "'abc\\'", i.e., it does end at the first
"'", even if the quote is escaped.
The above recognizes from one single quote to
the final single quote in the string. One may
just as well write var re = /('.*?')/g

Quote:
The reason it does so is that [^'] matches backslash as well, and
with a higher priority than what comes after, so it matches the
backslash as well.

The immediate fix of swapping the alternatives:
var re = /('(?:\\'|[^'])*'/g;
The above recognizes from a single quote to either the next
single quote not preceded by a backslash if such a single
quote exists; else to the last single quote. To observe:

var code = "abc'def\\'ghi'jkl\\\\'mno\\\\'pqr";
var re = /'(?:\\'|[^'])*'/g
alert (code.replace(re, "XXX"));

Quote:
and giving \\' priority over [^'], will match "\\'" as a non-string-ender,
but will also ignore "\\\\'". It's necessary to know whether there is an
even number of backslashes before the quote in order to know whether it's
escaped or not. The RegExp below is the simplest one I have found to do that.
Lasse, to me, the RegExp below looks identical to the first
one above. So in the absence of me seeing it, here is a
regular expression that recognizes single quoted strings.
It will match from a single quote to the next single quote
not preceded by an odd number of backslashes.

var re = /'(?:\\.|[^\\'])*'/g

Quote:
/* 'foo \\' */
var code = "'foo \\\\' '";

/* ["'foo \\'", "'foo \\'"] */
/('(?:[^']|\\')*')/.exec(code)

Glad to be of service
ECMAScript syntax is ... interesting. Context depending lexing combined
with semicolon-insertion gives ample room to make mistakes

var b=2,g=1;
var a = 84
/b/g; // <- it's division
This is highly interesting, where the interpretation of that
final line also depends on what comes before it. For example:

var b=2,g=1;
var a = 84;
/b/g; // <- it's a regular expression

or

whole(truth) /b+c/g; // division
vs.
while(truth) /b+c/g; // RegExp

I wonder about other examples of (non embedded) code being
interpreted differently depending on what precedes it.


Also, while your example of [^] works on my FF1.5, it does
not complile on my IE 6. Ie. adding
var re=/[^]/;
results in an error message from IE.

Reply With Quote
  #22  
Old   
Csaba Gabor
 
Posts: n/a

Default Re: Remove trailing comments exercise - 11-07-2009 , 06:15 AM






On Nov 7, 11:21*am, Csaba Gabor <dans... (AT) gmail (DOT) com> wrote:
Quote:
On Nov 6, 6:36 pm, Lasse Reichstein Nielsen <lrn.unr... (AT) gmail (DOT) com
wrote:

Thomas 'PointedEars' Lahn <PointedE... (AT) web (DOT) de> writes:
Lasse Reichstein Nielsen wrote:
*var re = /('(?:[^']|\\')*')/g;
*alert(re.exec(code)[0]);

It alerts the string *"'abc\\'", i.e., it does end at the first
"'", even if the quote is escaped.

The above recognizes from one single quote to
the final single quote in the string. One may
just as well write var re = /('.*?')/g
final => next. Sorry about that

If the ? in the RegExp I supplied is omitted, then
it captures till the final single quote

Reply With Quote
  #23  
Old   
VK
 
Posts: n/a

Default Re: Remove trailing comments exercise - 11-07-2009 , 12:34 PM



Thomas 'PointedEars' Lahn wrote:
Quote:
It is really merely an issue to recognize and ignore string literals first,
then to recognize and ignore RegExp initializers outside of them. *My
replace function already implements the former; adapting it to also take
care of the latter is left as an exercise to the reader.
Your replace function so far converts a syntactically correct source
into syntactically incorrect one:
/foobar//foobar
comes to
/foobar
which is "unterminated regular expression literal"

P.S. It is a bit of fun to watch people making a robust parser
algorithm for an algorithmically unparseable matter. But keep going, I
have more...

Reply With Quote
  #24  
Old   
Thomas 'PointedEars' Lahn
 
Posts: n/a

Default Re: Remove trailing comments exercise - 11-07-2009 , 02:04 PM



VK wrote:

Quote:
Thomas 'PointedEars' Lahn wrote:
It is really merely an issue to recognize and ignore string literals
first, then to recognize and ignore RegExp initializers outside of them.
My replace function already implements the former; adapting it to also
take care of the latter is left as an exercise to the reader.

Your replace function so far converts a syntactically correct source
into syntactically incorrect one:
/foobar//foobar
comes to
/foobar
which is "unterminated regular expression literal"
If you had paid attention, you would have known that I am aware of the
RegExp issue.

Quote:
P.S. It is a bit of fun to watch people making a robust parser
algorithm for an algorithmically unparseable matter.
It is not algorithmically unparseable. Otherwise there would be no script
engine that accepts RegExp initializer, would there? The context in which
`/' is not recognized as the start of a RegExp initializer is grammatically
well-defined, and if you had cared to read the Specification you would have
known.

Quote:
But keep going, I have more...
You would.


PointedEars
--
Use any version of Microsoft Frontpage to create your site.
(This won't prevent people from viewing your source, but no one
will want to steal it.)
-- from <http://www.vortex-webdesign.com/help/hidesource.htm> (404-comp.)

Reply With Quote
  #25  
Old   
Lasse Reichstein Nielsen
 
Posts: n/a

Default Re: Remove trailing comments exercise - 11-08-2009 , 10:38 AM



Csaba Gabor <danswer (AT) gmail (DOT) com> writes:

[correct description of how the regexps work]

Quote:
and giving \\' priority over [^'], will match "\\'" as a non-string-ender,
but will also ignore "\\\\'". It's necessary to know whether there is an
even number of backslashes before the quote in order to know whether it's
escaped or not. The RegExp below is the simplest one I have found to do that.

Lasse, to me, the RegExp below looks identical to the first
one above. So in the absence of me seeing it, here is a
regular expression that recognizes single quoted strings.
It will match from a single quote to the next single quote
not preceded by an odd number of backslashes.

var re = /'(?:\\.|[^\\'])*'/g
My mistake. The "RegExp below" that I was referring to was one that I
had written in a double-quoted message, but I managed to remove that
quote before posting.

It was indeed equivalent to the one you wrote here (I think it had the
alternative in the opposite order, but that's not important since they
are mutually exclusive.

Quote:
var b=2,g=1;
var a = 84
/b/g; // <- it's division

This is highly interesting, where the interpretation of that
final line also depends on what comes before it. For example:

var b=2,g=1;
var a = 84;
/b/g; // <- it's a regular expression

or

whole(truth) /b+c/g; // division
vs.
while(truth) /b+c/g; // RegExp

I wonder about other examples of (non embedded) code being
interpreted differently depending on what precedes it.
There are a few:
An object literal, {foo: 42}, is alos a valid statement block
with a labeled expression statement. In an expression context,
it can only be the object literal, in a statement context, it
can only be the statement block, and since expressions can be
statements (ExpressionStatement) there is a rule that says that
an ExpressionStatement cannot begin with "{" (or "function").


Quote:
Also, while your example of [^] works on my FF1.5, it does
not complile on my IE 6. Ie. adding
var re=/[^]/;
results in an error message from IE.
Tsk, tsk.

/L
--
Lasse Reichstein Holst Nielsen
'Javascript frameworks is a disruptive technology'

Reply With Quote
Reply




Thread Tools
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

vB code is On
Smilies are On
[IMG] code is On
HTML code is Off



Powered by vBulletin Version 3.5.4
Copyright ©2000 - 2009, Jelsoft Enterprises Ltd.