HighDots Forums  

Links, PageRank, Referrer Spam and a Future Reality of Web Traffic

Search Engine Optimization Discussion about SEO/Search Engine Optimization (alt.internet.search-engines)


Discuss Links, PageRank, Referrer Spam and a Future Reality of Web Traffic in the Search Engine Optimization forum.



Reply
 
Thread Tools Display Modes
  #1  
Old   
Roy Schestowitz
 
Posts: n/a

Default Links, PageRank, Referrer Spam and a Future Reality of Web Traffic - 10-13-2005 , 01:15 PM






I have had a zombie attack directed at my Web server for the past week or
two. Yesterday it peaked with over 30,000 attacks by Windows machines
world-wide. The motive was referrer spam and probably vandalism too. As I
contacted my host for assistance (requiring installation of software), I
was served the following page by the sysadmin:

http://www.abcseo.com/papers/referrer-spam.htm

I immediately recognised it as davidof's site and I quite like the following
bit, which I will quote below for the group to read:

(context is link spam)

===
Isn't Spam really Google's fault?

It seems that Heisenberg was right. By observing something you affect the
observations. Google's PhDs, for all their brains, are a somewhat innocent
bunch; unable to see the consequences of their actions or understand how
the real world operates. By basing their search engine rankings on
inbound-links and anchor text they encourage unscrupulous people to exploit
weaknesses in the system to boost their websites to the top of Google's
rankings.

Google is a great resource for finding information. However the Webosphere
didn't ask Google to set up shop. Google is a business. Like the spammers
they are in it for the money. Now I'm not saying Google is evil but they
need to mature as a business. Instead of focussing on propeller-heads
who've probably never had a girlfriend they need to employ some guys with
street smarts who can think through the latest whizzy idea before it gets
beta'd on the rest of us. Other large businesses have to take some
responsibility for their actions (well ok not Microsoft, they have the
EULA), so why not Google?

Still there are differences between setting up the environment that
encourages spam and actually generating the damn stuff. But if we don't
act, search engine spam will harm the web just as surely as UCE has harmed
email. We don't want to reach the stage where 85% of all requests to a
website are spam do we?

There are other actors. Microsoft for selling a completely insecure
operation system in the form of Windows must shoulder a lot of blame. ISPs
and Web hosting companies for supporting the spammers.
===

The last two paragraphs are similar to stuff I said earlier today, before
reading David George's take. I think the analysis above is insightful. In a
nutshell,

* Blame ISP's for harbouring spammy traffic

* Blame Microsoft for unleashing a faulty O/S out of the box

* Blame Google for unintentionally giving incentive for Web spam

E-mail traffic worldwide is about 50% spam. If all goes as planned or
predicted, expect 50% of content to be mirrors, 50% of links to be
synthetic and 50% of Web traffic to be utter garbage. Great future ahead!
Enjoy the Net today... before it's destroyed. I have been manually
filtering (human filter) for my site for the past 24 hours. Had I not done
that, my shared host would not have coped and I would have been 'separated'
from the Web. I have not done any work whatsoever today and yesterday.
Luckily, my supervisor understands.

Roy

Reply With Quote
  #2  
Old   
www.1-script.com
 
Posts: n/a

Default Re: Links, PageRank, Referrer Spam and a Future Reality of Web Traffic - 10-13-2005 , 02:47 PM






Roy Schestowitz wrote:


Quote:
E-mail traffic worldwide is about 50% spam. If all goes as planned or
predicted, expect 50% of content to be mirrors, 50% of links to be
synthetic and 50% of Web traffic to be utter garbage. Great future
ahead!
Enjoy the Net today... before it's destroyed. I have been manually
filtering (human filter) for my site for the past 24 hours. Had I not
done
that, my shared host would not have coped and I would have been
'separated'
from the Web. I have not done any work whatsoever today and yesterday.
Luckily, my supervisor understands.
What's with the pessimism, Roy?

Site’s under attack? Big deal. I am sitting here and looking at 1200+
entries list of unauthorized root access attempts on only one of my hosts
for the last 24 hours. Business as usual. Sometimes 3000, sometimes 500.
Never less than 500 in a day.

The point is: how is the Websphere (not a trade mark but an analogue to
Noosphere ir Biosphere ;-) ) different in this sense from the rest of your
life? Unless you were raised under a glass dome, you might have noticed
that life is not fair. There’s crime and there are accidents and
everything else that makes life less than perfect. I assume you wouldn’t
leave your car unlocked in a bad part of the city you live in, would you?
I you did, would you be surprised if you saw it vandalized upon your
return?

90% of my mail is junk. That’s my US Post-delivered mail I’m talking
about. If I did not dump all the junk from my PO into the adjacent trash
bin at least once a week, it would have exploded. Does it mean US Post has
to be shut down? God, no! I wouldn’t be able to get that new PC mouse I
ordered recently.

I do have a feeling that’s an exact opposite to yours: the Web was a saint
place compared to the rest of the world, and now it just gets back to
normal.

Hey, look at the bright side: there are good people out there! Say hello
to your supervisor ;-)

--
Cheers, and I mean CHE-E-E-RS!
Dmitri
See Site Sig Below
-------------------------------------

--
##-----------------------------------------------#
Article posted with Web Developer's USENET Archiv
http://www.1-script.com/forum
Web and RSS gateway to your favorite newsgroup -
alt.internet.search-engines - 15948 messages and counting
##-----------------------------------------------##


Reply With Quote
  #3  
Old   
Roy Schestowitz
 
Posts: n/a

Default Re: How Search Engines SHOULD Be Managed - 10-14-2005 , 12:16 AM



__/ [John A.] on Friday 14 October 2005 02:33 \__

Quote:
snip /

* Blame ISP's for harbouring spammy traffic

* Blame Microsoft for unleashing a faulty O/S out of the box

* Blame Google for unintentionally giving incentive for Web spam

SE spammers had incentive to spam long before Google, and they did
plenty of it. Anyone else remember the TV commercial - I forget for
which SE - where they had a bunch of old people("same old links") call
out their site name when a searcher calls out their search terms?
There was one old guy in a leather harness calling out "Hot Leather
Action!" or something like that, to which another replied "Oh, you
come up for everything!"

At the time it was mostly content and keyword tag spamming. Google's
system of analyzing links was just the ticket to sift out the real
stuff (which legit sites generally linked to) from the scum.

The problem is they let it come out how they did it. (It was, of
course, just a matter of time before the spammers figured it out, even
if they hadn't leaked it via their patents.) Any criteria by which
sites can be evaluated for relevancy & authority can be targeted if
it's known. Google has, of course, refined their system, mostly
plugging the holes in ways that seem to be aimed at forcing spam to be
more obvious to the user. The holy grail is, of course, to get the
criteria to the point that a page absolutely *has to be* relevant
and/or authoritative to meet the criteria and where any relevant
and/or authoritative page will meet it. That point may be approached,
but short of some degree of AI, it will never actually be met, and
probably not even then.
Interesting take. Some months ago I argued that in order to avoid bias and
avoid corruption, the following steps should at least be considered:

- Make a search engine public service[1], much like the W3C's validation
services and ICANN/whois.net/relatives. The Web belongs to everyone in this
world and search -- the means by which data gets organised -- should be a
service. Likewise, an operating system should be nobody's property.
Hardware should, but not the platform upon which people communicate.
Conflicting interests leads to protocol breakage... (I am going endlessly
off topic, so I will stop)

- Have sites register in one form or another to state their aims and scope.
DMOZ goes some way towards that, but the whole Google-DMOZ-mozilla.com
(corporation) loveaffair is disturbing in my eyes.

- Use more proper methods for exploiting knowledge and information. Don't
tell me (Schmitt) how long it will take you to index all human knowledge
(300 years, he said - reference available on demand). Do the task
_properly_! See the URL in the bottom of my sig as I truly believe search
engines are lagging behind what science (AI in particular) has to offer.


[1] Funding of crawling resources can be managed in the same way Google
does, e.g. paid listing in SERP's (not sponsored links in the actual
results), much like Yellow Pages where yellow/white tells apart ham from
spam.

I think there needs to be a strategic movement like GNU in order to release
ourselves from commercial search engines (and all-round public information
domnation). The financial entry barrier is high though. See:

* http://iuron.com/documents/manifest/draft/node4.html

* http://www.google.com/intl/en/corpor...tory.html#1998


<snip>

All they (Brin, Page) needed was a little cash to move out of the dorm ? and
to pay off the credit cards they had maxed out buying a terabyte of memory.
So they wrote up a business plan, put their Ph.D. plans on hold, and went
looking for an angel investor. Their first visit was with a friend of a
faculty member.

</snip>

Best Regards (happy to have heard your thoughts),

Roy

--
Roy S. Schestowitz | Previous signature has been conceded
http://Schestowitz.com | SuSE Linux | PGP-Key: 74572E8E
4:55am up 49 days 17:09, 3 users, load average: 0.79, 0.51, 0.55
http://iuron.com - next generation of search paradigms


Reply With Quote
  #4  
Old   
John Bokma
 
Posts: n/a

Default Re: How Search Engines SHOULD Be Managed - 10-14-2005 , 02:08 AM



Roy Schestowitz <newsgroups (AT) schestowitz (DOT) com> wrote:

Quote:
Interesting take. Some months ago I argued that in order to avoid bias
and avoid corruption, the following steps should at least be
considered:

- Make a search engine public service[1], much like the W3C's
validation services and ICANN/whois.net/relatives.
- Who's going to pay for that?
- Who's going to decide how it is going to work? The public?
Will fail.

Quote:
The Web belongs to
everyone in this world and search -- the means by which data gets
organised -- should be a service.
Who defines this service?

Quote:
Likewise, an operating system should
be nobody's property.
Why not?

Quote:
Hardware should,
Why?

Quote:
but not the platform upon which
people communicate. Conflicting interests leads to protocol
breakage...
Open Source doesn't mean that protocols will become clear, and well
defined. Also, protocols are not limited to software, they are in
hardware as well. Don't you just hate it when your 1 year old hardware
doesn't all work in your new motherbord?

Quote:
(I am going endlessly off topic, so I will stop)

- Have sites register in one form or another to state their aims and
scope. DMOZ goes some way towards that, but the whole
Google-DMOZ-mozilla.com (corporation) loveaffair is disturbing in my
eyes.
The same would happen if it became independent: there is an editor,
there is someone who wants in it -> conflicts, corruption.

Quote:
- Use more proper methods for exploiting knowledge and information.
Don't tell me (Schmitt) how long it will take you to index all human
knowledge (300 years, he said - reference available on demand). Do the
task _properly_! See the URL in the bottom of my sig as I truly
believe search engines are lagging behind what science (AI in
particular) has to offer.
TANSTAAFL, that's the problem.

Quote:
[1] Funding of crawling resources can be managed in the same way
Google does, e.g. paid listing in SERP's (not sponsored links in the
actual results), much like Yellow Pages where yellow/white tells apart
ham from spam.

I think there needs to be a strategic movement like GNU in order to
release ourselves from commercial search engines (and all-round public
information domnation). The financial entry barrier is high though.
Yup, that's the whole point. GNU.... have a look at HURD...

--
John Perl SEO tools: http://johnbokma.com/perl/
or have them custom made
Experienced (web) developer: http://castleamber.com/


Reply With Quote
  #5  
Old   
Roy Schestowitz
 
Posts: n/a

Default Re: Releasing Ourselves from Google and Microsoft Tyranny? - 10-14-2005 , 03:06 AM



__/ [John Bokma] on Friday 14 October 2005 07:08 \__


Okay, you pulled my finger, so I'll have to answer your questions. *smile*


Quote:
Roy Schestowitz <newsgroups (AT) schestowitz (DOT) com> wrote:

Interesting take. Some months ago I argued that in order to avoid bias
and avoid corruption, the following steps should at least be
considered:

- Make a search engine public service[1], much like the W3C's
validation services and ICANN/whois.net/relatives.

- Who's going to pay for that?
- Who's going to decide how it is going to work? The public?
Will fail.

You must have hit reply before reading it the first time. *grin*


Quote:
The Web belongs to
everyone in this world and search -- the means by which data gets
organised -- should be a service.

Who defines this service?

A panel of people who are said to be suitable and are knowledgeable in the
field in question.


Quote:
Likewise, an operating system should
be nobody's property.

Why not?

By owning an O/S, you partly own a person's computer. You definitely have
/control/ over it. If a commercial body controls your computer, it can
steer you towards elements that serve its financial agenda, to name just
one aspect of the problem.


Quote:
Hardware should,

Why?

Software can be duplicated. Hardware cannot.

Quality control and competition are encouraging development. You could use
the same arguments when referring to software, but let us assume that there
are many experts out there (there already are) who contribute to Open
Source and will continue to do so for reputation, not direct profit.


Quote:
but not the platform upon which
people communicate. Conflicting interests leads to protocol
breakage...

Open Source doesn't mean that protocols will become clear, and well
defined. Also, protocols are not limited to software, they are in
hardware as well. Don't you just hate it when your 1 year old hardware
doesn't all work in your new motherbord?

That is true. Need I raise the fact, however, that some hardware is design
to work only with Windows? (references on demand)


Quote:
(I am going endlessly off topic, so I will stop)

- Have sites register in one form or another to state their aims and
scope. DMOZ goes some way towards that, but the whole
Google-DMOZ-mozilla.com (corporation) loveaffair is disturbing in my
eyes.

The same would happen if it became independent: there is an editor,
there is someone who wants in it -> conflicts, corruption.

Yes, definitely. As we seek a way of verifying that a site is worthful, how
about specifying clear protocols for acceptance, classification, and
rating? You could say the same thing about taxing, but the system still
appears to work (let us pretend).


Quote:
- Use more proper methods for exploiting knowledge and information.
Don't tell me (Schmitt) how long it will take you to index all human
knowledge (300 years, he said - reference available on demand). Do the
task _properly_! See the URL in the bottom of my sig as I truly
believe search engines are lagging behind what science (AI in
particular) has to offer.

TANSTAAFL, that's the problem.

[1] Funding of crawling resources can be managed in the same way
Google does, e.g. paid listing in SERP's (not sponsored links in the
actual results), much like Yellow Pages where yellow/white tells apart
ham from spam.

I think there needs to be a strategic movement like GNU in order to
release ourselves from commercial search engines (and all-round public
information domnation). The financial entry barrier is high though.

Yup, that's the whole point. GNU.... have a look at HURD...

I wonder what Torr has to say on the subject...

I also wonder if the next step for Google would be to steer users to
OpenOffice.org and other Java Desktop and JRE stuff. It now seems more
realistic and defensible view than a Google operating system ( a fantasy to
some).

Roy


--
Roy S. Schestowitz | "Black holes are where God is divided by zero"
http://Schestowitz.com | SuSE Linux | PGP-Key: 74572E8E
7:50am up 49 days 20:04, 4 users, load average: 1.67, 1.03, 0.73
http://iuron.com - next generation of search paradigms


Reply With Quote
Reply




Thread Tools
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

vB code is On
Smilies are On
[IMG] code is On
HTML code is Off



Powered by vBulletin Version 3.5.4
Copyright ©2000 - 2008, Jelsoft Enterprises Ltd.