http://research.yahoo.com/news/psych...tai n_of_data
Psychologists are busy digging around in a mountain of data
Dec 04, 2006
(Extracted from: Financial Times)
The internet was designed, and continues to be maintained, by
technologists rather than psychologists and economists. This may be a
good thing in terms of its robustness and efficiency, but does not make
it simple to work with. It may be easy enough to learn a mark-up
language or two in order to create a website but that does not
guarantee its success.
The internet was designed, and continues to be maintained, by
technologists rather than psychologists and economists.
This may be a good thing in terms of its robustness and efficiency, but
does not make it simple to work with. It may be easy enough to learn a
mark-up language or two in order to create a website but that does not
guarantee its success.
So it is no surprise that a survey in the UK sponsored by Google, the
world's leading internet search company, and the employers' lobby, the
CBI, found that most consumer companies were diffident about the
effectiveness of their marketing online and felt they had a lot to
learn about a medium that is still less than 20 years old.
It would, of course, be fascinating to see what practitioners of the
"dismal science" of economics could come up with, given a clean sheet
and unlimited technology to play with, but that option is no longer
available. And the engineers are pressing on: the aim at present is to
expand the internet's present maximum of 4.3bn addresses (known as
IPv4) to 340 trillion trillion trillion (IPv6), enough to label
uniquely everything on Earth.
Speaking at the inaugural meeting of the Internet Governance Forum a
few weeks ago, Vint Cerf, one of the internet's founding architects and
today Google's "internet evangelist", noted that the whole area of
applications on the internet had hardly been broached. "There seem to
be an endless array of potential ideas left to be considered, limited
only by the imagination and our ability to produce the software to make
these ideas real," he said.
It may take more than advanced engineering capabilities, however.
Knowledge of how people interact with the web will be increasingly
important. Fortunately, the internet is a self-generating cornucopia of
information, which can be mined to reveal - and influence - patterns of
behaviour.
The combined data warehouses of Google, Yahoo, Microsoft's MSN and AOL
- to pick the largest directories/search engines - contain phenomenal
amounts of information, growing at a stupendous rate.
According to Prabhakar Raghavan, head of Yahoo Research, 12 terabytes
(12 trillion bytes) of data are added every day to the Yahoo warehouse
alone. The other search organisations are growing at much the same
rate, and this, Mr Raghavan observes, is only a decade after Walmart,
the US retail chain recognised for its effective use of technology, was
celebrating its creation of a 1 terabyte data store.
Storage is cheap, the former IBM scientist says, and so no data is
thrown away: indeed, it is the raw material for Mr Raghavan's research
into the science that underpins the internet and which will determine
the shape of the web of the future. All the big internet companies are
carrying out similar research.
Yahoo, for example, lagging behind the meteorically growing Google in
pure search, has seven research centres across the globe dedicated to
understanding the basic principles underlying the internet.
Yahoo Answers, a poll of the internet that started life as the nebulous
concept "social search", is an example of the kind of application that
owes its success as much to psychology as to computing power.
Mr Raghavan says Answers has the ability to unlock information held in
people's heads: "People seem to be fundamentally willing to share their
knowledge," he says, adding that too little is known about incentive,
reward and trust in this area of human psychology.
There are five sciences important to understanding the future of the
internet. First, information management and retrieval, the essence of
internet search which Google has revolutionised. Second, machine
learning, which makes possible predictive models: for example, how will
a 19-year-old male living in New York respond to a particular
advertisement? Third, large scale computing systems: supercomputers
built out of off-the-shelf components. Fourth, cognitive psychology:
here is where the internet comes into its own. On a good day, Mr
Raghavan says, an anthropologist may find 100 people to take part in an
experiment: 100m people may use Yahoo over the same period.
Fifth, microeconomics: here there are similarities with the offline
world. The way airlines price seats on aeroplane journeys, for example,
involves yield management of a fundamentally similar nature to the way
advertising slots are priced on internet search services.
The point Mr Raghavan makes, and which is driving his research and that
of his contemporaries at rival internet directory/search companies, is
that monetisation has to be built into an internet product from the
beginning. The danger is that engineers will make what they believe to
be engineering decisions but turn out to be deep revenue decisions. The
cost can run into hundreds and millions of dollars.
A bigger point is that Google, Yahoo and their competitors are tapping
into human behavioural patterns on an unprecedented scale. Their
findings will be of value to themselves and their customers. But they
should also be broadly disseminated, probably through the traditional
route of peer review and publication, for the good of all.
» Discussion