HighDots Forums  

Re: getting title, desciption for webpages

alt.html alt.html


Discuss Re: getting title, desciption for webpages in the alt.html forum.



Reply
 
Thread Tools Display Modes
  #1  
Old   
Jukka K. Korpela
 
Posts: n/a

Default Re: getting title, desciption for webpages - 06-07-2008 , 09:44 AM






Scripsit रवींदर *ाकुर (ravinder thakur):

Quote:
i am trying to find some generic way of getting the title and
description of webpages [...] i will be doing this in python.
Try googling with words like
python html parse

The first hit I got is
http://aspn.activestate.com/ASPN/Cookbook/Python/Recipe/286269
which might suit your needs.

It's probably easier to write two good HTML parsers than to decide which
of them is better. But for extracting the <title> element and the <meta>
element with name="description", any good or half-good parser should do.
Just make sure you recognize the tag and attribute names and the value
"description" in a case-sensitive manner and do not change the case of
anything in the title and description you extract (unless you really
want to).

--
Jukka K. Korpela ("Yucca")
http://www.cs.tut.fi/~jkorpela/



Reply With Quote
  #2  
Old   
Ben C
 
Posts: n/a

Default Re: getting title, desciption for webpages - 06-07-2008 , 10:00 AM






On 2008-06-07, Jukka K. Korpela <jkorpela (AT) cs (DOT) tut.fi> wrote:
Quote:
Scripsit ?????? ????? (ravinder thakur):

i am trying to find some generic way of getting the title and
description of webpages [...] i will be doing this in python.

Try googling with words like
python html parse

The first hit I got is
http://aspn.activestate.com/ASPN/Cookbook/Python/Recipe/286269
which might suit your needs.
Most people use BeautifulSoup
http://crummy.com/software/BeautifulSoup


Reply With Quote
Reply




Thread Tools
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

vB code is On
Smilies are On
[IMG] code is On
HTML code is Off



Powered by vBulletin Version 3.5.4
Copyright ©2000 - 2009, Jelsoft Enterprises Ltd.