程序師世界是廣大編程愛好者互助、分享、學習的平台,程序師世界有你更精彩!
首頁
編程語言
C語言|JAVA編程
Python編程
網頁編程
ASP編程|PHP編程
JSP編程
數據庫知識
MYSQL數據庫|SqlServer數據庫
Oracle數據庫|DB2數據庫
 程式師世界 >> 編程語言 >> 更多編程語言 >> Python >> python下rss的生成與解析教程

python下rss的生成與解析教程

編輯:Python

這兩天在研究獲取rss的內容,稍微作個總結吧

RSS生成:PyRSS2Gen

把指定的內容生成為rss,適用於python 2.3~3.3

下載地址:http://www.dalkescientific.com/Python/PyRSS2Gen.html

安裝方法:python setup.py install

例子:

import datetime
import PyRSS2Gen

rss = PyRSS2Gen.RSS2(
title = "Andrew's PyRSS2Gen feed",
link = "http://www.dalkescientific.com/Python/PyRSS2Gen.html",
description = "The latest news about PyRSS2Gen, a "
"Python library for generating RSS2 feeds",

lastBuildDate = datetime.datetime.now(),

items = [
PyRSS2Gen.RSSItem(
title = "PyRSS2Gen-0.0 released",
link = "http://www.dalkescientific.com/news/030906-PyRSS2Gen.html",
description = "Dalke Scientific today announced PyRSS2Gen-0.0, "
"a library for generating RSS feeds for Python. ",
guid = PyRSS2Gen.Guid("http://www.dalkescientific.com/news/"
"030906-PyRSS2Gen.html"),
pubDate = datetime.datetime(2003, 9, 6, 21, 31)),
PyRSS2Gen.RSSItem(
title = "Thoughts on RSS feeds for bioinformatics",
link = "http://www.dalkescientific.com/writings/diary/"
"archive/2003/09/06/RSS.html",
description = "One of the reasons I wrote PyRSS2Gen was to "
"experiment with RSS for data collection in "
"bioinformatics. Last year I came across...",
guid = PyRSS2Gen.Guid("http://www.dalkescientific.com/writings/"
"diary/archive/2003/09/06/RSS.html"),
pubDate = datetime.datetime(2003, 9, 6, 21, 49)),
])

rss.write_xml(open("pyrss2gen.xml", "w"))

 

RSS解析:feedparser

用於解析RSS、Atom和RDF,適用於python 2.4~3.3

下載地址:https://code.google.com/p/feedparser/downloads/list

安裝方法:python setup.py install

例子:

>>> import feedparser
>>> d = feedparser.parse("http://feedparser.org/docs/examples/atom10.xml")
>>> d['feed']['title'] # feed data is a dictionary
u'Sample Feed'
>>> d.feed.title # get values attr-style or dict-style
u'Sample Feed'
>>> d.channel.title # use RSS or Atom terminology anywhere
u'Sample Feed'
>>> d.feed.link # resolves relative links
u'http://example.org/'
>>> d.feed.subtitle # parses escaped HTML
u'For documentation only'
>>> d.channel.description # RSS terminology works here too
u'For documentation only'
>>> len(d['entries']) # entries are a list
1
>>> d['entries'][0]['title'] # each entry is a dictionary
u'First entry title'
>>> d.entries[0].title # attr-style works here too
u'First entry title'
>>> d['items'][0].title # RSS terminology works here too
u'First entry title'
>>> e = d.entries[0]
>>> e.link # easy access to alternate link
u'http://example.org/entry/3'
>>> e.links[1].rel # full access to all Atom links
u'related'
>>> e.links[0].href # resolves relative links here too
u'http://example.org/entry/3'
>>> e.author_detail.name # author data is a dictionary
u'Mark Pilgrim'
>>> e.updated_parsed # parses all date formats
(2005, 11, 9, 11, 56, 34, 2, 313, 0)
>>> e.content[0].value # sanitizes dangerous HTML
u'
Watch out for nasty tricks
'
>>> d.version # reports feed type and version
u'atom10'
>>> d.encoding # auto-detects character encoding
u'utf-8'
>>> d.headers.get('Content-type') # full access to all HTTP headers
u'application/xml'
 

  1. 上一頁:
  2. 下一頁:
Copyright © 程式師世界 All Rights Reserved