Common Atom Elements

Atom feeds generally contain more information than RSS feeds (because more elements are required), but the most commonly used elements are still title, link, tagline/summary, various dates, and ID.

This sample Atom feed is at

<?xml version="1.0" encoding="iso-8859-1"?>
<!-- several elements omitted in this example -->
<feed version="0.3"
  <title type="text/plain" mode="escaped">Sample Feed</title>
  <tagline type="text/html" mode="escaped"
    For documentation &lt;em&gt;only&lt;/em&gt;
  <link rel="alternate" type="text/html" href="/"/>
    <title>First entry title</title>
    <link rel="alternate" type="text/html" href="/entry/3"/>
    <summary type="text/plain" mode="escaped">Watch out for nasty tricks</summary>
    <content type="application/xhtml+xml" mode="xml"
             xml:base="" xml:lang="en-US">
      <div xmlns="">Watch out for
      <span style="background: url(javascript:window.location='')">
      nasty tricks</span></div>

feed elements are available in d.feed.

Example: Accessing Common Feed Elements

>>> import feedparser
>>> d = feedparser.parse('')
>>> d.feed.title
u'Sample feed'
>>> d.feed.tagline
u'For documentation <em>only</em>'
>>> d.feed.modified
>>> d.feed.modified_parsed
(2004, 4, 20, 11, 56, 34, 1, 111, 0)

Entries are available in d.entries, which is a list. You access entries in the order in which they appear in the original feed, so the first entry is d.entries[0].

Example: Accessing Common Entry Elements

>>> import feedparser
>>> d = feedparser.parse('')
>>> d.entries[0].title
u'First entry title'
>>> d.entries[0].link
>>> d.entries[0].id
>>> d.entries[0].created
>>> d.entries[0].created_parsed
(2004, 4, 19, 7, 45, 0, 0, 110, 0)
>>> d.entries[0].issued
>>> d.entries[0].issued_parsed
(2004, 4, 20, 0, 23, 47, 1, 111, 0)
>>> d.entries[0].modified
>>> d.entries[0].modified_parsed
(2004, 4, 20, 11, 56, 34, 1, 111, 0)
>>> d.entries[0].summary
u'Watch out for nasty tricks'
>>> d.entries[0].content
[{'type': u'application/xhtml+xml',
 'mode': u'xml',
 'base': u'',
 'language': u'en',
 'value': u'<div>Watch out for <span>nasty tricks</span></div>'}]
The parsed summary and content are not the same as they appear in the original feed. The original elements contained dangerous HTML markup which was sanitized. See HTML Sanitization for details.

Because Atom entries can have more than one content element, d.entries[0].content is a list of dictionaries. Each dictionary contains metadata about a single content element. The two most important values in the dictionary are the content type, in d.entries[0].content[0].type, and the actual content value, in d.entries[0].content[0].value.

You can get this level of detail on other Atom elements too.

← Common RSS Elements
Getting Detailed Information on Atom Elements →