Transform & display an external RSS feed¶
It is quite common to display content from an external XML feed. There are many good extensions in TER that cater for this need. Most of the time they take the approach to import the external source into the TYPO3 database (for example into tt_news which is then used for record display). When you just want to retrieve, parse, format and display a feed the XPATH content object can step in as “TypoScript only” solution.
Say we want to display the official TYPO3 newsfeed from http://news.typo3.org/rss.xml on our website. Using the XPATH object, we can retrieve the XML feed, select it’s items and format them with TypoScript’s parseFunc. Let’s start with the basics:
page.10 = XPATH
page.10 {
# first we set the source to the newsfeed url
source = http://news.typo3.org/rss.xml
# each news entry is wrapped in <item> tags, fetch them with XPATH expression
expression = //item
# return the <item>s as XML which is going to be formatted with parseFunc later on
return = xml
# before we do the parseFunc stuff, just return the content to see what we've got
resultObj {
cObjNum = 1
1.current = 1
}
# let's us display the output on the website for analysis
stdWrap.htmlSpecialChars = 1
}
When we reload the page we can see the items matched by our XPATH query:
<item>
<title>FLOW3 1.0.3 has been released</title>
<link>
http://news.typo3.org/news/article/flow3-103-has-been-released/
</link>
<description>
FLOW3 1.0.3, the third patch release of the PHP application framework has been released.
</description>
<category>Development</category>
<category>FLOW3</category>
<category>www.typo3.org</category>
<pubDate>Sat, 25 Feb 2012 21:30:00 +0100</pubDate>
</item>
We want to translate this to the following HTML
<div>
<h1>FLOW3 1.0.3 has been released</h1>
<p>Tags:
<span class="category">Development</span>,
<span class="category">FLOW3</span>,
<span class="category">www.typo3.org</span>
</p>
<p>
FLOW3 1.0.3, the third patch release of the PHP application framework has been released.
</p>
<p>
<a href="http://news.typo3.org/news/article/flow3-103-has-been-released/">
Read more...
</a>
</p>
</div>
Several things need to be considered. First the easy ones: The <item>, <title> and <description> tags need to be transformed to <div>, <h1> and <p> respectively. That shouldn’t be too hard. The <link> tag needs to be transformed to an <a> tag where the href attribute has to be set to the former tag’s content. The content for the <a> tag needs to be set to “Read more”. Finally, there are several <category> tags that need to be collected within one <p> and transformed to <span> tags with a class “category” assigned. Let’s see what TSRef and parseFunc have to offer for this scenario.
parseFunc’s “externalBlocks” property comes to our help. In tt_content, “externalBlocks” is used to pre-split bodytext content and parse <table> and <blockquote> tags and their according children. In our case, we can use it to replace the incoming <item> tags and pass the content once again into parseFunc to do the <category> collection and process the <link> tag.
The next step shows you the finished setup:
page.10 = XPATH
page.10 {
# first we set the source to the newsfeed url
source = http://news.typo3.org/rss.xml
# each news entry is wrapped in <item> tags, fetch them with XPATH expression
expression = //item
# return the <item>s as XML which is going to be formatted with parseFunc later on
return = xml
# configure the resultObj
resultObj {
cObjNum = 1
1.current = 1
1.parseFunc {
# use externalBlocks to select the <item> tags
externalBlocks = item
externalBlocks.item {
# and send their content once more into parsFunc
callRecursive = 1
# take out <item> tag
callRecursive.dontWrapSelf = 1
# use stdWrap to wrap with <div>
stdWrap {
wrap = <div> | </div>
# and now load a COA to work on the rest of the XML content
cObject = COA
cObject {
# get the current XML data first
5 = LOAD_REGISTER
5.item.data = current:1
# and now use some XPATH cobj to select the content; <title> first
10 = XPATH
10 {
# item register from .5
source.data = register:item
return = string
expression = //title
resultObj {
cObjNum = 1
1.wrap = <h1>|</h1>
1.current = 1
}
}
# <category> collection next
15 < .10
15 {
expression = //category
resultObj {
# use option split, so the last <category> doesn't get a ,
cObjNum = |*|1|*|2
1.wrap >
1.noTrimWrap = |<span class="category">|</span>, |
2.current = 1
2.wrap = <span class="category">|</span>
stdWrap.noTrimWrap = |<p>Tags: |</p>|
}
}
# next select the <description> and wrap in <p>
20 < .10
20 {
expression = //description
resultObj.1.wrap = <p>|</p>
}
# and finally select the <link> and wrap this in an <a> tag
30 < .10
30 {
expression = //link
resultObj.1.wrap = <p><a href="|">Read more...</a></p>
}
}
}
}
}
}
}
Admittedly, this is a bit of TypoScript ;) On the other hand, it only uses standard functionality and at the same time demonstrates how you can “chain” XPATH objects to flexibly work on your XML data.
The transformation could have been achieved much simpler using an XSL stylesheet. This is precisely what the XSLT content object is all about. Check it out in TER, you’ll find a tutorial very similar to this one where the transformation is done with an XSL stylesheet.