Thursday, December 1, 2011

PHP and RSS: Getting it together


RSS syndication is one of the most common TLAs around (TLA stands for Three Letter Acronym). RSS as an acronym has stood for various things, but the current standard is: Really Simple Syndication. This is the most recent variation of this very common and very useful standard.
Back when the Internet was young(er), a piece of software called Pointcast pushed data to a screensaver application on a user's computer, providing news updates of all kinds. Eventually browser developers such as Netscape and Microsoft worked to create something similar to this immensely popular service. Netscape produced the most widely accepted variant and that eventually was released into the development wilds of the Internet, to eventually become the RSS of today.
RSS distributes recently updated information to many receivers, much like a broadcast system. Once you have a substantial number of users, the RSS feed acts like a beacon to draw your users back to look at updates. It is little wonder that RSS has increased in popularity and use among content providers, as it provides a much needed method of maintaining an audience's attention.
When you see the icon in Figure 1 you can bet that an RSS feed is available on that site. This icon is the de-facto standard icon representing the availability of RSS for updates at a site. The curved lines represent radio waves, a symbol of the broadcast nature of the RSS feed.

A good number of applications, many of them free, can read an RSS feed and many of them allow you to aggregate the feeds. The aggregation features allow a user even further refinement over the amount and nature of content they receive. Each reader has different features, designed to help make sense of the incredible amount of information coming from the Internet.
Some examples are Thunderbird and Firefox by Mozilla, Internet Explorer 7 and upcoming versions of Office by Microsoft and many others, as close to you as the nearest search engine. With all the various ways to get and read feeds, it is very likely that you will find something that suits you. Unless of course you are a picky software developer and want to write your own! This article will get into that soon enough!

Your site has content that you want to get out to the masses, which is why you put it on the Internet in the first place. Once a substantial number of users know about your site and content, will they come back each day to check for updates? Probably not. Of all the sites you frequent, do you go to each one daily to check for updates? Probably not. This is where RSS comes in.
For your users, RSS can be a huge benefit, especially if they value opinions or news listed on your site. Without having to return to your site frequently, they will know exactly when you update or add content, allowing them to save time and effort, and they won't miss anything either!
Content generation isn't a problem, if you incorporate RSS feeds to fuel content aggregation for your own site. If you pull data off a feed and include it in your site, it can add a good amount of content to your site with only a little bit of time investment.
Personally, I like to use RSS to gather feeds from filtered results from various sites such as Craigslist (www.craigslist.org). A little trick I use is for shopping for used electronics. You can set up a site search and then RSS the resulting page. If you set up a feed for a search for cameras within a certain price range, you can see when anyone posted a camera for sale within your price range, on your RSS feed! Gives you a big advantage when you are trying to be the first bidder!

The RSS standard defines and contains the content of a feed. These feeds can be from any data source, defining Internet documents and in a very basic sense, make up a list of links and their descriptions.
Look at the RSS format in Listing 1, which uses a sample document from the NASA "Liftoff News" feed.

Listing 1. A sample RSS 2.0 document
                
<?xml version="1.0"?>
<rss version="2.0">
  <channel>
    <title>Liftoff News</title>
    <link>http://liftoff.msfc.nasa.gov/</link>
    <description>Liftoff to Space Exploration.</description>
    <language>en-us</language>
    <pubDate>Tue, 10 Jun 2003 04:00:00 GMT</pubDate>
    <lastBuildDate>Tue, 10 Jun 2003 09:41:01 GMT</lastBuildDate>
    <docs>http://blogs.law.harvard.edu/tech/rss</docs>
    <generator>Weblog Editor 2.0</generator>
    <managingEditor>editor@example.com</managingEditor>
    <webMaster>webmaster@example.com</webMaster>

    <item>
      <title>Star City</title>
      <link>http://liftoff.msfc.nasa.gov/news/2003/news-starcity.asp</link>
      <description>How do Americans get ready to work with Russians aboard the
        International Space Station? They take a crash course in culture, language
        and protocol at Russia's Star City.</description>
      <pubDate>Tue, 03 Jun 2003 09:39:21 GMT</pubDate>
      <guid>http://liftoff.msfc.nasa.gov/2003/06/03.html#item573</guid>
    </item>
    
    <item>
      <title>Space Exploration</title>
      <link>http://liftoff.msfc.nasa.gov/</link>
      <description>Sky watchers in Europe, Asia, and parts of Alaska and Canada
        will experience a partial eclipse of the Sun on Saturday, May 31st.</description>
      <pubDate>Fri, 30 May 2003 11:06:42 GMT</pubDate>
      <guid>http://liftoff.msfc.nasa.gov/2003/05/30.html#item572</guid>
    </item>
    
    <item>
      <title>The Engine That Does More</title>
      <link>http://liftoff.msfc.nasa.gov/news/2003/news-VASIMR.asp</link>
      <description>Before man travels to Mars, NASA hopes to design new engines
        that will let us fly through the Solar System more quickly.  The proposed
        VASIMR engine would do that.</description>
      <pubDate>Tue, 27 May 2003 08:37:32 GMT</pubDate>
      <guid>http://liftoff.msfc.nasa.gov/2003/05/27.html#item571</guid>
    </item>
    
    <item>
      <title>Astronauts' Dirty Laundry</title>
      <link>http://liftoff.msfc.nasa.gov/news/2003/news-laundry.asp</link>
      <description>Compared to earlier spacecraft, the International Space
        Station has many luxuries, but laundry facilities are not one of them.
        Instead, astronauts have other options.</description>
      <pubDate>Tue, 20 May 2003 08:56:02 GMT</pubDate>
      <guid>http://liftoff.msfc.nasa.gov/2003/05/20.html#item570</guid>
    </item>
  </channel>
</rss>

The first child object of the XML formatted document is the definition of a <channel>. A channel is simply the feed itself and it's associated information. Many RSS feeds have one channel object, but you can have several, perhaps if you wanted to separate feeds by an arbitrary filter. The objects: titlelink and description are required by the channel object. They define the basic descriptive information about the feed. The optional objects are: languagecopyrightmanagingEditorwebMaster,pubDatelastBuildDatecategorygeneratordocscloudttlimageratingtextInputskipHours, and skipDays.
A channel can contain an unlimited number of items. All elements of the ITEM element are optional, however, at least one title or description are needed to validate the element. The elements are: titlelinkdescriptionauthorcategorycomments,enclosureguidepubDate, and source.

This article assumes that you have some experience using PHP already and can use a function to pass a variable and return a result. PHP has many functions that make short work of XML both in and out of an application.
First you want to take information from a locally stored data source, a content management system, blog or any content which fits the format of an Internet document and put that out as a feed to your users. You will need to get this data, format it into an RSS object, and serve requests for it.
Portions of your site require additional content and rather than go out into the world seeking additional content documents for your site, you can take advantage of the multitudes of RSS feeds already prepared. You will use XML_RSS to get and handle these feeds for your site.
XML_RSS() is a PEAR package to help you get through the more complex tasks of interpreting an XML RSS file more easily. PEAR is an open source library of PHP functions which is free for your use and under continual development. You might already have PEAR installed with your PHP installation, but you might need to install it for this article (see Resources for a link).XML_RSS() is simply a function, which given the location of an RSS feed, will load the XML of the feed into an array, ready for your use in your PHP application. The elements of the array will have named keys, associated with the elements and attributes of the RSS file read.

Now that you know what the RSS data format is, you can look at the data you want to hand out to the world, and put it in that format. Thankfully PHP has some powerful RSS and XML handling features to speed your development along. Like many of the common Web standards, PHP has a number of great functions ready for use in this application.

Getting the word out

You create a feed in order for others to read it, but how do you let people know it exists? You can tell Mozilla Firefox and Microsoft Internet Explorer, as well as other readers, about your feed by adding the following tag to the top of your home page:
<link rel="alternate" type="application/rss+xml" 
href="URL_FOR_YOUR_FEED" title="FEED_TITLE" />


Make sure to update the tag with your URL and feed title.
For this article, you will pull data out of a database, using "" and format it into an RSS feed. You will set it up so that it looks for the most recent additions to your dataset whenever the RSS feed is called upon and returns a fresh RSS to the requester.
The feed can come from any data source on your site, but in the end you need to make sure there is enough data that the person receiving the RSS feed will be able to use the data. At a minimum the URL name and description is needed. Any data that is published on your site can be turned into a feed.
You will use PHP to connect to your Web application database, pull updated information out, and format it into an XML RSS document.
Assuming you have a database of choice, you will create a connection as normal, and generate a page displaying the XML laid out in a user readable fashion.
Now that you have the data all well formatted in your own code, you need to make sure you hand the data out properly so when someone inputs your URL into their reader, they will get the XML RSS feed they expect (see Listing 2).

Listing 2. The complete RSS.php
                
<?php

$database =  "nameofthedatabase";
$dbconnect = mysql_pconnect(localhost, dbuser, dbpassword);
mysql_select_db($database, $dbconnect);
$query = "select link, headline, description from `headlines` limit 15";
$result = mysql_query($query, $dbconnect);

while ($line = mysql_fetch_assoc($result))
        {
            $return[] = $line;
        }

$now = date("D, d M Y H:i:s T");

$output = "<?xml version=\"1.0\"?>
            <rss version=\"2.0\">
                <channel>
                    <title>Our Demo RSS</title>
                    <link>http://www.tracypeterson.com/RSS/RSS.php</link>
                    <description>A Test RSS</description>
                    <language>en-us</language>
                    <pubDate>$now</pubDate>
                    <lastBuildDate>$now</lastBuildDate>
                    <docs>http://someurl.com</docs>
                    <managingEditor>you@youremail.com</managingEditor>
                    <webMaster>you@youremail.com</webMaster>
            ";
            
foreach ($return as $line)
{
    $output .= "<item><title>".htmlentities($line['headline'])."</title>
                    <link>".htmlentities($line['link'])."</link>
                    
<description>".htmlentities(strip_tags($line['description']))."</description>
                </item>";
}
$output .= "</channel></rss>";
header("Content-Type: application/rss+xml");
echo $output;
?>

So let's go through this step by step. First, you set up a database connection object to a local database. In that database, you have a table with records containing headline, link and description fields, which you will request to put into your XML response. You execute an SQL query against your table with MYSQL_QUERY() and with the result, you reformat using WHILE to walk through the resulting object, and reformat the data into a new simple array.
With the new array ready, you start to build the XML file in the $output variable, appending new elements by walking through the$line array once for each returned response. This shouldn't take too long because back in your SQL statement you limited the responses to 15. To use this code fragment as a starter building block, you will need to replace the dummy links, database name, and login information to reflect your own environment.
When this script is executed, you get a nice clean RSS file output similar to that in Listing 3.

Listing 3. RSS.php output
                
<?xml version="1.0"?>
    <rss version="0.97">
        <channel>
            <title>Our Demo RSS</title>
            <link>http://www.tracypeterson.com/RSS/RSS.php</link>
            <description>A Test RSS</description>
            <language>en-us</language>
            <pubDate>Mon, 13 Nov 2006 22:46:06 PST</pubDate>
            <lastBuildDate>Mon, 13 Nov 2006 22:46:06 PST</lastBuildDate>
            <docs>http://someurl.com</docs>
            <managingEditor>you@youremail.com</managingEditor>
            <webMaster>you@youremail.com</webMaster>
    <item rdf:about="http://www.tracypeterson.com/">
            <title>This is Tracy's Web Page!</title>
            <link>http://www.tracypeterson.com/</link>
            <description>This is a demonstration of how to get PHP to work for 
your RSS feed.</description>
        </item><item rdf:about="http://www.tracypeterson.com">
            <title>This is Tracy's site again!</title>
            <link>http://www.tracypeterson.com</link>
            <description>Again, this is a demonstration of the power of PHP 
coupled with RSS.</description>
        </item></channel></rss> 

Anyone can now enter the URL to RSS.php and load up a fresh RSS file with all your great new content contained within!

You will use the XML_RSS() functions to get your RSS feeds into your PHP scripts, ready for use like any other array. Just like a query to a database, you will have an array, ready to use as you see fit.
In this case, you will connect to the RSS.php and load up a copy, displaying it in an unordered list (see Listing 4).

Listing 4. showfeed.php
                
<?php
require_once "XML/RSS.php";

$rss =& new XML_RSS("http://www.tracypeterson.com/RSS/RSS.php");
$rss->parse();

echo "<h1>Headlines from <a
href=\"http://www.tracypeterson.com/RSS/RSS.php\">Tracy 
        Peterson's Site</a></h1>\n"; echo "<ul>\n";

foreach ($rss->getItems() as $item) {
  echo "<li><a href=\"" . $item['link'] . "\">" . $item['title'] . 
"</a></li>\n";
}
echo "</ul>\n";
?>

The example shown in Listing 4 comes directly from the PEAR manual, and I used it because it so concise. Let's go through it line by line and see that it really only uses a couple of the methods available to the XML_RSS() class, the constructor andparse(). Parse simply renders the output as the array that I mentioned before.
First, you use the require_once() function to load the RSS.php file from your PEAR installation. If PEAR is set up properly andXML_RSS installed, it will find this include file correctly and you will now have the XML_RSS object ready for your use. Next, you create a new object called $rss, which is the result of passing the URL to the feed to your XML_RSS constructor.
You simply use the parse() method to return the values in the RSS feed. The first echo line begins to set up the basic HTML you use to make the RSS feed human readable. In this case you announce that the unordered list is a list of headlines from my site!
The foreach() statement gets each item element from the parsed feed, using the getItems() method as a new array $items. Each of the array elements are named after the actual XML tag they are contained within. In this case you only use link and title, in a moment you will add description to explore this point. Each time the foreach loop processes, it will move to the next element until the entire RSS feed is laid out in this fashion.
Now, add descriptions to each of your displayed results.
Inside the foreach() loop, add the line in bold shown in Listing 5.

Listing 5. Adding the description
                
foreach ($rss->getItems() as $item) {
  echo "<li><a href=\"" . $item['link'] . "\">" . $item['title'] . 
"</a></li><br>";
  echo $item['description'] . "<br><br>\n"; }

You simply add a line break and description line to the unordered list. Below is an example of the output of showfeed.php.

Figure 2. showfeed.php output
showfeed.php output 

No comments:

Post a Comment