Barton's Picture linux counter image.

Man Page for rssfeed.pl



NAME

 rssfeed.pl - Generate RSS feed from web pages


SYNOPSIS

option 1: rssfeed [<newsfile> [<rssfile>]]

  If a workfile name is given then that file is read instead of the
  filename given in the config section.
  If an rssfile name is given then output is written to that file.

option 2: rssfeed [<newsfilename> [<rssfilename>]] < <newsfile>

  newsfilename is optional if not present then output goes to workfile
  named in config section.
  If rssfilename is pressent that file is used for the rss output else
  from config section.

option 3: cat xx | rssfeed [<newsfilename> [<rssfilename>]]

  like option 2 just from a pipe


OPTIONS

-h --help

   Display help

-m --man

   Display a full man page

-c configfile --config=configfile

   use the named file as the configuration file. For example: rssfeed.pl --config=path/test.config

-n --noesc

   do not escape < or >. If not set then < = &lt; and > = &gt;

-r --resetdate

   do not use the value in date="..." if it exists, instead use todays date-time.


DESCRIPTION

This program reads a new (html) file and looks for <rssfeed> tags that should be inside html comments like this:

   <!-- <rssfeed> --> some html <!-- </rssfeed> -->

Strictly speaking the rssfeed tag does not need to be inside comments, also you can have:

   <!-- <rssfeed> then html which is inside the comment </rssfeed> -->

which lets you have code that does not appear on the web page.

This program extracts the <h2> element as the <title> element of the rss.

If there is a <a name=... tag the text of the name is appended to the link with a so the link goes directly to the anchor.

The program creates a temp file news.php.rss which has the <rssfeed> tag replaced with <rssfeed date='...'> which has the date this program was run. If the news.php file has the date='...' attrubute on the rssfeed tag then that date is used instead of the current date. After the program is done it copies the news.php file to news.php.old and then moves the news.php.rss file to replace news.php.

The <rssfeed> tag can take several other attributes:

  date="..."     article date
  title="..."    article title
  url="..."      the base url of the target page
  page="..."     the page file name
  anchor="..."   the anchor name
  noesc          do not escape html codes

each of these attributes takes the place of tag item between the <rssfeed> tag. For example:

  <rssfeed url="http://www.xyz.com" page="XYZ.php" anchor="this" title="XYZ test" date="Sun, 26 Apr 2009 19:58:59 GMT">
  <h2>Some text here</h2>
  <p>Some more text as a description</p>
  </rssfeed>

This section of code would produce the following <item> sectoin in the rssfeed.xml file:

  <item>
    <title>XYZ test</title>
    <link>http://www.xyz.com/XYZ.php#this</link>
    <description><h2>Some text here</h2><p>Some more text as a description</p> </description>
    <pubDate>Sun, 26 Apr 2009 19:58:59 GMT</pubDate>
  </item>

The <h2> tag is ignored as a title if the title attribute is provided. The same goes for the other attributes. The link attribute takes the place of the default link set in the configuration section, this lets you have <rssfeed> tags in one file that reference another file or site.


EXAMPLES

rssfeed.pl

   The default behavior, the files mentioned in the configuration file or the defaults are used.

rssfeed.pl def.html

  The file 'def.html' is read instead of the 'newsfile' mentioned in the configuration file or default. The file 'def.html' is
  updated and a 'def.html.old' is the backup. The rss feed goes into the file mentioned in the configuration.

rssfeed.pl def.html abc.xml

  Like above but the rss feed goes into 'abc.xml'.

rssfeed.pl xyz.html < def.html

  The file to be parsed is 'def.html', the rss feed output goes to the file mentioned in the configuration file or defaults,
  the new html goesss to 'xyz.html'.

rssfeed.pl xyz.html abc.xml < def.html

  The file to be parsed is 'def.html', the rss feed output will go to 'abc.xml', the new html goes to 'xyz.html'.

wget -O - http://localhost/def.php | rssfeed.pl

  If you have rssfeed tags that at generated dynamically you can pipe the output from the webpage to rssfeed.pl.
  Assumming the configuration file or defaults are set to 'newsfile=webpath/def.php', 'rssfile=webpath/abc.xml'
  the rss output would go to webpath/abc.xml, the file webpath/def.php would be updated and a backup file
  webpath/def.php.old would be created.


FILES

rssfeed.config default configuration file. Should be in the same directory as the rssfeed.pl. Can be created by cutting and pasting the default configuration from the script and changing the variables to fit your site.


SEE ALSO

http://www.bartonphillips.com


NOTES

The <rssfeed ...> can be split over several lines; however, the ending MUST be on a line by itself. If the <rssfeed> tag is inside a comment the end comment can be on the same line as the ending > of the tag.

This is OK:

   <--
   <rssfeed
   title="Hi There">
   -->

This is NOT OK:

   <-- <rssfeed title="Hi There"> --> <p>Some html on the same line</p>

I guess this could be thought of as a BUG but I like to think of it as a feature:)


BUGS

Probably, if you find any please let me know at the email addresses below. Thanks.


AUTHOR

Barton Phillips


Page Hits
358

Last Modified May 02, 2010 15:44:14 MDT

Contact Us