Share » Learn » eZ Publish » How to Import and Export RSS Feeds

How to Import and Export RSS Feeds

Wednesday 20 August 2008 12:14:00 pm

  • Currently 5 out of 5 Stars.
  • 1
  • 2
  • 3
  • 4
  • 5

As explained earlier, cronjobs are a way to automatically run scripts on a server, and enable some of eZ Publish's most useful features such as RSS importing.

There are two parts to configuring these automated functions:

  • Configuring eZ Publish cronjob settings
  • Configuring your server's cronjob settings

These steps and the overall cronjob process are shown in the following diagram:

eZ Publish cronjob overview

Configuring eZ Publish cronjob settings

The core eZ Publish script for executing cronjob activities is runcronjobs.php in the root eZ Publish folder. The runcronjobs.php script can be run with parameters, in order to run different sets of eZ Publish cronjob activities, according to how often they should be run. For example, in the broader scope of eZ Publish, you may decide that workflows and notifications will be run every three hours (using a command such as runcronjobs.php frequent), and RSS importing once a day (using a command such as runcronjobs.php rssimport).

Different sets of cronjob activities are called "cronjob parts", and are set in cronjob.ini. Examples of cronjob parts are shown below:

[CronjobPart-infrequent]
Scripts[]=basket_cleanup.php
Scripts[]</span>=linkcheck.php
 
[CronjobPart-frequent]
Scripts[]=notification.php
Scripts[]=workflow.php

To configure a cronjob part specifically for RSS importing, create a settings override file /settings/override/cronjob.ini.append.php with the following content:

[CronjobPart-rssimport]
Scripts[]=rssimport.php

This creates a cronjob part called "rssimport" that executes the script for importing RSS items (which we will configure shortly). With this done, we can give the runcronjobs.php script a parameter that tells it to only run the cronjobs defined under the “rssimport” cronjob part: runcronjobs.php rssimport.

Remember to clear your site's INI cache for the INI changes to take effect (see the Cache window on the right of your site's Administration Interface). For more information on runcronjob.php parameters, see the documentation called the cronjob script.

Configuring cronjob settings on your web server

This section shows how to create a cronjob using cPanel. Cronjobs can also be created and configured from your server's command line (do a web search for “crontab”, or see the eZ Publish documentation on cronjobs).

Many web hosting providers offer cPanel (or a similar web-based interface) as a way to manage and configure the hosting environment.

The URL for cPanel may be something like http://www.(yourwebsitedomain).com/cpanel.

Once you are logged in, click on the Cron jobs item, which will bring up a portal page similar to the screenshot below.

cPanel cronjob page

We will walk you through the "Standard" method of setting up a cronjob. The details of a cronjob can then be specified as shown below.

cPanel cronjob details

There are two elements:

  1. The “Command to run”, which specifies what the server should do
  2. The frequency with which that command should be run – as specified in minutes, hours, days, and/or months

There will be one set of these entries for each cronjob that has been configured for your web hosting service. If there are any existing entries, it will probably be best to leave these and add a new one; check with an expert if you are unsure. You can also use multiple entries to run different eZ Publish cronjob tasks (if you have defined multiple cronjob parts) at different times.

The full text of the “Command to run” used here is shown below, with brief explanations of each part.

Cronjob command breakdown

Expect this command to be slightly different for your site’s server. Translated into human terms, it says “change to the root directory of the eZ Publish installation, and using PHP (located at the path specified), run the runcronjobs.php script (with some parameters)”.

Next, configure the time and frequency information for the cronjob. As processing cronjobs places extra load on your server and can impact your site’s performance, give some thought to the details chosen. Check the site from which you are sourcing your RSS feed – it might have some information on how frequently its RSS feeds are updated. Note also that some RSS feeds have a <ttl> (time to live) element that can give information about how often it is useful to run an RSS import cronjob.

It is logical to wonder what happens if you import RSS items too frequently – in addition to the load on your server, will you end up with multiple copies of each blog or news item? The answer is no. This is because eZ Publish skips those that already exist as imported eZ Publish objects.

When you are done setting the frequency of the cronjob, click the Save Crontab button at the bottom of the page -- and you are done setting up the cronjob!

Printable

Printer Friendly version of the full article on one page with plain styles

Author(s)