PhpBB3.1/RFC/Modular cron

Ticket: http://tracker.phpbb.com/browse/PHPBB3-9596

Description
The new cron architecture moves cron tasks into separate files, each task being represented by a single class. It is now possible to enable "System cron", which allows you to set up a true cron job instead of having users trigger cron tasks.

Motivation
The olympus implementation is severely limited, as it requires several file modifications to add new tasks. By making everything more modular, adding new cron tasks is as easy as dropping a file in a folder. This gives MOD writers the opportunity to handle parts of their MOD asynchronously and periodically. This makes the user experience better, because you can offload slow tasks to a background process.

Create a task
A cron task must implement the phpbb_cron_task interface. The phpbb_cron_task_base class is an abstract class that provides some sensible defaults for the methods defined by the interface. To have tasks automatically loaded, they must be placed in includes/cron/task/ / .php, and use the class name phpbb_cron_task_ _. The included cron tasks use the namespace 'core'.

An example cron for periodically backing up the database could look like this. The filename is includes/cron/task/backuper/backup.php.

class phpbb_cron_task_backuper_backup extends phpbb_cron_task_base {	public function run {		do_backup; set_config('last_backup_run', time); }	public function should_run {		global $config; return $config['last_backup_run'] < time - $config['backup_interval_config']; } }

This class uses two config variables. last_backup_run specifies the time the last backup took place. backup_interval_config is the interval in which the task should run, in seconds, which the admin could set via the ACP. If the value is 60, it will run every minute.

Create a parametrized task
It is possible to pass parameters to a cron task, which are passed to cron.php when invoking it through the board footer. It is not possible to pass these parameters when using the system cron. For this reason you need to either need to make it non-runnable when system cron is enabled (you can define an is_runnable method) or provide some default values for the parameters. Basically, those tasks will not work properly when using the system cron. For this reason you may want to provide system-cron compatible fallbacks (eg. prune_all_forums).

Parametrized tasks must implement the phpbb_cron_task_parametrized interface, which in addition to phpbb_cron_task interface adds get_parameters and a parse_parameters(phpbb_request_interface $request) methods.

An example for where these are used is the prune_forum core task. The f parameter is passed into it, allowing the task to prune a specific forum. First it is instantiated in viewforum.php, passing the forum information into the constructor. Then get_parameters is used to generate the URL for cron.php. When cron.php is actually requested with those parameters, they are fetched through parse_parameters.

You may have noticed that there is an is_runnable and a should_run method. is_runnable defines if the task is able to run at all, according to the board configuration. should_run checks if intervals have been met for the task to run.

One important difference with parametrized tasks is that you must set up the parameters yourself. To do this, call instantiate_task on the cron manager ($cron) with the name of the task and an argument. Here is the snippet from viewforum.php.

if (!$config['use_system_cron']) {	$task = $cron->instantiate_task('cron_task_core_prune_forum', $forum_data); if ($task && $task->is_ready) {		$url = $task->get_url; $template->assign_var('RUN_CRON_TASK', ''); } }

Take a look at the core tasks in includes/cron/task/core to get more of an idea of how the tasks work. Specifically the prune_forum task, as it is parametrized.

Install a task
Installing a task from a MOD is as easy as placing the files in includes/cron/task/. It will automatically be used.

Use system cron
In order to use the system cron you need to be able to set up cron jobs on your server. Note that cron is a Unix utility that schedules tasks to run at periodic intervals, so this discussion assumes that your forum is hosted on a Linux or Unix operating system where cron should be supported. If you have Windows hosting, similar functionality exists, but it is not called cron. On Windows this functionality is usually referred to as scheduled tasks. Consequently on Windows you will need to use a different interface, deal with a different terminology and different way of formatting paths. (Windows uses \ to delineate paths.)

First of all, enable the 'Run periodic tasks from system cron' setting in the 'Server settings' tab of the ACP.

Now you need to set up a periodic cron job. This depends on your system, but you usually have to edit /etc/crontab and add a job that runs every minute in the following format.

* * * * * cd /path/to/board; ./bin/phpbbcli.php cron:run

When specifying the /path/to/board, make sure it is the absolute path from the root of the server, not a relative path. On shared hosting, this may require a conversation with your web host, because the path shown in your web host's file manager is often not correct. Often the real path is buried into directories like /home/ or /var/www.

Note: cd (change directory) is needed because all paths within phpBB are relative. If it does not work, ./bin/phpbbcli.php cron:run may need to be prefaced with php or php_cli. If php or php_cli is not known from the command line, you may need to preface php with its path. On most Unix systems, this command will often retrieve the path for php:

whereis php

Use pseudo-system cron
In shared hosting environments cron may be allowed but multiple commands may not be allowed in the cron. You may be able to use curl or wget instead of a system cron. With this approach, do NOT set 'Run periodic tasks from system cron' setting. You will still need to create a cron, however. The purpose of the cron is to trigger phpBB's cron regularly so if there is no board traffic programmed phpBB crons are still executed. A cron command similar to the following may work:

* * * * * curl -A='Mozilla/5.0' http://www.yourforum.com/forum/cron.php?cron_type=cron.task.cron_task

or if your forum uses HTTPS, you may find you need to disable checking the site's certificate, which often causes problems:

* * * * * curl -A='Mozilla/5.0' -k https://www.yourforum.com/forum/cron.php?cron_type=cron.task.cron_task

The -A argument with curl is recommended as some web hosts will reject web requests if they don't appear to come from a browser. If you use the -A argument, use a valid user agent string such as would appear in the HTTP headers for a browser.

With this approach some trial and error may be necessary. Have the results of the cron sent to an email address that you can read and examine it for troubleshooting. It may be necessary to use a different user agent or to prepend curl or wget with the path, which might be /usr/bin or /bin. Your web host can provide information on the paths needed if necessary. When it is all working you may want to turn off the cron notifications.

Handling emailing quotas
If you have an hourly or daily quota of emails allowed, you may need to change the queue_interval. queue_interval is a phpBB configuration variable. It is set to 60, which means that if you have a lot of traffic on your board, sixty seconds must elapse before board traffic will trigger another attempt to send out emails, or take any phpBB "cron" actions. While this value can be changed, phpBB does not have an interface for doing so. To ensure you don't go over a quota on outgoing emails, you may have to change this value.

For example, if your web host allows up to 200 emails per hour, you might want emails to be sent every five minutes. The cron would look something like this:

*/5 * * * * curl -A='Mozilla/5.0' http://www.yourforum.com/forum/cron.php?cron_type=cron.task.cron_task

That would mean every hour there would be 12 "events" for sending out emails, with the event triggered by the cron if there is no board traffic. In this case you could send up to 16 emails every five minutes to get near to, but not exceed, 200 outgoing emails per hour. (12 x 16 = 192).

In this example, you would first change your email package size to 16. However, if you get traffic on your forum more frequently than every five minutes, you might exceed the 200 emails per hour. So you would also want to change the queue_interval from 60 to 300. 300 is 300 seconds, or five minutes. Since there is no way to do this in phpBB, you need to do it in the database instead. Using a tool like phpMyAdmin, you could change the queue interval as follows (this assumes the $table_prefix variable in your config.php file is "phpbb_"):

UPDATE phpbb_config SET config_value = 300 WHERE config_name = 'queue_interval';

Afterward, you should flush the board's cache so the new setting will be used.

Using site monitoring services as crons
For shared hosting, if cron is not an alternative you might be able to use a site monitoring service. A site monitoring service is an internet company that periodically polls your server to see if it is "up". It notifies you via email if the site is down, so you can complain to your web host. Any public traffic that hits your forum should work, so use the URL to the index of your forum. This will trigger a phpBB cron which should send out any digests that need to go out, along with any forum or topic notifications. You must use a HTTP test rather than a ping test.

Simple search for "site monitoring service" in a search engine. Some services cost money, others are free for limited use. Bear in mind this only works reliably if the service is always up.

Implementation
Most of the new code is in includes/cron, with changes to the core being minimal, except for cron.php. There are two interfaces: phpbb_cron_task and phpbb_cron_task_parametrized. The class loader is used extensively.