phpBB

Development Wiki

Sphinx Fulltext Search

From phpBB Development Wiki

Sphinx fulltext search provides a new feature to use Sphinx Open Source Search Server for phpBB 3.1 search. Using Sphinx will improve the performance of searching as well as indexing particularly in boards with large databases. Sphinx server being both flexible and fast, provides a better alternative as a search backend.

Minimum Requirements

Sphinx Search server 2.0.1+ and phpBB 3.1 board running on either MySQL or PostgreSQL Databases.

Installation Instructions

Sphinx Installation

Follow the Instructions to install sphinx. Only the actual installation is required, no need to follow "Sphinx Quick Usage Tour" for phpBB search.

Sphinx Configuration

Sphinx configuration file data can either be generated through ACP and then copy pasted into the sphinx.conf or phpBB/docs/sphinx.sample.conf can be manually edited and used. Following folders/files need to be created and defined in the sphinx.conf:

  • Config directory which will have sphinx.conf and stopwords.txt (If defined).
  • Data directory which will have binary and index files.
  • Log directory as a sub directory of Data directory which will save all logs related to sphinx search server.

Creating Required Directories

  • Data Directory
mkdir -p {DATA_PATH}
  • Log Directory
mkdir -p {DATA_PATH}/log

Indexing

Board administrator needs to select Sphinx Fulltext Search as the search backend and Create Search Index through the ACP UI. This will create a SPHINX_TABLE in the database. Then the sphinx indexer should be manually run from the shell.

  • Index Main
indexer --config {CONFIG_PATH}/sphinx.conf index_phpbb_{SPHINX_ID}_main >> {DATA_PATH}/log/indexer.log 2>&1 &
  • Index Delta
indexer --config {CONFIG_PATH}/sphinx.conf index_phpbb_{SPHINX_ID}_delta >> {DATA_PATH}/log/indexer.log 2>&1 &
  • Re-Index
indexer --rotate --config {CONFIG_PATH}/sphinx.conf index_phpbb_{SPHINX_ID}_delta >> {DATA_PATH}/log/indexer.log 2>&1 &

Test Sphinx

Test whether sphinx is working. The following command will return the search result.

search --config {CONFIG_PATH}/sphinx.conf search string

Incremental Updates

Crontab file on most Unix Systems can be edited by

crontab -e

Add this line to update the delta index every five minutes

*/5 * * * * indexer --rotate --config {CONFIG_PATH}/sphinx.conf index_phpbb_{SPHINX_ID}_delta >> {DATA_PATH}/log/indexer.log 2>&1 &

Add this line to set up cron job for full index once every night

0 3 * * * indexer --rotate --config {CONFIG_PATH}/sphinx.conf index_phpbb_{SPHINX_ID}_main >> {DATA_PATH}/log/indexer.log 2>&1 &

Start Searchd

Start sphinx daemon.

searchd --config {CONFIG_PATH}/sphinx.conf >> {DATA_PATH}/log/searchd-startup.log 2>&1 &

Troubleshooting

Log files present in the {DATA_PATH}/log/ directory can be checked for errors. See Sphinx Documentation for details.

Manual Configuration

Sample Sphinx config file for phpBB sphinx search backend is available [# here]. It has many options which include database details as well as the directory details for sphinx data and config folders.

Database Details

Database details on which sphinx daemon and the board are running.

  • type - database type , default mysql.
  • sql_host - hostname, default localhost
  • sql_user
  • sql_pass
  • sql_port - database port, default 3306 for mysql
  • db_name

Searchd Details

  • listen - IP address : Sphinx Daemon port, default 127.0.0.1 : 3312
  • read_timeout - Network client request read timeout in seconds, default 5
  • max_children - Maximum amount of children to fork (concurrent searches to run in parallel), default 30
  • max_matches - the number of search hits to display per result page, default 20000

Stopwords

Sphinx config file provides an option for specifying a file containing search stop words. Stop words are those common words like 'a' and 'the' that appear commonly in text and should really be ignored from searching. A somewhat complete list of English stop words can be found [# here]. These words can be copied into a text file and added to sphinx.conf under index_phpbb section as

stopwords = path/to/stopwords.txt