phpBB

Development Wiki

MySQL Fulltext Search

From phpBB Development Wiki

phpBB supports MySQL fulltext search backend for boards running on mysql databases. Full-text indexes can be used with MyISAM tables and InnoDB (only since MySQL 5.6.4), and can be created only for CHAR, VARCHAR, or TEXT columns. There are 3 types of fulltext searches:

  • Boolean
  • Natural Language
  • Query Expansion

phpBB MySQL search backend uses Boolean fulltext search. MySQL fulltext search docs

Boolean Search

A boolean search interprets the search string using the rules of a special query language. The string contains the words to search for. It can also contain operators that specify requirements such that a word must be present or absent in matching rows, or that it should be weighted higher or lower than usual. Common words such as “some” or “then” are stopwords and do not match if present in the search string.

Configuration

MySQL fulltext search can be selected via Search Settings in ACP. ACP configuration settings allow to modify maximum/minimum word length.

Minimum word length

Search ignores keywords with length shorter than the minimum word length while searching. Default Value: 4

Maximum word length

Search ignores keywords with length longer than the maximum word length while searching. Default Value:254

Indexing

The board administrator can create/delete indexes for MySQL full text search by choosing search index option in the maintenance tab. Index stats for MySQL indexes show just the total no of posts which were indexed. Indexes need to be rebuilt everytime the search backend configuration settings are changed in the ACP.

Search Keywords

The search keyword support the following operators:

  • + A leading plus sign indicates that this word must be present in each row that is returned.
  • - A leading minus sign indicates that this word must not be present in any of the rows that are returned. The - operator acts only to exclude rows that are otherwise matched by other search terms. Thus, a boolean-mode search that contains only terms preceded by - returns an empty result. It does not return “all rows except those containing any of the excluded terms.”
  • (no operator) By default (when neither + nor - is specified) the word is optional, but the rows that contain it are rated higher.
  • * The asterisk serves as the truncation (or wildcard) operator. Unlike the other operators, it should be appended to the word to be affected. Words match if they begin with the word preceding the * operator.

Phrase Search

A phrase that is enclosed within double quote (“"”) characters matches only rows that contain the phrase literally, as it was typed. The full-text engine splits the phrase into words and performs a search in the FULLTEXT index for the words. Phrase searching requires only that matches contain exactly the same words as the phrase and in the same order. For example, "test phrase" matches "test, phrase" in MySQL. If the phrase contains no words that are in the index, the result is empty. For example, if all words are either stopwords or shorter than the minimum length of indexed words, the result is empty.

MySQL Stopwords

MySQL has predefined stopwords which are excluded from search. A full list of MySQL stopwords can be found here