MySQL Natural Language Full-Text Search

Natural Language Full Text Search

In the previous tutorial, we have seen what is Boolean full-text search and how it works. In this tutorial, we will see what is natural language full-text search, how it is different from boolean full-text search, what is its syntax and how to implement it with practical examples. This is going to be a very interesting tutorial as we will see multiple examples of natural language search and its features too. So, let’s get started!

Also read: MySQL Boolean Full-Text Searches

Introduction to Natural Language Full-Text Search

The Natural language search is one of the techniques of full-text search in MySQL in which rows are returned that are relevant to natural language.

Unlike the boolean search method, where rows are returned based on matching words along with the boolean operators, the natural search method focuses on natural human language queries.

Relevance is an integer with a positive decimal value. When the relevance is 0, there is no resemblance. The number of words in the document, the number of unique words in the document, the total amount of words in the collection, and the number of documents (rows) that include a certain word are all characteristics that MySQL considers when determining relevance.

Syntax of Natural Language Full-Text Search

We can perform a natural-language full-text search using the IN NATURAL LANGUAGE MODE modifier with the AGAINST clause. The following syntax shows how to use the natural language full-text search.

SELECT col_name FROM table_name
WHERE MATCH(target_col_names) 
AGAINST(query IN NATURAL LANGUAGE MODE);Code language: SQL (Structured Query Language) (sql)

Where,

  • target_col_names – Names of the column that you want to search the keyword in.
  • query – Natural human language query or keywords.

Note that, the Natural language full-text search is the default mode for full-text search. So, even if you skip the “IN NATURAL LANGUAGE MODE” part in the AGAINST function, it will do the same work.

Before Proceeding to Natural Language Full-text Search

When particular indexes are used in MySQL, full-text searches are done, and such indexes have numerous unique characteristics, including the following:

  • The index must be of the FULLTEXT type in order to be classified as a full-text index.
  • Only tables using the InnoDB or MyISAM storage engines can use FULLTEXT indexes.
  • Only CHAR, VARCHAR, and TEXT columns can have FULLTEXT indexes.
  • When the MATCH() AGAINST() clause is used, FULLTEXT indexes are utilized

You can easily create a full-text index on a single column as well as on multiple columns. Here is the StackOverflow thread for creating a full-text index on a single column and multiple columns.

Examples of Natural Language Full-Text Search

For the demonstration of the natural language full-text search method, we will create a table and insert some data into it.

CREATE TABLE nlftSearch(
id INT AUTO_INCREMENT PRIMARY KEY,
title VARCHAR(100),
body VARCHAR(100)
);Code language: SQL (Structured Query Language) (sql)
INSERT INTO nlftSearch(title,body)
VALUES("mysql tutorial", "this is a mysql tutorial"),
("ml tutorial", "not a mysql tutorial"),
("android tut", "use firebase"),
("system programing", "8086 microprocessor"),
("datastructure & myDB", "mysql with DS"),
("advanced mysql", "data mining and graph database");Code language: SQL (Structured Query Language) (sql)

Let’s display the table data to check everything is inserted without an error.

SELECT * FROM nlftSearch;Code language: SQL (Structured Query Language) (sql)
Nlftsearch Table Data
Nlftsearch Table Data

Perfect! Now let’s write a simple query demonstrating the natural language search method.

SELECT * FROM nlftSearch
WHERE MATCH(body)
AGAINST('what is mysql tutorial' IN NATURAL LANGUAGE MODE);Code language: SQL (Structured Query Language) (sql)

Here, we are specifying a natural language query for searching the records. Let’s see what result we get.

Natural Language Search Example 1
Natural Language Search Example 1

As you can see, the output contains the expected result. If you see it carefully, the body column of the table contains the keyword “mysql” in each row. However, rows that contain “mysql tutorial” are put on higher ranking.

Let’s take another example where we will learn how to sort the result by relevance.

SELECT * FROM nlftSearch
WHERE MATCH(body)
AGAINST('mysql,tutorial' IN NATURAL LANGUAGE MODE);Code language: SQL (Structured Query Language) (sql)

Here, we have specified two keywords that will return all rows which contain either one of the mysql or tutorial or both.

Note that, the records which contain both keywords are ranked higher. All the records which contain only one keyword are ranked lower.

Natural Language Search Example 2
Natural Language Search Example 2

As you can see, the records that are ranked higher consist of both keywords. Whereas, the last records only contain a single keyword.

When utilizing the full-text search, there are a few things to keep in mind:

  • The minimal length of a search query in MySQL’s full-text search engine is four characters. It implies that if you search for a term with a length of fewer than four characters, such as man or sit, you will get no results.
  • Stop words go unheeded. In the MySQL source code distribution, MySQL specifies a list of stop words.

Conclusion

I hope you find this tutorial helpful. It is heavily recommended to practice the full-text search methods and implement those in the live projects so that you will understand the concept in depth. Check out the official documentation given below for more information.

References

MySQL official documentation on natural-language full-text search.