Searching is a common requirement for any application, providing users with an efficient and intuitive search experience is a critical aspect of user satisfaction. The ability to quickly find relevant information has become a cornerstone for successful applications.
In this article, let’s explore how we can leverage RedisSearch, a Full-text Search engine, available as a module for Redis. We'll use Python as our example language, providing a hands-on approach to leveraging this powerful tool.
Throughout this article, we will build a simple search API that lets users search news from News Category Dataset (210k news articles). To simplify, we only use 10% of this dataset (20k articles).
There are many factors for building a good search interface. Among all of them, I want to emphasize these 2 key points:
Source: redisconf 2018
To make searches fast and easy, we can implement autocomplete and full-text search, which will be introduced next.
The autocomplete pattern has revolutionized search interfaces. By providing users with suggestions as they’re typing, autocomplete makes searching easy and effortless.
There are 2 simple steps to implement this powerful feature: adding suggestions and querying results.
For each article in the dataset, we will add a suggestion to the auto-complete suggestion dictionary of RedisSearch.
To add a suggestion string to an auto-complete suggestion dictionary, we use the command
💡 This tutorial uses the official Python client for Redis.
There are 2 main parameters in the command
With the autocomplete suggestion dictionary in place, we can now retrieve a list of suggestions for a given prefix using the
There are 3 main parameters in the command
That's the basics! Pretty straightforward, isn't it? For more complex cases, read on to solve the challenge of handling keywords within the text.
The autocomplete feature above only handles the prefixes. If users type keywords that are in the middle or even at the end of the text, autocomplete can not suggest the expected results.
Let's say we have 5 article titles:
If we query “London” using autocomplete, it only returns the first 3 articles. In reality, we would expect users to be able to see all 5 results above.
Or in a worse scenario, if users misspell “London” with “Londin” in the query, they will not see any results as “Londin” does not appear in any of those article titles.
Full-text search is the solution to those problems. It is a technique that allows searching for documents or data based on the presence of keywords or phrases within the document's entire text. Unlike traditional search methods, full-text search considers the context, synonyms, and word proximity to provide more relevant search results. This ensures a better user experience and increased user satisfaction.
Full-text search engines use algorithms to index the content of documents or data sources to make them searchable.
We can implement and test a Full-text search through 3 steps:
First, we need to define the schema for the index. For example, let’s define a schema for news articles like below:
RedisSearch uses the weight parameter to rank the result set on querying data. By default, every field has a weight of 1, meaning RedisSearch will not consider whether the keyword is in the title or description.
However, we want to prioritize articles with the keyword in the title over those with the keyword in the description. As a result, we increase the weight of the title to suit our needs.
After having the schema, we will create the index using the command
Next, we need to import all the articles to Redis using the command
This step gradually happens in real-life applications when we slowly append more data to Redis and the data should exist before searching. In this article, we import all data one time to be able to do the full-text search.
For more details, you can check this notebook in the demo repo.
Then, the search function can be finally implemented.
Pretty simple. However, in some cases, users might want broader results. Luckily, RedisSearch also supports Fuzzy matching. We can simply add ‘%’ around the term to apply Fuzzy matching.
Fuzzy matching or approximate string matching is the technique of finding strings that match a pattern approximately (rather than exactly).
After having the query, we can use the command
FT.SEARCH to get the results.
This command returns a list of documents. Each document has 2 main fields:
Besides full-text searching,
FT.SEARCH also supports filtering by particular fields.
@field_name: value1 | value 2. For example, if we want to return only articles on BUSINESS and ENTERTAINMENT categories, we will add
@category: BUSINESS | ENTERTAINMENT to our query.
There are multiple types of advanced filters:
You can find more details here.
The result set can be returned in order of any fields using the
SORTBY parameter. The order can be ASC or DESC.
How about when we have a large amount of data, and you want to return a part of them? Does it sound like pagination?
Simply add the
LIMIT parameter to our query. It takes 2 numbers: offset and limit. Note that the offset is zero-indexed. The default is 0 10, which returns 10 items starting from the first result.
In this tutorial, we've explored RedisSearch, a powerful Full-text Search engine integrated with Redis, using Python. We covered autocomplete suggestions, efficient full-text searches, data import, and advanced querying techniques.
By implementing these features, you can create a seamless and intuitive search experience for your users. RedisSearch helps your implementation fast and easy, providing more value to your users.
For a hands-on experience, you can access the demo code on our GitHub repository.