Advanced search

Boox provides several advanced search options that allow you to fine-tune your search queries and improve the accuracy and relevance of your results. These options can be passed to the search and searchSync methods as part of the SearchOptions object.

Vector Space Model

By default, Boox uses TF-IDF scoring and inverted indexing to determine the relevance of documents to your search query. However, you can enable the Vector Space Model by setting the useQueryVector option to true:

const results = await boox.search('keyword', {
  useQueryVector: true
})

The Vector Space Model represents documents and queries as vectors in a high-dimensional space. The similarity between a document and a query is then determined by calculating the cosine similarity between their respective vectors. This approach can provide better contextual precision compared to TF-IDF, but it might be slightly slower.

Be aware

The useQueryVector option is still experimental and may be subject to changes in future versions of Boox.

Query expansion

Query expansion is a technique for expanding your search query with synonyms or related terms to improve the chances of finding relevant documents. You can define a custom query expander function and pass it to the queryExpander option:

/**
 * @type {import('boox').SearchResultsAdvanceOptions['queryExpander']}
 */
function queryExpander(query) {
  // Your logic to expand the query (e.g., using a thesaurus or external API)
  const expandedQuery = query + ' synonyms'
  return expandedQuery
}

const results = await boox.search('keyword', {
  queryExpander: queryExpander
})

In this example, the queryExpander function appends synonyms to the original query. You can implement your own logic to retrieve synonyms or related terms from external sources.

Matching coefficient

The matchingCoefficient option allows you to define a custom function to calculate the similarity between a search query and a document's attributes. This function takes the search query and document attributes as input and returns a similarity score between 0 and 1. Here's an example using the Levenshtein distance to calculate similarity:

import * as levenshtein from 'fastest-levenshtein'

/**
 * @type {import('boox').SearchResultsAdvanceOptions['matchingCoefficient']}
 */
function matchingCoefficient(query, attributes) {
  const maxLength = Math.max(query.length, attributes.title.length)
  const distance = levenshtein.distance(query, attributes.title)
  const similarity = 1 - distance / maxLength
  return similarity
}

const results = await boox.search('keyword', {
  matchingCoefficient: matchingCoefficient
})

In this example, the matchingCoefficient function calculates the Levenshtein distance between the query and the document's title. The similarity score is then calculated as 1 minus the distance divided by the maximum length of the query and title. This means that documents with titles that are closer to the query will have higher similarity scores.

Be aware

Using a custom matching coefficient function can significantly impact the search results. Make sure you understand the implications of your function and test it thoroughly before using it in production.

Limit

The limit option allows you to specify the maximum number of search results to return. This can be useful for improving performance and preventing the search from returning too many results.

const results = await boox.search('keyword', {
  limit: 10
})

In this example, the search will return at most 10 results.