paint-brush
Solving the Cold Start Problem with Pre-Trained AI Algorithmsby@algolia
546 reads
546 reads

Solving the Cold Start Problem with Pre-Trained AI Algorithms

by AlgoliaMay 9th, 2023
Read on Terminal Reader
Read this story w/o Javascript
tldt arrow

Too Long; Didn't Read

Artificial intelligence (AI) has quickly moved from hot topic to everyday life. Ecommerce businesses are beginning to clearly see where AI can sharpen their competitive edge. With AI driving site search and discovery, sentiment analysis, chatbots, and more, users can experience more relevant search in their journeys.
featured image - Solving the Cold Start Problem with Pre-Trained AI Algorithms
Algolia HackerNoon profile picture


Artificial intelligence (AI) has quickly moved from hot topic to everyday life. Now, eCommerce businesses are beginning to clearly see where AI can sharpen their competitive edge. With AI driving site search and discovery, sentiment analysis, chatbots, and more, users can experience more relevant search in their journeys, resulting in better engagement for higher conversions.


While AI is undoubtedly an impressive technological feat in its own right, it’s important to note that AI systems are only as effective as the data they’re trained on.

Put simply, machine learning (ML) allows machines to learn from data and improve their response over time by doing so. This means that the quality and quantity of the data used to train an AI system can have a significant impact on its effectiveness.


For example, when it comes to AI search specifically, data is particularly important. AI search systems are designed to help users find the information they’re looking for quickly and efficiently. This requires a large and diverse set of data that can be used to train the AI system to recognize patterns and make accurate predictions about what users are likely to be searching for.


One of the biggest challenges companies will face when building their own AI language algorithms, whether it’s for search or another application, is the amount of time and resources required in developing a powerful AI platform, as well as the large amount of data required to train effectively its algorithms. The cold start problem appears.

Deep Learning Algorithms

So how can a company bridge the data abyss and realize value from their investment in AI?  One practical solution is to consider using pre-trained algorithms.


A pre-trained algorithm is a type of AI model that has been trained on a large dataset and then made available for use by others. The model has already learned to recognize patterns in the data, so it can be used to make predictions or classify new data with a high degree of accuracy without access to large amounts of data.


Some popular pre-trained algorithms include BERT, Bidirectional Encoder Representations from Transformers, and the Universal Sentence Encoder. BERT is one of the most widely used and well-known NLP language models which Google open-sourced in 2018, and which Google uses to improve understanding of query context. BERT is used in everything from Alexa and Siri to Google Translate.


There are many benefits to using pre-trained algorithms like BERT to get around the cold start problem including:


  • Faster development time: Pre-trained search algorithms allow developers to skip the time-consuming and resource-intensive process of training a model from scratch. Instead, they can build on an existing model that has already been trained on large amounts of data, saving significant amounts of time and effort.
  • Improved accuracy: Pre-trained search algorithms have been trained on massive amounts of data that has already been cleaned, making them more accurate and effective than models trained on smaller datasets.
  • Needs less data: Pre-trained search algorithms require less data to be effective. That efficiency makes them more practical for organizations with limited data resources. Without pre-training, these organizations would have much longer to wait to take advantage of AI-based search capabilities.
  • Flexibility: Pre-trained search algorithms can be fine-tuned to specific domains, making them adaptable to different types of data and applications. They are designed to be highly customizable so that developers can tailor their search experiences to the specific needs of their users, using their own data set as an additional layer.
  • Better user experience: By providing more relevant search results, pre-trained search algorithms improve the overall user experience. These algorithms permit companies to create personalized search experiences that are highly relevant to their users, even as their needs and preferences change over time.


There are scores of algorithms to choose from. I mentioned BERT and Universal Sentence Encoder, but there are many others. The choice of which to use comes down to the task at hand — ie, the solution you’re building, the quality, size, and performance of the algorithm and data set it’s been trained on, and computational resources. Some algorithms can be quite large and require expensive GPU computing.

A Hot Start

Pre-trained search algorithms can be leveraged to speed up the development process solving the cold start problem and reduce risk. These algorithms have access to a large variety of data sources that can be used to train search models and improve over time across all industries. By leveraging these pre-built algorithms, companies get a head start when it comes to building AI applications such as AI search quickly. Pre-trained algorithms accelerate development so companies can deliver more engaging and relevant search experiences to their customers, ultimately leading to increased engagement, sales, and customer loyalty.


If you’d like to see the result of one such application, check out Algolia NeuralSearch. We leveraged pre-trained models for both query processing (NLP/NLU) and retrieval which enables our customers to build with AI search from day 1 without having to construct their own models.  To surmount the cost and scale problems inherent with vector-based solutions, we also designed a proprietary neural hashing algorithm which simultaneously processes both vector and full-text keyword search. It enables developers to deploy search with natural language understanding (NLU) for any application — from web to voice search and much more.


Reach out to our team of experts to learn more about how you can be among the first to try out our powerful new search technology.



Also published here.