Yandex announced an update to their search engine. The update is named Vega. The update offers many details about how modern search engines work.
Major Improvements to Yandex
Yandex calls its update Vega. This update features 1,500 improvements. Of those improvements, Yandex highlighted two that they said impacts the search results in an important way.
One of the changes adds expert human feedback into algorithm training. The second change was the ability to double the size of their search index without impacting search result speed.
Related: The Ultimate Guide to Yandex SEO
Crowd Sourcing Search Result Raters
Google employs contractors trained with Google’s quality raters guidelines to judge their search results. Yandex is relying on their crowd sourcing platform named Yandex.Toloka.
While that may seem a little less controlled than Google’s method, Yandex provides a raters guidelines in order to improve ratings accuracy.
“People, or “assessors,” have long helped train our machine learning platforms through our crowd sourcing platform, Yandex.Toloka.
Using our search result evaluation guidelines, the assessors in Yandex.Toloka complete tasks that help us find the most relevant results for specific queries.”
Human Input to Algorithm Training
We know that Google uses quality raters to test drive new algorithm changes. Yandex does the same thing, too. They call their raters Assessors because they assess web results.
The change that Yandex has added was to employ experts in a given topic to review the work of the Assessors in order to improve the accuracy of their work. What this means is that the training data given to the algorithm will be better since it was verified and vouched for by an expert.
Because the Yandex training data is reviewed by topic experts the algorithm will (presumably) be more accurate since the training data is improved.
This is how Yandex explained it:
“We’ve updated the ranking algorithm with neural networks trained on data provided by real experts in several fields, providing users with even higher quality solutions to their searches.
The professionals appraising the assessors range from IT administrators for data queries to hydrologists for searches relating to rivers.
The expert assessors use over a hundred criteria to evaluate the work of the assessors…
By training our machine learning algorithms with expert assessments, our search engine learns to rank relevant information higher in results thanks to the work of a highly qualified group of individuals.”
Expanding Search Index With Clustering
Yandex has introduced a very interesting way of handling topically similar web pages. Instead of searching through the entire index for an answer, Yandex has clustered web pages into topical clusters. This is said to improve and speed up search results by allowing the search engine to select an answer from pages that are topically relevant.
“Our algorithms use neural networks to now group pages into clusters based on their similarity. When a user enters a query, it’s searched among the most relevant cluster of pages, rather than our entire index.”
The Yandex clustering technology allowed Yandex to double their search index to 200 billion pages without impacting how fast it took to select a web page.
This is very interesting because it sounds similar to link ranking algorithms that begin with seed sites as representatives of topics. Web pages that are more links further away are judged to be less relevant to the topic. Pages that are located closer to the seeds for the topic are judged to be more relevant.
Predicting Search Queries and Results
An interesting update to Yandex is the use of algorithms to predict what the user will ask and to “pre-render” the results for that search query. While this was announced in the context of the Vega update, this was actually implemented in March 2019.
What makes this a good feature is that it speeds up the time it takes to show a user the search results they are looking for.
“Since March, Yandex mobile users on Android have been searching with pre-rendering technology, which predicts the user’s query and selects relevant results as the user is typing.”
State of the Art Information Retrieval
Yandex is a Russian search engine that uses neural networks and machine learning. I find it’s good to understand the technologies in use around the world because it keeps me up to date with what defines modern information retrieval (the business of search engines) today.
Read the official Yandex algorithm update announcement here: