This week, August 22, Yandex launched a new version of the search with the Korolev algorithm . It is based on a neural network, which allows it to correlate the meaning of the request and the web page and at times more accurately respond to complex and ambiguous requests. To train the new version of the search, search statistics and estimates of millions of people are used: it turns out, not only developers, but all users in general, contribute to the development of the system.
The presentation of Korolev took place, which is symbolic, in the Moscow planetarium. The stage was performed by Andrei Styskin, Head of Yandex.Search, Alexander Safronov, Head of Relevance Services Yandex.Search and Olga Megorskaya, Head of Data Processing Department Yandex.Search. Trashbox.ru is ready to share photos, videos and impressions.
From MatrixNet to Neural NetworksSearch engines appeared in the mid-90s of the last century, when the Internet was very small - only a few thousand sites. At first, search engines simply compiled a list of pages where there are specified words without troubles with ranking according to the degree of compliance with the request. The more often words from the query are found in the document, the better. It is clear that with the current state of the global network, this will no longer "ride".
In Yandex, for processing requests, they came up with the Matrixnet, a machine learning method with which the author’s ranking formula was built. However, the search continued to rely on words. But what about the queries that users formulate allegorically or associatively? Then the desired web page does not have to contain strictly all the words from the request. But how to explain this to a car? She should understand us as a person ...
In the end, scientists came up with something at the intersection of technology and biology - an artificial neural network (ANN). According to the Wikipedia wording, it is “a mathematical model, as well as its software or hardware implementation, built on the principle of organization and functioning of biological neural networks - networks of nerve cells of a living organism”. Neural networks can process information like us and, most importantly, learn and hone skills, like living things. Actually, they are the basis of full-fledged artificial intelligence, the appearance of which is a matter of time.
Last year, Yandex introduced the Palekh search algorithm based on a neural network. He showed excellent results in solving problems that were usually only possible for people: he did a great job with recognizing speech and objects in images. Palekh learned to convert search queries and web page headings into groups of numbers - semantic vectors. Their important property is that vectors can be compared with each other: the stronger the similarity, the closer the meaning of the request and the header.
Korolev. Who understandsThe next step in the development of a search engine based on neural networks is the Korolev algorithm, which analyzes not only the title, but the entire page! The number of pages that the search compares in meaning with the query has grown from 150 documents to 200 thousand. Among other things, “Korolev” began to take into account the meaning of other queries, by which people go to it on the desired page.
A neural network learns like a child. To master this, she needed a huge number of examples. Actually, all users of the service were involved in the spontaneous training of Korolev in one way or another: search statistics and estimates of millions of people were used. Yandex is gradually learning to more accurately recognize semantic connections, such as: [the picture where the sky twists] is about the picture of Van Gogh, [lazy cat
from Mongolia] - manul.
Search is a very complex system. Thousands of engineers are working to ensure that she understands the person and helps solve his problems. At Korolyov, we combined machine intelligence and the efforts of millions of people. Our users improve the search with us by asking questions and helping to train our algorithms.
Andrey Styskin , head of Yandex Search.
In addition to analyzing the daily routine, learning the search engine requires evaluating the quality of responses. The more complex the system, the more ratings are required. If earlier a relatively small group of assessors, members of the Yandex team, were engaged in evaluating the quality of the search, now it was necessary to seriously increase the volume. This is how the Yandex.Tolki
service came about (mob is a form of mutual assistance that was once practiced by villagers). Any enthusiast who is interested in a small reward and, of course, in a sense of involvement in something important, can perform simple tasks. Now more than a million people have gathered such tokers, and the number of ratings they have exceeded 2 billion.
“Modern search is based on complex algorithms. Algorithms are invented by developers, and taught by millions of Yandex users. Any request is an anonymous signal that helps the machine to better understand people. Therefore, we will not be mistaken if we say: a new search is a search that we did together, ”reads the blog post on Yandex.
Over the more than two-year history of Yandex.Tolki, the most productive and diligent participant was identified. They became Ilya Mikhalenko from Chelyabinsk. The guy came to the presentation of "Korolyov" in Moscow to get a well-deserved award from the hands of the search engine team.
New search in businessWhat is the practical improvement in the performance of our Yandex? Now you can talk with him almost like a brainy and erudite friend. (Even in voice.) For example, what will you do if you need to remember the name of the film from which you remember some passage, and the names of the actors and the director flew out of your head? You can contact your friends or ask for help on some thematic forum. And you can ask Korolev!
Significantly improved image search. With them, as a rule, there is always some kind of “adisha”: the search engine either thoughtlessly displays all the images in the name of which the words from the query are used, or takes into account the text of the article, which the picture illustrates. If you are looking for something that would meet the vague demands of the soul, then get ready to be disappointed. “Korolev” analyzes exactly what is shown in the picture, therefore it is able to please with a non-trivial approach.
As an example, the tests brought not the most obvious request - [cat in space]. Dogs were in orbit quite often, but from the mustachioed-striped disciplined space explorers did not work out. Only one attempt is known for certain: in 1963, the French launched the cat Felicette into a suborbital flight. Romantic, but short-sighted, - as soon as the scientists opened the hatch of the landing capsule, the murka was like that. The solemn photo shoot did not take place.
Upon request, the search engine gives not only animals in spacesuits and surreal photojacks, but a photo of a cat in a washing machine, which looks quite like a hatch of a spaceship. But this is not said in the description.
For the solemn launch of a new search engine, the entire Yandex.Search team has risen onto the scene. A small countdown and ... Let's go! Now everyone can experience the insightful "Queen". The main thing is that its current capabilities are not static, but are in constant development.
To end the evening, the organizers saved something completely unexpected - a communication session with the real astronauts from orbit. They personally answered some popular queries from users of the search engine about space and answered questions from those present.