Yandex is the fourth-largest search engine in the world, and its major market share is in Russia. On January 27th, 2023, it became the victim of a data leak that ranks among the largest that a tech company has encountered in recent times. The worst part is that this is the second Yandex leak in close to a decade.
A former Yandex employee attempted to sell the Yandex search engine code on the black market for $30,000 in 2015.An initial Yandex leak that emerged revealed 1922 ranking factors, of which 64 percent were listed as depreciated or unused. Though the leak was termed “kernel,” more files were found in combination with 17,800 ranking components. When it comes to SEO for Yandex, most of it is applicable.
Yandex, like Google, has always been transparent about algorithm updates and changes, though in recent years it has adopted machine learning. A few of the notable updates from the last two to three years include
At a personal level, this Yandex leak replicates a second Christmas. Since January 2020, news websites have covered Yandex SEO with search news in Russia with 600+ articles, as this has proven to be the hobby site’s peak event. It is the right time to test how Yandex’s public statements match with the codebase secrets.
Though Yandex is primarily known for its presence in Russia, it has to be stated that the search engine has a presence in Georgia, Turkey, and Kazakhstan. There exists a general feeling that the data leak was politically motivated and incorporated a number of code fragments from the monolithic Yandex Arcadia repository. There was close to 44 GB of leaked data containing information related to Yandex, which includes apps, mail, search, and discs along with the cloud.
Yandex has publicly stated that as of January 31st, 2023,
How much of the code can be actively used is a different question altogether.
Yandex has revealed that during its audit and investigation, it found a number of errors that were in violation of its own principles. So, it is likely that the portions of the leaked code that are currently in use may change in the future.
Yandex classifies its ranking factors into three major components. In Yandex SEO public documentation, it has been specified for the first time. It aids in better understanding of the ranking factor leak.
From the data obtained so far, some of the learnings and affirmations that we have been able to make are A lot of data exists in the leak, and we are likely to come across new connections and things in the coming weeks. They include
Matrix Net is mentioned in a few of the ranking factors; it was announced in 2009 but was superseded by CatBoast in 2017, which was rolled out across the Yandex product sphere. This further substantiates the comments from Yandex, and one of the factors was an outdated code repository.
It was introduced as a new core algorithm that took into consideration numerous ranking factors and assigned weights based on the user query, the actual search query, and the perceived intent. It can be compared to the early version of Google Rank Brain, which is indeed a couple of systems. Matrix Net has been built upon, which is not surprising considering that it is 14 years old.
In 2016, Yandex went on to introduce the Palekh algorithm, which used deep neural networks to better match the documents and queries, even if they did not contain the right levels of common keywords but satisfied the user’s intent. Palekh was capable of processing 150 pages at a time.
From the Yandex leak, it takes into consideration URL construction specifically.
The document’s age with the last updated date is also important, as this makes sense. Apart from that, a number of data points relate to its freshness, more so for news-related queries. Earlier Yandex used timestamps for ranking purposes, but for reordering purposes, this is to be classified as “unused.”
Yandex is known to have similar algorithms to combat link manipulation like Google, and it has since the Nepot filter in 2005. From undertaking a review of the backlink factors to some of the basics that are mentioned in the descriptions, the assumption for building links related to Yandex may relate to the following:
Below is an affirmation that is a confirmation of the best practices.
But there are some link-related factors that are additional considerations when planning, monitoring, and interpreting backlinks.
The data connection also revealed that there will be about 80 active components in the spam link calculator. when taking into consideration that there are a number of deprecated factors. This goes on to create a question on how Yandex Seo works, as it would outline how bad a link is. On the other hand, a negative SEO attack is likely to be a short burst. Machine learning models are used by Yandex to validate PBN and sponsored links. The assumption regarding link velocity and the time frame during which it is acquired is the same.
The paid-over links are generated over a longer period of time, and these patterns were introduced to combat.
When it comes to advertising on the page, there are a number of factors to consider, and a few of them would be deprecated. It is not known from the description itself or what the thought process was, but the point to consider is how much Google would embrace it if ads obscured the main content of the page.
The Yandex mechanisms took into consideration the Proxima update in the ratio of useful and advertising content on a page.
Google and Yandex are disparate search engines with a number of differences. This is despite the thousands of engineers who have worked for both companies. It is due to the fight for talent, though a few of the master builders and engineers have built things in a similar fashion.
Just like professionals worldwide, Russian SEO professionals have their own say on the leaks across numerous Runet forums. A point to consider is number of conclusions and findings from this data match up to the western world SEO and their findings.
For more such blogs, Connect with GTECH.
Machine learning and AI have brought significant changes in the technology sector and with the…
If you are following a digital marketing or SEO professional career, then it is important…
Google PageRank algorithm is one of Google’s most important and game-changing introductions ever. Being one…
Almost any marketing strategy or campaign found on the web is focused on generating and…
As the technology sector constantly evolves, it does create an impact in the SEO world.…
You might have seen news like ”Google has released a new spam update for search”…