I’m a search geek. I read through patents that provide hints and possible glimpses behind the curtains of search engines like they are novels.
I look for patents from specific inventors, like people who might keep their eyes open for news of a new Marvel movie.
Patents don’t always provide actionable insights, but they do suggest questions and possible things to look out for or understand how search engines may be working, or even to test.
I found a patent this summer which reminded me of the concept of a sea change and how search results could transform and undergo a sea change.
One of the inventors I watch out for is Trystan Upstill, at one point the Head of Core Web Ranking and Mobile Content Search at Google.
He has been involved in some of the more interesting patents and processes at Google, like one I wrote about on How Google May Rank Some Results Based on Categorical Quality.
If you read about that one, you may see some similarities to the patent I am writing about today.
He writes about things that we may never visibly notice, that sort of happen behind the scenes (or curtains), and decide upon which pages may fill the search results we see in response to a query.
A newly granted (July 2, 2019) patent from Google has his name on it as one of the inventors, and it was filed when he was still the head of Core Web Ranking at Google back in 2015.
Adjusted Search Features
The patent starts out simply enough, by telling us:
“The search system ranks the resources based on their relevance to the query and importance and provides search results that link to the identified resources, and orders the search results according to the rank.”
The results shown are responsive to a query, and the search engines look at features of a webpage that query may appear upon and other aspects of that query, and possibly other information when determining search scores for the resources that appear in SERPs.
But most patents describe a problem that they report upon, and that problem explains the need for a patent to have been written, with an invented process that might address that problem.
Sometimes a patent will also tell us about the state of the technology at the time that patent was also written. Here is the problem, and the state of the technology as described in the summary section of the patent:
“Typically the search operation implements a robust search algorithm that performs well over a wide variety of resources. However, sometimes particular features for a particular query and a particular set of resources may be quite important in determining the search scores for the resources, while for other queries the particular features may be much less important. For example, for a particular query with certain terms, the presence of those terms in the resources may have a very strong impact on the search scores for the resources; conversely, for another query with different terms, the relative importance of the resources in an authority graph may have a much stronger impact on the search scores than the presence of query terms in the resources.
However, the relative importance of particular features for particular queries and resources is often difficult, if not impossible, to predict a priori.”
What these changes to features a page is ranked upon may mean is that in response to them, sometimes Google might adjust search features and rescore resources after a while.
The process behind the patent can include:
- Receiving data that indicates resources identified by a search operation that are responsive to a query and ranked according to a first-order, each resource having a corresponding search score by which the resources are ranked in responsiveness to the query relative to the other resources identified by the search operation as being responsive to the query, wherein the search operation scores each of the resources based, in part, on features of the resource and the query, selecting a set of the resources.
- Determining, from the SERPs and for each of the features of the resources and the query, an impact measure that measures the impact of the feature on the ranking of the resources that belong to the set.
- Re-scoring the resources for the query in the SERPs based, in part, on the impact measures and ranking the set of resources according to a second-order that is different from the first order.
- Providing, to a searcher in response to the query, search results according to the second-order, each search result identifying a corresponding resource.
Many patents include a section in their summary that lists what they refer to as “advantages” for using the process described in the patent. They are a forecast of what the expected outcome of the patent might be.
For this patent the expected advantages include:
- Search operations may be adjusted to compensate for emergent phenomena that affect resource scoring.
- Those adjustments may be determined at query time so that the foundational search operation need not be adjusted, and thus foundational search operation be built on known priors.
- This approach allows for the retention of the foundational search operation that performs well for most resources in a corpus given a set of known priors, but also provides flexibility to adjust the search operation on a per-query basis when particular features affect the ranking of resources in a way that departs from the expected effects.
- The re-ranking of resources resulting from scoring pursuant to the adjusted search operation tends to surface more prominent resources that are more likely to satisfy a user’s informational need, thereby increasing the quality of the overall user experience.
The ultimate goal is expressed there as providing resources that are “more likely to satisfy a user’s informational need, thereby increasing the quality of the overall user experience.”
This adjusted search features patent can be found at:
Search operation adjustment and re-scoring
Inventors: Trystan G. Upstill, Andre Duque Madeira, Wisam Dakka and Zhong Xiu
Assignee: Google LLC
US Patent: 10,339,144
Granted: July 2, 2019
Filed: May 21, 2015
“Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for receiving queries, and for each received query: receiving data indicating resources identified by a search operation as being responsive to the query, wherein the search operation scores each of the resources based, in part, on features of the resource and the query, selecting a subset of the resources, determining, from the subset of resources and for each of the features of the resources and the query, an impact measure that measures the impact of the feature on the ranking of the resources that belong to the subset, adjusting the search operation based on the respective impact measures, and initiating the search operation to re-score the resources in the subset of resources based, in part, on the adjustment and to rank the subset of resources according to a second-order that is different from the first order.”
More on Adjusted Search Features That May Change Search Engine Scores
I mentioned search engine scores that may be created according to “multiple features of the resource and the query.” These features could be related to:
- Information retrieval, such as features related to recall and precision.
- The relative authority of a resource in a resource graph.
- The query terms.
- User feedback of the resource given a query and other queries.
The patent tells us that “these features may be modeled in the search engine as parameters, and various parameter values may be selected for each parameter.”
How these search features are valued may be part of the what makes search engine scores work well. They give us an example:
“For example, with respect to a resources authority score, a parameter value may be a weight by which a feature value for the resource–the authority score–is multiplied or otherwise adjusted; with respect to resource terms and query terms,
Parameter values may include synonyms, related terms, and weights by which matches of terms and term counts are multiple or otherwise adjusted; and so on.”
So according to this patent, search could be a very complex process that looks to multiple types of scoring contributions of different types based upon a number of different types of parameters which could be related to features from web resources on the content of a query.
The search operation, once built, tends to perform well over a wide variety of search queries and documents. This could present some issues that need to be overcome, and the patent describes those for us.
It tells us that:
- Some features may exhibit much more influence on the scoring of the resources than for other queries and other resources
- Some features may exhibit much less influence on the scoring of the resources than for other queries and other resources
When a subject is a fairly new one on the Web (which they are referring to as an “emergent subject”), Some aspects of a score may have more impact than others:
“Furthermore, such influences may be evanescent; for example, for an emergent subject, an information retrieval score may be more influential for the first several weeks, and then, at a later time, authority scores and user feedback scores may tend to grow in influence. Thus, tuning a search operation to compensate for these features is difficult prior to their detection, if not impossible.”
So, the focus of this patent is on “when certain features exhibit greater or lesser impacts on the ranking of resources for a search operation for a query and then adjust a search operation based on the impacts.”
If you’ve ever ranked a page in a fairly new subject area, and one day the search results that it appears in all of a sudden seem to shift around and change (undergoing a sea change), the next paragraph from the patent could explain why that might happen as search results get adjusted:
“The adjusted search operation is the re-run on the identified resources to re-rank the resources in a manner that takes into account the detected impacts. In some implementations, an initial search for a query is executed, and a proper subset of the ranked resources, e.g., the top N ranked resources, is processed to determine appropriate modifications to the search operation. The search operation, adjusted by the appropriate modifications, is then re-run to re-score and re-rank the resources.”
When I read the next paragraph in the patent, I was reminded of a post that Jason Barnard wrote about ranking at Google, based upon information he had received from Gary Illyes, Webmaster Trends Analyst at Google, which he wrote about in How Google Search Ranking Works – Darwinism in Search:
“The search engine utilizes a search operation that generates search scores for the resources and ranks the resources based on search scores. The search operation quantifies the relevance of the resources to the query, and the quantification can be based on a variety of factors. Such factors include information retrieval (“IR”) scores, user feedback scores, and optionally a separate ranking of each resource relative to other resources (e.g., an authority score). The search results are ordered in a first-order according to these search scores and provided to the user device according to the first order, or, in some situations, may be re-ranked by an adjusted search operation and provided to the user device as search results’ ranked according to a second-order that is different from the first order.”
This patent also tells us about feedback scores based upon information from query logs and click logs:
“In some implementations, the queries submitted from user devices are stored in query logs. Click data for the queries and the web pages referenced by the search results are stored in click logs. The query logs and the click logs define search history data that include data from and related to previous search requests. The query logs and click logs can be used to map queries submitted by the user devices to web pages that were identified in search results and the actions taken by users. The click logs and query logs can thus be used by the search system to determine queries submitted by the user devices, the actions taken in response to the queries, and how often the queries are submitted. Such information can be stored as feedback scores for the queries and resources.”
And Then There Is Reranking of Results, or Adjusted Search Features
This is part of an adjustment of results as has been described in the patent when there may be shifts in the values that results were scored upon to modify search results:
“…the re-ranking engine, for each query, processes resources identified by a search operation as being responsive to the query and ranked according to the first order, selects a proper subset of the resources, and determines, for each feature the search operation takes into account, an impact measure that measures the impact of the feature on the ranking of the resources. The re-ranking engine can then adjust the search operation based on the respective impact measures, and initiate a subsequent run of the search operation to re-score the resources based, in part, on the adjustment, resulting in the search results’.”
Search Operation Adjustment & Re-Ranking Resources
When search results are ranked, the influence of each feature involved in ranking those is calculated, and any changes to those features may be measured by their impact.
If the impact doesn’t meet a threshold, then the re-ranking engine will not rerank the search results. If it does meet that threshold, then the results will be re-ranked.
The patent provides this peek at how reranking might take place, when Google decides to use adjusted search features.
“…then the process adjusts the search operation based on the impact measures (314). A variety of adjustments can be used. For example, depending on a category of the query, the search algorithm may be adjusted in different ways. By way of one example, if a query is categorized as being a “product” seeking query, then a relevance weight parameter value related to certain commercial content, such as reviews, pricing information, etc., may be increased; conversely, if a query is categorized as being an “informational” seeking query, then the relevance weight parameter value related to certain commercial content, such as reviews, pricing information, may be decreased, while a relevance weight parameter value related to anchor text linking to the resource may be increased, etc.”
And synonyms may play a role as well:
“…if an impact measure related to synonym matching terms is high, then the feature of query expansion may be adjusted such that a more aggressive form of query expansion is used.”
Adjusted Search Features Takeaways
The article that Barnard wrote names specific types of features that may be used to rank pages, such as topicality, quality, speed, RankBrain, entities, structured data, freshness.
Those aren’t described in this patent or discussed in any detail, but they do seem like they could be features of ranked resources or queries that could influence how a page may be ranked, which are mentioned in this patent.
If you haven’t had a chance to read Barnard’s post, I would recommend it. I read it around the same time that I first saw this patent, and I highlighted the paragraph from this patent that tells us that pages may be ranked based upon a variety of factors.
While this patent doesn’t tell us the same factors that Barnard was told, the idea that multiple factors may be involved in ranking pages at Google is one worth exploring in more detail, if you can.
What this patent adds to what Barnard told us was that Google may, upon seeing changes in the impact of different ranking signals that it may have used to rank a page beyond a certain threshold, Google may adjust rankings by applying a reranking process.
So, if you see the results that you have gotten used to for a particular query that you have been following, knowing the SERP place around that query well, and who else occupies positions in that SERP place, and you may suddenly see it shift around and change.
It is possible that Google may have adjusted search features and changed those results because the impact of ranking signals for those features may have changed.
All screenshots taken by author, September 2019