Searching for Quotes has shifted at Google with an Updated Continuation Patent
In August of 2017, I wrote the post Google Searching Quotes of Entities. The patent that post was about was called Systems and methods for searching quotes of entities using a database.
I noticed that this patent was updated last year (February 2019) with a continuation patent. I like comparing the claims in older patents with the claims from newer continuation patents – it is a message saying, “We used to do something one way, but we have changed how we do it now, and want to protect our intellectual property by updating the claims in this patent with a newer version of it.”
Reviewing the Patents on quote searching
It appears that this patent is showing us that Google is paying more attention to indexing audio, and that shows in this updated patent.
Here is a comparison of the claims from the patents.
The first claim from the 2017 version – Systems and methods for searching quotes of entities using a database:
1. A computerized system for searching and identifying quotes, the system comprising: a memory device that stores a set of instructions; and at least one processor that executes the set of instructions to: receive a search query for a quote from a user; parse the query to identify one or more key words; match the one or more key words to knowledge graph items associated with candidate subject entities in a knowledge graph stored in one or more databases, wherein the knowledge graph includes a plurality of items associated with a plurality of subject entities and a plurality of relationships between the plurality of items; determine, based on the matching knowledge graph items, a relevance score for each of the candidate subject entities; identify, from the candidate subject entities, one or more subject entities for the query based on the relevance scores associated with the candidate subject entities; identify a set of quotes corresponding to the one or more subject entities; determine quote scores for the identified quotes based on at least one of the relationship of each quote to the one or more subject entities, the recency of each quote, or the popularity of each quote; select quotes from the identified quotes based on the quote scores; and transmit information to a display device to display the selected quotes to the user.
The first claim from the 2019 version – Systems and methods for searching quotes of entities using a database
1. A method comprising the following operations performed by one or more processors: receiving audio content from a client device of a user; performing audio analysis on the audio content to identify a quote in the audio content; determining the user as an author of the audio content based on recognizing the user as the speaker of the audio content; identifying, based on words or phrases extracted from the quote, one or more subject entities associated with the quote; storing, in a database, the quote, and an association of the quote to the subject entities and to the user being the author; subsequent to storing the quote and the association: receiving, from the user, a search query; parsing the search query to identify that the search query requests one or more quotes by the user about one or more of the subject entities; identifying, from the database and responsive to the search query, a set of quotes by the user corresponding to the one or more of the subject entities, the set of quotes including the quote; selecting the quote from the quotes of the set based at least in part on the recency of each quote; and transmitting, in response to the search query, information for presenting the selected quote to the user via the client device or an additional client device of the user.
If you want to read about how this patent was originally intended to work, I detailed that process when I wrote about the original granted patent that was granted in 2017. The continuation patent was filed in 2017 and was granted last spring. The first version tells us about finding quotes looking at knowledge graph entries. The phrase “knowledge graph” was left out of the newer claim, but it also tells us that it is specifically looking for audio content, and performing analysis on audio content to collect quotes from entities.
What this update tells me is that Google is going to rely less upon finding quote information from knowledge base sources, and work upon collecting quote information from performing audio analysis. This seems to indicate a desire to build an infrastructure that doesn’t rely upon humans to update a knowledge graph but instead can rely upon automated programs that can crawl content on the web, and analyze that information and index it. This does look like an attempt to move towards an approach that can scale on a web level without relying upon people to record quotes from others.
I am seeing videos at the top of results when I search for quotes from movies, and that have been reported upon in the news. Like President Trump referring to a phone call he had with the leader of Ukraine as a “perfect phone call.”
Note that Google is showing videos as search results for that quote.
I tried a number of quotes that I am familiar with from history and from Movies, and I am seeing at or near the top of search results videos with those quotes in them. That isn’t proof that Google is using audio from videos to identify the sources of those quotes, but it isn’t a surprise after seeing how this patent has changed.
Has Google gotten that much better at understanding what is said in videos and indexing such content? It may be telling us that they have more confidence in how they have indexed video content. I would still recommend making transcripts of any videos that you publish to the web, to be safe in making sure content from a video gets indexed correctly. But it is possible that Google has gotten better at understanding audio in videos.
Of course, this change may be one triggered by an understanding of the intent behind quote searches. It’s possible that when someone searches for a quote, they may be less interested in learning who said something, and more interested in watching or hearing them say it. This would be a motivation for making sure that a video appears ranking highly in search results.