• Veveo Pioneers "Siri on Steroids" Voice-Based Video Search

    Veveo, a provider of search solutions for connected devices, has debuted a new voice and natural language-based, "conversational interface" technology for video search. Available for trial currently and for release in Q1 '13 in its Reveal 3.0 product, the new voice capability is targeted to pay-TV operators, connected device manufacturers and set-top box providers eager to give users more flexibility in how they navigate the ever-increasing array of video choices.

    Sam Vasisht, Veveo's CMO, took me through a deep dive of the technology behind this last week, and how Veveo's approach goes beyond today's popular voice navigation options like Siri. Veveo defines conversational interfaces as allowing people to use casual language modes - the way we actually talk to each other - rather than stilted way we typically use today's voice-response systems - to find the videos we want.

    Sam discussed three key ingredients to making conversational interfaces effective: disambiguation, statefulness and personalization. Disambiguation means being able to identify a user's intent (e.g. distinguishing "Cruise" from "Cruz") or to ascribe the correct meaning to a word (e.g. "Eagles" as football team vs. as a rock band). Statefulness means either maintaining the context of the user's search or adapting quickly to a new one (e.g. from movies to sports). Personalization means retaining knowledge of a user's prior searches in order to inform subsequent ones.

    Note that all of these things are what we as humans do when we interact with each other, allowing us (for the most part) to easily communicate and exchange information with each other. Of course we take all this for granted!

    Underlying Veveo's conversational interface is its SmartRelevance Conversational Platform, based on 32 issued patents and 65 patent applications. The platform consists of a "Knowledge Graph" (reference set of named entities, with algorithms to make use of them), "Content Graph" (actual content assets mapped to the Knowledge Graph), "Personal Graph" (contextual learning engine for user's behaviors and interests) and a "Conversational Query Engine" (front end that binds together other elements and allows natural language processing).

    All of that sounds like pretty complicated stuff, but it is all needed in order to deliver voice-based search beyond today's relatively simple efforts. And Veveo isn't coming at this cold; it is well-versed in video search as Reveal is already used on 100 million+ set-top boxes and mobile devices, serving pay-TV operators Comcast, Cablevision, Rogers, DirecTV and others. Reveal can be implemented in operators' networks, in the cloud, hosted by Veveo or as a mix of these.

    One of the key frustrations of users these days is simply finding the video they desire. And with more video becoming available daily, the problem is only going to get worse. The prospect of a conversational search alternative - a "Siri on steroids" where we casually engage with our devices and they bring back what we want - is an enticing vision that Veveo appears ready to deliver.

    The video below shows a nice demo of the conversational interface.