Answers supplied by Hiren Hindocha, CEO, Digital Nirvana
What does Big Data bring to this sector?
When we talk about big data in broadcast, we’re talking about the hundreds of terabytes or even petabytes of data that a system gathers during direct interaction with end users. Typically that happens when broadcasters make their content available through VOD or streaming options. Broadcasters can analyze this big data to understand customers’ preferences, which in turn helps them serve better content to viewers and serve the right demographic to advertisers.
Besides the massive amounts of data exchanged between broadcasters and end users, big data also refers to the many content feeds most broadcasters ingest continuously and simultaneously. For example, the volume of incoming video feeds for a news organization is huge —several hundred gigabytes to terabytes on a daily basis.If broadcasters can make sense of that big data, they can use it to help make content.
What are the possibilities of Artificial Intelligence in the broadcast industry?
Applying artificial intelligence across audio and video opens a world of possibilities for the broadcast industry. For example, speech-to-text technology has reached a point where it is better than humans at understanding specific domains. A well-trained speech-to-text engine can provide a very accurate transcript and captions of incoming content. At the same time, other well-trained engines can extract facial recognition, perform on-screen text detection, detect objects in the background, and more.
In the case of multiple and continuous incoming video feeds, artificial intelligence can help describe what is in the feed and make it very easy for editors to find what they’re looking for. AI capabilities can also generate metadata that makes the content easily searchable and retrievable, leading to easier content creation and better content publishing decisions.
How can all these technologies enrich the content consumer experience?
Once content providers become more familiar with user preferences,they can bubble up content within their archive that is better-suited to those preferences. They can also use AI to quickly access data that will inform their content feeds, such as in social media channels. Take World Cup soccer, for example. Rightsholder Fox Sports could use AI technology to identify moments in the game that are worthy of viewing, and within a few minutes of the game ending, they can put up those highlights on YouTube. Before AI, this process would have taken a human many hours.
And of course, the more consumers watch content that is in tune with their preferences (action, drama, certain news topics), then the better the system gets at predicting and serving similar content. That’s an example of tailoring the content for a better consumer experience.
How should traditional broadcasting adapt to these technologies to get the most out of them?
Broadcasters need a website or platform where users can search, find, and consume content. The smaller the chunks broadcasters produce, the greater the consumption,which means they need to be able to capture all of that consumer information and make it useful. (See examples mentioned before.) To be able to do that at scale, broadcasters have to adopt technologies that can process the information faster and better than employing an army of people.
From what perspective does Digital Nirvana approach the use of technology associated with Big Data?
We don’t do anything with big data at this point.
From what perspective does Digital Nirvana approach the possibilities of Artificial Intelligence?
Digital Nirvana believes AI has great potential to accelerate media workflows and make life easier for our clients. To realize that potential, we’re always looking for new and better ways to help our clients use artificial intelligence tools like speech to text and facial or object recognition to describe what is in their audio and video.
Digital Nirvana is doing a lot of work on training speech-to-text engines to automatically recognize who is speaking and what they are saying — such as distinguishing one media personality from another and identifying different topics.
How does Digital Nirvana intend to take advantage of the confluence of both technologies?
Our focus right now is to leverage AI technologies in the audio, video, and natural language processing sectors. Natural language processing is the ability to understand what is being said in the content. Not only can we provide a verbatim transcript of what is being said, but then we use natural language processing to figure out who is doing the talking and what the topic and context are.For example, our Trance application uses multiple technologies, including automatic speech to text and an automated translation engine. Our goal is to make sure those engines keep getting better and better.
It has not yet become main stream technology, but there are already many developments and pilot projects. Which ones would you highlight as the most challenging and interesting?
One pilot project we’ve been working on with a major U.S. broadcaster is automatic segmentation and summarization of incoming news feeds. Suppose the programming in the feed lasts 60-90 minutes and contains multiple segments on different topics. Today in production, we are generating real-time text of that content, but in the future,we’ll automatically be able to figure out which people and places are being discussed in that feed, then provide a headline and summary of each of the segments.We’ll also be able to detect changes in topics and categorize accordingly. This is not an easy thing to do.
A similar use case relates to podcasts. Today, a well-designed podcast will have what we call chapter markers within an hour long or 45-minute podcast. The chapter markers delineate the different segments, and there are show notes related to each chapter marker.Right now this process is done manually. We foresee technology that will listen to a podcast and automatically generate chapter markers along with a summary of each chapter.
Finally, Digital Nirvana is developing an advertising intelligence capability for a large ad intelligence provider that needs to analyze advertisements at scale. This provider must process close to 20 million advertisements per year, and there is no way to do it manually. They have to use technology.
The technology we’re developing will look at an advertisement —whether it be outdoor creative, a six-second social media advertisement, or a 30-second broadcast commercial —and determine the product, the brand, and the category (e.g., alcohol ad, political ad, automobile commercial).That kind of analysis is a challenge, and being able to do it automatically will significantly improve this company’s workflow.
What future developments is Digital Nirvana involved in regarding the capabilities of this technology?
Digital Nirvana already processes media in a multitude of languages. Our goal is to keep evolving so that we not only improve the accuracy of our existing languages but continually add new ones.
Also, we are looking at ways to apply generative AI —AI that helps generate content — to the media and entertainment space.