M+E Daily

IBM Continues to Enhance Watson AI Solutions for Video

NEW YORK – IBM continues to enhance the various Watson artificial intelligence (AI) solutions it offers to enhance customers’ video services, according to Tim MacGregor, IBM Watson Media worldwide technical sales consultant.

By providing accelerated video enrichment and captioning technologies through advanced AI, Watson can help media and entertainment (M&E) companies, among other things, build easily searchable metadata packages for every asset and automatically and quickly offer editable closed captions, he said July 24 during a session called “Inside the Mind of Watson” at the Smart Content Summit East event, part of the M&E Day at the Microsoft Conference Center.

The 2018 M&E Day also included Content Protection Summit East and Entertainment Production in the Cloud (EPIC) conference tracks, providing M&E technology teams valuable insights into the creation, production, distribution, security and analysis of content.

One goal of the presentation was to clearly explain what Watson Media is because “I think there’s a huge education gap between what’s Watson and what divisions fall underneath it,” as well as what the IBM Cloud is and what products IBM offers and what solutions it powers, MacGregor told the summit.

Many people first started hearing about Watson when IBM’s advanced computer system competed against humans on the TV game show “Jeopardy” in 2011, he noted. After that, “we started to build verticals around” some of the research and development IBM did with Watson, he said, adding: “We started in healthcare. Then we did finance. And now we’re starting to build a vertical around media and entertainment. And that’s where the Watson Media team comes in.”

There are various problems the Watson Media team is trying to solve, he pointed out, saying people often look at machine learning and cognitive solutions and they say, “I want to solve the world.” That includes the “45 problems that I have with my supply chain management, with my media delivery workflow, with my” over-the-top (OTT) subscriber churn — and “they’re all different things that you could use machine learning or cognitive solutions to try to solve,” he said.

But Watson Media wanted to start with solutions that would give its M&E clients the “best bang for their buck,” so it opted to start with solutions around content search and discovery, he explained.

Content enriched by Watson is inherently more searchable, while content discovery leads to increased usage, according to IBM. Other video solutions initially powered by Watson included recommendation uplift because enriched data provides deeper insights into what really motivates viewers, highlight clipping, closed captioning and compliance monitoring to help companies avoid potential violations, while eliminating the cost of manual search and identification.

Automated highlight clipping is especially useful for sports broadcasters who need to create video highlights as quick as possible because “faster usually means larger” return on investment, for one thing, MacGregor said. Watson automatically watches, identifies and clips highlights from sports broadcasts.

With closed captioning, Watson automates real-time, accurate and easily editable captions, saving companies time and money, according to IBM.

Watson Video Enrichment applies AI to analyze audio, textual and visual data within multimedia content. As part of the Watson Video Enrichment workflow, assets are acquired, ingested, processed and enriched using the Watson Media platform and includes semantic scene chaptering which divides the media into meaningful chapters or chunks that can be more easily managed by those people handling editing or producing, according to IBM. Also made possible are speech to text, natural language understanding and visual recognition.

Stressing that IBM plans to enhance the capabilities, MacGregor said: “We have facial recognition coming soon.” There are additional application program interfaces (APIs) that “exist in the core Watson services that we will add to this product as we continue to integrate and grow the team,” he added.
IBM has tested the various initial Watson Media capabilities with broadcast sports events including the U.S. Open, Masters Tournament and, with Fox Sports, the 2018 FIFA World Cup, he noted.

With the closed captioning solution, which IBM started using with the U.S. Open, the company reached 84-90% “accuracy rate out of the box,” he said, adding: “We’re trying to make it faster than real-time. We are trying to make it self-learning.”

Watson Video Enrichment and Watson Captioning are handled via “primarily an MPEG 4 file with AAC audio,” he said during a Q&A at the end of the presentation. He added: “One of the things that we want to enhance is we want to expand it to 5.1 audio, so we can start to do individual tracks, which would allow us to do some better analysis. That is not in the product today but will be something we would add. Profanity detection is something that is fairly simple to do today. You can create vocab lists to train it. You can also create vocab lists to basically block [profanity] out in your closed caption file. Now, is it going to block it out on your video file? No. But it’s going to be able to block [it] out in your transcript.”

Customers often ask about specific functions not yet offered, he also said, adding: “Translation is a big one that people come with because anybody that’s doing distribution on an international level — or they’re doing broadcasting on an international level — they’re starting to get to a point where like ‘OK, I can get an automated closed caption file. Great. Now I want to put it in 15 different languages and also now I want to do a voice over.’ So, we hear it a lot. It’s going to be an iterative process to get there on the machine learning and cognitive solutions side. But I think as we start this kind of journey those are the things that we’re going to try to identify: which problem do we solve next?”

The 2018 Media & Entertainment Day was presented by Microsoft, with sponsorship from IBM Watson Media, Amazon Web Services, IBM, LiveTiles, Microsoft Azure, NAGRA, NeuLion, Ooyala, EIDR, GrayMeta, MarkLogic, Qumulo, Avid, Cloudian, SoftServe and TiVo. The event was produced by the Media & Entertainment Services Alliance (MESA), the Content Delivery & Security Association (CDSA), the Hollywood IT Society (HITS) and the Smart Content Council.

Click here for audio of MacGregor’s presentation or here for the slide deck.