Smart Screen Exclusive

Logged In: Live Sports Metadata Still Needs the Human Touch

NEW YORK — Creating usable metadata from live sports broadcasts is a labor intensive operation. Whether done in real time or after the fact the process still involves teams of humans watching video and entering information into metadata fields manually.

Computer vision and other forms of artificial intelligence (AI) can help automate parts of the process. But according to speakers at the Sports Video Group’s Asset Management & Storage Forum here Wednesday, we’re a long way from no longer needing the human touch.

CBS Sports Network CTO Walter Raps described the network’s years-long efforts to automate metadata collection and the many pitfalls it encountered. “We tried it using voice recognition [from the soundtrack] but it led to disasters,” he recalled. “We had a case with a coach who had a very deep Southern accent who was talking about ‘rating’ the players as they came out of the showers but it didn’t come out as ‘rate.'”

CBS then tried to leverage the closed captioning feed to capture what was being said but found the accuracy too low. “We still haven’t found something better than a human logger doing to the tagging,” Raps said.

Technology can help, though. Some important metadata is objective, such as the position of the camera being used to capture the video, the timing of cuts to another camera and other production details. “One of the things we’re looking at is how we can embed recognition of some of those objective aspects into the equipment so you capture it at ingest,” ARRIS senior director, multiscreen media and systems Venu Vasudevan said. “Then we become a real force multiplier for a human logger.”

Computer vision can also help capture objective metadata elements. NerVve Technologies co-founder and CEO Thomas Slowe described his company’s origins in the U.S. intelligence community. “We started the company to figure out ways to scan through huge volumes of video to identify one particular object, like a particular car or truck, because that’s what the intelligence agencies needed,” he said. “It turns out that’s a very useful ability in sports media as well.”

The NerVve system can scan through video footage to determine whether, when and for how long the Nike logo is visible on the screen, for instance, or whether a player that a brand has a relationship with is visible — both vital issues to sports marketers. “The Nike logo doesn’t change,” Slowe said.

Some types of sports metadata may always require humans, however, particularly if it’s relevant in real time. “We have loggers who look at whether a player is limping coming of the field,” CBS’ Raps said. “Could we automate that? I don’t know. Could we do it after the game? Maybe. Would we still care about it after the game? I don’t know.”

The key, according to Dextro co-founder Davied Luan, is to find ways to integrate human and machine intelligence. “One of the most interesting areas in computer vision right now is tools that can work together with humans,” he said. “What you want to do is find ways to combine the objective parts of the content that can be fairly easily solved for by computers, with the more nuanced aspects where humans are still better than computers.”

Any metadata system for sports content also needs to be open ended and extensible, Levels Beyond COO Christy King said, because the tools for data capture will only improve over time and the needs of metadata users will evolve. “One of the things I deal people about asset management that always terrifies them is that you’re never done,” King said. “You will be layering data onto video forever.”