M+E Daily

M&E Journal: Leveraging AI to Fight Streaming Piracy

The U.S. Defense Advanced Research Projects Agency (DARPA) informs us that we are currently experiencing the third wave of the evolution of artificial intelligence, which is characterized by the development of systems that can reason and learn in much the same way that humans do.

This wave is powered by several key technological advancements that create the perfect conditions needed for smarter, more contextually aware autonomous systems.

Although there is much debate as to exactly which advancements have played the greatest role, we can at least agree that we now have access to more sophisticated AI algorithms, significantly more powerful, inexpensive, and ubiquitous computing resources, and large amounts of data that can be used for training AI systems.

The third wave of AI is also having a significant impact on cybersecurity, primarily by enhancing the speed and accuracy of threat detection and response.

Cybersecurity solutions that make use of a combination of rule- and machine learning based algorithms have the potential to greatly improve the effectiveness of anti-piracy efforts in the video streaming industry.

By automating repetitive tasks, providing new tools for identifying and tracking pirated content, and finding subtle patterns in complex data sets that indicate new forms of piracy, these technologies can help rights holders to better protect their content and ultimately benefit the entire streaming industry.

DRM LOG ANOMALY DETECTION

According to Grandview Research, the video stream- ing market size is expected to expand at a compound annual growth rate (CAGR) of 21.3 percent by 2030. To secure content and protect valuable revenue, the industry regularly uses a variety of technologies such as digital rights management (DRM).

DRM is widely used within the pay media industry to ensure that video content is stored and transmitted in encrypted form, so that only authorized users and devices can play it back. When a user attempts to play back a protected video, the video player must first request a DRM license from a DRM server.

The interaction between the DRM agent in the video player and the DRM backend — the DRM protocol — makes it possible to collect a host of valuable data about how the platform is being used.

But DRM systems are not entirely foolproof. Pirates continuously find ways to circumvent security technologies to steal and illegally rebroadcast licensed content. This is where artificial intelligence (AI) comes in.

AI can be used to detect the difference between normal – and legitimate – use of a streaming platform, and abnormal usage that could indicate piracy. Traditionally, detecting this type of activity would require resource intensive and time-consuming manual analysis of the data.

In particular for “big data” sets with multiple billions of records added to it each month.

Although rule-based detection methods can go some way towards automating this process, it is less effective at detecting all threats.

Especially if they are not encountered beforehand or if the abuse pattern is very subtle. The best solution is to complement the rule-based approach with an AI algorithm.

The exact nature of the data may differ slightly from one pay media operator to the next.

But typically, it includes a timestamp of when the DRM interaction took place, the content that was being accessed, the data that identifies the device/player that was used
in accessing the content, the IP address of the device, and often an identifier that can be traced back to the subscriber using information in other backend systems.

For an initial test of an AI solution, our team chose an auto-encoder architecture model, which makes semi-supervised learning possible despite lacking enough labeled data. Using cleansed data for training, the goal is for the encoder to learn how to interpret the input and compress it to an internal representation.

This is done while the decoder attempts to recreate the original input from the output of the encoder.

The original input and the reconstructed input are compared through a distance measure. This allows a threshold to be set, whereby all inputs with a distance value higher than the set threshold are identified as anomalous data points.

The following workflow explains how the AI solution works in practice:

# Data are pre-processed and summarized to ensure that training and inference operations are cost and computationally effective.

# The model ingests the data that represents “normal” activity, which is identified through a set of rules. It produces a binary prediction (anomaly vs not anomaly).

# A report with detected anomalies is shared with the customer.

# The customer validates the predicted anomalies and provides feedback. The model is re-trained based on the customer’s feedback (e.g., confirmed anomalies are removed from the training set) to ensure that model’s quality is up to date.

This approach allows for customer-specific AI models, where the model only learns from the usage patterns of the target customer’s platform.

This ensures that one customer’s data is not exposed to the models of other customers.

It is also important that differences in business models are correctly reflected.

Additionally, the AI model can also be specifically tailored for different DRM types such as Widevine, PlayReady, and FairPlay, to detect DRM specific anomalies.

The solution is currently being validated with targeted customers and has already yielded valuable results.

Interestingly, it also flagged up anomalous behavior that, once investigated, proved to be a buggy DRM implementation. A combination of both rule- and machine learning-based detection of anomalous behavior yields much better results than one based only on rules.

Looking forward, the AI model must be continuously evaluated and improved upon based on content piracy evolution.

We also plan to advance the current periodic anomaly detector to be a real-time alerting system to further mitigate piracy and fraud.

* By Werner Strydom, Head of Advanced Technology, Innovation; Rodrigo Fernandes, Product Director, OTT; and Jessica Alecci, Senior Data Scientist, Irdeto *

=============================================

Click here to download the complete .PDF version of this article
Click here to download the entire Winter 2022 M&E Journal