AWS Exec: Machine Learning’s Undergoing a ‘Renaissance’
NEW YORK — Machine learning is undergoing a “renaissance” now thanks to the increasing shift of data storage to the cloud, according to Matt Wood, Amazon Web Services (AWS) GM-artificial intelligence (AI).
That’s because “the cloud has enabled machine learning and customers to overcome the single largest point of friction, which is almost always around scale,” he told the AWS Summit Aug. 14 during a keynote in which AWS also introduced the new machine-learning based security service Amazon Macie and announced new cloud service client wins that included Hulu.
“When you’re working with machine learning and training machine learning models, you need tons and tons of data — the more the merrier,” Wood said. The concept is simple. “The more data you put in, the more likely it is that your model is going to be accurate,” he said.
When you have all that data, you then “need to be able to train it at scale – typically using high-end” graphics processing units (GPUs), he said. “Once you train those models, you need to be able to perform predictions against them, also at scale, both in the cloud” and “at the edge through connected devices or on mobile apps,” he said.
AWS has been “addressing these challenges for customers for over a decade,” he went on to say, noting that its customers have been “aggressively migrating everything out of their data centers up to AWS as quickly as they can,” and nearly all the new data “has been generated in the cloud by default.”
Earlier in the keynote, Wood pointed out that “it’s never been cheaper, easier or more cost-effective for customers to be able to pull data from their program applications, their web applications, their IoT applications – even their data centers – and load it up onto AWS.”
Once that data is in the cloud, “customers typically want to be able to get some value out of that data,” he said, explaining: “They want to be able to analyze and they want to be able to compute against it. They want to be able to ask questions and get answers back in a reasonable time.”
Before, “inside the constrained walls of the data center” on premises, that was “extremely challenging” because companies were “stuck with a fixed set of resources unless” they wanted to make large capital investments, he said. So, customers typically “ended up being crammed inside that same box,” inside the walls of their data centers, he said.
However, he said: “In the cloud, those data center walls – they just disappear. And so, customers can start to collect the data that they need, aggregate it at the right level and ask the questions which are truly important to their data.”
Over the past 10 years, AWS has “built out probably the largest and broadest set of … capabilities available anywhere” that address customer demand, including Amazon Athena, Wood said, pointing to the pay-as-you-go, serverless interactive query service that his company introduced last year.
Athena makes it easy to analyze data directly in Amazon Simple Storage Service (Amazon S3) using standard Structured Query Language (SQL), according to AWS. With a few clicks in the AWS Management Console, customers can point Athena at their data stored in Amazon S3 and start using standard SQL to run queries and get results in just seconds.
With Athena, there are no clusters to manage and tune; nor is there any infrastructure to set up or manage, and customers pay only for the queries they run. Athena “scales automatically – executing queries in parallel – so results are fast, even with large datasets and complex queries,” Amazon said last year while announcing it.
Regarding Amazon Macie, Wood told the summit that Netflix was among the AWS customers already using the new machine-learning based security service.
Macie combines concepts developed by Harvest.ai, a San Diego-based startup Amazon bought late last year, with AWS’s existing machine learning technology. Harvest.ai’s MACIE Analytics provides “the confidence and dependability in protecting your business’s most important assets, protecting user accounts from compromise and preventing theft of data and intellectual property” (IP), according to the Harvest.ai web site.
Its features include patent-pending AI-based ability to track intellectual property across an organization’s network, including emails and other content derived from IP, according to Harvest.ai.