Sprocket kicks video processing into high gear
- By Patrick Marshall
- Nov 09, 2018
Thanks to parallel processing, scientists now can compute the likely path of a hurricane in the blink of an eye. But processing even short digital video recordings still takes much longer than you'd think.
The reason? According to George Porter, associate professor in the computer science and engineering department at the University of California at San Diego, the problem is the way video files are compressed.
Instead of being a series of full-frame images, Porter said, “a single frame of video is actually encoded as differences from the previous frame. That's how video compression works.”
While that technique is great for reducing file sizes, it makes it difficult to slice up a video file for parallel processing. “Most systems will simply process whole videos at once,” he said. And that takes time.
This, said Porter, is where Sprocket comes in. Compressed video files save a full-frame image, called a “keyframe,” every couple of seconds. Sprocket analyses video files and divides them into many smaller segments using keyframes as markers. Then each segment can be processed individually.
Sprocket also relies on an important development in cloud technology -- serverless frameworks like Amazon Web Services' Lambda platform, which can run code without the provisioning or managing of virtual machines.
“For many years you’ve been able to get virtual machines on demand from cloud providers,” said Porter. “But the ability to create those virtual machines has gotten so fast and so inexpensive that now I can get a virtual machine for a few seconds or for less than a second. Amazon now offers virtual machines for just hundreds of milliseconds.”
Thanks to Lambda, Sprocket can slice a video into, say, 5,000 pieces and send each piece for processing. The result is that two hours of video can be processed in 30 seconds instead of 10s of minutes and for a cost as low as $1.
“It fundamentally changes the way you think of the cloud,” said Porter. “I can now have an application that is able to take a task that may have taken one or two hours on the laptop, and I can send it to the cloud and operate in this massive amount of parallelism for just a very short period of time and only pay for the time I use.”
Sprocket, a framework that was written in Python, has been released as open-source code. The framework allows developers to specify video inputs, apply filters and perform other operations, including interfacing with the computer vision algorithms provided by cloud service providers.
The UCSD team continues to develop Sprocket. According to Porter, one of the current limitations is that while Sprocket can slice up video files for parallel processing on Lambda, those pieces don’t communicate well with each other. As a result, once a video is processed for one task, say applying a color filter, it must be reassembled before it can be processed for another task such as image recognition.
“One of the things that we have been looking at is ways of improving the ability of all of those different, short-lived virtual machines to work together,” he said.
Patrick Marshall is a freelance technology writer for GCN.