Search goes to the video

SnapStream scours closed captions to help organizations find what they need

DIGITAL VIDEO is easy to record, manipulate, save and distribute. But one of the toughest remaining challenges to using it is finding what you want.

'Historically, TV has been difficult to search,' said Rakesh Agrawal, chief executive officer of SnapStream Media, which has come up with a networked video search appliance to address the challenge.

The appliance, SnapStream Enterprise, records video and searches the closed captioning and metadata embedded in most commercial programming to locate references and automatically alert users.

'Now you don't need constant monitoring' of TV systems, Agrawal said. 'You can automate the process.'

Because it relies on closed captioning, the system works with commercial TV rather than proprietary video security systems, and the searching is done on what is being said in the program and on captioned descriptions of activity rather than of the video itself. The capability is in demand among public affairs offices, law enforcement organizations and anyone else who wants to keep up with what is being said about a particular subject.

The Senate Republican Conference uses SnapStream Enterprise to keep track of TV coverage of the Senate and political affairs on commercial and public service networks in addition to the Senate's in-house video system for covering hearings.

'We saw it at FOSE' in 2007, said Nathaniel Green, the Senate Republican Conference's systems administrator. 'We had been looking for something that could record cable TV.'

The conference is a communications office that serves the party. It uses Snap- Stream primarily for recording audio and video, Green said. 'We use search some, if someone needs to find something in the middle of the clip,' he said.

Although digital video recording has been around for years, the SnapStream approach is not simple. 'It's pretty complicated,' Agrawal said. 'It's all networked. What makes it hard is managing the task of recording 10 shows at once on a single platform.'

The appliance is a rackmount server that can be accessed by PCs running the SnapStream client. 'You can have multiple people running the SnapStream client on their PCs and have a full DVR experience,' he said.

The client supports Active Directory for logging in, and each user can run searches, make and edit their own clips, and save them to disk or e-mail them. The system can record as many as 10 channels at once and can store as much as 10,000 hours of recording, a little more than 400 days of around-the-clock broadcasts.

Searchable data file

SnapStream took the easiest route to making the recordings searchable, first focusing on embedded text.

'Closed captioning effectively is mandated by [the Federal Communications Commission] for 95 percent of shows,' Agrawal said. Those who do provide it embed data in the video signal where it is originated.

SnapStream separates the data from the audio and video when it is recorded, adds a time code and puts it into a separate searchable data file.

Although it is a text search, it is a little more complicated than doing a search in a Word document. The captioning is prone to error and the software does spelling correction before searching. There also are linguistic techniques used in the search, such as recognizing abbreviations and acronyms, such as HPD for Houston Police Department or DOD for the Defense Department, without a specific request.

'This is a space that has seen a lot of research, and there is a lot of existing work to draw on,' Agrawal said. 'But a lot remains to be done. We've only scratched the surface on the video search problem.'

Developers still need to improve optical character recognition to enable direct search of the video portion and phonetics to enable search of the audio.

Reader Comments

Please post your comments here. Comments are moderated, so they may not appear immediately after submitting. We will not post comments that we consider abusive or off-topic.

Please type the letters/numbers you see above