big data

What to do when data's too big to just transfer to the cloud

As government agencies consider moving their enterprise data to the cloud, their first question might be: How does it get to the cloud? In most cases, data can be transmitted  via FTP or HTTP protocols, but for some applications — like life sciences, sensor and video surveillance applications — the data is just too big to fit through the pipe. What’s the best option? 

Pack it up and ship it out.

Some major cloud vendors now offer a service whereby clients can ship physical media to the data center, where it can be uploaded, eliminating overly long data transfer times. Bulk imports are especially useful when data is first ported to the cloud or for backup and offsite storage. The fees for this service vary, and some cloud providers will also download data from the cloud and ship it via physical media. 

AWS Import/Export accelerates transferring large amounts of data between the AWS cloud and portable storage devices that clients ship to Amazon. It uses the company’s multimodal content delivery network that can transmit terabytes of data faster than a T-3 leased line to transfer data from physical media to Amazon S3, Amazon EBS or Amazon Glacier. Amazon charges $80 for each device handled; other costs depend on which Amazon cloud is used as well as the time it takes Amazon to upload the data or decrypt the device. For more information, see the AWS Import/Export documentation.

Google Cloud Storage Offline Disk Import is an experimental feature that is currently available in the United States only. The service gives clients the option to load data into Google Cloud Storage by sending Google physical hard drives that it loads into an empty Cloud Storage bucket. Google requires that the data be encrypted. Because the data is loaded directly into Google's network, this approach might be faster or less expensive than transferring data over the Internet. According to Google, import pricing is based on a flat fee of $80 per HDD irrespective of the drive capacity or data size. After that, standard Google Cloud Storage pricing fees apply for requests, bandwidth and storage related to the import and subsequent usage of the data, according to the company.

HP Bulk Import Service is still in private beta, but it allows users to load their data into HP Cloud Block Storage or HP Cloud Object Storage. The new service, which is expected to be released in fall 2013, will let users send hard drives directly to HP’s data centers, where data can be rapidly uploaded and transferred.

Rackspace’s Bulk Import to Cloud Files is a service that lets clients send Rackspace physical media to be uploaded directly at the data centers, where “migration specialists” connect the device to a workstation that that has a direct link to Rackspace’s Cloud Files infrastructure. Rackspace will not decrypt data, though the company plans to offer that option in the future. Rackspace charges $90 per drive for bulk imports.

For cases where the data is consistently too large to transmit and access demands won’t allow the latency inherent in shipping data, Apsera offers its Fast Adaptive Secure Protocol (FASP) data transfer technology that eliminates the shortcomings of TCP-based file transfer technologies such as FTP and HTTP, the company’s website explains. On a gigabit WAN, FASP can achieve 700-800 megabits/sec transfers with high-end PCs and 400-500 megabits/sec with commodity PCs, the company said.

Aspera said its software is in use and accredited for SIPRnet, JWICS and FIPS 140-2, and it has been vetted by the intelligence community for large data transfers over military networks. It is also used in the 1000 Genomes Project that exchanges data between the National Center for Biotechnology Information and the European Bioinformatics Institute.

About the Author

Susan Miller is executive editor at GCN.

Over a career spent in tech media, Miller has worked in editorial, print production and online, starting on the copy desk at IDG’s ComputerWorld, moving to print production for Federal Computer Week and later helping launch websites and email newsletter delivery for FCW. After a turn at Virginia’s Center for Innovative Technology, where she worked to promote technology-based economic development, she rejoined what was to become 1105 Media in 2004, eventually managing content and production for all the company's government-focused websites. Miller shifted back to editorial in 2012, when she began working with GCN.

Miller has a BA and MA from West Chester University and did Ph.D. work in English at the University of Delaware.

Connect with Susan at [email protected] or @sjaymiller.


  • Records management: Look beyond the NARA mandates

    Pandemic tests electronic records management

    Between the rush enable more virtual collaboration, stalled digitization of archived records and managing records that reside in datasets, records management executives are sorting through new challenges.

  • boy learning at home (Travelpixs/

    Tucson’s community wireless bridges the digital divide

    The city built cell sites at government-owned facilities such as fire departments and libraries that were already connected to Tucson’s existing fiber backbone.

Stay Connected