Pulse


Vs of big data

Big data: How to know it when you see it

Information from sensors, video, satellites, genetic codes, social media and web tracking is contributing to a big data soup that web-connected devices are serving up. The vast amount of data now available, not to mention the storage and analytics technology that makes analysis of that data possible, is making “big data” look like the answer to every question.    

But what is big data really?

According to the National Institute of Standards and Technology, big data consists of highly extensive datasets that require a scalable architecture for efficient storage, manipulation, and analysis. Commonly known as the ‘Vs’ of big data, the characteristics of data that require these new architectures include:

Volume – the size of the dataset at rest, referring to both the data object size and number of data objects. Although big data doesn’t specify a particular data quantity, the term is often used in discussing petabytes and exabytes of data.

Velocity – the data in motion, or rate of flow, referring to both the acquisition rate and the update rate of data from real-time sensors, streaming video or financial systems.

Variety – data at rest from multiple repositories, domains or types, from unstructured text or images to highly structured databases.

Variability – the rate of change of the data from applications that generate a surge in the amount of data arriving in a given amount of time.

Veracity – the completeness and accuracy of the data sources, the provenance of the data, its integrity and its governance.

Posted on Feb 06, 2015 at 12:43 PM0 comments


Food safety agency plans online services to streamline press, public notice distribution

The Food Safety and Inspection Service wants to partner with the private sector firms to disseminate English and Spanish language versions of its publications using a variety of web and software-based publishing systems. 

The agency’s Office of Public Affairs and Consumer Education (OPCE) noted in a request for information that while it is content with its current media software, it is interested in a commercial off-the-shelf dissemination tool for print and online publications. 

Each year, OPACE distributes press releases to more than 100,000 media sources/contacts in metro area cross the country. OPACE has also significantly expanded its use of social media and other web-based channels. 

To support its growing online publishing requirements, the food agency is looking for a web-based media public relations software system that is easily accessible through a secure desktop interface. The system should be able to accommodate multiple users to distribute, track, evaluate and report messaging in an efficient, 508-compliant manner.   

Services and capabilities OPACE is seeking include press release distribution, a media sources and contacts database, database management, tracking of media calls, news monitoring, evaluation of impressions, training and support. 

OPACE also requires effective methods of analyzing and evaluating system performance. 

Eventually, OPACE envisions a system or a combination of systems to provide user-friendly publication distribution and analysis services.  Interested parties have until Feb. 18 to respond.

Posted on Feb 05, 2015 at 10:11 AM0 comments


organizations suffer from common data quality errors

U.S. organizations say a third of their data is bad

Agencies are relying on data aggregation and analytics to enhance citizen services and understand social, scientific and financial trends.  Given the meteoric rise in the uses of data aggregation, as well as a growing reliance on its methods, data accuracy is paramount.

Many organizations struggle with data inaccuracy, despite having an established data quality strategy. However, in a startling increase from last year, 1,200 respondents to a global study believe 26 percent of their data is inaccurate, U.S. respondents believe 32 percent of their data is inaccurate. 

The Experian Data Quality’s study noted three common data quality errors: incomplete or missing data, outdated information and inaccurate data. Most organizations cited duplicate data as a contributor to overall inaccuracies , while human error is believed to be the biggest factor in the data spoilage. Lack of automation – and a consequent dependence on manual data input – has also contributed to the problem, the study suggested.

One way to address these concerns is to institute data audit software, Experian suggested, noting that  only 24 percent of their study’s respondents use such software.  Organizations that do not deploy proactive software to detect errors not only waste resources and damage productivity, but they may not be able to derive accurate insights from their data.

Besides auditing technology, organizations can use data profiling or matching and linking technology to detect errors.

In order to make improvements, 89 percent of U.S. organizations will seek to invest in some type of data management solution, Experian said, warning that without a coherent data management strategy, these types of errors will continue to increase. 

Posted on Feb 03, 2015 at 2:05 PM0 comments


USGS releases open-source groundwater toolkit

Because the nation rely on groundwater for its drinking water, agriculture and industry, a robust monitoring network is needed to track water quality as well as reports of contaminated wells and fluctuating water levels.

The U.S. Geological Survey recently introduced a new open-source Groundwater Toolbox that estimates groundwater flow, including base flow (the groundwater-discharge component of streamflow), surface runoff and groundwater recharge from streamflow data, according to USGS.

The geographic information system-based toolbox brings together various analytical methods used by USGS and the Bureau of Reclamation to pull data automatically from the National Water Information System from more than 26,000 streamgage sites. 

The GW Toolbox can be run on any Microsoft Windows compatible platform.  A customizable interface includes data analytical programs and methods for estimating groundwater recharge.  Users can also glean water availability and hydrologic trends related to changes in the environment.

The toolbox is free and available to the public.  Engineers, members of the academic community and government agencies will benefit from the information provided by the open-source GW toolbox for independent assessments and research. 

Posted on Feb 02, 2015 at 9:11 AM0 comments


NIST issues final guidance for mobile app security

Today’s mobile-enabled workers have access to a variety of apps that are designed to improve productivity, but an employee who downloads an unsafe app may unwittingly expose an organization’s computer network to security and privacy risks.

The National Institute of Standards and Technology’s Vetting the Security of Mobile Applications, (SP 800-163) aims to help organizations assess the security and privacy risks associated with mobile apps, whether developed in-house or downloaded from mobile app marketplaces.

It is the final version of Technical Considerations of Vetting 3rd Party Mobile Applications guide that was published for comments in August 2014.

The guide offers plans for implementing the vetting process as well as  considerations for developing app security requirements, and describes the types of app vulnerabilities and the testing methods to use to detect them. The document also provides guidance for determining if an app is acceptable for an organization to use.

The publication is a guide for developers seeking to understand the types of vulnerabilities that can be introduced during an app’s software development cycle.

Posted on Jan 27, 2015 at 1:02 PM0 comments