What is your e-mail address?

My e-mail address is:

Do you have a password?

Forgot your password? Click here
close

Want to avoid software snafus? Here's a good place to start.

The National Institute of Standards and Technology has dramatically expanded its public dataset of software flaws to help developers and analyzers avoid weaknesses in their programs.

The Software Assurance Metrics and Tool Evaluation (SAMATE) Reference Dataset contains examples of errors in a number of popular programming languages that could leave software vulnerable to exploits by hackers and criminals.

Version 4.0 of SAMATE contains 175 broad categories of weaknesses with more than 60,000 specific cases. This release has more than doubled the number of categories and added 30 times the number of examples from the previous release.


Related story:

Updated SCAP specs aim to improve automated security checks


“This is an enormous step toward bringing methodical science to the hard question of bugs in software,” said Paul E. Black, NIST computer scientist and SAMATE project leader. The dataset is used to build static analyzers that comb software for problems.

SAMATE, which began in 2004, is an umbrella project to improve software assurance by excluding known problems. The catalyst for the program was a Homeland Security Department project on software assurance tools, Black said.

“They wanted to understand what tools were available, measure their effectiveness and identify gaps,” he said. The tools analyze software, scanning it for known flaws and weaknesses. “We asked ourselves, does this tool catch all possible errors? We realized that to answer that we needed a list of all possible errors.”

NIST worked with DHS to establish a long-term program for creating such a list. The effort complements other programs, such as the Common Weakness Enumeration and the Common Vulnerabilities and Exposures databases maintained by Mitre Corp.

SAMATE contains specific examples of coding flaws in software written in Java, C and C++. Each case is about a page of computer code showing a problematic way of composing functions, loops or logic operations

The current dataset is limited in the languages it includes and still does not cover all types of weaknesses. The Common Weakness Enumeration contains closed to 500 types of weaknesses, Black said. “We’ve expanded enormously, but we could probably double our set again,” he said.

Industry is using SAMATE, Black said. Before the latest release, there had been 10,000 downloads of the dataset over a 10-month period.

About the Author

William Jackson is a senior writer of GCN and the author of the CyberEye column.

Reader Comments

Please post your comments here. Comments are moderated, so they may not appear immediately after submitting. We will not post comments that we consider abusive or off-topic.

Your Name:(optional)
Your Email:(optional)
Your Location:(optional)
Comment:
Please type the letters/numbers you see above
GCN Awards 2012

GCN eNewsletters

Editorial Webcasts

  • Cloud Computing: Ushering in the Next Wave of Data Center Consolidation Register Now

    In this webcast, a government IT expert will explore the top considerations, operational requirements and policy challenges inherent to integrating new and legacy applications in the cloud. You will explore the pros and cons of adopting a public vs. private cloud model based on your specific security and operational requirements, as well as how you can fully leverage your cloud investment to achieve efficiency, collaboration and transparency needs. Read more