NASA discovers an SOA for Earth data
- By Joab Jackson
- Feb 01, 2006
Ageny's early foray into Web services technology nets a fully functional service-oriented architecture
GROUND CONTROL: 'We found that members of the Earth science community wanted to have more control over what goes into and comes out of their own research,' says NASA's Robin Pfister.
Service-oriented architecture wasn't on the minds of NASA personnel in 2000 when they started updating a directory of Earth science data. In fact, most hadn't even heard the abstract term. They only wanted to do a better job of addressing a widespread audience with diverse hardware requirements.
'With the emergence of the Web and new collaborative technologies, we found that members of the Earth science community wanted to have more control over what goes into and comes out of their own research,' said Robin Pfister, development manager for the NASA Earth Observing System Clearing House, also known as ECHO (www.echo.eos.nasa.gov
Planned or not, the drive to unify resources brought NASA to the cutting edge of SOA. Originally an index of Web-based material, ECHO was expanded to offer a machine-readable directory of services that can manipulate data as well. The advanced services went live last September.
'We're letting different organizations make their wares available. It is sort of like a marketplace for science,' said Michael Burnett, senior software architect for Blueprint Technologies Inc. of Fairfax, Va, which did architectural design work for ECHO.
To carry out the project, the ECHO team set up a Universal Description, Discovery and Integration-based registry. UDDI is an Extensible Markup Language-based data structure that gives organizations a uniform way to describe and list Web services. It's also a cornerstone of SOA.Beyond the buzz
A buzzword in CIO offices, SOA is a way of describing components in an IT infrastructure so they can exchange data and create an environment where one application can execute a function then pass off the results to another.
SOA was barely a glint in some theoretician's eyes when NASA started prototyping ECHO. At the time, Pfister simply wanted to round up data resources from the Earth science community, both inside NASA and academia. At one time NASA kept much of its data in central locations, but with the advent of the Web, researchers started hosting their own material. Therefore ECHO itself wouldn't hold data; it would contain pointers to other locations.
ECHO has collected links to about 1,950 large datasets. New data sets are submitted by filling out a form on the Web site and using XML to format the metadata.
With data gathering under way, the ECHO team took the next step of providing a directory of services that could help researchers crunch the data they found. A service is simply a computer operation that can be invoked remotely. A subsetter is one popular service. It carves out a portion of useful data from a larger set, based on the user's preferences. It's particularly handy for working with data sets that are too big for a desktop computer.
But a subsetter could be located anywhere. All a remote user would need to know is where that service is and how to format the data. ECHO should provide a directory of such services, Pfister reasoned.
ECHO itself is housed in NASA's Goddard Space Flight Center in Greenbelt, Md.
It runs on Sun Microsystems SunFire V800s servers with a BEA Systems WebLogic application server and an Oracle9i database. In January 2003, NASA contracted with Global Science and Technology Inc. of Greenbelt to set up and maintain the ECHO service directory. GST brought in Blueprint to help with the work.
In ECHO, an owner of a service can submit its description using Web Services Description Language. UDDI can then keep track of a wide range of metadata, including service uptime, contact information and security requirements.
In addition to providing a directory that researchers can use to find new resources, UDDI will also help ECHO take the next step'letting users build workflows between different resources. The team hopes to have this feature ready early this year.
'You would build your own workflow and ECHO will facilitate the interactions between its partners,' Burnett said.
For instance, a researcher may want a report on the monthly average amount of vegetation growth from June 2001 to the present, displayed as a time-lapsed map. The vegetation data may be in one location, a subsetter in another location and a modeling tool in a third location. That person could compose a workflow at ECHO that would dispatch a query for the data, pipe that data to a subsetter and then route that output to the modeling service. Since the UDDI data is machine-readable, the researcher wouldn't need to know the specifics of how to route the data from one service to the next. Instead, ECHO would provide high-level graphical tools to cobble together the workflow.
It's this sort of automation that's the ultimate promise of SOA. NASA is getting an early glimpse of that promise.
Joab Jackson is the senior technology editor for Government Computer News.