It Pays to be Persistent

 

Connecting state and local government leaders

The Energy and Defense departments are minimizing 'document not found' messages by using persistent identifiers.

Is there a browser message more annoying?With its Information Bridge program, the Energy Department's Office of Scientific and Technical Information is trying to do away with such messages. The agency is giving its research documents permanent addresses on the Web so they can always be found. Ascribing permanence in an online world is no easy feat, but it may go a long way toward minimizing 'Document Not Found' messages.The first decade of the Web was a time of fluidity for government agencies. Early adopters posted agency material, only to have it shuffled around as new IT initiatives and enterprise architectures uprooted the order of documents. These days, when someone types in an older Web address for some agency page, chances are they'll see an error message. Equally problematic is the fact that, as copies of documents proliferate across the Web, updates go unnoticed. And these sorts of problems will only grow worse over time.To address these concerns, Energy and other agencies are embracing the idea of 'persistent identifiers,' which assign documents permanent online addresses. No matter how many changes the agency makes to its IT architecture, documents with persistent identifiers will always be accessible through the same address.Technically, persistence is easy. It mostly involves recognizing that someone within that agency needs to manage the task. Energy maintains a set of servers that contain a list of all the assigned permanent addresses, written in the Permanent Uniform Resource Locator format. Those addresses are mapped to the current addresses where documents reside.'In the old days, everybody relied on a report number to find a document. Now people use PURLs,' said Sharon Jordan, assistant director at OSTI in the Office of Program Integration.So far, it has been mostly the scientific and library communities that have embraced permanent document identifiers. Eventually, all agencies may have to grapple with them.Energy's Information Bridge program gives users the ability to search Energy research and development documents, including papers published in scientific journals as well as 'gray literature,' or working papers, informal presentations and other unpublished but still pertinent material.The material ranges the gamut of Energy research, covering physics, chemistry, materials, biology, environmental sciences, energy technologies, engineering, computer and information science, renewable energy and more. Operational since 1997, Information Bridge's material stretches back to 1995.'When we began dealing with digital documents, we realized that we needed a locator or identifier,' Jordan said. Since OSTI houses only some of the documents, while others are hosted by the originating facilities themselves, they risked setting up a system that would point to documents that could be moved by their custodians.To assign permanence to the documents, OSTI uses the PURL naming system developed by Online Computer Library Center Inc., a nonprofit research organization for the library community. A PURL looks like a Uniform Resource Locator, the format used for Web addresses. Unlike URLs, however, PURLs come with at least an implicit guarantee from the custodian of a document that the document will always be available at an address.So far, the Information Bridge has attached PURLs to about 110,000 documents, said Jeff Given, OSTI IT services project manager for Information International Associates Inc. of Oak Ridge, Tenn. OSTI set up a dedicated server, called a resolution service, that keeps track of all PURL addresses, along with their current locations.When a PURL-based request for a document comes in from the Internet, the software simply redirects that request to the server holding the document, said Stuart Weibel, a senior research scientist at the OCLC who created PURL. While someone can change what a PURL resolves to, they can't change the PURL itself.Redirecting incoming requests for pages is nothing new. It's a standard feature of Web server software. But OCLC's open-source program makes the task of managing PURLs much easier, Weibel said. You can download the software at http://purl.oclc.org.While the technology may be a snap, the key to running a permanent identifier system is administrative oversight. An agency needs to understand the importance of setting up a system that will keep track of where all its documents'old and new'are located. It also needs to organize its workflow so new documents are registered though the service.OSTI's task was relatively easy, because Energy already had procedures in place to register scientific documents with the office. Each Energy office designates an individual who manages its documents. When a researcher creates a new artifact, they must alert their office, which in turn alerts OSTI of the new document.'It used to be a paper process. In the 1990s, we just updated our procedures to do it all electronically,' Jordan said.Like Energy, the Defense Department is also using permanent identifiers for its scientific literature, but it took a slightly different course.The Defense Technical Information Center, based in Fort Belvoir, Va., serves as a centralized repository for Defense scientific and technical information. Like OSTI, DTIC has long had in place procedures for researchers to keep DTIC notified of their documents. To provide a public conduit to these documents, DTIC set up the Public Scientific and Technical Information Network, a Web portal through which users can search the 200,000 technical reports under DTIC's care (http://stinet.dtic.mil).But DTIC does not use PURLs. It uses a technique called the Handle System to keep track of the documents, said James Erwin, director of information science and technology for the Defense Technical Information Center. The Corporation for National Research Initiatives oversees the Handle System (www.handle.net), and offers software that can be integrated into browsers so they recognize handles.Like PURLs, handles can be inserted into standard URLs. A search for the term silicon nitride at the StiNet site will return a document with the address http://handle.dtic.mil/100.2/ADA428642. The first part of the address is a standard URL; the second part, 100.2/ADA428642, is the permanent handle.Though they serve their immediate communities, these persistent-identifier networks are starting to be adopted by other groups as well. For instance, Defense Department librarians have incorporated DTIC's Handle System addresses into their own reference systems, Erwin said. In addition, DTIC is working with the Defense Department's Advanced Distributed Learning Office, which is building a repository of e-learning modules. The modules will be tagged with handles so they will be searchable through StiNet.Information Bridge is also expanding. Earlier this year, Energy joined an academic linking service called CrossRef, run by a nonprofit industry collation called the Publishers International Linking Association (www.crossref.org). CrossRef assigns permanent digital object identifiers to scientific reports, allowing them to be located even if their Web addresses change. OSTI will place CrossRef identifiers on about 90,000 Energy articles. This will allow CrossRef users, mostly academics who may not know about Information Bridge, to find Energy Department data.Erwin and other persistent identifier gurus, however, are thinking more broadly. They want to see a global identifier system for the world's documents.Organizations like OSTI and DTIC are addressing a problem that all agencies will soon face. Subsection 207d of the 2002 E-Government Act calls for 'the adoption of standards, which are open to the maximum extent feasible, to enable the organization and categorization of Government information ... in a way that is searchable electronically, including by searchable identifiers; and in ways that are interoperable across agencies.'Erwin is co-chair of the Categorization of Government Information Working Group, part of the Interagency Committee on Government Information. This group created a set of recommendations for Office of Management and Budget on how federal agencies could meet the E-Gov mandate. Persistent identifiers are a key component.The group recommends that the federal government stick close to standards proposed by the Internet Engineering Task Force, which shepherded the protocols that led to the Internet's ubiquity. IETF's proposed identification scheme is called Uniform Resource Names. While today's identifier schemes, such as PURL and the Handle System, piggyback on Web URLs to identify a document's whereabouts, the URN system would create an entirely separate information space on the Internet. The URN creators assume that Web addresses will always be in turmoil, so they propose creating an entirely different naming scheme solely dedicated to permanently placed documents. Under the scheme, the address for a permanent document would start with urn:// instead of the common http://.A governmentwide group formerly called the Commerce, Energy, NASA, Defense Information Managers Group (now known simply as CENDI) is working on a prototype of how such a system would work, which it plans to demonstrate this month. VeriSign Inc. of Mountain View, Calif., will contribute an open-source browser plug-in that will identify URNs as hyperlinks.And just as the Internet bound together many individual networks, URN can bind together different identifier schemes, in-cluding the Handle System and PURL. This means that if wide-scale adoption of URN occurs, existing identifier systems will not have to change their addresses.The URN identification will be a multipart address, said Michael Mealling, who wrote the IETF request for comments that explain the workings of a URN resolution service, called the Dynamic Delegation Discovery Service. Mealling is CEO of Refactored Networks LLC of Kennesaw, Ga.The first part of the address after the urn:// prefix will identify the type of naming system in place. The International Assigned Numbers Authority will manage these namespaces, as they are called.'It is a system that allows opaque persistent identifiers to be used generically. You can have a URN contain an ISBN number just as easily as it contains a product code or a handle,' Mealling said. While the supply chain community could use URNs to keep track of RFID-tagged items, the library community could use the same URN system to track unique ISBN numbers for books.After the namespace is identified, the resolution process would then hand off the request to the particular organization overseeing the document requested, much like the Domain Name System only identifies the top-level domain names, leaving organizations to manage their own Web spaces.A global URN resolution system would 'not contain data about the document. It points you to the server that can then give you data about the document,' Mealling said.If the system works and becomes widely adopted by agencies, it could represent a major step not only in e-government, but also in information sharing in general. In the meantime, experts say, IT shops should brush up on the subject of persistent identifiers. CENDI put out a white paper last year titled Persistent Identification: A Key Component of an E-Government Infrastructure. You can read it by visiting www.gcn.com and typing 474 in the GCN.com/box.

'Document not found' messages, such as these generated by government Web sites, often frustrate users. Persistent identifiers can help.

'Document Not Found.'













Forever Marked




















A handle on Defense docs














Coming soon to your agency

















Playing the name game







X
This website uses cookies to enhance user experience and to analyze performance and traffic on our website. We also share information about your use of our site with our social media, advertising and analytics partners. Learn More / Do Not Sell My Personal Information
Accept Cookies
X
Cookie Preferences Cookie List

Do Not Sell My Personal Information

When you visit our website, we store cookies on your browser to collect information. The information collected might relate to you, your preferences or your device, and is mostly used to make the site work as you expect it to and to provide a more personalized web experience. However, you can choose not to allow certain types of cookies, which may impact your experience of the site and the services we are able to offer. Click on the different category headings to find out more and change our default settings according to your preference. You cannot opt-out of our First Party Strictly Necessary Cookies as they are deployed in order to ensure the proper functioning of our website (such as prompting the cookie banner and remembering your settings, to log into your account, to redirect you when you log out, etc.). For more information about the First and Third Party Cookies used please follow this link.

Allow All Cookies

Manage Consent Preferences

Strictly Necessary Cookies - Always Active

We do not allow you to opt-out of our certain cookies, as they are necessary to ensure the proper functioning of our website (such as prompting our cookie banner and remembering your privacy choices) and/or to monitor site performance. These cookies are not used in a way that constitutes a “sale” of your data under the CCPA. You can set your browser to block or alert you about these cookies, but some parts of the site will not work as intended if you do so. You can usually find these settings in the Options or Preferences menu of your browser. Visit www.allaboutcookies.org to learn more.

Sale of Personal Data, Targeting & Social Media Cookies

Under the California Consumer Privacy Act, you have the right to opt-out of the sale of your personal information to third parties. These cookies collect information for analytics and to personalize your experience with targeted ads. You may exercise your right to opt out of the sale of personal information by using this toggle switch. If you opt out we will not be able to offer you personalised ads and will not hand over your personal information to any third parties. Additionally, you may contact our legal department for further clarification about your rights as a California consumer by using this Exercise My Rights link

If you have enabled privacy controls on your browser (such as a plugin), we have to take that as a valid request to opt-out. Therefore we would not be able to track your activity through the web. This may affect our ability to personalize ads according to your preferences.

Targeting cookies may be set through our site by our advertising partners. They may be used by those companies to build a profile of your interests and show you relevant adverts on other sites. They do not store directly personal information, but are based on uniquely identifying your browser and internet device. If you do not allow these cookies, you will experience less targeted advertising.

Social media cookies are set by a range of social media services that we have added to the site to enable you to share our content with your friends and networks. They are capable of tracking your browser across other sites and building up a profile of your interests. This may impact the content and messages you see on other websites you visit. If you do not allow these cookies you may not be able to use or see these sharing tools.

If you want to opt out of all of our lead reports and lists, please submit a privacy request at our Do Not Sell page.

Save Settings
Cookie Preferences Cookie List

Cookie List

A cookie is a small piece of data (text file) that a website – when visited by a user – asks your browser to store on your device in order to remember information about you, such as your language preference or login information. Those cookies are set by us and called first-party cookies. We also use third-party cookies – which are cookies from a domain different than the domain of the website you are visiting – for our advertising and marketing efforts. More specifically, we use cookies and other tracking technologies for the following purposes:

Strictly Necessary Cookies

We do not allow you to opt-out of our certain cookies, as they are necessary to ensure the proper functioning of our website (such as prompting our cookie banner and remembering your privacy choices) and/or to monitor site performance. These cookies are not used in a way that constitutes a “sale” of your data under the CCPA. You can set your browser to block or alert you about these cookies, but some parts of the site will not work as intended if you do so. You can usually find these settings in the Options or Preferences menu of your browser. Visit www.allaboutcookies.org to learn more.

Functional Cookies

We do not allow you to opt-out of our certain cookies, as they are necessary to ensure the proper functioning of our website (such as prompting our cookie banner and remembering your privacy choices) and/or to monitor site performance. These cookies are not used in a way that constitutes a “sale” of your data under the CCPA. You can set your browser to block or alert you about these cookies, but some parts of the site will not work as intended if you do so. You can usually find these settings in the Options or Preferences menu of your browser. Visit www.allaboutcookies.org to learn more.

Performance Cookies

We do not allow you to opt-out of our certain cookies, as they are necessary to ensure the proper functioning of our website (such as prompting our cookie banner and remembering your privacy choices) and/or to monitor site performance. These cookies are not used in a way that constitutes a “sale” of your data under the CCPA. You can set your browser to block or alert you about these cookies, but some parts of the site will not work as intended if you do so. You can usually find these settings in the Options or Preferences menu of your browser. Visit www.allaboutcookies.org to learn more.

Sale of Personal Data

We also use cookies to personalize your experience on our websites, including by determining the most relevant content and advertisements to show you, and to monitor site traffic and performance, so that we may improve our websites and your experience. You may opt out of our use of such cookies (and the associated “sale” of your Personal Information) by using this toggle switch. You will still see some advertising, regardless of your selection. Because we do not track you across different devices, browsers and GEMG properties, your selection will take effect only on this browser, this device and this website.

Social Media Cookies

We also use cookies to personalize your experience on our websites, including by determining the most relevant content and advertisements to show you, and to monitor site traffic and performance, so that we may improve our websites and your experience. You may opt out of our use of such cookies (and the associated “sale” of your Personal Information) by using this toggle switch. You will still see some advertising, regardless of your selection. Because we do not track you across different devices, browsers and GEMG properties, your selection will take effect only on this browser, this device and this website.

Targeting Cookies

We also use cookies to personalize your experience on our websites, including by determining the most relevant content and advertisements to show you, and to monitor site traffic and performance, so that we may improve our websites and your experience. You may opt out of our use of such cookies (and the associated “sale” of your Personal Information) by using this toggle switch. You will still see some advertising, regardless of your selection. Because we do not track you across different devices, browsers and GEMG properties, your selection will take effect only on this browser, this device and this website.