Can your computer read a Web page without your help? Soon it might.

 

Connecting state and local government leaders

Tim Berners-Lee, the inventor of the Web format, and the organization that keeps the standards of the Web, the World Wide Web Consortium, have recently been promoting the idea of making the Web machine-readable, or a Web of data. What does that mean? After all, at least in one sense, the Web is already being read by a machine -- namely your own computer -- when you surf the Web.

Tim Berners-Lee, the inventor of the Web format, and the organization that keeps the standards of the Web, the World Wide Web Consortium, have recently been promoting the idea of making the Web machine-readable, or a Web of data. What does that mean? After all, at least in one sense, the Web is already being read by a machine -- namely your own computer -- when you surf the Web. 

At the International Semantic Web Conference, being held this week in Chantilly, Va., Dean Allemang, chief scientist at Semantic Web consulting firm TopQuadrant, offered a solid example of how a machine-readable Web would help us all, in theory anyway.

His example was work-related: booking hotels. Say you wanted to attend a conference at some out-of-town location. The conference site itself probably has a Web site.

You copy its physical address from its site, and go to an online hotel broker site, such as Hotels.com, to find a nearby hotel. You do a search on hotels, say, by entering that address into the search criteria, to seek hotel within a certain radius. Or you just a get a list of hotels and go to a third Web site, a mapping site such as MapQuest, and enter hotel addresses and the conference center address to see if any hotel is close to the conference center.

In Allemang's view,  this really is crazy. Why copy some information from one page and paste it to another, using the same computer? Why can't the computer itself do the work?

The trick would be to get all the sites to agree on how to represent an address, Allemang said. Then, the addresses can be passed from one site to the next through your browser, automatically, without you having to do anything. The mapping site could check your cache and list any addresses found there, offering you the option of mapping them.

Automating such a task (and the countless others we do by hand on our computers), is the point of creating a machine-readable Web. If computer programs can read the Web pages and carry out tasks, we won't have to.

Relational databases make the prospect feasible. With databases, you can structure data so each data element is slotted into a predictable location. You can query a database of personnel data to return a birth date of a particular person, because the row of data with that person's info has a dedicated column dedicated to the birth date.

This approach wouldn’t work so well for data beyond a single database, however. "The problem is that everyone assumes you will need to build a huge data warehouse, where everything can be compared. This will never happen," Allemang said. Another factor: On the Web, data is not structured in such a way that it can retrieved with any consistency, and the vast number of people who design and maintain Web sites would not all agree on the same format for structuring data.

The answer the W3C has come up with comes in a form of a set of interrelated standards, that can be used to embed data on Web sites, as well as to interpret the data that is found there. One standard is the Resource Description Framework. The other is the Web Ontology Language, or OWL.

RDF is a way of encoding data so it can be available for a wider audience in such a way that external IT systems can understand it. It is based on making associations. It describes data by breaking each data element into three nodes: a subject, a predicate, and object. For example, consider the fact that Yellowstone National Park offers camping. "Yellowstone" would be subject. "offers" would be the predicate and "camping" would be the "object." (All three elements get uniform resource identifiers, or a globally-recognized Internet addresses).

A query against Triple Store, which is what a RDF database is called, can link together disparate facts. If another triple, perhaps located in another Triple Store, contains the fact that Yellowstone contains the Mammoth Hot Springs, a single search across multiple Triple Stores can return both facts.

Additional standards can further refine the precision of the data definition. For instance, two parties can agree that the term "Yellowstone" refers "Yellowstone National Park" by using a shared, controlled vocabulary, which can be referenced through a Resource Description Framework schema and RDFS. RDFS also allows inferencing. In RDFS, you can state that Yellowstone is a type of national park. So a search for national parks that offer camping would return Yellowstone.

Of course, the Interior Department could build a list of all the national parks and include which services each park offers. But with the semantic Web approach, such a single database would never be needed. The services for each park could maintain their own data, and the results could be compiled only when someone posts some piece of specific data, Allemang pointed out. In essence, with RDF, a user can build a set of data from various sources on the Web that may have not been brought together before.

How do you use these triples? One way is through the query language for RDF, called SPARQL (an abbreviation for the humorously recursive SPARQL Protocol and RDF Query Language). With Structured Query language (SQL), you can query multiple database tables through the JOIN function. With a SPARQL query, you specify all the triples you would need, and the query engine will filter down to the answers that fit all of your criteria.

For instance, say you are looking for a four-star hotel in New York. You have a query to look for triples specifying for four-star hotels, and for hotels and New York. The query search engine would find all the triples for hotels in New York, as well as all the triples for four-star hotels, and filter the set down to four-star hotels in New York.

Even more sophisticated interpretations of RDF Triples can be done through OWL.

The logical chain of reason within a RDF Triple is relatively static, and can vary according to who does the encoding. One triple may say that Yellowstone "offers" camping as a service, but another triple may state that camping "is offered" Arcadia National Park. While it may seem obvious to us that both Arcadia and Yellowstone offer camping, it wouldn't be to the computer. A SPARQL query engine, perhaps one embedded in a Web application, could consult OWL and return both entries though.

While the idea of a machine-readable Web sounds great, there still requires data holders to render their material in RDF, a tall order for already-overworked Web managers. But the benefits may be worth it — once online, data can be reused in ways that government managers may never have considered.

X
This website uses cookies to enhance user experience and to analyze performance and traffic on our website. We also share information about your use of our site with our social media, advertising and analytics partners. Learn More / Do Not Sell My Personal Information
Accept Cookies
X
Cookie Preferences Cookie List

Do Not Sell My Personal Information

When you visit our website, we store cookies on your browser to collect information. The information collected might relate to you, your preferences or your device, and is mostly used to make the site work as you expect it to and to provide a more personalized web experience. However, you can choose not to allow certain types of cookies, which may impact your experience of the site and the services we are able to offer. Click on the different category headings to find out more and change our default settings according to your preference. You cannot opt-out of our First Party Strictly Necessary Cookies as they are deployed in order to ensure the proper functioning of our website (such as prompting the cookie banner and remembering your settings, to log into your account, to redirect you when you log out, etc.). For more information about the First and Third Party Cookies used please follow this link.

Allow All Cookies

Manage Consent Preferences

Strictly Necessary Cookies - Always Active

We do not allow you to opt-out of our certain cookies, as they are necessary to ensure the proper functioning of our website (such as prompting our cookie banner and remembering your privacy choices) and/or to monitor site performance. These cookies are not used in a way that constitutes a “sale” of your data under the CCPA. You can set your browser to block or alert you about these cookies, but some parts of the site will not work as intended if you do so. You can usually find these settings in the Options or Preferences menu of your browser. Visit www.allaboutcookies.org to learn more.

Sale of Personal Data, Targeting & Social Media Cookies

Under the California Consumer Privacy Act, you have the right to opt-out of the sale of your personal information to third parties. These cookies collect information for analytics and to personalize your experience with targeted ads. You may exercise your right to opt out of the sale of personal information by using this toggle switch. If you opt out we will not be able to offer you personalised ads and will not hand over your personal information to any third parties. Additionally, you may contact our legal department for further clarification about your rights as a California consumer by using this Exercise My Rights link

If you have enabled privacy controls on your browser (such as a plugin), we have to take that as a valid request to opt-out. Therefore we would not be able to track your activity through the web. This may affect our ability to personalize ads according to your preferences.

Targeting cookies may be set through our site by our advertising partners. They may be used by those companies to build a profile of your interests and show you relevant adverts on other sites. They do not store directly personal information, but are based on uniquely identifying your browser and internet device. If you do not allow these cookies, you will experience less targeted advertising.

Social media cookies are set by a range of social media services that we have added to the site to enable you to share our content with your friends and networks. They are capable of tracking your browser across other sites and building up a profile of your interests. This may impact the content and messages you see on other websites you visit. If you do not allow these cookies you may not be able to use or see these sharing tools.

If you want to opt out of all of our lead reports and lists, please submit a privacy request at our Do Not Sell page.

Save Settings
Cookie Preferences Cookie List

Cookie List

A cookie is a small piece of data (text file) that a website – when visited by a user – asks your browser to store on your device in order to remember information about you, such as your language preference or login information. Those cookies are set by us and called first-party cookies. We also use third-party cookies – which are cookies from a domain different than the domain of the website you are visiting – for our advertising and marketing efforts. More specifically, we use cookies and other tracking technologies for the following purposes:

Strictly Necessary Cookies

We do not allow you to opt-out of our certain cookies, as they are necessary to ensure the proper functioning of our website (such as prompting our cookie banner and remembering your privacy choices) and/or to monitor site performance. These cookies are not used in a way that constitutes a “sale” of your data under the CCPA. You can set your browser to block or alert you about these cookies, but some parts of the site will not work as intended if you do so. You can usually find these settings in the Options or Preferences menu of your browser. Visit www.allaboutcookies.org to learn more.

Functional Cookies

We do not allow you to opt-out of our certain cookies, as they are necessary to ensure the proper functioning of our website (such as prompting our cookie banner and remembering your privacy choices) and/or to monitor site performance. These cookies are not used in a way that constitutes a “sale” of your data under the CCPA. You can set your browser to block or alert you about these cookies, but some parts of the site will not work as intended if you do so. You can usually find these settings in the Options or Preferences menu of your browser. Visit www.allaboutcookies.org to learn more.

Performance Cookies

We do not allow you to opt-out of our certain cookies, as they are necessary to ensure the proper functioning of our website (such as prompting our cookie banner and remembering your privacy choices) and/or to monitor site performance. These cookies are not used in a way that constitutes a “sale” of your data under the CCPA. You can set your browser to block or alert you about these cookies, but some parts of the site will not work as intended if you do so. You can usually find these settings in the Options or Preferences menu of your browser. Visit www.allaboutcookies.org to learn more.

Sale of Personal Data

We also use cookies to personalize your experience on our websites, including by determining the most relevant content and advertisements to show you, and to monitor site traffic and performance, so that we may improve our websites and your experience. You may opt out of our use of such cookies (and the associated “sale” of your Personal Information) by using this toggle switch. You will still see some advertising, regardless of your selection. Because we do not track you across different devices, browsers and GEMG properties, your selection will take effect only on this browser, this device and this website.

Social Media Cookies

We also use cookies to personalize your experience on our websites, including by determining the most relevant content and advertisements to show you, and to monitor site traffic and performance, so that we may improve our websites and your experience. You may opt out of our use of such cookies (and the associated “sale” of your Personal Information) by using this toggle switch. You will still see some advertising, regardless of your selection. Because we do not track you across different devices, browsers and GEMG properties, your selection will take effect only on this browser, this device and this website.

Targeting Cookies

We also use cookies to personalize your experience on our websites, including by determining the most relevant content and advertisements to show you, and to monitor site traffic and performance, so that we may improve our websites and your experience. You may opt out of our use of such cookies (and the associated “sale” of your Personal Information) by using this toggle switch. You will still see some advertising, regardless of your selection. Because we do not track you across different devices, browsers and GEMG properties, your selection will take effect only on this browser, this device and this website.