Draft Proposed Standard for DTOC entries - July 8, 1997

Recap: In March, during a conference call we decided to pursue development of a prototype data catalog based on free-text searching of site "Data Table of Contents" (DTOC). I was charged with pulling together a specification for the form DTOC entries. Now that our NSF site review is complete, I'm turning my attention to that task. I'd like to get your feedback so that we can get a working prototype operational before the IM meeting in early August.

Please give me feedback not only on the proposed form of the TOC entries from the viewpoint of the user, but also how hard the TOC entries would be to put together and update on your end.

There are two major options.

OPTION 1: a relatively simple form that consists of the following:

The recommended HTML for this would be:

<LI><A HREF="http://www.vcrlter.virginia.edu/where_data_is">
The title of the dataset goes here
</A> --VCR
</LI>

The HREF entry should point to an HTML file of metadata for the dataset being described. Please use a FULL URL (starting with http://....), not just the local address as we need to traverse this from the search-server to get to your site.

The <LI> provides a bulleted list in HTML and makes it easy to separate out individual DTOC entries if it becomes necessary to split DTOCs into individual entries. The first link would be to the metadata from the site server. The "--" makes it easy to parse the DTOC entry into its data and site components. We could do parentheses or brackets if folks preferred. The second link would be to the overall site home page.

A review of existing site DTOC entries found that almost all contained the Data Set Identity (see Michener et al., 1997, Ecol. Applic. 7:330-342 for definition). Thus this should be very easy to implement at the site level.

However, despite the advantages of simplicity, option 1 has certain problems. It provides no way to search based on keywords that aren't in the data set identity, nor does it provide a place to list data authors or site-specific data set identification codes. We can add optional places in the DTOC for these things, but added complexity comes at some cost, especially if we want to keep a DTOC entry so it can be parsed into its components. This ability to parse the entries will be important if we (later) want to build more complex search-engines based on the DTOC catalog (e.g, links from the personnel directory to data originators or lists of keywords that are then linked to the data). Below I have used HTML comments to make the data parsable.

I thereforefore propose this as a second option for DTOC entries:

<LI><A HREF="http://www.vcrlter.virginia.edu/where_data_is">
The title of the dataset goes here
</A> -- VCR
<BR>--<!-- originators -->Comma separated list of investigators
<BR>--<!-- keywords -->Comma separated list of keywords
<BR>--<!-- datasetID -->Site-specific dataset ID
<BR>--<!-- other -->Other optional information sites want to add
</LI>

Where information is not available, the leading --, the comment and the data field would be omitted. Thus an entry with the basic info and keywords (only) would be:

<LI><A HREF="http://www.vcrlter.virginia.edu/where_data_is">
The title of the dataset goes here
</A> -- VCR
<BR>--<!-- keywords -->Keyword1, Keyword2
</LI>

My preference is to go with the second form. It truncates down to the first form when additional information is not added beyond the data set identity. However, I'd appreciate your comments and suggestions.

I've set up a page at:

http://www.vcrlter.virginia.edu/nis/toclist.html

Where you can see a sample DTOC I've set up so you can get the visual impact of the proposed format.

Thanks for your help!

-John Porter

-- 
---------------+--------------------------------+----------------------------
John H. Porter |  Dept. Environmental Sciences  |  jhp7e@virginia.EDU
(804) 924-8999 |  Clark Hall, Univ. of Virginia |  jporter@lternet.EDU
(804) 924-7761 |  Charlottesville, VA 22903     |  (804) 982-2137(fax) 
---------------+--------------------------------+----------------------------
WWW page: http://www.vcrlter.virginia.edu/jhp7e.html