Vertebrate Palaeontology Searching

25th September 2003

1. Introduction
        1.1. Motivating Example
        1.2. Relevance Ranking
2. Implementation
        2.1. Thesauri
        2.2. Distributed Search
        2.3. Heuristic Semantic Analysis

1. Introduction

1.1. Motivating Example

Imagine trying to express a search like this:

I want to see papers written by Bakker in 1988 on Kimmeridgian stegosaur metacarpals found on the Isle of Wight and held in the Natural History Museum.

If the exact paper you want doesn't exist, there are seven degrees of freedom that a clever search-engine could slide along to find papers that would be interesting to you:

In the absence of better hits, such an engine might offer up a paper written by Greg Paul in 1989 on Tithonian anylosaur manual phalanges from Dorset held in the OUMNH.

1.2. Relevance Ranking

### Rank by number of degrees of slippage?

### Allow users to specify which axes are most/least significant.

### View and rotate a 3d slice of the slippage space to see what areas are best represented (and which areas, because they're sparsely populated, will make good research subjects.)

2. Implementation

2.1. Thesauri

To make this work, the searching system would need to have six ``thesauri'' (in the most general sense of structured collections of authority records):

(Slipping year of publication is trivial, of course: we don't need a structured authority file to tell us that 1986 is two years earlier than 1988).

These thesauri would need to be provided by experts in the field. Experience shows that building them is usually more work than people expect, and is in any case an inexact science. That's OK: even a vague, imprecise and error-strewn thesaurus will yield useful results.

2.2. Distributed Search

### New sites can "nuzzle up to" the network.

2.3. Heuristic Semantic Analysis

### Guess which bits of title/abstract are author, taxon, etc.

Feedback to <mike@miketaylor.org.uk> is welcome!