Northrop searches queries against a textbase of 4,373 records from the Dialog databases, all dealing with AIDS or AIDS-related topics. It returns the titles of texts that contain all the words in your query.
You can have the results sorted in either of two ways. A relevance sort orders the texts strictly by the number of occurrences of the search terms. A genre sort first groups the records by textual genre (using the set of ten genres shown in the checkboxes below), then sorts the texts in each genre according to relevance.
Northrop does this genre grouping by looking at low-level cues within the text itself -- for example the frequency of punctuation marks and the length of sentences. It does not refer to the metadata labels that are present in some of the records. Its accuracy at genre identification of previously unseen texts is about 92% (note that even humans do not agree with each other all the time when trying to identify genres).
To get a good idea of how Northrop's genre identification helps one organize voluminious search results, we recommend that you start by doing a traditional search that ignores genre: push Sort by Relevance. When the results come back, arranged strictly by topic relevance, you will have a chance to reorganize the results by genre. Of course, at any time you have the option of typing in your own search terms (whole words only), or limiting the search to records of a particular genre: for example "needle sharing" or "Luc Montagnier."
Please note that the data used for this demo were derived from records provided courtesy of Knight-Ridder Information Services, Inc. Since our agreement with Knight-Ridder has expired, we cannot show these records anymore.
Web page author: Brett Kessler / Sep. 25, 1997