Interactive Information Services Using World-Wide Web Hypertext
- (next,
previous
section)
A Folk Music Database
The Digital Tradition Folk Song Server is a World-Wide Web application
which provides an interactive interface to a database of 4000 songs.
It provides full text search of lyrics, category keyword search, and
audio retrieval of melodies
[Putz2].
The original Digital Tradition database is distributed as a single user
standalone application for the IBM-PC. For the WWW version, the song
lyrics and melodies were converted to a simple ASCII format,
and a Plexus HTTP server module was created using the perl scripting language [Wall1].
Each song has a title, lyrics, and usually one or more category
keywords such as "ballad" or "sailor". Most of the songs also have at
least one melody stored in the database, represented in a simple ASCII
music notation. The lyrics and melodies are stored in two large files,
each with a companion table of contents file. Since the total amount
of text is less than 6 million characters, a typical full text search
can be performed in about four seconds, assuming the text is stored on
a local hard disk. Title and keyword searches can be performed in a
fraction of a second, since the server keeps the list of titles and
keywords in memory. While this kind of linear scan would be too slow
for much larger amounts of data, there are many applications for which
it is well suited. A more sophisticated inverted word index would be
more appropriate for a much larger database.
The Digital Tradition server provides different kinds of information
displays depending on what information is being presented. Each
display is presented as a formatted HTML document with hypertext links.
The two main display formats are a search query and results page
and a song lyrics display.
There are also lists of category keywords, song titles and song
tunes which allow browsing the database without performing a search.
This application provides a word-based text search capability.
Although the underlying implementation uses regular expressions to find
occurrences of search terms, a much simpler query syntax is provided.
While regular expressions can be very powerful for specifying search
patterns, most end users are unprepared to deal with the confusing and
arcane syntax of regular expressions. And the vast majority of users'
searches are for simple word or phrase combinations anyway.
The server accepts queries using the standard HTTP convention of ending a URL with a question mark ("?") character followed by the query terms.
Adjacent words in a query are interpreted as a phrase (i.e. the words must occur together in the specified order for a match to occur).
When the special character "&" is used between query terms (indicating an AND operation), the terms may occur separated and in any order.
When the special character "|" is used between terms (indicating an OR operation), only one of the terms need occur for a match.
Normally each word in a query must exactly match an entire word in the text (ignoring punctuation and case).
However the special character "*" will match any sequence of letters of digits.
This is most useful for matching variations of a word, such as singular or plural endings.
Most songs in the database are categorized with one or more keywords
used to group songs by topic or genre. All lyrics and title searches
also check each song's list of keywords. It is also possible to search
explicitly for keywords by using an "@" character as a prefix (e.g.
"@sailor"). This is useful for keywords (such as "father") which occur
frequently in lyrics on other topics. This categorization scheme and
the use of "@" to mark category keywords was inherited from the
original PC-based version of the database, and it works well with the
regular expression searches used by the WWW server.
As as alternative to the full text search capability, searches may
optionally be performed on just the song titles and keywords. The same
query syntax is used as for full text searches. The user can set a display option to specify whether each song's category keywords are listed under the titles.
Using HTML for presentation of search query results provides a great
deal of flexibility. Using the presentation markup capabilities of
HTML, the Digital Tradition server shows lines of text that match a
search query below each song title with the search terms highlighted.
When the full text is displayed, the search terms are highlighted there
are well.
See the search results
and song lyrics examples.
Currently, the only widely supported format for delivering music via World-Wide Web is as telephone quality sampled digital audio (8-bit u-law). When a client requests a melody in audio format (by selecting a link), the server converts the stored note list into audio samples using a public domain music software package called Csound
[Verc1].
This is a grossly inefficient method of sending simple tunes across a
wide area network. A possible improvement is to provide the melodies to
clients in Standard MIDI Format, as software MIDI emulators are now
becoming available.
Authored hypertext documents can be combined with the interactive
service in useful ways. Links from the interface point to a variety of
pages which use HTML hypertext for documentation about the
application. Some of the documentation pages in turn include examples
with links that perform sample database queries.
(next section)