Software Used in this project
Several items of software are used in this project. This document is to answer
questions about the software and its origins.
Some of the software used are standard packages and some are bespoke
Data entry cycle
The main form of data entry for this project is on a small laptop computer
which runs DOS. This enables data entry to be performed during research,
while at various locations, including libraries. I use the lowest common
denominator, that is I do not assume the existance of windows or much
machine memory.
The software selected for this is
Brother's Keeper. The current version of Brother's Keeper is designed
for windows, but a DOS only version is still available. I keep the versions
of BK that I use on my server, but be warned that these may not be the latest:
After data entry I generate a GEDCOM file, and I find that the Windows
version of BK performs a good job of this function.
I keep
a version of the GEDCOM specification on this server, but again
it is not the latest, no neccessarily the version that BK uses for down load.
The GED2HTML converter
The DOS generated GEDCOM is shipped to a unix server which host the
web database. This is passed through a bespoke GED2HTML converter.
This converter is based upon the converter of
Vic Abell of Perdue University Computer Centre,
which in turn was based on the converter of
Frode Kvam of Norway.
Converter evolution
The evolution of the converter started with a simple demonstrator program
(by Frode Kvam)
that took the basic elements of the gedcom file and manufactured plain
html files from them. This made one file per person, which on a large
genealogy create a significant number of files. On many systems the creation
of large numbers of files causes difficulties for the file system. The
software was an excellent demonstrator of the concept of using the Web to
display and exchange genealogy information.
The next step was made by Vic Abell to utilise the cgi script facility of a web server to
extract the desired record from a database. This method is a slight trade off
of performance for file space.
The gedcom is converted into a tractable database format which can be read
quickly by the cgi scripts to prepare the necessary HTML page on the fly.
Access to the desired record is made by a key which is the byte offset of
the desired record in the database.
The problem with this approach is that as the data is updated the URL's
for individual records become stale as the byte addressing will change.
This caused a major rewrite of the converter to include the following
features:
- Convert GEDCOMs using DOS, or ISO accented characters into HTML
- Use the GEDCOM ID's as the WWW URL to eliminate stale links
- Generates indexes sorted by forname, surname or both
- Permit data access by name rather than record ID
- Caches certain types of search results for access speed
- Ability to run as set-uid to protect the data
- Ability to protect the data of living persons in a database
- Ability to handle multiple databases on one server with a tree
structured hierachical naming system
- Backward compatibility to old byte offset URLs to support any old
URLs still in use
- Ability to have multimedia inserts and hyperlinks in GEDCOM notes
properly supported in HTML
- Ability to have illustrated records for individuals
- Ability to cross link databases with separately stored hyper-links
in the style of Hyper-G.
- Symbolic naming of cross links allowing easy renaming of whole datasets
- Allows controlled access to closed databases or records of living persons
- Allows identification of accesses to records not made via the indexes
- Permits control to specific access sites
- Limits machine resources used for searching
Limitations of the Software
The software has several limitations:
- It is not supported
- I change it when I feel like it
- It runs on unix (only)
- Not much attention has been paid to its portability (yet)
Future enhancements
- Allowing more portability. I run this on two different servers at present
- Allowing non-unix use. My home web server is a Windows-95 486.
- Support a compressed database
- Support more efficient searching
- Suport better searching, forms based queries
Downloading the Software
If you need any help in understanding archive file formats or compression
programs you should read the
help text on dealing with compressed and archived files.
I keep a local copy of a
DOS decompressor for unix compressed .Z files here, as well as the
Gnu Zipper fopr .gz files.
Installation Instructions
The Software is available via ftp from here.
The Instructions given in the 00README file may be of some help, they are
grossly out of date, and if you're anything but a perl and C hacker you'll
probably be out of your depth!
Brian Tompsett
Department of Computer Science
University of Hull
Hull, UK, HU6 7RX
B.C.Tompsett@dcs.hull.ac.uk