
  Project HOSPEXerver
=====================
          written by:
          Ian Feldman <ianf@random.se>
          Sept-Oct 94


  Abstract
----------
  HOSPEX is a database of individuals (from all parts of the world)
  who have pledged to host one another during tourist stays in own
  homes.  Anyone may join by mailing a filled form to the database
  owner/ maintainer, who then validates and appends it to a log
  file.  Information about new applicants is disseminated manually
  to all members via a related HOSPEX-L mailing list.  However,
  currently the only way for members to search the database and
  extract records is by fetching the entire accumulated logfile via
  FTP, then searching it manually (grep etc) for entries of hosts in
  countries/ regions/ areas of interest.  Ie.  a highly tedious and
  restrictive procedure. 


  Objective
-----------
  To widen the appeal of the HOSPEX service by linking it to the
  WorldWideWeb, which should simplify access to it for members and
  non-members alike.  It would allow realtime browsing and, for
  members, retrieval of individual records.  An additional target
  is to alleviate the administration of the service by providing 
  the Hospex-adm. with a front-end capable of automating most of
  the administrative chores (record validation, posting, updating).


  Problem description
---------------------

* Technically HOSPEX is a database of page-size ASCII records
  containing a number of free-length fields indexed by plaintext
  named labels.  A label is delimited from the content of its field
  by a trailing colon.  There are no explicit field separators, only
  implicit ones. 

* Because access to the database is restricted to its members, any
  linking of it to the WWW will by necessity require _selective_
  suppression (or other transformation) of parts of the output. 
  Else there would be no incentive for non-members to join up. 

* Any changes **must** still allow for non-anon FTP accesses using
  the simplest of tools: a TTY ftp client and automatic ftpmail
  operations (non-anonymous). 

* Although it is possible to configure an HTTP server to provide
  full user authentication to allow or deny access to a service in
  full, our solution must be able to distinguish between privileged 
  and non-privileged accesses.  On detection of the latter the 
  database must still be readable but perhaps with certain key 
  member-identifying fields suppressed or encrypted.


  The proposed solution
-----------------------

* The records will continue to be plaintext files, one record/
  member, not marked-up beyond what's already in place (this takes
  care of simple-ftp accesses). 

* Instead of current flat-file organization, existing records will
  be stored in individual files (= nodes in the future record tree),
  and new records _automatically_ added to relevant directories. Such
  files, one per member, will unformly be named '[nn].txt' where
  nn = a two-digit --leading 0 where so required-- sequential number
  of records in its directory. 

* Files will be arranged in a nested, hierarchical directory
  structure reflecting the geographical and regional divisions in
  the real world.  The organization will allow parallel control/ 
  navigation files, so browsing can be done along the lines of 
  'Continent/ Country/ Town' as well as by 'Continent/ Region/ 
  Country/ State/ Town/' where so required.

* The navigation/ browsing structure will be stored in separate
  documents marked up as HTML and updated/ regenerated automatically
  by the HOSPEXerver each time the database has been changed.  In
  this manner the database will always stay up to date.  The entire
  _primary_ structure will be made up ONLY of text/plain member-
  records named '[nn].text and 'index.html' at _all_ levels of it.

* Some indexes, eg. those for geographic regions like
  'Scandinavia' in Europe and 'EastCoast' in the USA will not
  EVER be changed/ updated.  Such documents can be considered
  'static', and should therefore stay locked to prevent their
  accidential regeneration when the whole control structure is
  being updated. 

* When database accessed by a member, the HOSPEXerver will in
  realtime transform the requested text/plain record into a fully-
  qualified HTML data stream, with the EMAIL: field in the record 
  made  into a clickable HTML-mailto: <member's email-address> 
  anchor.

* When database accessed by a non-member the Hospexerver will in 
  realtime selectively suppress the content of the NAME:, ADDRESS: 
  and PHONE fields, while changing the EMAIL: into a clickable HTML-
  mailto: anchor pointing to the hospex-request or equivalent
  address.  See the samples (ALL leading-dot-files are simulations
  of the relevant [nn].txt and index.html files).


  Proposed directory structure
------------------------------

    Continent1/
    Continent2/
    ContinentN/
              index.html
              Country1/
              Country2/
              Country3/
              RegionA.html
              RegionB/
              |      index.html
              CountryN/
              |       index.html
              |       Town1/
              |       Town2/
              |       RegionA/
              |       |      index.html
              |       RegionB.html
              |       TownN/
              |       |    index.html
              |       |    01.txt
              |       |    02.txt
              |       |    03.txt
              |       |    nn.txt 
 

* Continents: Africa, AU-NZ, Asia, Europe, NAmerica, SAmerica

* Country-Level Regions: eg. Scand{inavia} in Europe; E{ast}Coast
  in NAmerica: FarEast in Asia. (more)

* Town-Level Regions: contain mainly indexes pointing to countries
  one level up (eg. American states), and towns on the same level.
  (more).

* Observe that there may both be Region-directories with a single
  index.html document in them --AND-- RegionN.html indexes in their 
  own right.


  Maintainer usage scenario
---------------------------

( 1) A new HOSPEX member application arrives via mail.  The
     maintainer ('M') saves it in a separate file, here called
     file0. 

( 2) Using a dedicated shell script or command M submits file0
     to the HOSPEXerver for validation, cleaning up and reformat-
     ting. Because records later will be partly enhanced/ trans-
     formed to HTML in realtime, all files must be reformatted 
     acc. to supplied (or other) samples. Output is 'file1'

( 3) As part of validation process the HOSPEXerver extracts a few
     keywords with which to construct the suitable path to the
     file-record. If the database does not contain the necessary 
     directories it asks the maintainer whether to create them.
     Else it checks the number of [nn].txt files in the target
     directory, increments the nn by 1, stores the file1 auto-
     matically as [nn+1].txt and sets up a flag for update of
     the relevant 'index.file' or files along the path.

( 4) Periodically M may issue an 'update [dir]' command to rewrite
     the index file(s), which will then reset the flag and clear
     any associated memory registers etc. Alternatively, on quit
     the HOSPEXerver checks the status of the flag and asks
     whether to update/ rewrite the indicated file(s).

( 5) After file1 has been incorporated in the database the 
     HOSPEXerver automatically posts it to the HOSPEX-L mailing 
     list.

( 6) Occassionally M will need to remove members, for which there
     also will be a special command. 


  End-usage scenario
--------------------

* Members are sent instructions on how to configure their WWW
  or other front-ends to access the database in privileged
  fashion. More specifically, they are told how to construct
  a base URL with their password already inserted, and how
  to ensure that any {Mosaic | browser} .global-browse-history
  files or equivalent, normally readable by all, are made
  accessible but to themselves. 

* Non-members are provided with the URL to the top-level index.

* The HTTP server (not HOSPEXerver!) listens to both the default
  HTTP and FTP ports, recognizes and notes which is which (member or
  non-member, here referred to as ftp- and http-type accesses).  As
  long as the URL of the pointed-to file does not end in a '.txt',
  it provides normal http-server services (navigation and delivery
  of 'index' and other .html documents).  Once the URL is that of a
  .txt-suffixed file it calls the HOSPEXerver with, for type of
  current access, correct argument and the URL. 

* The HOSPEXerver extracts the requested .txt file and transforms
  it into an HTML datastream acc.  to supplied rules (samples),
  which it then returns to the HTTP server for forwarding to the
  WWW client. 

  

HOSPEXerver scripts--commands
-----------------------------

REALTIME cmds issued by the  !
HTTP server in response to   !
request from HTTP / FTP port ! 

% enhance -http path-to-file # on detection of a request for a HOSPEX file
                             # incoming from the HTTP port (default '80'?),
                             # ie a common, non-privileged access attempt,
                             # the HTTP server calls the HOSPEXerver with
                             # the 'http' argument and the partial-URL

% enhance -ftp path-to-file  # on detection of a request for a file coming
                             # from the FTP port (default '40'?), ie access 
                             # from a HOSPEX member, the HTTP server issues
                             # a call for acceptable MIME types, and checks
                             # if 'text/html' is returned. If not then it
                             # assumes that the request came from an FTP
                             # client, calls the ftp daemon and delivers the
                             # file as plaintext. No transformations are 
                             # attempted. Else it calls the HOSPEXerver with 
                             # the 'ftp' argument and the partial-URL.


HOSPEXerver BATCH Commands   ! Raw source file is assumed to be a single RFC
suitable for assembly into   ! 822 mail-message with a HOSPEX member form.   
an executable shell script   ! 
for web-administration duty  ! ____________________________ Unix file syntax 

% formatRecord file          # verify/ clean-up/ modify a raw source text;
                             # save in work directory (name doesn't matter).
                             # Cleaning up means stripping off unwanted
                             # mailheaders and reformatting the text acc.
                             # to sample. Output file is all-plaintext,
                             # not HTML.

% addRecord file [ dir ]     # if dir not present parse the content of the 
                             # COUNTRY:  and TOWN: items in <file>, extract 
                             # and construct a path in the form of 
                             # /hospex/country/town/, then attempt to store 
                             # file as "n.txt", where n = number of files
                             # in that directory; automatically create new 
                             # subdirectories as needed, named after each 
                             # (new) country/ region/ town; if unable to 
                             # determine the destination return a verbose 
                             # error message "Don't know where to place 
                             # country/town"

% deleteRecord file [ dir ]  # parse the content of Reply-To: or, if absent, 
                             # From: header in <file>, extract and search 
                             # the database for presence of it and, if found,
                             # delete the record in question and set up a 
                             # flag for later regeneration of Index.html 
                             # in found directory. If unable to find, exit 
                             # with a verbose error message "Member 
                             # <address> not found in database"

% update [ file ]            # if unlocked regenerate the index.html for
                             # the given dir; differing formatting rules
                             # depending on the level on which update is
                             # being attempted.

% update all                 # regenerate the entire control hierarchy
                             # of the database, but replace only the
                             # unlocked index.html documents. Differing
                             # formatting rules depending on the level
                             # on which update is being attempted.

% lock [ file ] [ dir ]      # toggle lock of (any index) <file> in current
                             # or indicated directory; return a verbose
                             # message "country/town/Index.html NOW locked"
                             # or "country/town/Index.html NOW unlocked"

% announce file              # sends <file> to the HOSPEX-L mailing list 

% quit                       # check status of the regenerate-structure?
                             # flag prior to exiting. If set, ask whether
                             # to update/ regenerate the control structure 
                             # of the database (unlocked documents only)

 $$

