DOCUMENTATION for CALFIT, CALCAT, CALMRG

                                 ABSTRACT

This program set forms an environment for uniform fitting of rotation-
vibration spectra to molecular parameters and for predicting the full
spectrum based on these parameters. The programs were developed for
performing calculations for the JPL Submillimeter, Millimeter and Microwave
Spectral Line Catalog, but are useful for a variety of related tasks in
molecular spectroscopy. These programs require a set of subroutines for a
specific problem. The two subroutine sets included are (1) for doublet Pi
linear molecules with one nuclear spin and (2) for asymmetric rotors with
up to 9 interacting vibrational states and 5 spins. These programs are
written in FORTRAN and have been designed to be easily ported to a variety
of different computers. It has been tested on ten different computer
systems ranging in capability from an IBM PC-XT to a Cray YMP. The program
can fit up to 32767 experimental lines to 180 parameters. CALFIT is used to
perform the fitting of experimental lines, CALCAT is used to perform
catalog predictions. The catalog predictions from CALCAT are in
pseudo-random order and contain only predictions. To form a full catalog
file in which the lines are sorted and the experimental line parameters are
merged, use a sorting program followed by a sorting program.

1. Description of Problem: High resolution molecular spectroscopy generates
frequency measurements with accuracy of 1 part in 10,000,000 which can be
fit to a molecular Hamiltonian with relatively few parameters. These
parameters can then be used to generate highly accurate predictions of
molecular absorption for literally thousands of lines. The JPL
Submillimeter, Millimeter and Microwave Spectral Line Catalog was created
to provide a uniform computer-accessible compendium of these predictions
for use by the scientific community.

2. Method of solution: This program set was developed as a general suite
of programs to provide a general and uniform format for a variety of
molecular species covered by the catalog. The computation is divided into
distinct programs so that the operations of fitting and predicting are
logically separated into separate executable units. Subroutine sets are
included (1) for doublet Pi linear molecules with one nuclear spin and (2)
for asymmetric rotors with up to 9 interacting vibrational states and 5
spins. The fitting programs are called DPFIT and SPFIT, respectively. The
predicting or cataloging programs are called DPCAT and SPCAT.

3. Program Language: FORTRAN-77, system unique calls are segregated into
a separate subroutine file to ease conversion to new systems.

4. Machine Requirements: Subroutine sets included for generic system, IBM
PC (Microsoft FORTRAN), HP1000A, PRIME, VAX-VMS, SUN-UNIX, ALLIANT-UNIX,
CDC, and CRAY-UNICOS.

5. User Operating Instructions: The input and output data files are
distinguished by unique file extensions. In some computer implementations
the files can be specified on the command line, or if the command line is
blank, interactively from the standard input device/file. In other
implementations the files are specified from the standard input only. In
either case, the first input is a generic name used to supply a default
file name with the standard file extensions. This first input need not have
a file extension. Subsequent entries provide alternate names for specific
files using the first 3 characters of the supplied file extension to match
the standard definitions. In interactive mode, requests for files are
terminated by entering no file name in response to the prompt. File names
must be entered with a lower case extension. The standard output device/file
provides information on the progress of the calculation. On the HP1000, a
file extension of .log is used to redirect the standard output to a file. On
many systems, entry of ^C (c with the control key depressed) will cause a
graceful conclusion of the programs. 

In the following list 'file' can be any name which is legal to the file
system and can include a path designation.

     file.par  is the parameter input/output file for CALFIT
     file.lin  is the input list of experimental lines for CALFIT & CALMRG
     file.fit  is the printable output file for CALFIT
     file.bak  is produced by CALFIT. It is a backup of file.par
     file.var  is a CALFIT output file and CALCAT input file containing
                  parameters and variances in file.par format
     file.unf  is a CALFIT output file and CALCAT input file containing
                  parameters and variances in binary for extra precision
     file.int  is the intensity input file for CALCAT
     file.out  is the printable output file for CALCAT
     file.cat  is the catalog output file for CALCAT and input for CALMRG
     file.egy  is a energy, parameter derivative, and eigenvector output
                  file for CALCAT
     file.str  is a transition dipole output file from CALCAT
     file.mrg  is the merged output for CALMRG
     file.bad  is an output file from CALMRG in file.lin format

FORMAT of file.par and file.var : 
 
line 1: title
line 2 [freeform]: NPAR, NLINE, NITR, NXPAR, ERRTST, THRESH , FRAC, CAL

     NPAR   = maximum number of parameters
     NLINE  = maximum number of lines
     NITR   = maximum number of iterations
     NXPAR  = number of parameters to exclude from end of list when fitting 
              special lines (see notes)
     THRESH = matrix singularity test for fitting (minimum magnitude of diagonal
                 element of upper triangular fit matrix after column normalized
                 to maximum value)
     ERRTST = maximum [(obs-calc)/error]
     FRAC   = fractional importance of variance
     CAL    = scaling for infrared line frequencies (only NPAR used by CALCAT) 

line 3-m: option information [freeform] ( see DPI.DOC and SPINV.DOC)
                                        ( usually  1 line )
line (m+1)-n [freeform]: IDPAR,PAR,ERRPAR

     IDPAR  = parameter identifier (see DPI.DOC,SPINV.DOC)
     PAR    = parameter value
     ERRPAR = a priori error in parameter

line (n+1)-end [8F10.6]: ( ( V(i,j),j=i,NPAR ) ,i=1,NPAR )
     V = Choleski decomposition of the correlation matrix, optional for file.par
  
notes: In the freeform input, the variables are all preset to reasonable
       default values. The input numbers can be separated by any character
       not usually found in an E or F formatted number. A space or comma is
       recommended. Two successive commas indicate that the default value is
       to be used for that variable. At the end of the line or when a '/'
       character is encountered, all unspecified variables remain set to
       their default values. PAR defaults to zero. ERRPAR defaults to a very
       large number for CALFIT and to zero for CALCAT.

       If an end-of-file or error is encountered before the parameters
       are read in, NPAR is set to the number read to that point. If an
       end-of-file or error is encountered before V is completely read
       in, V is set to a unit matrix. CALCAT will attempt to get V from
       file.unf if it exists.

       Special lines to which NXPAR applies are lines in which the F
       quantum number is negative.  In the quantum number assignment
       process in the program, the line is flagged and F is set to an
       appropriate value. When derivatives are accumulated, the last NXPAR 
       derivatives are ignored, and the energies are corrected by subtracting
       the first order contribution of these parameters. If F < -1, the 
       absolute value of F is used in the energy calculation. If F = -1,
       the F used is as close to the previous spin quantum number as angular
       momentum addition rules allow.  The value selected will be shown in
       the fit file line listing in place of the -1.

       The file.par is copied to file.bak by CALFIT.  The file.par file is
       over-written with new parameters. (In VAX VMS the new file.par file
       has a new version number, and the old version is not over-written.)
 
       If IDPAR is less than zero the magnitude is taken. In CALFIT, the 
       parameter value will be constrained to be a constant ratio of the 
       preceding parameter value. In this way linear combinations of
       parameters can be fit as a unit.
 
  
FORMAT of file.lin:
line 1-NLINE [12I3,freeform]: QN,FREQ,ERR,WT
  
     QN   = 12 integer field of quantum numbers. Interpreted in a multiple I3 format
             as the quantum numbers for the line (upper quanta first, followed 
             immediately by lower quanta). Unused fields can be used for annotation.
             The entire field is printed in file.fit
     FREQ = frequency in MHz or wavenumbers
     ERR  = experimental error. Minus sign means that the frequency and error are in
            units of wavenumbers. FREQ and ERR will be converted internally to units
            of MHz.
     WT   = relative weight of line within a blend (normalized to unity by program)
  
notes: If an end-of-file is encountered before all the lines are read
       in, NLINE is set to the number read to that point.  If successive
       lines have the same frequency, the lines will be treated as a
       blend and derivatives will be averaged using WT/ERR. Any lines
       with format errors will be ignored.

       The freeform input begins in column 37 and extends to the end of the
       line. See the file.par notes for more on the freeform input.
  
FORMAT of file.int:
line 1: title
line 2 [freeform]: FLAGS,TAG,QROT,FBGN,FEND,STR0,STR1,FQLIM,TEMP
 
     FLAGS  = IRFLG*1000+OUTFLG*100+STRFLG*10+EGYFLG
          IRFLG  = 1 if constants are in wavenumbers
          IRFLG  = 0 if constants are in MHz
          OUTFLG = 0 for short form file.out
          STRFLG = 1 to enable file.str output
          EGYFLG > 0   to enable file.egy energy listing
                 = 2,4 to enable file.egy derivative listing
                 = 3,4 to enable file.egy eigenvector listing
                 > 4 to dump Hamiltonian with no diagonalization
     TAG    = catalog species tag (integer)
     QROT   = partition function for TEMP
     FBGN   = beginning integer F quantum (round up)
     FEND   = ending integer F quantum (round up)
     STR0,STR1 = log strength cutoffs
     FQLIM  = frequency limit in GHz
     TEMP = temperature for intensity calculation in degrees K (default is 300K)
  
line 3-end [freeform]: IDIP,DIPOLE
  
     IDIP   = dipole identifier (see DPI.DOC,SPINV.DOC)
     DIPOLE = dipole value

notes: The freeform input is defined above in the notes for file.par.

     The maximum log of the line strength output to file.cat from CALCAT
     must be greater than STR0+STR1*(frequency/300GHz)**2. Both STR0 and
     STR1 default to -100.

FORMAT of file.cat and file.mrg:
[F13.4,2F8.4,I2,F10.4,I3,I7,I4,12I2]: FREQ,ERR,LGINT,DR,ELO,GUP,TAG,QNFMT,QN

     FREQ  = Frequency of the line
     ERR   = Estimated or experimental error (999.9999 indicates error is larger)
     LGINT = Base 10 logarithm of the integrated intensity in units of nm**2 MHz
     DR    = Degrees of freedom in the rotational partition function
             (0 for atoms, 2 for linear molecules, and 3 for nonlinear molecules)
     ELO   = Lower state energy in wavenumbers
     GUP   = Upper state degeneracy
     TAG   = Species tag or molecular identifier. A negative value flags that the
             line frequency has been measured in the laboratory.  The absolute value 
             of TAG is then the species tag (as given in line 2 of file.int above)
             and ERR is the reported experimental error. 
     QNFMT = Identifies the format of the quantum numbers given in the field QN.
             (see DPI.DOC, SPINV.DOC)
     QN(12)= Quantum numbers coded according to QNFMT. Upper state quanta start in
             element 1. Lower state quanta start in element 7. Unused quanta are blank,
             quanta whose magnitude is larger than 99 are shown as **. Quanta between
             -10 and -19 are shown as a0 through a9. Similarly, -20 is b0, etc., up to
             -99, which is shown as i9.

note: further discussion of these fields are described in R. L. Poynter and 
      H. M. Pickett, "Submillimeter, millimeter and microwave spectral line 
      catalog," Applied Optics 24, 2235-2240 (1985).

Format of file.str:
[F15.4,E15.6,I5,1X,12A2]: FREQ,DIPOLE,QNFMT,QN
     FREQ  = Frequency of the line
     DIPOLE= Reduced matrix element of the transition dipole
     QNFMT = Identifies the format of the quantum numbers given in the field QN.
             (see DPI.DOC, SPINV.DOC)
     QN(12)= Quantum numbers coded according to QNFMT. Upper state quanta start in
             element 1. Lower state quanta start in element 7. Unused quanta are blank,
             quanta whose magnitude is larger than 99 are shown as **. Quanta between
             -10 and -19 are shown as a0 through a9. Similarly, -20 is b0, etc., up to
             -99, which is shown as i9.

Format of file.egy:
energy output [2I5,2F18.6,6I3]: IBLK,INDX,EGY,ERR,QN
     
     IBLK = Internal Hamiltonian block number
     INDX = Internal index Hamiltonian block
     EGY  = Energy in wavenumbers
     ERR  = Expected error of the energy in wavenumbers
     QN(6)= Quantum numbers for the state

derivative output(follows energy output) : integer is the index of the parameter 
       (see file.fit or file.out); value is the derivative of the energy with 
       respect to the parameter.
eigenvector output (printed after derivative, if present): integer is the index of
       the basis function; value is the eigenvector with a phase such that the largest
       value is positive.
Hamiltonian dump output: integer is the index of the basis function

The formats of file.fit and file.out are intended for printing, and should be
labeled well enough to be self-explanatory. The format of file.unf is FORTRAN
unformatted binary.

 
6. Installation Instructions: System specific instructions are included in
the subroutine files which start with the letters SLIB.  
     CALCAT.FOR, CALFIT.FOR, and CALMRG.FOR are the main programs.
     SUBFIT.FOR is supplementary to CALFIT.
     ULIB.FOR & BLAS.FOR are generic libraries.
     SLIBVAX.FOR, SLIB.FOR, SLIBSUN.F, SLIBPC.FOR, etc. are system 
          dependent libraries, which include compilation instructions.
     SPINV.FOR is a specific library for spins and multiple vibrations.
          The executables using this library and CALFIT or CALCAT are 
          called SPFIT and SPCAT respectively.
     DPI.FOR  is a specific library for doublet pi with a nuclear spin
          The executables using this library and CALFIT or CALCAT are 
          called DPFIT and DPCAT respectively.
     *.NAM are parameter name files for GETLBL in CALFIT (currently they 
          can be on the current directory or in a default directory ).
          They are only used to label the output from CALFIT.
     BLAS.FOR contains needed LINPACK double precision
          Basic Linear Algebra Subroutines (these may be available on some 
          systems in a machine coded and/or vector processor form).
     HCO.* are sample input/output for SPFIT and SPCAT.
     NO.*  are sample input/output for DPFIT and DPCAT.
     CALPGM.DOC is this documentation. 
     SPINV.DOC is the specific documentation for the SPFIT and SPCAT. 
     DPI.DOC is the specific documentation for the SPFIT and SPCAT.
     TECH.DOC gives technical information on some of the subroutine
          interfaces in the program suite.

System dependent routines are all grouped in the files which start with SLIB.
Comments at the beginning of the SLIB files have directions for compiling.

The SLIB subroutine FILGET includes provision for reading file names. The PC,
UNIX, and HP1000 versions will take command line arguments. The CDC version
avoids the absence of .EXT file naming by using the convention name*c,
where c is the first character of the extension.
The SLIB function RQBRK provides for some type of 'hot key' to end the
programs gracefully. For the PC, VAX, and UNIX versions, ^C will end SPFIT or
DPFIT on the last iteration and will stop SPCAT or DPCAT at the end of the
present J calculation. On the HP1000, BR has the same effect.
The SLIB subroutine OPENLU names an alternate path for the parameter name
files (.NAM) which can be changed to something more convenient. 

Currently, only the PC, UNIX, and VAX versions are routinely checked as
upgrades are made. The other versions have not been tested recently, but will
probably work with little or no modification.

If all else fails use the SLIB.FOR generic file. Other unix systems should
work with SLIBSUN. The SLIB function NDBLE will need to be modified for
machines where REAL*8 is NOT four times longer than INTEGER*2 (e.g. CRAY). 


Very large arrays are in blank common. The size can be changed in CALFIT and
CALCAT by changing the size of the parameter HEAPLN. On some systems it may
be possible to invoke some sort of dynamic memory allocation. However, an
effort was made to keep the data packed for efficient use of virtual memory.
CALFIT also uses an explicit scratch file for experimental line information.
Some care has been taken to make sure the programs work with floating point
formats which have a 7 or 8 bit exponent (e.g. HP1000 or VAX D__FLOATING)
which limits dynamic range from 1.2E-38 to 1.0E+38. INTEGER*2 vectors are
assigned for integers which can be compactly stored as 16 bit quantities.
INTEGER*4 is used where the values may be larger then 16 bits. Otherwise
generic INTEGER type is used, and can be made INTEGER*2 or INTEGER*4 by
compiler option. All floating point numbers are typed REAL*8 and all
variables are explicitly typed. The cft77 compiler used on the CRAY under
UNICOS automatically converted INTEGER*2 and INTEGER*4 to INTEGER(48 bit) and
REAL*8 to REAL(64 bit). The warning messages can be ignored. 

8. Program Timing
speeds for HCO test data -- SPFIT

     IBM-PC-XT                    1978 sec
     IBM-PC-AT                    1088
     IBM-PC-486/25 (16 bit)         47
     IBM-PC-486/25 (32 bit)         29 
     IBM-PC-586/90 (32 bit)          3.2
     HP1000-A700                   529
     DEC-microVAX                  204
     Sun-3(UNIX)                    72
     Sun-4 SPARC 4/330 (UNIX)       14.0
     Sun-4 SPARC 10  (UNIX)          3.6
     Sun-4 SPARC 20  (UNIX)          2.0
     PRIME                          70
     Alliant                        18.3
     CRAY-2                          4.2
     CRAY-XMP                        3.3
     CRAY-YMP                        2.3