 
      SUBROUTINE AID(MM, M, N, A, CLAB, RLAB, JP, TH, K, DMIWRK, IWORK,
     *               DMWRK1, WORK1, WORK2, OUNIT)
C
C<><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><>
C
C   PURPOSE
C   -------
C
C     FORMS A TREE OF CLUSTERS BY SPLITTING CASES ON VALUES OF
C     INDIVIDUAL VARIABLES TO MINIMIZE THE SUM OF THE SQUARED
C     DEVIATIONS FROM THE CLUSTER MEANS.  THE TREE CAN BE USED TO
C     PREDICT THE VALUE OF A DIFFERENT VARIABLE FOR A NEW CASE.
C
C   DESCRIPTION
C   -----------
C
C   1.  THE ROUTINE STARTS WITH ONE CLUSTER OF ALL CASES, AND AT EACH
C       ITERATION SPLITS A CLUSTER INTO TWO CLUSTERS OF CASES ACCORDING
C       AS A(I,J) > C OR A(I,J) <= C, WHERE A IS THE DATA MATRIX AND I
C       ARE THE CASES.  THE VARIABLE J AND THE CONSTANT C ARE CHOSEN
C       WHICH MINIMIZE THE SUM OF THE SQUARED WITHIN-CLUSTER DEVIATIONS
C       OF THE VARIABLE TO BE PREDICTED.  IF THE REDUCTION IN THE SUM
C       OF SQUARES FOR THE CLUSTER WHICH WAS SPLIT IS GREATER THAN THE
C       THRESHOLD, THE SPLIT IS PERFORMED, OTHERWISE IT IS IGNORED.
C       THIS ROUTINE STOPS WHEN NO CLUSTER CAN BE SPLIT SUCH THAT THE
C       SUM OF SQUARES IS REDUCED BY MORE THAN THE THRESHOLD.
C
C   2.  THE CASES ARE SORTED SUCH THAT EACH CLUSTER IS CONTIGUOUS IN
C       THE ORDER.  THE OUTPUT IS WRITTEN ON FORTRAN UNIT OUNIT AND
C       BEGINS WITH THE PRINTING OF THE SORTED LIST OF THE CASES.
C       THEN, A TREE OF CLUSTERS ARE PRINTED OUT ALONG WITH THEIR
C       SUMMARY STATISTICS.  THE CASES BELONGING TO EACH CLUSTER ARE
C       THE CASES IN THE SORTED ORDER THAT ARE BETWEEN THE FIRST AND
C       LAST CASE.  THE VARIABLE THAT PRODUCES THE OPTIMUM SPLIT OF THE
C       CLUSTER, THE SPLIT POINT, AND THE REDUCTION IN THE SUM OF
C       SQUARES BY THE SPLIT ARE PRINTED.  IF THE REDUCTION IS
C       NEGATIVE, THE SPLIT WAS PERFORMED; OTHERWISE, THE SPLIT WAS
C       IGNORED.  THE MEAN AND THE VARIANCE OF THE PREDICTED VARIABLE
C       AND THE NUMBER OF ELEMENTS FOR EACH CLUSTER ARE ALSO DISPLAYED.
C
C   3.  TO PRODUCE THE TREE, USE THE FIRST CLUSTER AS THE ROOT.  IF THE
C       SUM OF SQUARES REDUCTION IS NEGATIVE, THIS CLUSTER WAS SPLIT ON
C       THE GIVEN SPLIT VARIABLE AT THE SPLIT POINT.  LOOK FOR THE NEXT
C       CLUSTER WITH THE SAME FIRST CASE AND MAKE THAT CLUSTER THE LEFT
C       SON OF THE ROOT.  THIS BRANCH CORRESPONDS TO CASES WHOSE SPLIT
C       VARIABLE HAS A VALUE LESS THAN OR EQUAL TO THE SPLIT POINT.
C       THE RIGHT SON IS THE FIRST CLUSTER IN THE LIST WHOSE LAST CASE
C       IS THE SAME AS THE LAST CASE OF THE ROOT CLUSTER, AND HAS THE
C       VALUE OF THE SPLIT VARIABLE GREATER THAN THE SPLIT POINT.
C       CONTINUE FOR EACH CLUSTER UNTIL NO CLUSTERS ARE SPLIT.  THE
C       NODES AT THE BOTTOM OF THE TREE ARE THE FINAL CLUSTERS.
C
C   4.  FOR AN EXISTING CASE, FIND THE FINAL CLUSTER IT BELONGS TO BY
C       MOVING DOWN THE TREE DETERMINED BY THE SPLITS USING THE KNOWN
C       VALUES.  THEN, THE VALUE FOR THE VARIABLE TO BE PREDICTED FOR
C       THIS CASE IS THE MEAN OF THAT VARIABLE FOR THE CASES IN THAT
C       FINAL CLUSTER.  THIS MEAN VALUE AND ITS VARIANCE IS PRINTED IN
C       THE LAST TWO COLUMNS OF THE OUTPUT.
C
C   INPUT PARAMETERS
C   ----------------
C
C   MM    INTEGER SCALAR (UNCHANGED ON OUTPUT).
C         THE LEADING DIMENSION OF MATRIX A.  MUST BE AT LEAST M.
C
C   M     INTEGER SCALAR (UNCHANGED ON OUTPUT).
C         THE NUMBER OF CASES.
C
C   N     INTEGER SCALAR (UNCHANGED ON OUTPUT).
C         THE NUMBER OF VARIABLES.
C
C   A     REAL MATRIX WHOSE FIRST DIMENSION MUST BE MM AND SECOND
C            DIMENSION MUST BE AT LEAST M (CHANGED ON OUTPUT).
C         THE DATA MATRIX.
C
C         A(I,J) IS THE VALUE FOR THE J-TH VARIABLE FOR THE I-TH CASE.
C
C   CLAB  VECTOR OF 4-CHARACTER VARIABLES DIMENSIONED AT LEAST N
C            (CHANGED ON OUTPUT).
C         ORDERED LABELS OF THE COLUMNS.
C
C   RLAB  VECTOR OF 4-CHARACTER VARIABLES DIMENSIONED AT LEAST M
C            (CHANGED ON OUTPUT).
C         ORDERED LABELS OF THE ROWS.
C
C   JP    INTEGER SCALAR (UNCHANGED ON OUTPUT).
C         NUMBER OF VARIABLE TO BE PREDICTED.  MUST BE BETWEEN 1 AND N.
C
C   TH    REAL SCALAR (UNCHANGED ON OUTPUT).
C         THRESHOLD VARIANCE FOR SUM OF SQUARES REDUCTION.
C
C   K     INTEGER SCALAR (UNCHANGED ON OUTPUT).
C         MAXIMUM NUMBER OF CLUSTERS TO BE FORMED.  SHOULD BE LESS
C            THAN 2*M.
C
C   DMIWRK INTEGER SCALAR (UNCHANGED ON OUTPUT).
C         THE LEADING DIMENSION OF MATRIX IWORK.  MUST BE AT LEAST 2.
C
C   IWORK INTEGER MATRIX WHOSE FIRST DIMENSION MUST BE DMIWRK AND SECOND
C            DIMENSION MUST BE AT LEAST K.
C         WORK MATRIX.
C
C   DMWRK1 INTEGER SCALAR (UNCHANGED ON OUTPUT).
C         THE LEADING DIMENSION OF MATRIX WORK1.  MUST BE AT LEAST 6.
C
C   WORK1 REAL MATRIX WHOSE FIRST DIMENSION MUST BE DMWRK1 AND SECOND
C            DIMENSION MUST BE AT LEAST K.
C         WORK MATRIX.
C
C   WORK2 REAL VECTOR DIMENSIONED AT LEAST 2*M.
C         WORK VECTOR.
C
C   OUNIT INTEGER SCALAR (UNCHANGED ON OUTPUT).
C         UNIT NUMBER FOR OUTPUT.
C
C   REFERENCES
C   ----------
C
C     HARTIGAN, J. A. (1975).  CLUSTERING ALGORITHMS, JOHN WILEY &
C        SONS, INC., NEW YORK.  PAGES 330-346.
C
C     SONQUIST, J. A. (1971).  "MULTIVARIATE MODEL BUILDING: THE
C        VALIDATION OF A SEARCH STRATEGY."  INSTITUTE FOR SOCIAL
C        RESEARCH, THE UNIVERSITY OF MICHIGAN, ANN ARBOR, MICH.
C
C<><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><>
C
 
 
