1.0 PURPOSE PENNZYME is a program for fitting trial rate laws to experimental data from steady state initial velocity experiments. It determines the set of parameter values for a given rate law which produces the closest fit to the experimental data, and provides information as to how good that fit is. 2.0 ABSTRACT This program finds the set of parameter values in a user supplied rate law which gives the best fit to the initial velocity data. The program operates interactively and provides two algorithms for locating a least-squares fit to the experimental data. The algorithms have different convergence properties and are best applied independently or in sequence with independent stopping criteria. The two algorithms were selected to work together so their capabilities complement each other. The program allows the user to interactively alter the initial parameter values, which parameters are actually being varied in an optimization, the stopping criteria, the order of application of the fitting algorithms, and to delete points which the program has identified as outliers. In addition the user can request that any of a set of reports pertaining to the input data, and to intermediate calculations, be typed on his terminal. The user may also obtain graphs on a suitable graphics terminal or with a plotter, if the graphics routines described below are included in the program and used as directed. These operations may be performed in any order and as many times as desired in a particular session. PENNZYME user's document Page 2 BEGINNER'S GUIDE TO PENNZYME 3.0 BEGINNER'S GUIDE TO PENNZYME The purpose of this section is to give the user who has no prior experience with computers, enough information to use the program on an elementary level. PENNZYME is a program which fits the parameters in trial rate laws representing possible enzymatic mechanisms to sets of experimental initial velocity data. It has considerable capability for inputting initial guesses , following the state of the computation, etc., which the sophisticated user can employ to efficiently obtain results interactively. At the same time it is able to do without all of this and give the unsophisticated user a result with minimal assistance from him although the computation may then be less efficient. Nonlinear optimization procedures generally require good initial guesses for the parameter values. This program combines two such procedures so that the program effectively generates its own initial guesses if necessary(in which case many user inputs are set to default values). A more detailed description is given in the DETAILED DESCRIPTION section below. Technical terms are defined in the TECHNICAL GLOSSARY. To use the program two input files and a FORTRAN function subroutine representing the trial rate law must be provided. One of the two input files contains the experimental velocity data. The other file contains information which depends upon the rate law and the model to be tested and includes the initial values of the parameters and the selection indices and convergence criteria for the fitting algorithms. The experimental velocity data-file contains only the chemical names and concentrations and the observed enzymatic velocities. Therefore, the file need not be changed with each new trial model. The user supplied rate law subroutine must be compiled and linked to the PENNZYME package prior to operation of the program. A description of this subroutine and the linking operation are given below under "RATE LAW SUBROUTINE". The parameter file and experimental data files described below must be on FORTRAN logical devices 1 and 20, respectively. The detailed handling of the rate law file may be computer-dependent. The following discussion refers to the sample input files for the measurements and for the parameters which are shown in Fig 3.1 and 3.2. These files are actually read by the program in this order, and may come from the same physical device, usually a disk file. In this and the following discussion, input is described as being composed of files made up of lines, which have presumably been typed in at a computer terminal. (In previous versions of this program these were referred to as cards in a deck.) It should be noted that columns in the data input file may be separated by tabs, but that this cannot be done in the parameter file, where ten spaces should be typed between columns. An optional name for the velocity may be inserted in the data file on the line preceding the numerical values, but this will not carry over to any graphic output. PENNZYME user's document Page 3 BEGINNER'S GUIDE TO PENNZYME .c;Experimental Data Input File 3,10 SUBSTR 1 SUBSTR 2 INHIBITR 1.2 5.0 3.5 1.11 .058 1.2 5.0 35.0 .710 .034 1.2 500.0 105.0 2.24 .115 1.2 500.0 350.0 .766 .0426 12.0 50.0 105.0 11.8 .6 12.0 50.0 350.0 5.03 .258 36.0 50.0 3.5 38.5 1.94 36.0 50.0 35.0 36.0 1.64 120.0 150.0 35.0 61.5 3.18 120.0 150.0 105.0 57.9 2.76 FIGURE 3.1 In this example the concentrations of three chemicals are measured for each of ten velocity observations. The next three lines contain the names of the three chemicals of our example in the order in which they will be encountered in the velocity observation lines that follow. This order must be the same as the order assumed in the FORTRAN rate law which will be used. The programmer supplying the rate law is responsible for supplying the documentation needed to determine this order. In our example the chemical names are: SUBSTR 1 SUBSTR 2 INHIBITR The next ten lines, in our example, each contains the data for one velocity observation. Each of these lines contains the following information in the same order: 1. Concentrations of the measured chemicals in the same order as that in which their names were specified on the name lines. 2. Observed reaction velocity. 3. Experimental error of the velocity measurement. These numbers may appear with or without decimal points and exponents, and must be separated by either blanks or commas. The following rule apply for input: PENNZYME user's document Page 4 BEGINNER'S GUIDE TO PENNZYME 1. All numbers are space delimited and may contain decimal points and exponents. 2. Any names which are followed in the record by other input must be padded with spaces to match the required character count. 3. Records must be placed in the input file in the required order, and are not numbered or otherwise identified. Optionally the position normally occupied by the first experimental point may be occupied by a set delimiter line, in which case the first velocity data line will follow the last contiguous delimiter line. If the data cannot fit on one line, it may be placed on following contiguous records provided that each line but the last has exactly "n" values, (where "n" is the "maximum values per record" value included (optionally) in the first record of the file). The last line must have at most "n" values but may have as few as one. Lines in the experimental file may be divided into sets. Set delimiter lines may beall spaces or begin with non-numeric characters other than tab, decimal point, or space. (These lines may also include comments for identification.) The program numbers the data sets so defined. They may then be manipulated independtly. The numbers on the first of these lines is interpreted as follows: line image: 1.2 5 3.5 1.113 .0580 Number Interpretation ______ ______________ 1.2 Concentration of "SUBSTR 1". 5 Concentration of "SUBSTR 2". 3.5 Concentration of "INHIBTOR". 1.113 Observed reaction velocity. 0.058 Experimental error of this measurement. The first line of the sample data file specifies the number of chemicals which have been measured in the experiment(and which appear in the rate law subroutine), and the number of steady state velocity measurements, in that order. PENNZYME user's document Page 5 BEGINNER'S GUIDE TO PENNZYME 3.1 Rate Law Subroutine Prior to operation of the PENNZYME program a user supplied rate law subroutine must be compiled and linked with the PENNZYME package. This subroutine must be a FORTRAN "REAL FUNCTION" subroutine and must be named "RATE". Two argument vectors are passed to the function subroutine and contain the current concentration values and the current parameter values. It is recommended that the argument vector names be the same for all user written rate law subroutines. A sample rate law subroutine is provided in figure 5.1 . REAL FUNCTION RATE(PCONC,PARAM) C DUMMY VARIABLES: C PCONC CONCENTRATION ARRAY. C PARAM PARAMETER ARRAY. DIMENSION PCONC(1),PARAM(1) DATA QSMALL/1E-35/ IF(PCONC(1).LE.QSMALL.OR.PCONC(2).LE.QSMALL.OR.PARAM(3) 1 .LE.QSMALL) GO TO 100 C---- ELSE, CALCULATE THE REACTION RATE. QSUB1 = (PARAM(1)/PCONC(1)) QSUB2 = (PARAM(2)/PCONC(2)) QACTV = (PARAM(4)/PCONC(1)) QINHIB = (PCONC(3)/PARAM(3)) + 1.0 RATE = PARAM(5) / (QSUB1*QINHIB 1 + QSUB2*(QACTV*QINHIB+1) + 1) GO TO 110 C---- THEN, THE RATE WOULD HAVE BEEN ZERO. 100 RATE = 0.0 110 CONTINUE RETURN END FIGURE 5.1 PENNZYME user's document Page 6 BEGINNER'S GUIDE TO PENNZYME Notice that the adjustable parameters and the chemical concentrations are represented by the elements of the vectors "PARAM" and "PCONC" respectively. The order in which they appear in these vectors must correspond to the order in which they are specified in the input file. Comparing the data file of figure 3.1 with the rate law subroutine of figure 3.2 we note the following correspondences: Vector element User label ______ _______ ____ _____ PCONC(1) SUBSTR 1 PCONC(2) SUBSTR 2 PCONC(3) INHIBITR PARAM(1) KM1 PARAM(2) KM2 PARAM(3) KI PARAM(4) KA PARAM(5) VMAX The first line of the routine declares the program to be a "REAL FUNCTION" subroutine named "RATE", as required, with two arguments to be identified by the names "PCONC" and "PARAM". The next non-comment statement(i.e. statement without a "C" in column 1) is the "REAL" statement which declares that the two arguments are of variable type "REAL"(i.e. single precision floating point). The next statement declares that the two argument names each refers to a vector of values. The first argument of the subroutine always refers to the vector of concentration values at which the rate of the reaction is to be calculated. The second argument refers to the vector of parameter values which, together with the rate law subroutine, complete our model of the enzymatic mechanism. It is these values that the optimization algorithms must determine. The rate law itself is written in algebraic equation form with the name of the function subroutine("RATE") on the left-hand side of the equation. PCONC(K), in our example, would refer to the concentration of the K'th chemical. PARAM(K) would refer to the value of the K'th rate law parameter value. The order of the chemicals and parameters may be defined arbitrarily when the rate law subroutine is written, but if n chemicals are required they must be referenced as the 1 through n'th elements of the concentration vector. The same applies to the use of the parameter values vector. PENNZYME user's document Page 7 BEGINNER'S GUIDE TO PENNZYME The user should note that the concentration and parameter names and values supplied in the input data files must appear there in the same order in which they are referenced in the argument vectors of the rate law subroutine (i.e. the index numbers of the array elements correspond to the order of the input data). It should be remembered that this routine will be called very many times in the course of the optimization calculation and should be as efficient as possible. In our example, additional statements are where included to check for small values of concentrations and parameters which would cause an "underflow" in the calculation of the reaction rate. This check is optional. For easier use of PENNZYME the user should have access to a library of these rate law subroutines, each with documentation explaining the mechanism it represents and the order in which the concentrations and parameters appear in their respective arrays. Cleland has published a large collection of rate laws together with definitions of their Kinetic constants(Biochim. Biophys. Acta. 67(1963) 104-137). A library of FORTRAN implementations of a number of these rate laws, in the format described above, is supplied with PENNZYME. At the present time there are 28 rate laws in this library. A valuable tool for deriving rate laws is the KINAL program of Cornish-Bowden, Biochem. J. 165(1977),55-59. PENNZYME user's document Page 8 BEGINNER'S GUIDE TO PENNZYME The Parameter File Initial Estimates Input File 15 THIS IS THE TITLE line. 16 5 17 KM1 18 KM2 30.0 19 KI 20 KA 21 VMAX 90.0 c;FIGURE 3.2 The parameter file starts with a line indicating the number of adjustable parameters in the rate law model whose optimal values we wish to determine. In our example there are five of these parameters. The next five lines- one for each parameter- each contains the name and estimated value of a parameter. The order of these lines must be the same as the order required by the rate law subroutine being used. The parameter name must be in the first eight columns of the line. The estimated value may start in any column from the ninth on. The value is represented by a number with or without a decimal point and exponent. If no estimate is supplied by the user all columns past the eighth must be blank. In our example the parameter names and values are: Name Value ____ _____ "KM1" 0 "KM2" 30 "KI" 0 "KA" 0 "VMAX" 90 A sample rate law subroutine for this data appears in figure 5.1 (see also "RATE LAW SUBROUTINE" below). PENNZYME user's document Page 9 BEGINNER'S GUIDE TO PENNZYME FIGURE 3.2 3.1.1 Detailed Description Of Paramter Input File - 1. Title Record: String of up to 60 characters which will appear in output report headers. Must be the first record in the input file and should be centered within the first 60 character positions of the record. 2. Parameter Count Record: Number of adjustable parameters in the rate law. Must be the second record of the file and may begin in any character position of the record. 3. Parameter Data Records: Contain the names and initial values of the adjustable parameters of the rate law. The first of these words must occupy the third record position. The rest must follow in contiguous record positions in the order in which they appear in the parameter array referenced in the rate law function subprogram. One must be present for each adjustable parameter as declared in record 2. Format of the record is as follows: 1. Parameter Name: Occupies record positions 1 through 8. 2. Parameter Value: Occupies any record positions past 8. PENNZYME user's document Page 10 program operation 3.2 Program Operation 3.2.1 MENU After processing the input data the program outputs a message indicating the current menu level and a prompt to enter the desired option. Entering the option "MENU" will output the following: GRAPH ENTER GRAPHICS MODE DEFN ENTER PROBLEM DEFINITION MODE CONV ALTER CONVERGENCE CRITERIA SMPITER SET MAX SIMPLEX ITERATIONS DEVPAR SET MAX RELATIVE DEV OF PARAMS FOR SIMPLEX DEVERR SET MAX REL. DEV OF LEAST-SQUARE-ERROR FOR SIMPLEX STEP SET INITIAL SIMPLEX STEP SIZE FPITER SET MAX FP ITERATIONS ERREST SET ESTIMATE OF EXPECTED LEAST SQUARE ERROR FOR FP RELERR SET MAX RELATIVE ERROR BETWEEN ITERATIONS FOR FP PARAM ALTER PARAMETER VALUES DELOUT DELETE OUTLIERS TRACE SET FORMAT FOR TYPING TRACE OF OPTIM SMPMODE SET OUTPUT DEVICE FOR SIMPLEX REPORT SMPPER SET PERIOD BETWEEN TRACE OUTPUTS FOR SIMPLEX FPMODE SET OUTPUT DEVICE FOR FP REPORT FPPER SET PERIOD BETWEEN TRACE OUTPUTS FOR FP RPTMODE SET OUTPUT DEVICE FOR FINAL REPORTS FREEZE FREE OR PIN PARAMETER VALUES OPTIM ENTER PROBLEM OPTIMIZATION MODE SIMPLEX PERFORM A SIMPLEX OPTIMIZATION FP PERFORM A FLETCHER-POWELL OPTIM. SMPRPT REPORT RESULTS OF SIMPLEX OPTIM FPRPT REPORT RESULTS OF FLETCHER-POWELL OPTIM OUTLIER DETERMINE AND OUTPUT OUTLIER DATA POINTS REPORT ENTER REPORTS MENU DATA LIST INPUT DATA RESID OUTPUT RESIDUALS REPORT STATS DETERMINE AND OUTPUT STATISTICAL REPORT PLIST LIST PARAMETER VALUES VARCOV OUTPUT VARIANCE-COVARIANCE MATRIX The first word of every line is entered by the user to specify the desired action. Entering an undefined option will generate an error message followed by Menu Desired? (If so type "yes".) Responding with a yes will give a partial list of the options related to the current level. This is followed by the message: Do you want menu of other levels? A yes answer will obtain a list of the options that are at the same level as the last option entered. A valid option may be entered in response to either of the two questions and will then be executed. Examples of individual options follow, in order of increasing sophistication. PENNZYME user's document Page 11 program operation 3.3 Program Output 3.3.1 REPORT Submenu - 3.3.1.1 DATA Option (list Input Data) - A sample report follows: PENNZYME user's document Page 12 program operation ********************************************************************** * TEST OF PENNZYME: PENNSYLVANIA ENZYME PROGRAM * * INPUT DATA * ********************************************************************** NUMBER OF PARAMETERS NUMBER OF CHEMICALS NUMBER OF EXPERIMENTS 5 3 32 OBSERVED STANDARD CHEMICAL CONCENTRATIONS VELOCITY DEVIATION POINT 1 2 3 SET # 1 1 1.2000 5.0000 3.5000 1.0800 0.0930 2 1.2000 5.0000 35.0000 0.7300 0.0540 3 1.2000 5.0000 110.0000 0.3320 0.0270 4 1.2000 5.0000 350.0000 0.1330 0.0100 5 1.2000 50.0000 3.5000 5.8100 0.4100 6 1.2000 50.0000 35.0000 2.4400 0.2400 7 1.2000 50.0000 110.0000 1.5900 0.1200 8 1.2000 50.0000 350.0000 0.6300 0.0450 9 1.2000 150.0000 3.5000 7.7900 0.5500 10 1.2000 150.0000 35.0000 3.7800 0.3200 11 1.2000 150.0000 110.0000 1.9100 0.1600 12 1.2000 150.0000 350.0000 0.8190 0.0600 13 1.2000 500.0000 3.5000 7.4000 0.6300 14 1.2000 500.0000 35.0000 4.3200 0.3600 15 1.2000 500.0000 110.0000 2.1300 0.1800 16 1.2000 500.0000 350.0000 0.7150 0.0680 SET # 2 17 12.0000 5.0000 3.5000 5.5500 0.4300 18 12.0000 5.0000 35.0000 4.1200 0.3200 19 12.0000 5.0000 110.0000 2.7400 0.2000 20 12.0000 5.0000 350.0000 1.2600 0.0930 21 12.0000 50.0000 3.5000 23.9000 2.1000 22 12.0000 50.0000 35.0000 18.5000 1.5000 23 12.0000 50.0000 110.0000 11.3000 0.9300 24 12.0000 50.0000 350.0000 4.9600 0.4100 25 12.0000 150.0000 3.5000 38.8000 3.0000 26 12.0000 150.0000 35.0000 26.3000 2.1000 27 12.0000 150.0000 110.0000 15.8000 1.3000 28 12.0000 150.0000 350.0000 6.6500 0.5500 29 12.0000 500.0000 3.5000 44.5000 3.5000 30 12.0000 500.0000 35.0000 28.9000 2.5000 31 12.0000 500.0000 110.0000 19.3000 1.5000 32 12.0000 500.0000 350.0000 7.9300 0.6300 PENNZYME user's document Page 13 program operation ********************************************************************** * TEST OF PENNZYME: PENNSYLVANIA ENZYME PROGRAM * * STARTING VALUES * ********************************************************************** CURRENT LEAST-SQUARES ERROR: 6.12428E-01 INITIAL PARAMETER VALUES WERE: KM1 1.00000E+01 KM2 1.00000E+01 KI 1.00000E+01 KA 1.00000E+01 VMAX 1.00000E+01 PENNZYME user's document Page 14 program operation 3.3.2 OPTIM Submenu - 3.3.2.1 SIMPLEX Option (perform A Simplex Optimization) - The Simplex method is a "robust" method and does not require a good initial estimate to locate a minimum. Its convergence properties are less sensitive than gradient search methods to curvature of the least-squares error surface. In addition, if there are several local minima the Simplex method is more likely to find the global minimum rather than a local one, provided that the initial step size is chosen sufficiently large. 3.3.2.2 SMPRPT Option (report Results Of Simplex Optimization) - This report will be produced only if the algorithm has been used at least once in the session. A sample report follows: PENNZYME user's document Page 15 program operation .rm 82 ---------------------------------------------------------------------- FITTING BY SIMPLEX METHOD. CONVERGENCE ACHIEVED AFTER 218 ITERATIONS. -ELAPSED CPU TIME: 0 MIN 5.1 SEC RMS RELATIVE DEVIATION TARGET ACTUAL ABOUT THE CENTROID OF - -KINETIC PARAMETERS: 2.50000E-02 1.78445E-02 -LEAST-SQUARES ERROR: 1.00000E-03 9.09281E-04 LEAST-SQUARES ERROR: START 6.12428E-01 END 4.85735E-02 PERCENT REDUCTION: 92.07% OPTIMAL PARAMETER VALUES WERE: KM1 1.17673E+01 KM2 4.75639E+01 KI 3.46474E+01 KA 6.76959E+00 VMAX 9.65884E+01 This operation causes a least-squares fit of the velocity data to be performed by the Simplex method using the current estimate of the optimal parameter values and the current Simplex option values. If the option values indicated that a trace report was 3.3.2.3 FP Option (perform A Fletcher-Powell Optimization) - The Fletcher-Powell method requires a good initial guess of the parameter values but converges rapidly when such an estimate is supplied. This initial estimate is automatically supplied to the Fletcher-Powell algorithm when the Simplex method is used first. Since it is a gradient search method it can detect the case where the Simplex method has converged but the least-squares error surface still has a calculable gradient allowing the fit to be improved. Trying to do optimizations with the Fletcher-Powell method alone can be quite expensive in computer usage. PENNZYME user's document Page 16 program operation The recommended procedure is the last one; locate an initial estimate by the Simplex method, then verify and refine this estimate by the Fletcher-Powell method. 3.3.2.4 FPRPT Option (report Results Of Fletcher-Powell Optimization) - This report will be produced only if the algorithm has been used at least once during the session. A sample report follows: .rm 82 ---------------------------------------------------------------------- FITTING BY FLETCHER-POWELL METHOD. CONVERGENCE ACHIEVED AFTER 8 ITERATIONS. -ELAPSED CPU TIME: 0 MIN 3.8 SEC LEAST-SQUARES ERROR: START 4.85735E-02 END 4.84583E-02 PERCENT REDUCTION: 0.24% OPTIMAL PARAMETER VALUES WERE: KM1 1.17010E+01 KM2 4.90328E+01 KI 3.46888E+01 KA 6.51458E+00 VMAX 9.61637E+01 Fine Points in Using Fletcher-Powell Optimization Starting Too Close to the Minimum A warning message may be output by the Fletcher-Powell optimization routine: CONVERGENCE ACHIEVED IN TOO FEW ITERATIONS TO ACHIEVE AN ACCURATE COVARIANCE MATRIX This means that the Fletcher-Powell optimization started at or near the minimun and converged in fewer iterations than the number of parameters. In this situation the H matrix will either be calculated inaccurately or may never be calculated at all. A tactic to get around this difficulty is to change at least two and preferably all the paramter values by 10% and PENNZYME user's document Page 17 program operation repeat the Fletcher-Powell optimization. This tactic may also be useful if the Fletcher-Powell optimization appears to be having difficulty finding the minimum--which may happen because it started out there. Optimizing Error Model Input to the PENNZYME program includes estimates of the standard deviation of the velocity measurements. These are not merely values obtained from a few replicate measurements, but are the standard deviations expected from hundreds of replicate measurements. PENNZYME weights each squared residual by the normalized inverse variance. Clearly, such values are available only if a particular enzymatic assay, using a particular instrumentation, had been previously calibrated. If the standard deviations are set to zero, owing to a lack of information, all the weights are set to unity and PENNZYME cannot use weighting to account for the variation in the reliability of the experimental measurements. The extended least squares option permits the user to estimate the unknown variance simultaneously with identifying the optimal values of the kinetic parameters. Just as the user must supply a function RATE which calculates the expected enzymatic velocity at a given set of ligand concentrations, to use extended least squares he must also supply a function VARIAN which calculates the expected variance of a given experimental velocity measurement. PENNZYME calls VARIAN once for each data point. The function arguments are: FUNCTION VARIAN (CONC,PARAM,VCALC) where CONC = array of ligand concentrations forcurrent data point PARAM = array of error model parameters to be optimized VCALC = expected(i.e. calculated) enzymatic velocity for current data point. VARIAN may be any function of concentrations and/or calculated velocity, but usually it is a polynomial or a power law function of the expected velocity. The user should make some attempt at obtaining a qualitative description of the error structure of his data. The error model described by VARIAN will include adjustable parameters just as RATE does. The parameter input file (on logical unit 1) must have initial estimates of these parameters as well as the kinetic parameters. For example, if the error model is a power law in the expected velocity: VARIAN = PARAM(1) * VCALC **PARAM(2) PENNZYME user's document Page 18 program operation then initial estimates of error model parameters PARAM(1) and PARAM(2) in addition to those for the kinetic parameters are necessary. The format for the parameter input file is: title kinetic parameters, error parameters kinetic parameter name . . . . . . error model parameter name . . . . For the above example the input would resemble: title 3,2 KM VMAX KI CONST EXPON VARIAN is called only if the extended least squares option is set by including a nonzero number of error model parameters on the parameters input file. PENNZYME has a default error model: VARIAN = CONST**2 + VCALC**EXPON where CONST and EXPON are adjustable parameters. A user-supplied function overrides this default model. TO ENTER THE GRAPH MODE, TYPE GRAPH. A DETAILED MANUAL FOR THE GRAPHICS COMMAND LANGUAGE FOLLOWS THE GLOSSARY BELOW. The REPORTS level submenu PENNZYME user's document Page 19 STATISTICS 4.0 STATISTICS 4.0.0.1 STATS Option (determine And Output Statistical Report) - The Statistics Report contains statistical measures relevent to the fitting calculation. The form of this report depends upon which fitting algorithms were used, in what order they were used, and whether or not the calculations converged. A discussion of the statistical measures which appear in this report is given under "STATISTICS" below. PENNZYME computes the values of several statistics which serve as measures of the reliability of the optimal parameter values and as indicators of the presence of systematic errors (bias) in the kinetic model. Model Reliability _____ ___________ Residual error A measure of the model's fit to the kinetic data. Pure error The residual error expected if only experimental errors of velocity measurements determine the residuals. Excess variance (Residual error)**2 - (Pure error)**2 . A measure of bias. The ratio of these two squared terms may be used as an "F-test" to determine if the excess variance is significant. Parameter Reliability _________ ___________ Standard deviation A measure of the curvature of the error surface in the vicinity of the minimum. Small values indicate high curvature and a well defined minimum. 95% confidence limit The value which when subtracted from or added to the optimal parameter value defines the range of values in which there is a 95% probability of finding the true PENNZYME user's document Page 20 STATISTICS parameter value. T-statistic The ratio (parameter value / standard deviation). If this ratio is less than a certain value, the corresponding parameter value is not significantly different from zero, and the parameter is redundant. PENNZYME performs this test automatically. ********************************************************************** * TEST OF PENNZYME: PENNSYLVANIA ENZYME PROGRAM * * STATISTICS REPORT * ********************************************************************** PURE ERROR: 2.78787E-01 RESIDUAL ERROR: 4.84583E-02 AVERAGE OBSERVED VELOCITY: 9.44122E+00 (RES. ERR.)/(AV. VEL.) = RELATIVE ERROR: 5.13264E-03 (RES. ERR.)**2 - (PURE ERR.)**2 = EXCESS VARIANCE: -7.53738E-02 (RES. ERR.)**2 / (PURE ERR.)**2 = MEASURE OF BIAS: 3.02130E-02 PARAMETER NAME VALUE STD. DEV. CNFD. LIM. T-STATISTIC < T(.95) KM1 1.17010E+01 1.866E+00 3.830E+00 6.26950E+00 KM2 4.90328E+01 9.786E+00 2.008E+01 5.01032E+00 KI 3.46888E+01 2.196E+00 4.506E+00 1.57980E+01 KA 6.51458E+00 1.057E+00 2.169E+00 6.16224E+00 VMAX 9.61637E+01 1.202E+01 2.466E+01 8.00187E+00 An arrow("<---") will appear in the column directly beneath the header " , t(.95)" for any parameters whose t-Statistic exceeds the 95% confidence limit. If the arrow appears for a parameter it is likely that the parameter is redundant for the data set used. 4.0.0.2 RESID Option (output Residuals Report) - A sample report follows: PENNZYME user's document Page 21 STATISTICS ********************************************************************** * RAPID-EQUILIBRIUM BINARY REACTION WITH HILL COEFFICIENTS * * RESIDUALS REPORT. * ********************************************************************** OBS V(CALC)V(OBS) V(CALC) WEIGHT WEIGHTED -V(OBS) RESIDUAL SET # 1 1 1.5O1E-O1 1.5O4E-O1 -2.946E-O4 1.73OE+OO -5.O98E-O4 2 1.869E-O1 1.9O5E-O1 -3.6O8E-O3 1.551E+OO -5.597E-O3 3 2.473E-O1 2.451E-O1 2.221E-O3 1.173E+OO 2.604E-O3 4 3.658E-O1 3.69OE-O1 -3.236E-O3 6.120E-O1 -1.981E-O3 5 5.7O8E-O1 6.25OE-O1 -5.423E-O2 2.469E-O1 -1.339E-O2 SET # 2 6 1.672E-O1 1.7O2E-O1 -3.OO8E-O3 1.656E+OO -4.98OE-O3 7 2.O82E-O1 2.O51E-O1 3.O66E-O3 1.422E+OO 4.361E-O3 8 2.755E-O1 2.667E-O1 8.774E-O2 1.OO6E+OO 8.822E-O3 9 4.O74E-O1 3.883E-O1 1.91OE-O2 4.954E-O1 9.462E-O3 1O 6.357E-O1 6.25OE-O1 1.O75E-O2 1.965E-O1 2.112E-O3 SET # 3 11 1.87OE-O1 1.923E-O1 -5.282E-O3 1.55OE+OO -8.19OE-O3 12 2.328E-O1 2.235E-O1 9.35OE-O3 1.264E+OO 1.182E-O2 13 3.O81E-O1 3.OO8E-O1 7.339E-O3 8.375E-O1 6.146E-O3 14 4.557E-O1 4.2554E-O1 3.O21E-O2 3.948E-O1 1.193E-O2 15 7.111E-O1 7.519E-O1 -4.O77E-O2 1.548E-O1 -6.313E-O3 SET # 4 16 2.O94E-O1 2.128E-O1 -3.384E-O3 1.414E+OO -4.787E-O3 17 2.6O7E-O1 2.516E-O1 9.137E-O3 1.O91E+OO 9.967E-O3 18 3.45OE-O1 3.419E-O1 3.143E-O3 6.835E-O1 2.148E-O3 19 5.1O3E-O1 4.819E-O1 2.839E-O2 3.123E-O1 8.866E-O3 2O 7.963E-O1 8.696E-O1 -7.33OE-O2 1.216E-O1 -8.915E-O3 SET # 5 21 2.263E-O1 2.439E.O1 -1.757E-O2 1.306E+OO -2.294E-O2<-- 22 2.818E-O1 2.721E-O1 9.7O1E-O3 9.7O6E-O1 9.416E-O3 23 3.729E-O1 3.883E-O1 -1.538E-O2 5.896E-O1 -9.O7OE-O3 24 5.515E-O1 5.714E-O1 -1.989E-O2 2.654E-O1 -5.279E-O3 25 8.6O6E-O1 1.O2OE+OO -1.598E-O1 1.O3OE-O1 -1.646E-O2 CURRENT LEAST-SQUARES ERROR; 1.O625OE-O2 CURRENT PARAMETER VALUES WERE: KMPEP 2.97865E+O1 KMADP 1.838O7E+O2 VMAX 1.O787OE+OO HCPEP 1.2O112E+OO SIG 3.96878E-O2 ALPHA 4.29188E+OO The markers " <--- " at the right margin flag those POINTS PENNZYME user's document Page 22 STATISTICS identified as outliers by the program (see index 8 above). The message "DELETED" appears in the same position for any points which were removed from the fitting calculation by the user, using the DELOUT option which is described below. (When the marker " <--- " appears in any report involving parameters, it indicates a parameter which is not significantly different from zero in a statistics report, and a parameter which is pinned in any other report involving parameters.) 4.0.0.3 OUTLIER Option (determine And Output Outlier Data Points) - This operation compares the absolute value of the weighted residuals to twice the standard error. Any observations (data-points) whose weighed residuals exceed this limit are numbered, inserted in an internal "outlier table" and printed on the user's terminal. There is room in the table for ten outliers. If more than this are found a warning message and the number of the velocity observation is printed for each outlier for which there is no room in the table. A sample report follows: ENTER DESIRED OPTION > OUTLIER OUTLIER DATA POINT V(OBS) RESIDUAL 1 6 2.63900E+00 3.37986E-01 2 56 2.42100E+01 2.77115E+00 ENTER DESIRED OPTION > 4.0.0.4 PLIST Option (list Parameter Values) - This operation produces the following report: ENTER DESIRED OPTION > PLIST CURRENT PARAMETER VALUES WERE: KM1 1.24335E+01 KM2 5.06060E+01 KI 3.54142E+01 KA 6.91388E+00 VMAX 1.01837E+02 .s2 (The marker " <--- " indicates a pinned parameter.) PENNZYME user's document Page 23 STATISTICS .s2 ENTER DESIRED OPTION > VARCOV .b3 This option prints out the variance -covariance matrix. **************************************************************** * RAPID EQUILIBRIUM BINARY REACTION WITH HILL COEFFICIENTS * * VAR-COVAR MATRIX * **************************************************************** KMPEP KMADP VMAX HCPEP 1.2454E+01 1.2915E+00 1.8991E-01 -2.2180E-01 KMPEP 7.2989E+01 2.6871E-01 -4.0922E-03 KMADP 6.6961E-03 -1.0788E-02 VMAX 2.3825E-02 HCPEP **************************************************************** * RAPID-EQUILIBRIUM BINARY REACTION WITH HILL COEFFICIENTS * * NORM. VAR-COVAR MATRIX * **************************************************************** KMPEP KMADP VMAX HCPEP 1.0000E+00 4.2836E-02 6.5762E-01 -4.0720E-01 KMPEP 1.0000E+00 3.8436E-01 -3.1032E-03 KMADP 1.0000E+00 -8.5410E-01 VMAX 1.0000E+00 HCPEP PENNZYME user's document Page 24 STATISTICS 4.0.1 DEFN (PROBLEM DEFINITION) Submenu - 4.0.1.1 CONV Option (alter Convergence Criteria) - This operation allows the user to alter the values of variables which were initialized from data in the "method specific" input file. These variables are described in the section "INPUT DATA". The following is a sample of the dialogue which the user will encounter: Notice that the user may examine a variable's value without changing the value by entering "n" in response to the "new value" prompt. Also note that the variable types are mixed. The user need not know the type of a particular variable (although it is hoped that he would) since the type conversion of user supplied data is done automatically. 4.0.1.1.1 Index 3 - "Delete An Outlier" - "Outlier deletion" is an operation which allows the user to remove from the fitting calculation points whose weighted residuals exceed twice the least-squares error. This is done by removing the weights of these points and renormalizing the remaining weights. Points which have been deleted may be returned to the calculation by the reverse operation "Reinsert a deleted point". Both of these operations are provided in a control subsection initiated by entry of index value 3 in response to the options prompt. A sample dialogue follows: ENTER DESIRED OPTION > MANIPULATE OUTLIERS. PIN (describe here pinning parameters) .hl4PARAM option (alter parameter values) This operation is provided to allow the user to adjust his estimate of the optimal parameter values prior to or between fitting attempts. Sample dialogue follows: .tp40 .b3;.lt ENTER DESIRED OPTION > PARAM ALTER PARAMETERS. PLEASE ENTER PARAMETR INDEX(OR ):2 PENNZYME user's document Page 25 STATISTICS PARAMETR KM2 IS 10.00000 ENTER NEW VALUE (OR N, FOR NO CHANGE):50 PLEASE ENTER PARAMETR INDEX(OR ):4 PARAMETR KA IS 10.00000 ENTER NEW VALUE (OR N, FOR NO CHANGE):N PLEASE ENTER PARAMETR INDEX(OR ):97 INDEX 97 IS OUT OF BOUNDS, TYPE FOR A LIST OF OPTIONS. PLEASE ENTER PARAMETR INDEX(OR ):5 PARAMETR VMAX IS 10.00000 ENTER NEW VALUE (OR N, FOR NO CHANGE):97 PLEASE ENTER PARAMETR INDEX(OR ): (ENTER "-1" IF NO MORE CHANGES) INDEX PARAMETER NAME VALUE 1 KM1 1.00000E+01 2 KM2 5.00000E+01 3 KI 3.00000E+01 4 KA 1.00000E+01 5 VMAX 9.70000E+01 PLEASE ENTER PARAMETR INDEX(OR ):-1 'BYE! PLEASE ENTER CONTROL INDEX(OR ): Notice the mistake made by the user in selecting the parameter index. The number entered ("97") was meant to be the value of parameter #5 rather than a parameter index. 4.0.1.2 TRACE Option (set Format For Typing Trace Of Optim) - I f the option values indicated that a trace report was to appear on the user's terminal it will be typed at this point. See the sample session dialogue for a typical trace report. Upon return to interactive mode the parameter values will be the optimal values as determined by the fitting operation. A sample trace report follows: PENNZYME user's document Page 26 STATISTICS ********************************************************************** * TEST OF PENNZYME: PENNSYLVANIA ENZYME PROGRAM * * SIMPLEX TRACE REPORT. * ********************************************************************** ITERATION 25 RMS RELATIVE DEVIATION OF KINETIC PARAMETERS 1.29966E-01 CURRENT LEAST-SQUARES ERROR: 1.32720E-01 CURRENT PARAMETER VALUES WERE: KM1 2.53438E+00 KM2 7.25742E+00 KI 2.79929E+01 KA 1.79910E+01 VMAX 3.24603E+01 ITERATION 50 RMS RELATIVE DEVIATION OF KINETIC PARAMETERS 3.44595E-02 CURRENT LEAST-SQUARES ERROR: 1.11128E-01 CURRENT PARAMETER VALUES WERE: KM1 4.09943E+00 KM2 6.96557E+00 KI 3.77969E+01 KA 2.00883E+01 VMAX 3.63544E+01 A least-squares fit is found by the Fletcher-Powell gradient search method, using the current estimate of the optimal parameter values as a starting estimate, and using the current values of the F-P option selection indices. A trace report is produced on the user's terminal if selected through the F-P option values. A sample trace report follows: PENNZYME user's document Page 27 STATISTICS ********************************************************************** * TEST OF PENNZYME: PENNSYLVANIA ENZYME PROGRAM * * FLETCHER-POWELL TRACE REPORT. * ********************************************************************** ITERATION 3 CURRENT LEAST-SQUARES ERROR: 4.45696E-02 CURRENT PARAMETER VALUES WERE: KM1 1.17266E+01 KM2 4.77989E+01 KI 3.48247E+01 KA 6.77243E+00 VMAX 9.62696E+01 ITERATION 6 CURRENT LEAST-SQUARES ERROR: 4.45120E-02 CURRENT PARAMETER VALUES WERE: KM1 1.17035E+01 KM2 4.90169E+01 KI 3.46901E+01 KA 6.51486E+00 VMAX 9.61774E+01 PENNZYME user's document Page 28 STATISTICS 4. Program Mode Record: Contains data which indicates the initial optimization strategy and the report generation mode. This record must follow the last parameter data record and has the following internal format: 1. Simplex Selector: Must be first value in record. If value is greater than or equal to zero then a Simplex mode record is expected. 2. Fletcher-Powell Selector: Must be second value in record. If value is greater than or equal to zero then a Fletcher-Powell mode record is expected, and the Fletcher-Powell mode record must follow the: 1. Program Mode Record if the Simplex selector value is less than zero. 2. Simplex Mode Record if the Simplex selector value is greater than or equal to zero. 3. Report Mode Indicator: Must be third number in record. If its value is "2" the residuals report will be included in the final report. 5. Simplex Mode Record: Contains input relevant to use of the Simplex algorithm. All of the following numbers must be present in the indicated order. A value of zero for any number will cause the indicated default values to be used. 1. Maximum iterations of algorithm allowed. Default: 50 + 40 * number-of-adjustable-parameters 2. Initial Simplex Step Size. Should be large enough to assure that the minimum will be reached before contraction is completed. It is expressed as a fraction of the initial parameter values. Default: 1.0 3. Convergence Mode. = 1, Test for convergence will be on the RMS relative deviation of the parameters of the Simplex from those defining the centroid. = 2, Test for convergence will be on the RMS relative deviation of the least-squares errors of the vertices of the simplex about that of the centroid. = 3, Both of the above must be satisfied. Default: 1 4. Convergence criterion "a". Convergence criterion for test #1 . Default: 0.05 PENNZYME user's document Page 29 STATISTICS 5. Convergence criterion "b". Convergence criterion for test #2 . Default: 0.05 6. Trace mode. = -1, no trace report. = 0, same as 1 (default). = 1, trace report on user's console. = 2, trace report on printer. = 3, trace report in both locations. Default: 0 7. Trace period. Number of iterations to be performed between trace printouts. Default: 25 6. Fletcher-Powell Mode Record: Contains input relevant to use of the Fletcher-Powell algorithm. 1. Maximum iterations of algorithm allowed. Default: 10 * number-of-adjustable-parameters 2. Estimate of minimum sum of squared errors obtainable. Default: 0 3. Expected relative error of the parameters. Default: 1 * 10**(-6) 4. Trace mode. Same options as for Simplex. Default: 0 5. Trace period. Number of iterations to be performed between trace printouts. Default: 5 4.0.1.3 Simplex And Fletcher-Powell Mode Record Contents - Simplex Mode Record Contents: 1. Max. iterations of algorithm allowed, 2. initial Simplex step size, 3. convergence mode, 4. convergence criterion "a", PENNZYME user's document Page 30 STATISTICS 5. convergence criterion "b", 6. trace mode, 7. number of iterations between trace printouts. Fletcher-Powell Mode Record Contents: 1. Max. iterations of algorithm allowed, 2. estimate of the minimum sum of squared errors obtainable, 3. expected relative error of the parameters, 4. trace mode, 5. number of iterations between trace printouts. PENNZYME user's document Page 31 STATISTICS 4.0.1.4 Description Of Simplex And Fletcher-Powell "Mode Record" Variables: - Variable Description ________ ___________ Simplex specific: _______ _________ Max. iterations A good rule of thumb for this algorithm is 50 iterations plus 40 for each parameter to be fit. Initial step size This should be large enough to enclose the minimum error point in parameter space . It is expressed as a fraction of the initial parameter estimate. Convergence mode = 1, test for convergence will be on the RMS relative deviation of the parameters of the Simplex from those of the centroid. = 2, test for convergence will be on the RMS relative deviation of the least-squares errors of vertices of the Simplex about that of the centroid. = 3, both of the above must be satisfied. Criterion "a" Maximum allowable RMS deviation of the parameter values of the simplex vertices and centroid for convergence under convergence modes 1 and 3. Criterion "b" Maximum allowable RMS deviation of least-squares errors of the simplex vertices and centroid for convergence under convergence modes 2 and 3. Trace mode = -1, no trace report. = 0, same as 1 (default). = 1, trace report on user's console. = 2, trace report on printer. = 3, trace report at both locations. Trace period Number of iterations to be performed between trace printouts. Fletcher-Powell specific: _______________ _________ Max. iterations A good rule of thumb is 10 iterations for each parameter. Est. of sum of squares If this is not known use "0" . PENNZYME user's document Page 32 STATISTICS Est. of error Not less than 10**(-6). Trace mode See Simplex specific. Trace period See Simplex specific. PENNZYME user's document Page 33 TECHNICAL GLOSSARY 5.0 TECHNICAL GLOSSARY Centroid The centroid of all the vertices of the simplex but the one with the largest least-squares error. Convergence criteria Conditions which must be fulfilled for an iterative calculation to be considered to have converged. Gradient search method An optimization method which uses the gradient of a function to locate the extrema of the function. Parameter Rate law constant to be calculated by PENNZYME which will result in the best fit of the experimental data by the rate law type. Parameter space The n-dimensional space defined by the n different rate law parameter variables. Record A collection of related items of data treated as a unit. Residual Difference between calculated and observed velocity values. Run-time The interval during which the program is executing. Simplex The n+1 points in parameter space(where n is the number of parameters) which are used by the Simplex algorithm to locate the error function minimum. Vector A variable type consisting of an ordered set of values. Vertices The individual points defining a simplex. 1.0 GRAPHICS COMMANDS There are three types of commands currently implemented in the graphics system. They are the Setup, Graph-Type and Data-Selection commands. All command types use the same general input format to minimize the number of special cases the user must remember. The Setup commands define the general physical appearance of the graph. They are used to specify the size of the graph, axis numbering and labelling, etc. The data is organized as a table, each line containing the values of the variables to be graphed. For PENNZYME the data is organized as lines of data, the first columns being the various chemical concentrations. The next column contains the observed velocities at the chemical concentrations; the last column contains the standard deviations associated with the observed velocities. The Graph-Type commands specify the variables (columns) to be graphed. Unless restricted by the Data-Selection commands, all the lines of data will be used when graphing. The Graph-Type commands thus specify the variables to be graphed and also the type of representation that is to be used. These include displaying data as discrete or connected points, with or without error bars and displaying several related curves at once as in the case of PENNZYME specific graphics. The Data-Selection commands allow the user to define the range of data points to be graphed for each individual curve. The data points can be selected by input line using expressions of the following type: all points up to line 24, points from line 2 through line 32. They can also be selected by specifying a data set number. This selects one or several of predefined data subsets such as those defined in PENNZYME. One can also select data lines by having any line for which the value of a variabale matches that specified. For example, include all lines for which Mgconc = 3.O. The complexity of the commands varies from Setup commands, usually a simple assign statement (set x-axis length to 8 inches) to the Graph-type commands where the user must include variable names for the x, y and error components, separated by operators as well as any optional switches. In between these two extremes lie the Data-Selection commands where the user enters an expression, usually a series of line numbers separated by commas. In all of the commands, the first word must be the command specifier. There are currently l5 of these including those implemented for PENNZYME. All of the commands, variables, operators and switches can be entered by the user as unambiguous abbreviations of their full representation. They are all separated by blank characters except for the variable names which allow imbedded blanks and therefore are terminated by an operator, a switch, etc. Command specifiers and switches can optionally be terminated by an equals sign. This makes for a more sentence-like command line when using the Setup commands and some switches where PENNZYME user's document Page 2 GRAPHICS COMMANDS a value is being assigned ('XLENGTH = 8'). Variable names are those defined by the user either as specified to the stand-alone graphics or as specified by a host program. Additional names can be defined for quantities previously unnamed in the host system but graphed in the graphics system. This is described later in the description of the interfacing routine. As an example, in PENNZYME, quantities were not assigned names by the user since only chemical names are required. These were the observed and calculated velocities, the associated standard deviations and the residuals. They were then assigned the names 'VELOCITY', 'VCALC' 'ERROR' and 'RESIDUAL' in the interfacing routines. Operators are used to separate variable names and to indicate the relationship between variables. For example, the '.VS.' operator indicates that the first variable is to be the ordinate and the second one the abscissa. The '.ERR.' operator uses the next variable to generate error bars. Operators must be enclosed by periods to indicate that they are not part of the variable names. A switch is text entered by the user at the end of a command which modifies the value of a graph parameter. The parameters are initially set to default values and reset to these values when the switch is turned off. The effect of a switch is limited to the command to which it is appended. If it is appended to a setup command, it will stay active until turned off. When appended to a graph-type command, it is only active for that command and does not affect succeeding similar graph-type commands. The general format of a switch is a slash (/) followed by the switch name, which is of the same format as the command specifiers. If required, this is followed by a value to be assigned to the switch. One or more switches may be used in a command, each being separated by a slash. Switches are meant to be used to alter a convenient default value. If the default values are often inappropriate then they should be modified within the program. Defaults for setup commands, for example, are set once for the entire program whereas the default symbols for the graph-type commands are rotated through the eight special symbols. 1.1 Overview Of Process Of Generating A Graph The graphics were designed for use with a storage tube graphics display. This requires that the user enter all the commands and that they be processed before the graph is initiated: otherwise, this would require partial erasure of the screen which is not possible on storage tubes. In a typical graph, the user would not change the defaults for the axis lengths and origins. It might be desired, however, to label the axes, and this would be done using the label commands followed by a literal string containing the text of the label. All of the setup commands can be entered at anytime prior to the actual PENNZYME user's document Page 3 GRAPHICS COMMANDS graphing since they only take effect after all the other commands have been entered. At this point the command to define a curve would be entered. In the simplest of cases this consists of the DRAW command: DRAW Y .VS. X Entering the'PLOT' command will then display a graph using the entire range of points and connecting them with straight line segments. The scaling and numbering of the axes would be calculated to insure the largest graph possible while maintaining reasonable axis numbering values. The appropriate axis labels would also be displayed. The data selection command was designed to allow the user to define the range of data to be graphed and also to make specification of multiple curve graphs quicker. The typical 'SELECT' command would be of the type: SELECT 1-15,16-18 Select rows 1 through 15 and 16 through 18 for display. The select command must be entered after the graph-type command it is to modify. If several curves of the same variables were to be graphed then one could alternate between the graph-type and the select commands for each curve. For a family of 4 curves this would require 8 command lines. To reduce this, the graph-type command can be entered only once and each subsequent 'SELECT' command will refer to that graph-type command. This reduces the number of commands to 5 lines, one for the graph-type followed by 4 'SELECT' commands. When this feature is used for displaying families of points, the symbol defaults are automatically rotated through the eight available special symbols for each set of points. 1.2 Description Of Graphic Commands And Their Effects These consist of three basic types: LABEL, ORIGIN, and LENGTH. Each of these must be prefixed by an 'X' or a 'Y' to specify the axis being set (e.g., 'XLABEL'). These are input in response to the prompt [] from the program. LABEL command The 'LABEL' command draws up to 50 characters (including blanks) of text centered below or alongside the designated axis. If no label command is entered, no label will be drawn. Once it is PENNZYME user's document Page 4 GRAPHICS COMMANDS defined, it will appear on every succeeding graph until removed by a new label command. [] XLABEL = 'X-AXIS' [] YLAB = carriage return (Remove YLABEL) ORIGIN command The 'ORIGIN' command permits the user to place the lower left corner of the graph at any distance in inches from the lower left corner of the physical screen. This is useful for moving the graph around to ensure best placement of the graph within the physical limits of the screen. This is required when using different terminals which have different characteristics. The 'ORIGIN' command does not affect the values at the origin of the variables graphed. [] XORIGIN = O.5 [] YOR = carriage return (Reset to default value) LENGTH command The 'LENGTH' command specifies the length of the axis in inches from the point defined by default or by the 'ORIGIN' COMMAND. The switch 'NUMBER' may be used to ensure or inhibit ('NO NUMBER') the output of axis numbering. [] XLENGTH = 7 []YLEN = 6/NO NUMB GRAPH-TYPE commands These consist of the two basic commands; 'DRAW' and 'POINT' plus those defined specifically for use with PENNZYME; 'FITPLOT', 'RESIDPLT' and 'CALC'. DRAW and POINT commands The 'DRAW' command will generate a graph of y vs. x, using the data values of the first variabale for y and those of the second variable for x. The two variables are separated by the '.VS.' operator. The 'DRAW' command connects each point by a line segment without representing the data point by a symbol. The 'POINT' command displays the data as unconnected symbols. If error bars are desired, then a third variable must be specified by the 'ERR' operator. PENNZYME user's document Page 5 GRAPHICS COMMANDS [] DRAW YVAR .VS. XVAR [] DRAW YVAR .VS. XVAR .ERR. STDEV [] POINT YVAR. VS. XVAR FITPLOT command The 'FITPLOT' command is the most often used by the PENNZYME user. It generates a graph like that specified by the 'POINT' command and then superimposes a curve of calculated velocities interpolated polated over the range of the chemical concentration variable specified. The velocities are calculated using the rate law function linked to PENNZYME. This graph only makes sense if the velocity is the ordinate and a chemical concentration the abscissa. All other chemical concentrations should be fixed over the range that is being graphed. No checking is performed by the program for nonsensical graphs. A warning message is issued when more than one chemical concentration is varied. The user then has the option of completing the graph or starting over. [] FITPLOT VEL .VS. CHEMCON [] FITPLOT VEL .VS. CHEMCON .ERR. STDEV RESIDPLT Command The 'RESIDPLT' command is used to display the systematic error of a model and the systematic and random error of the experimental data. A graph is generated with the abscissa centered vertically and the data points displayed as symbols. The 'Residplt' command should always have the residuals as the ordinate, the abscissa being any of the chemical concentrations or the observed calculated velocity. Dashed lines are also graphed at plus or minus the 95% confidence t-statistic times the residual error. This flags outlying data points. [] RESIDPLT RESID .VS. CHEMCON [] RESIDPLT RES .VS. VELOC (The observed velocity) [] RESID RES .VS. VCALC (The calculated velocity) CALC command The 'CALC' command is similar to the 'FITPLOT' command and is used when more than one chemical concentration is varying at once. 'CALC' does not interpolate over the chemical concentrations but rather displays the calculated velocity at the experimental chemical concentrations and then connects these with line segments. [] CALC VEL. VS. CHEMCON [] CALC VEL .VS. CHEMCON. ERR. STDEV PENNZYME user's document Page 6 GRAPHICS COMMANDS 1.3 Switches For The Graph-Type Commands SYMBOL switch Instead of representing data values as discontinuities in a curve, this switch allows the user to represent the data points as symbols. Any keyboard character or one of the eight special symbols can be used. All the points of the set or curve will have the same symbol. This switch is also useful for modifying the default symbol value of the 'POINT' command. A keyboard character will be used if it is enclosed by single quotes. The special symbols, square, diamond, etc. are selected by using an integer in the range 1 through 8. These may not be available for all installations. [] DRAW VEL .VS. CHEMCON/SYMB = '*' use an asterisk [] POINT YVAR .VS. XVAR/SYMB = 2 use the second special symbol SIZE switch The 'SIZE' switch allows the user to specify the size of the symbols in inches if these are present. This is useful when displaying error bars since error bars are only drawn when their vertical dimension exceeds the vertical dimension of the symbol. Altering the size of the symbol allows control over the error bars that are to be shown. [] FITPL VEL .VS. CHEMCON/SYM = 'X'/SIZE = O.2 SPACE switch The 'SPACE' switch allows the user to specify the amount of space left araound a symbol. This prevents a connecting line segment from traversing the symbol. This does not stop unrelated line segments from traversing the symbol. The space is given in inches. This switch does not apply to commands which do not connect the data values. [] DRAW YVAR .VS. XVAR/SYM=2/SPACE=0.2 1.4 Data-Selection Commands There are currently three ways to select data to be graphed. All three used the command specifier 'SELECT' and vary in the type of arguments expected. PENNZYME user's document Page 7 GRAPHICS COMMANDS The 'SELECT' command followed by the symbol '#' indicates that data input is to be selected by line number. The lines are selected by expressions involving line numbers separated by commas, a range of lines specified by bounds, and a range of lines specified by an inequality operator. [] SELECT #1-5 Selects lines 1 through 5 [] SELECT #>6 Selects lines 7 through end of data [] SELECT #<20 Selects lines 1 through 19 [] SELECT #1,2 Selects lines 1 and 2 [] SELECT #1,2,6-9,11,>17 (The various modes can be mixed when separated by commas.) The 'SELECT' command followed by a variable name indicates that data will be selected if it satisfies a constraint. The only constraint currently allowed is that the data equal a value specified. The current implementation will only select the first contiguous set of data lines satisfying the constraint. [] SEL CHEMCON .EQ. 52 Select the line(s) where the variable CHEMCON has the value 52. The 'SELECT' command followed by the symbol '$' indicates that data ranges will be selected according to predefined data subsets. These data sets cannot be defined within the graphics. They must be defined by a host system which allows the user to do this such as PENNZYME. Several of the data subset indices can be used in the same SELECT command as long as they are separated by commas. [] SEL $2 Select Set number 2 [] SEL $2,5 Select Sets 2 and 5 PLOT command The 'PLOT' command is the last command entered and is used to initiate the display of the graphics . This should be entered once all the desired graph parameters have been set for the current graph. Output can be sent to a hard copy plotter by following the 'PLOT' command with the word 'PLOTTER'. Example [] PLOT = PLOTTER Entering a carriage return will erase the graph and allow the user to define a new graph. The previous graph may be redisplayed simply by entering the 'PLOT' command again, if no Graph-type or Data-selection commands have been entered since. The Setup commands may be used to vary the layout of the graph. Similarly when the graphics is used as an option within a host program, upon reentering the graphics the last graph can be regenerated by entering the 'PLOT' commands. The new graph will then use the new data values if these have been modified by the host program. With PENNZYME, one can display a quality of fit graph, exit the graphics and optimize the kinetic parameters, enter the graphics and display PENNZYME user's document Page 8 GRAPHICS COMMANDS the new quality of fit graph using the optimum kinetic parameter values. QUIT command The QUIT command is used to exit from the graphics program. PENNZYME user's document Page 9 EXAMPLES OF GRAPHICS COMMANDS AND RESULTING GRAPHS 2.0 EXAMPLES OF GRAPHICS COMMANDS AND RESULTING GRAPHS [] DRAW YVAR .VS. XVAR [] PLOT | | /\ | / \ | / \ /\ | / \ / \ | / \ / \ | / \ / \ | -----/ \---- |/ |-------------------------------------------- [] DRAW YVAR .VS. XVAR/SYM = O [] PLOT | | O | /\ | / \ O | / \ / \ Use symbol O | / \ / \ to denote point | / \ / \ | /O---O \_O/ \ |/ |--------------------------------------------- [] DRAW YVAR .VS. XVAR [] POINT UVAR .VS. TVAR [] PLOT | | /\ Two different | / \ sets of | / \ x /\ variables | / \ / \ using different | / x \ / \ representations |x / \/ \ x may be graphed | ------ ----- simultaneously | x |------------------------------------- PENNZYME user's document Page 10 EXAMPLES OF GRAPHICS COMMANDS AND RESULTING GRAPHS [] DRAW YVAR .VS. XVAR/SYM = O [] SELECT 1-3,5,6 [] PLOT | | | The select command | -O------O-- is used to include | O-----O- / or exclude data | / \-- / values from the | / \-O- curve defined by |O the preceding | graph-type command |-------------------------------------- [] DRAW YVAR .VS. XVAR [] SELECT 1-3,5,6 [] POINT UVAR .VS. TVAR [] PLOT | The select command does | not affect graph-type | x commands which follow | it | | x | ------- / | / \ / | x / \ --/ | / \/x | / x |----------------------------------- [] DRAW CVAR .VS. DVAR The select command [] SELECT 1-5 when used repeat- [] SELECT 6-10 edly generates [] PLOT distinct curves based on the values | /\ set by the preced- | / \ ing graph-type | / /\ \ command. This is | ------/ / \ \ used to display | / / \ \ families of curves | / /----/ \ \ if the data is |/ / \ \------ organized in that |------------------------------------ fashion. PENNZYME user's document Page 11 EXAMPLES OF GRAPHICS COMMANDS AND RESULTING GRAPHS 2.1 Graphics Commands Specific To PENNZYME [] FITPLOT VEL .VS. MGADP Display observed [] SELECT 1-5 velocities and [] PLOT superimposed curve calculated veloci- | ties at interpola- | O------O ted concentrations. | O---/ This only makes | O / sense if the first | /- variable is the | O/ velocity variable. |/ |------------------------------------- [] FITPLOT VEL .VS. MGADP. ERR. STDEV Displays error [] SELECT 1-5 bars. These are [] PLOT drawn only if they exceed the size of | - - the symbol. | | O Vertical lines are | O - missing from error | - | bars because they | - O -/------ cannot be repre- | | - / sented by print | /O/-- symbols. (The | / | points do not fit | /O - the line so there |/ is room for the |-------------------------------------- error bars.) [] FITPLOT VEL .VS. MGADP [] SELECT 1-5 [] SELECT 6-1O [] PLOT | O | O/----- | ----- | O/ | -- | / x Multiple curves | O/ /----- | - x--- | / x---/ | O/ --/ | / / | / /-x |O/-/ |//x |--------------------------------------- PENNZYME user's document Page 12 EXAMPLES OF GRAPHICS COMMANDS AND RESULTING GRAPHS [] CALC VEL .VS. MGADP [] SELECT 1-5 [] PLOT | O ------ Displays calculated | /---/ velocities | O/ not interpolated | -/ over concentrations | / | O/ | / | O/ | / |-------------------------------- [] RESIDPLT RESID .VS. MGADP [] PLOT | O |-------------------- Uses one symbol to | O O denote data values |-------------------- of entire data. This | only makes sense if | O O O O the first variable | O is the Resid variable. |-------------------- |-------------------- [] RESIDPLT RESID .VS. MGADP [] SELECT 1-5 [] SELECT 6-1O [] PLOT O |---------------- | x O x Uses a different | symbol for each | x data subset |-------------------- | O O | x O | x |--------------------- |--------------------