RT-11 MULTIPROCESSOR  USER'S GUIDE  (Version  7-Nov-85 NKGAZG)  PAGE






            M       M   PPPPPPP                 1         1      
            MM     MM   P      P               11        11      
            M M   M M   P      P              1 1       1 1      
            M  M M  M   P      P                1         1      
            M   M   M   PPPPPPP                 1         1      
            M       M   P                       1         1      
            M       M   P                       1         1      
            M       M   P                       1         1      
            M       M   P                     11111     11111    







			*************************
			* MULTIPROCESSOR RT-11  *
			*			*
			*     USER'S GUIDE	*
			*************************




		1.  Introduction

		2.  Multiprocessor Components

		3.  Multiprocessor Utilities Overview

		4.  Device allocation

		5.  Handlers and Pseudo-Handlers

		6.  Getting started

		7.  Bootstrapping

		8.  Error messages and trouble shooting



1. INTRODUCTION.
----------------

This manual is a short guide for using the RT-11 MULTIPROCESSOR
package as developped at the University Hospital Groningen and the first
version (V3) is described in full detail in:


  "A MODULAR DATA COMMUNICATION PACKAGE PROVIDING A MULTIUSER ENVIRONMENT
   AND PARALLEL PROCESSING",
  Proceedings DECUS EUROPE, Coventry U.K., Sept. 1982.

The package provides a datacommunication facility between RT-11 (V4 and
higher) systems(LSI or PDP-11) and facilitates transparent use of remote
devices as disks, lineprinters etc.

A 2nd public version (V5) requiring RT-11 V5.0 or higher, is described in:

  "MULTIPROCESSING AND HIGH SPEED DATACOMMUNICATION UNDER RT-11"
  Proceedings DECUS U.S.A., New Orleans, May 1985.



A RT-11 DISK DATA CACHE is very usefull in a multiprocessor environment.
Details and application are described in:

  "DISK USAGE ANALYSIS and DISK DATA CACHING under RT-11"
  Proceedings DECUS EUROPE, Zuerich, August/Sept. 1983. 

and

  "THE DISK DATA CACHE UNDER RT-11"
  Proceedings DECUS U.S.A., New Orleans, May 1985. 


NOTE:
-----

  In the remainder of this manual DC will stand for the abbreviation of
  datacommunication. The manual will always describe the latest version
  of MP-11.

  RSP will stand for the Radial Serial Protocol, which is used by DEC for
  communicating with the TU58 device. The protocol used for MP-11 is a
  derived version of RSP.


2.  MULTIPROCESSOR COMPONENTS 
-----------------------------

  
$
  The following hardware is currently supported:

	1. Parallel words: DRV-11, DR-11 C, DR-11 K
	2. Serial,  words: WB-11, WBV-11
	3. DMA	   blocks: Qnector (Westvries Systems b.v., NL)

  Suppose we have two systems coupled by DR(V)-11 hardware:

@+
  /----------------\				/-----------------\
  !	RT-11 /----!				!	RT-11	  !
  !	      !JOB !--\	   DR(V)-11 link     /--!		  !
  ! DCJOB     !Hnd !  !======================!	!  DR handler	  !
  !	      \----!--/		Data	     \--!		  !
  !"service job"   !	 <---------------->	! "DC handler"	  !
  !    site        !	      Commands		!	  site	  !
  \----------------/	 <----------------	\-----------------/
@-

Note that the DR-handler site issues the commands for I/O transfers.
Then the following files should reside on either system device:

$
              DCJOB site			DC Handler site
              ==========		        ===============

   DCJOB      : job (there may be one other
 	        version e.g. DCJOB.SPD
		for Special Dir. support)

   DJ.SYS     : job handler, drives	DR.SYS    : Data communi-
		the DC hardware at the			cation handler
		job site			    (behaves like a disk)

   DJBOOT.SYS : boot program with	Pseudo-handlers, e.g. LP.SYS,
		password access			    HL.SYS for HELLO

   JBINFO.DAT : job data file and	HELLO.SAV : Shows remote available
 		mailbox				    devices, device char-
						    teristics, access
   JOBDEF.SAV : defines the list of		    status, message facility
		remote available		    Checks remote/local date	
		devices, their default
                access status and type
                number of jobs running
                simultaneously

   MAIL.SAV   : communicates with	JBDATE.SAV: Fetches remote
		mailbox and job data		       date&time
	        message facility

   JSHOW.SAV  : disk/device overview	WATCH.REL:  Scans at regular time
		nr. I/O's, error count		    intervals remote mailbox
		read/write access		    for changes in contens

$

In general the job site and the DC handler site have separate system
disks. In case of a remote system disk, both disk units, will be physically
located at the job site. However, it is also possible that both systems use
the same system disk by using a disk - CACHE at the DC handler site.

By using the JSHOW utility you have a way to see the device list of phys-
ical devices, device sizes & characteristics and the read/write access.
The read/write access to the devices may be changed. The access status is
valid for the user at the "DC handler site", not for the user at the
job site! However, it is very important that only ONE site has WRITE
access to a certain disk unit. If both sites write more or less simulta-
neously to the same disk unit, directory corruption may occur!
At the job site a disk unit can be protected against writes by the local
user by assigning the disk unit to the job who needs write access to it,
e.g. :

   .LOAD DL2:=DCJOB0

  or, in case of logical disks, by using the SET command, e.g.

   .SET LDx NOWRITE  (x=0-7)



$
3.  MULTIPROCESSOR UTILITIES OVERVIEW
-------------------------------------

JSHOW
	Shows DC jobs running(Max. 7), nr. I/O's received and error 
	report,	devices available for remote access and their Read/Write
	access status. The R/W status can be changed.

JBHOLD  FORTRAN callable functions for blocking DCJOB activity by BG program.
        Usefull when e.g. BG program performs high speed A/D acquisition.

  	IERR=JBHOLD ; Initializes job holding
		    ; --> Should be called only ONCE before calling
		    ;     the following routines:

  	IERR=JBSPND ; Suspends job running in the Foreground (F)
  	IERR=JBRSUM ; Resumes  job running in the Foreground (F)
		    ; --> For each call JBSPND a matching JBRSUM
		    ;     should be executed!
	(NOTE:  sept 85 DEC there is still a serious bug in the ABORT
		I/O code of the resident monitor!)
MAIL
	Puts messages for DC jobs and general news in mailbox.
$
	Attention:
	----------
	Message size is restricted to 480 characters.

HELLO
	To be run at the DC handler site. Displays remote available
	devices, their characteristics, the R/W access. The device names
	are the physical names, so you can "see through" logical assign-
	ments made at the remote configuration! Reads news and message
	in mailbox, sends message to servicejobs where it is put in
	the mailbox and printed on the console terminal. The pseudo
	device handler HL should be installed and have the appropriate
	settings (RSPvector, RSP unit no.). Compares remote
	and local DATE&TIME and reports differences (TIME difference
	should be more then about 20 min.)
WATCH
	May be running at the DC handler site. Scans every 10 sec. status
	of DC job and contents of mailbox for changes. Uses HL: for
	I/O (see HELLO)

JOBDEF	Defines, by interactive query, the remote available devices and
	their default access status. Should be run whenever the device
	list or default access status is changed or when a bootstrap
	program is added for a memory-only system.

JBDATE
	Fetch remote DATE&TIME and set them locally. Very uselfull in
	startup commandfile if normally at startup a remote DATE&TIME
	is present.


General purpose programs:

DEVICE  Prints device characteristics in local system.
FREE    Prints sizes, no. files & free blocks of all random access
	devices in the local system.

$
4.  DEVICE ALLOCATION
---------------------

When a job starts running it attemps to open an I/O channel to all
devices specified in the device list. This means that, when a handler is
not loaded, the I/O channel will not be opened and the device will not be
available to the remote site(s).

 IMPORTANT: handlers to which an I/O channel has been opened may not
	    be unloaded as long as the DC job(s) run!

Renaming of devices in the jobs device list can be done using the logical
assignment procedure.

 NOTE:	    logical assigments have only an effect on the device allocation
	    when they are made BEFORE startup of the job(s)!

The JSHOW utility also examines whether the devices in the list can be
allocated within the running system. During this examination it also
presents the identifier and characteristics (such as special function
support, variable size etc.) of the device. From these data it can be
seen whether a logical assignment is in effect (and thus was done before
starting up the jobs). In addition to the job's device allocation scheme,
JSHOW also examines, for random access devices (disks) only, whether they
really exist by issuing "dummy" read requests to these devices. Device
units, who have no drive, or, when a drive is not on cycle (think of
removable disks), are marked and an additional question mark ('?') is
presented in the R & W access overview. At the remote site this mark is
presented by HELLO. 

 NOTE:	    the additional random access device examination is only
	    done when JSHOW is run! So JSHOW should be run each time
	    when a drive is turned off, and when a drive that was off
	    is turned on.

$
5.  HANDLERS and PSEUDO-HANDLERS
--------------------------------

Using the datacommunication and pseudo-handlers is simple as they are
used in the same way as handlers for local devices are used!

All handlers know the SET XX SHOW command which shows you the values
of the conditionals(octal!) in the handler and other usefull
information. The SET commands in the pseudo handlers are:

	.SET XX RSPVEC=nnn

	.SET XX UNITS=n

	.SET XX UNIT0=n

UNITS is the number of device units supports. E.g. if UNITS is set to 2,
the handler only supports requests for units 0 and 1. Requests to other
device units will return immediately with a hard error.

UNIT0 is the RSP unit start number of the handler. So the handler unit
number 0 corresponds with the UNIT0 setting. Handler unit 1 corresponds
with the setting UNIT0+1, etc.. In this way you can change access to a
remote device by changing this number. E.g. when the following remote
devices (list as displayed by JSHOW and HELLO) are available:

$$$
	Central device        DCJOB1
	-----------------------------
	 0 RK0: System/fixed  R   -
	 1 RK1: (Fixed RK05)  R   W
	 2 RK2: (Removable)   R ? -
	 3 RK3:                 N
	 4 RK4:                 N
	 5 LD1:               R ? -
	 6 LD2:                 N
	 7 DM: (RK07)         R   -
	 8 JBINFO (Mailbox)   R   W
	 9 HELLO                N
	10 SP: (LP: SPOOLER)  -   W   
	11 SP1:(LP: wide )    -   W
	12 SP2:(LP: quality)  -   W
	13 SP3:( Plotter )    -   W
	14 MT: (Magtape)        N
	15 JOB - handler      R   W
	-----------------------------
	(R = Read, W = Write access)

Then a pseudo LP handler should be set:

	.SET LP UNITS=4
	.SET LP UNIT0=10

in order to access the printing/plotting devices no. 10 - 13 !

RSPVEC is the vector of the DC handler which drives the DC hardware using
the RSP protocol and to which the pseudo handler belongs. E.g. when we
have a system communicating with two separate remote systems, one, a
LSI-11/23, accessed by a DC handler QN: having vector 170 and one, a
PDP-11/34, by DR: having vector 310, then by setting:

	.SET HL RSPVEC=170	! HL: is pseudo handler used by HELLO

the LSI-11/23 is accessed by the HELLO program and after:

	.SET HL RSPVEC=310

the PDP-11/34 is accessed! With RT-11 V5.2 the following usefull commands
could be defined in this case:

	HELL23 :== SET HL RSPVEC=170\HELLO\\
	HELL34 :== SET HL RSPVEC=310\HELLO\\

The DC handlers driving the hardware, the so called DC/RSP handlers,
recognize the same SET commands except for UNIT0 and RSPVEC. The UNIT0
value is default set to 0 but may be changed by editing the handler
source. Also hardware I/O page addresses and vectors are defined in the
sources. All DC handlers are defined as random access with VARIABLE SIZE.
This assures that the size of a (disk) volume is always correct, even when
volumes with different sizes are exchanged during the run of DC jobs!
  
Use the FREE utility to inspect the sizes of mounted volumes.

When the DC-handlers are generated within a system with "device time-out"
support, they also recognize the option:

	.SET XX TIME=t

Specifie in decimal the time-out value t in 0.1 sec.(50 Hz.) units.

Note that you can only use a pseudo handler when the DC-handler which
drives the hardware is loaded! Otherwise a hard error will be returned
on each call to the handler. 

  Attention: Pseudo handlers CANNOT be used for bootstrapping!
	     (The bootstrap cannot load the appropriate DC handler!)
	     Therefore the RSP units resp. device units to boot are
	     restricted to the range 0 - 7.



6.  GETTING STARTED
-------------------

The software is simple to use and control. If you want to run only one
DC job you should use the Foreground/Background monitor.
For running 2-7 DC-jobs or when you want to run e.g. QUEUE or other
foreground/system jobs simultaneously with one DC-job, then you need the
FB monitor with system job support. It is recommended to have a monitor
with the so called "device time-out" support. Otherwise time-out will not
be supported.

Before a DC job is started assure that it's handler is installed. Then
type the SET XX SHOW command, where XX is the name of the job handler.
Now you can verify several settings and hardware addresses. All DC jobs
are numbered with a number between 0 and 7. The number is stored in the
job handler (device identifier = jobnumber + 300). The first job to run
is set to number 0 with the command SET XX JOBNUM=0. The second to 1 with
SET YY JOBNUM=1 etc. These numbers have to be correctly set as they are
used by the utilities in order to find the correct job settings from the
job info file SY:JBINFO.DAT. Once these "job-numbers" have been set, they
should not be changed!

Now load the job handler and assign it to the logical name JOB:

	.LOAD DJ
	.ASS  DJ JOB

The DC job can be started:

	.FRUN/BUFFER:nnnn DCJOB     (or SRUN/BUFFER:nnnn)

The number nnnn is the extra buffer space to be allocated to the job.
When /BUFFER:nnnn is omitted only a fixed buffer space of 256 words
(1 disk block) is used.
When a second DC job has to be started the procedure is repeated. Assume
that the second job uses a job handler with name DI. Then:

	.LOA DI
	.ASS DI JOB
	.SRUN/BUFFER:nnnn/NAME:DCJOB1 DCJOB

Note that although the same DC job program is used, it must be given
another logical name!

Now run the utility JSHOW to check the devices allocated to the job(s)
and R/W protection. As a good rule JSHOW should be run always after
startup of job(s) as this utility updates the device lists in JBINFO.DAT
(device and/or device characteristics may have changed e.g. due new
 logical assignments).

7.  BOOTSTRAPPING
-----------------
  
When you have a running RT-11 system, you can bootstrap a DC-handler with
the command BOOT DC:.  Before you should make the disk bootable with the
command COPY/BOOT DC:RT11FB DC: or, at the site where the DC-job runs, 
COPY/BOOT:DC DSK:RT11FB DSK:, where DSK: is the handler name for the disk.
If you have not a running RT-11 system (e.g. a "memory only" system) with
a DC-link, then check that the remote disk is bootable for the DC handler
and the remote job has BOOT support (normally default).

  Booting:

	1. Activate the BOOT PROM at its correct start address, if you
	   have no automatic boot on power on. If you have not yet a
	   suitable PROM use a toggle-in boot (see System Manual).

	2. In case of a bootprogram with password access enter the correct
	   password for your processor and thereafter the RSP unit number
	   of the bootable remote disk (should preferable be 0).


8.  ERROR MESSAGES AND TROUBLE SHOOTING
---------------------------------------

  8.1 ERROR MESSAGES:

  When a DC job prints "No JBINFO.DAT" or "No JOB:" at startup, this
  means respectively that there is no file SY:JBINFO.DAT (check with
  DIR SY:.DAT) or there is no handler with the logical assingment JOB:.

  When a job is running, messages printed have the following general
  format:   ab-c-DCJOBx 
  where

	x     	is the number of the job which printed the message.

	c	is the condition under which the error occured:

		C  = while receiving a command packet
		R  = while receiving a data packet
		S  = while sending a data packet
		E  = while sending an end packet
		F  = while processing special function request.
		I  = no error but a message:
		    e.g. ST-I-DCJOBx, job start message

	ab	is detailed error/message specification:

		NU = No USR available (special direc. devices only)
		CK = Checksum error
		HD = Protocol error
		IN = Initialize job command received
		BT = Bootstrap request received
		BF = Bootstrap request failed
		NB = Not enough Buffer space for executing a Special
		     Function request received. You can avoid this
		     error by increasing the job's buffer space by
		     restarting with FRUN/BUFFER:yyyy DCJOBx

		SP = Special directory device (e.g. Magtape) request
		     received while the job has no code for processing it.
		     (use another DCJOB program which has the code)

		ST = Start of DC job

  Normally when an error occurs, this is not fatal as the DC-job returns
  a hard error to the requesting site. However, in some more serious cases
  e.g. protocol errors, the job tries to recover by generating an internal
  reset. As a result synchronisation with the DC-handler may be lost. In
  this case the first next request of the handler probably will result in a
  hard error. Therefore the request should be repeated. It should be noted
  that protocol errors only occur very seldom.


  8.2 TROUBLE SHOOTING:

	1. Is the remote job running? Use SHOW JOBS monitor command and
	   program JSHOW as check. Has the remote job the required
	   features (optional is BOOT, SPDIR support)? All devices which
	   are no special directory devices (as e.g. MAGTAPE) and to
	   which access is needed during the runtime of the job should
	   be loaded before a job is started (FRUN or SRUN jobname).

	2. Are you aware of the appropriate device names and RSP unit
	   numbers? Use HELLO or JSHOW for an overview. Is the access
	   status correct?

	3. Check the RSP units which a device XX supports by executing
	   SET XX SHOW. The printout also tells you if XX is a pseudo-
	   handler as only pseudo handlers print: RSPvec=yyy. You can
	   only do succesfull I/O transfers if the DC/RSP-handler, to
	   which the pseudo handler belongs is LOADED. Otherwise the
	   pseudo handler reports hard I/O errors.

	4. Are the JBINFO.DAT (job data file & mailbox) and xxBOOT.SYS
	   (for booting with password) present on the system disk of the
	   job site? Where xx stands for the job handler name driving the
	   DC hardware (e.g. DJ: --> SY:DJBOOT.SYS).
	   Is the JBINFO file initialized? This is done when at
	   least once the program JOBDEF has run. JOBDEF should also be run
	   when a new boot program (xxBOOT.SYS) is placed on the system
	   disk. Does the HL pseudo handler unit (UNIT0), communicating
	   with the remote mailbox equal the message unit number defined
	   by the program JOBDEF and as displayed by JSHOW?
 
	5. For EACH I/O access to a special directory device, it is
	   required that at the job site SET USR NOSWAP is in effect.

	6. AGAIN: Pseudo-handlers can only operate when the DC-handler to
	 	  which they belong is LOADED. Otherwise they return a
		  normal hard error.

	7. Let executed tests to check the hardware data link.

			*********************************
                                   