
                        MAY Version 2.10
                        ================
          A Distributed Processing Package for Easing the 
      Developemnt of Algorithms with Coarse-Grained Parallelism 
      ---------------------------------------------------------
                       Linas Vepstas, 1993
                       -------------------


Programmer's Reference
======================

mayCreate
=========
Starts a remote process 

Syntax:
-------
struct mayProcId * mayCreate (requested_host_name, 
                              requested_program_name
                              requested_connection_type);
char 	*requested_host_name;
char	*requested_program_name;
int	requested_connection_type;

Arguments:
----------
-- requested_host_name 
   pointer to a NULL terminated character string.  It can be a NULL
   pointer; this indicates that the process is to be started on the 
   local machine.

   The host name is assumed to be a valid internet hostname.  Standard
   name resolution techniques are used to locate the host (gethostbyname(),
   which queries nameservers, examines /etc/hosts, etc.)

-- requested_program_name
   pointer to a NULL terminated character string.  This must be a valid,
   full file pathname to the program to be executed, including
   possible arguments and flags. For example: /u/johniac/myprog -ac
   or /bin/rm -r *. (execv() is used to run remote program.)

-- requested_connection_type
   takes one of two values: MAY_CREATE_TCP or MAY_CREATE_UDP. Depending 
   on which value is passed, the child proccess is created with the 
   indicated  connection type.

Returns:
--------
Upon return, mayCreate returns a pointer to the "address" of the 
child process.  This address can be used with the maySend command 
to send messages to the child.  The memory area for this address is
allocated by MAY and is unique (you do not need to copy it to use it).
You should not modify its contents or free it.

Description:
------------
The mayCreate routine starts the indicated process on the indicated
machine.

A maydaemon must be running on the indicated machine.  The file path
name must indicate a valid file in the remote machine's file system.
This file must have execute priviledges (chmod ugo+x <filename>).

If a maydaemon on the remote machine is not responding, mayCreate will
return with NULL after a twenty second timeout.

Error Returns:
--------------
NULL is returned if the create failed.  An error message is printed on
stderr.  

Ideas for Future Enhancements:
------------------------------
1) In the current implementation, mayCreate() does not handle
   dotted-decimal notation IP addresses.  Such a capability should be 
   added in case a network nameserver is unable to located a named
   machine.

2) Error messages are currently printed to stderr.  A more useful and
   elegant error handling policy would use the errno mechanism provided
   by ANSI C (or a similar service), so that users of MAY could design
   more robust error recovery proceedures.

3) Currently, mayCreate() only offers a synchronous creation service.
   That is, mayCreate does not return until the indicated program has
   started on the remote machine, and an acknowledgement has been
   returned.  This can introduce excessive overhead into the system if
   multiple mayCreates need to be performed.

   An asynchronous interface could be implemented by providing
   mayCreateRequest() and mayCreateReply() routines.  This could, in
   fact, be done quite esily by taking the existing mayCreate() code and
   splitting it up into two pieces. mayCreate() itself could then be
   trivially implemented as a mayCreateRequest() followed by a
   mayCreateReply().

4) Find, document, and provide an interface that would allow the
   creation timeout to be set by a MAY user.



mayCreateFarm
=============
Initializes a processor farm

Syntax:
-------
void mayCreateFarm (num_machines, 
                    machine_names, 
                    file_names,
                    connection_type)

int     num_machines;      /* number of available processors */
char    *machine_names[];  /* array containing mamachine names */
char    *file_names[];     /* array containing file paths */
int     connection_type;   /* MAY_CREATE_UDP or MAY_CREATE_TCP */

Arguments:
----------
-- num_machines
   The number of machine and file names in the arrays machine_names,
   file_names.

-- machine_names 
   a pointer to an array of strings naming the machines on which
   processes are to be started.  Machine names must be valid internet
   machine names.

-- file_names
   a pointer to an array of strings specifying full file pathnames on
   the indicated machines. The files must be executable. The strings may
   include command-line flags.

-- connection_type
   takes one of two values: MAY_CREATE_TCP or MAY_CREATE_UDP. Depending 
   on which value is passed, the child proccesses are created with the 
   indicated  connection type.


Description:
------------
This routine performs initialization for the mayProcFarm routine.

Error Returns:
--------------
None.  Failure to connect to remote daemons is indicated by error
messages printed to stderr.

Internal Usage:
---------------
Although this routine is intended for use with the mayProcFarm routine,
it can be used as a convenience for performing multiple mayCreates.
This usage is discouraged for the casual user, but can be useful for 
anyone intending to enhance the MAY system.  A linked list ring of 
the resultant creates are hung off of the global variable current_farm.  
This variable points to the following structures:

extern struct mayFarm *current_farm;

struct mayChild {
   struct mayChild *next;       /* pointer to next child */
   char         *program_name;  /* name of child program, incl.  arguments */
   struct mayProcId *rpaddr;    /* network address of child */
   int          alive;          /* TRUE if machine is responding */
   };

struct mayFarm {
   int num_workers;             /* number of workers in the ring */
   int num_active_workers;      /* number of active workers in the ring */
   struct mayChild *workers;    /* pointer to ring */
   };


Ideas for Future Enhancements:
------------------------------
1) The current implementation of mayCreateFarm() is written on top of
   mayCreate().  Since mayCreate() does not return until either an
   acknowledgement of the remote create has been received, or a timeout has
   occured, a considerable amount of time can be lost in this routine.
   In particular, timeouts are very expensive.

   Rather than performing serial creates, a parallel version could be
   developed that would first send out a whole series of create requests,
   and then gather up the create replies.  Such a change would have to
   understand and use the internal CreateRequest and CreateReply messages.

   Perhaps the most elegant way of implementing this would be to add a
   mayCreateRequest() and a mayCreateReply() routine to the MAY interface,
   and build mayCreateFarm on top of these two routines.  Thus, all
   users could obtain the benefits of an asynchronous create mechanism.

2) the num_machines argument is not really required.  It could be
   eliminated by having mayCreateFarm() look for a NULL pointer as 
   the last entry in the array machine_names.



mayProcFarm
===========
Dispatch work requests to remote processes and gather up completed work

Syntax:
-------
int mayProcFarm (work, 
                num_pieces,
		piecesize, 
		stride, 
		callback) 

char * work;		/* pointer to array of pieces to be done */
int num_pieces;		/* number of pieces to be done */
int piecesize;		/* size (in bytes) of each piece */
int stride;		/* spacing (in bytes) between subsequent pieces */
int ((*callback)());	/* routine called back every time a piece is done */

Arguments:
----------
-- work
   pointer to an array of work items to be distributed between processors
   participating in the processor farm.  These will be sent as messages
   to each member of the farm.

-- num_pieces
   The number of work pieces in the array.

-- piecesize
   The size of each piece of work, in bytes.

-- stride
   The spacing, in bytes, between pieces of work.  This allows work
   pieces ot be embedded in a larger array; thus stride need not be equal
   to piecesize.

-- callback
   Pointer to a subroutine that will be called back as each pice of work
   is completed. This routine will be called back with a pointer to the
   returned data as an argument.

Description:
-----------
The mayProcFarm() routine implements a processor farm for distributing
work for parallel execution across multiple processors.  A "work item"
consists of two parts: a "work request", which is a binary message that
will be transmitted to remote worker processes, and a "work reply",
which is the reply message that a worker sends back upon completion of a
work request.  This routine accepts a pointer to an array of work
requests, and automatically dispatches these requests to remote workers.
It also automatically gathers up replies, calling a callback routine,
allowing each reply to be processed. 

This routine attempts to minimize the actual, elapsed time to finish
processing all work requests. It does so by implementing several 
algorithms for load balancing, for fault-tolerant distribution of the 
work, and a fast-feed mode for exceptionally fast processors (or 
exceptionally small jobs).  These algorithms assume that work items 
can be processed in arbitrary order, that work items are roughly equal 
in size (within an order of magnitude, or so), and that remote workers
are roughly equivalent in turn-around time (to within an order of 
magnitude, or so).  In the attempt to minimize the overall elapsed time,
the same work request may be dispatched to several machines, psossibly 
resulting in work duplication.  However, the callback routine will be
called only once once for each work item (duplicate work replies will
be discarded).


These restrictions on equivalence of work items 
and speed of workers can be loosened when more than several thousand 
work items are to be processed.

This routine assumes that "worker" processes have already been started 
on remote machines, and that these are waiting for work items to be 
processed.  Use the mayCreateFarm() routine to start up remore workers.

This routine runs most efficiently when there are at least five times as
many work requests as there are workers.  Efficiency also increases as 
the number of cpu cycles needed to process each work item outweigh the 
overhead of dispatching/transmitting/receiving work items.  Currently,
it is suggested that work items take at least one-hundred thousand CPU
cycles to process, in order to minimze network overhead.  Efficiency is
also increased by keeping the ratio of message lengths to work-item
processing times low.

It can easily happen that a large collection of fast workers can outpace
the ability of the master process to collect and process completed work.
Care should be taken to minimize the amount of time spent in the
callback routine when large farms are being implemented.  The
mayProcFarm() routine itself introduces relatively small amounts of
overhead compared to the inherent overhead of maySend(), mayReceive(),
and the underlying IP support.



Information on the algorithms employed by this routine can be found
elsewhere in this document, and the algotihms themselves are detailed at
length in the source code.

When using the UDP-based processor farm, work request and work reply
message lengths are limited by the underlying transmit/receive mechanism.
Please refer to the maySend() and mayReceive() routines to determine the
largest supported UDP message length.  There are not message length
limitations for a TCP-based farm.


Ideas for Future Enhancements:
------------------------------
1) In the current implementation, mayProcFarm() uses mayReceive() to 
   gather back up completed work.  mayReceive() returns messages in the
   order in which they were actually received.  Since farm FIFO lengths
   are buried in the reply messages, this leads to somewhat stale FIFO data
   being gathered, and thus could lead to idle machines if the work
   dispatches are extremely small.  Providing an alternative mayReceive
   that returned the last message first would help keep FIFO data most
   current.  However, care would then be required so that earlier
   messages containing earlier FIFO data would not over-write the current
   FIFO data with stale data.  This could be implemented by adding
   time-stamp info to the internal reply.

Ideas for Future Research:
--------------------------
1) Characterize the network overhead of the processor farm.  In
   particular, measure the number of cycles spent in TCP/UDP/IP code, the
   number of cycles spent in maySend() and mayReceive(), and the overhead
   introduced by mayProcFarm().  From this data, construct a theoretical
   model of the processor farm.  Using this model, predict how well the
   dispatch mechanism responds to large variations in work size and worker
   speed.  Compare theoretical predictions to measured data.  Construct
   a chart showing the region of near-linear speedup for this system.
   Discuss methods and algorithms that can be used to extend the region
   of near-linear speedup.




mayGetParentTcp  
===============
Returns the address of the parent

Syntax:
-------
struct mayProcId * mayGetParentTcp ();

Description:
------------
This call returns the address of the parent that spawned this child.
This routine should only be called by child processes to find thier
parents.  The returned address can be used with maySend() to send
messages to the parent.

This routine should only be used by children whose parent created them
with a TCP connection -- i.e. by calling mayCreate(,,MAY_CREATE_TCP)

This routine mallocs the space for the returned structure. The returned
structure should not fiddled with or freed.

Error Returns:
--------------
NULL if error occured. Error text printed to stderr.


mayGetParentUdp
===============
This call returns the address of the parent.  This address can be used
in the maySend command to send messages to the parent.

Syntax:
-------
struct mayProcId * mayGetParentUdp ();

Description:
------------
This call returns the address of the parent that spawned this child.
This routine should only be called by child processes to find thier
parents.  The returned address can be used with maySend() to send
messages to the parent.

This routine should only be used by children whose parent created them
with a UDP connection -- i.e. by calling mayCreate(,,MAY_CREATE_UDP)

This routine mallocs the space for the returned structure. The returned
structure should not fiddled with or freed.

Error Returns:
--------------
NULL if error occured. Error text printed to stderr.


mayInitTcp
==========
Initialization for a TCP system.
Must be called before any other MAY routines. 

Syntax:
-------
void mayInitTcp ()

Description:
------------
Initializes a process for use in a TCP based communications system.

Error Returns:
--------------
None.



mayInitUdp
==========
Initialization for a UDP system.
Must be called before any other MAY routines. 

Syntax:
-------
struct mayProcId * mayInitUdp ();

Description:
------------
Opens a communications port for this process.
Returns a pointer to the "address" of this process; any processes
wishing to send mail to this process must send them to this address.

This routine mallocs the space for the returned structure. The returned
structure should not fiddled with or freed.

Error Returns:
--------------
NULL if error occured. Error text printed to stderr.


mayReceive
==========
This routine allows a user to receive a message.

Syntax:
-------
int mayReceive (sprocid, messtype, subtype, timeout, header, messlen);
struct 	mayProcId 	*sprocid;
long 	messtype;
short 	subtype;
int	timeout;
struct 	msghead 	*header;
int 	*messlen;

Arguments:
----------
-- sprocid 
   pointer to structure identifying connection from which to receive
   message.  If NULL is specified, all connections are checked for 
   messages; the first message with a matching messtype and subtype 
   is returned.

-- messtype 
   indicates the type of message to search for. If type 0 (zero) is
   specified, the first message with a matching sprocid and subtype 
   is returned.

-- subtype 
   indicates the subtype of message to search for. If subtype 0 (zero) is
   specified, the first message with a matching sprocid and messtype 
   is returned.

-- timeout
   The maximum number of seconds to wait for a message to arrive, if
   a message matching the sprocid, messtype and subtype is not already
   present.

-- header
   Must point to a struct msghead structure.  This structure will be 
   filled out with values to indicate the type of message that was
   actually returned (if any).  

-- messlen
   pointer to an int that specifies the size (in bytes) of the data area
   into which the returned message will be copied.  Upon return, this
   value will be changed to the actual length (in bytes) of the received
   message.

Description:
------------
This routine returns received messages.  An attempt is made to find a
message that matches the indicated sprocid, messtype and subtype.  
If any of these fields are NULL or zero, then the respective fields are
not used to determine a match.

If no matching message is found, then this routine will block (not
return), and wait up to a timeout period of seconds to match the 
message. (See also "Known Bugs" below).   If no matching message is
received within the timeout peroid, this routine will return with a
value of MAY_TIMED_OUT (see "may.h").  If timeout is set to zero, 
mayReceive() will return immediately either with a message (if there 
was one) or with MAY_TIMED_OUT. 

The structure pointed at by *header will be filled out with info about
the message.  The pointer *header must point at the following structure
(see "may.h"):

struct msghead {
	struct 	mayProcId sproc_id;   /* sending process id */
	long	msgtype;
	short	subtype;
	long	msglen;
	void	*msgptr;
        long	queue_len;
	};

Note that header->msgptr must point at a data area sufficiently large to
contain the received message.  The received message will be copied into
the data area pointed at by header->msgptr.  mayReceive() does NOT
allocate this area -- you must do so.  You can make this pointer point
at any structure or array desired; you can reuse this are from message
to message, if desired.  The size of the available area must be pointed
at by the argument messlen.  If this area is overflowed, the message
will be truncated, and the truncated portions discarded. The value of
messlen will be unchanged, and a value of MAY_TRUNCATED_MSG will be 
returned to indicate that truncation has occured.

The queue_len field will be filled out with the number of cached 
(pending, unread) messages that the sender process has.  This field 
is useful for implementing FIFO pacing mechanisms.

Notes:
------
For the IBM RT, it is suggested that the receiving process be compilesd
with the -a option, esp. if many and/or large messages are to be
received.

UDP messages limited to xxx bytes ..... where xxx is system dependent,
but generally less than 1500 bytes.


Returns:
--------
If no error occured, then the number of pending, unreceived messages (of
all types, from all sources) is returned.  If the returned value is
negative, then one of the following errors has occured:

MAY_GENERAL_ERROR	unidentified error condition 
MAY_TRUNCATED_MSG 	incomplete message recv'd
MAY_NO_MEM	 	out of memory error
MAY_CLOSE_CONNECTION 	socket died
MAY_TIMED_OUT		timeout expired
MAY_NO_OPEN_SOCKS	all sockets are dead 

In certain cases, an error message will be printed to stderr.


Known Bugs:
-----------
In the current implementation, if a message arrives during the timout
period, but it does not match the indicated sprocid, messtype and
subtype, then the timout period will be reset.  Thus, a continuous stream
of incoming messages that fail to match the desired fields can cause
this routine to wait for an arbitrarily long period of time, and give
the appearence of being "hung".


Ideas for Future Enhancements:
------------------------------
1) In the current implementation, mayReceive() performs numerous mallocs
   of several different sizes of data structures.  This could lead to
   severe memory fragmentation problems.  Memory fragmentation and
   performance could be significantly improved by modifying MAY to use
   bulk-malloc techniques.  

2) In the current implementation, mayReceive() stores pending messages
   internally in a long linked list.  Search time for stored messages could
   be improved significantly by converting this to a hash table.
   However, for users who are specifying message ID's of NULL, it would be
   freindlier if messages were still returned in the order in which they
   were received.  This could be preserved by time stamping messages in
   the order in which they were received.

3) In some cases, it could be useful to a user of MAY to know when a
   received message was transmitted (rather than when it was received).
   In particular, such information could be useful for alternate designs
   of the processor farm.  Thus, time-stamp data could be added to
   the internal message headers.  A simple routine should be provided to
   pull the time-stamp data out of the message headers.

4) Fix the bug indicated above.  This can be done by fetching the system
   time when the first attempt to timout is made, and fetching it again
   after the select() routine returns. If the elapsed time is greater
   than the timeout peroid, then a timeout error should be returned.

5) Design another routine, similar to mayReceive(), that does not require
   the message header and message body to have been previously allocated
   by the user.  Such a routine is interesting because it could save the
   CPU cycles/overhead of copying the received data from internal
   structures into the user sturctures.



maySend
=======
Sends a message to an address.

Syntax:
-------
int maySend (process_id, message_type, subtype, message, messlen);
struct 	mayProcId 	*process_id;
long 	message_type;
short 	subtype;
void    *message;
int 	messlen;

Arguments:
----------
-- process_id
   location to which a message is to be sent.

-- message_type
   The sent message will be identified as being of this type.

-- subtype
   The sent message will be identified as being of this subtype.

-- message
   pointer to message to be sent.

-- messlen
   length of message to be sent, in bytes.

Description:
------------
This routine sends a message of the indicated type and subtype to the
indicated process.

The message_type and subtype
can be set to arbitrary values; they provide a means for the
user to help sort out his messages. Note that message_type
of zero is reserved.

NULL messages can be sent.
(at the receiving end, a header is received, and the message field
is set to NULL).

Currently, UDP messages are limited to be 1800 or fewer bytes long. 
(This depends on the size of udp datagram packets and is a limitation
that aught eventually be removed).  There is no limit for TCP messages.

Like the udp protocol, messages are not gaurenteed to be reliably
delivered or to be non-duplicated. This may be eventually fixed.

This routine returns the length of the message actually sent.

Error Returns:
--------------
This routine returns a negative number if an error has occured.
The following values are returned:
MAY_GENERAL_ERROR  -- error has occured in writing to socket.  This
                      error usually occurs when the remote end has 
                      disconnected.
MAY_TRUNCATED_MSG -- sent message was truncated in the sending.


Ideas for Future Enhancements:
------------------------------
1) In the current implementation, messages are treated as unstructured
   byte streams.  A more useful messaging interface could be provided by
   supporting an XDR (or similar) type interface and protocol.  (XDR is
   the interface provided to RPC's.)

2) If the process_id is NULL, it is assumed that the message is intended
   for the parent of this process.  (not implemented).


mayTerminate
============
Terminates a MAY session

Syntax:
-------
void mayTerminate ();

Description:
------------
This function deallocates internally used storage and closes all
connections.  In the current implementation, an attempt is made to free
most memory, but this routine has not been thoroughly tested/debugged.


-------------------------- END OF FILE ------------------------------
