Lightweight Bartering Grid batch experiments manager
  Copyright (c) 2005-2008, Cyril Briquet, parts Xavier Dalem
  Contact:  C.Briquet@ulg.ac.be

================================================================

A batch experiments manager is provided to run a batch of experiments,
each of which consists in:
* deploying a Grid with specific parameters,
* submitting one or more Bag of Tasks to a given Peer,
* collecting runtime and cache hit statistics,
* shutting down the Grid.

This allows to easily:
* test multiple Grid configurations under a given load,
* test a given Grid configuration under multiple loads.

================================================================

(1) How to prepare a batch of experiments ?

Preparing a batch of experiments involves 3 steps:
* modifying the experiment_manager.sh file to reflect your setup,
* writing experiment description files,
* writing Grid nodes configuration files.

================================================================

(2) How to modify the experiments manager to your setup ?

Important variables include:
* FTP_DATA_DIR:
  path to User data
  (on a disk partition accessible from the consumer Peer machine)
* FTP_CMD:
  the exact command to launch the FTP server storing the User data,
  launched on the consumer Peer machine
  (required by the automatic Task submitter to submit Bags of Tasks)

================================================================

(3) How to write an experiment description file ?

Each experiment is conducted through one single User,
which is an automatic Task submitter.
It must be configured in the experiment description file.

An experiment description file  is composed of:
* a list of "key : value" pairs,
* a single line acting as a delimiter and consisting
  only of the string "clusters :",
* one or more lines containing the description of clusters,
  each of which is a list of space-separated user@machine strings
  describing the machines where the Grid will be deployed
  (user must have ssh access to machine).

The following keys must all be defined (in arbitrary order)
in the list of provided "key : value" pairs:
* experiment_id		experiment ID (arbitrary string)
* experiment_dir	directory where the config files can be found,
			and where the logs will be stored,
                        relative to the SRC_CONFIG_DIR environment variable
                        (if blank value, defaults to SRC_CONFIG_DIR)
* jobs			number of Jobs (i.e. Bags of Tasks)
                        for this experiment
* tasks_per_job		number of Tasks per Job
* ddr			Data Diversity Ratio
                        (DDR, commonly referred to as SDR),
                        expressed as an integer percentage (0 < DDR <= 100)
* data_size		data size, in MB
* data_per_task		number of input data files for each  task
* search_engine		search engine network address
* consumer_peer		consumer Peer network address
  			(the User FTP server will also be deployed there)
* consumer_user		User (i.e. autmatic Task submitter) network address
* user_port		listening port the the automatic Task submitter
* ftp_port		listening port for the User FTP server
			That FTP server is deployed automatically (see the
			FTP_CMD variable), but must have a configuration that
			is consistent with this variable. In other words, this
			variable is redundant with the configuration file of
			whatever FTP server you intend to use

Values can be defined using bash-like syntax.

================================================================

(4) How to write a Grid node configuration file ?

Peer nodes and Resource nodes must be configured
in two dedicated node configuration files.

All Peers involved in a given experiment are currently
configured identically, as are all Resources
(the experiments manager can certainly be extended to enable
different configurations for the nodes of a given type).

The single User node is configured with the experiment description file.

A node configuration file  is composed of a list of "key = value" pairs.
Standard values (see examples in the config directory) can be assigned.

The Peer node configuration file must be named peerconfig.properties,
and the Resource node configuration file must be named resourceconfig.properties.
They must both be stored in the directory designated by experiment_dir.

================================================================

(5) Quick guide to write experiment description and nodes configuration files

+---------------------------+--------------+----------------------------+
| data transfer protocol    | Peer config. | DATA_HOST_POLICY =         |
|                           |              | { TORRENT | FTP | blank }  |
+---------------------------+--------------+----------------------------+
| Resource selection        | Peer config. | PEER_STORAGE_AFFINITY =    |
|                           |              | { true | false }           |
+---------------------------+--------------+----------------------------+
| Task selection            | Peer config. | PEER_TTG_POLICY =          |
|                           |              | { true | false }           |
+---------------------------+--------------+----------------------------+
| cache size                | Res config.  | SUPPLYING_CACHE_CAPACITY = |
|                           |              | { > 0 }                    |
+---------------------------+--------------+----------------------------+
| Job count                 | edf          | jobs = { > 0 }             |
+---------------------------+--------------+----------------------------+
| Tasks per Job             | edf          | tasks_per_job = { > 0 }    |
+---------------------------+--------------+----------------------------+
| input data files per Task | edf          | data_per_tasks = { > 0 }   |
+---------------------------+--------------+----------------------------+
| DDR                       | edf          | ddr = { [1..100] }         |
+---------------------------+--------------+----------------------------+
| file size                 | edf          | data_size = { > 0 }        |
+---------------------------+--------------+----------------------------+

================================================================

(6) How to run a batch of experiments ?

./experiment_manager.sh my_exp1.edf my_exp2.edf my_exp3.edf my_exp4.edf

Note:
The batch experiments manager transparently relies
on the deployment scripts described in README-DEPLOY,
which means that there is no need to explictly deploy/shutdown the Grid.
End of note.

================================================================

(7) Where are the logs and collected statistics ?

The logs and statistics collected for one given experiment
are stored in the logs and logs/stats subdirectories
of the experiment_dir specified in the experiment description file,
itself a subdirectory of the experiments base directory
specified by the SRC_CONFIG_DIR environment variable
defined in experiment_manager.sh (by default: config).

================================================================

