INSTALLATION INSTRUCTIONS FOR MOBYLE
************************************

APOLOGIES
We did not get enough time so far to provide a proper setup script for the Mobyle system, which is rather complex to install (see below). We have schedule the release of a more automated setup procedure along with the version 1.0.

REQUIREMENTS
-Any machine running a unix-like OS..
-An apache server with a loaded mod_cgi.
-Python, >=2.4
-The following Python librairies: 
 	+ simpletal, >= 4.1 
 	+ 4suite, >= 1.0.2
 	+ simplejson, >= 1.7.1
 	+ python imaging library, >= 1.1.5 
 	+ PyCAPTCHA ( http://pypi.python.org/pypi/PyCAPTCHA/ )
    	+ a biological sequence/alignment format converter software, squizz/jreadseq ( squizz strongly recommended )
-Optional:
        + a batch system, such as SGE (http://gridengine.sunsource.net/) or Torque (http://www.clusterresources.com/pages/products/torque-resource-manager.php). 
        + dnspython >=1.5.0 is helpful to check user emails domain validity.
        + golden (ftp://ftp.pasteur.fr/pub/GenSoft/unix/db_soft/golden/) is helpful to directly load biological sequences from databanks into the web portal.

2- Technical overview:

A "Mobyle" server does not run any specific daemon (apart from apache). When a user launches a job, he is actually running a cgi that runs a bioinformatics program in a subprocess. If the subprocess runs for more than a certain time, it detaches itself, and continues monitoring the execution until its end. The "apache" user is therefore the one that runs every requestm and user permissions should be done so that it can access and run every data, program, and parameter of the Mobyle configuration.

3- Important elements in the Mobyle archive

Local => configuration and local parameters or code for the Mobyle system. 

Programs => Mobyle XML wrappers (distributed separately) 

Src => Mobyle source code: 
	* the Mobyle folder contains the "core" code for the Mobyle Server,
	* the Portal folder contains the code for the web portal that provides an access to the system.

Tools =>A few utilities and scripts.
      =>The setsid binary, used only if jobs are executed using the "Sys" job manager.

Utils =>Siginterrupt source
      =>setsid source

4- Installation steps

4.1 - Make sure every required dependence/software is present.

4.2 - Create the Mobyle tree structure
All the Mobyle code and configuration are located in a given directory, whose path is stored in the MOBYLEHOME environment variable.
=>Extract the contents of the archive in the chosen directory, which has to be accessible by the apache user. The RunnerChild script (located in ${MOBYLEHOME}/Src/Mobyle/RunnerChild.py has to be executable by apache. 
The data (jobs, sessions, etc.) as well as the web portal code have to be accessible by apache, but should also be published on the web.
=> Copy the portal in the www directory: the MobylePortal folder located in Portal/htdocs to "DocumentRoot", and the MobylePortal folder located in Portal/cgi-bin to the "cgi-bin" folder
=> Create a folder in DocumentRoot to store Mobyle data and meta-data: for instance, a "Mobyle" dans DocumentRoot. In this new folder, create the following sub-folders:
    -jobs and jobs/ADMINDIR: contain the jobs data.
    -sessions: will contain the user sessions (e-mail adresses, jobs list, etc.).
=> Create a folder somewhere accessible by Apache that will store the uploaded data before their format is checked.

The log folder will contain all the Mobyle logs. 
=> Create a log folder (accessible by apache), for instance /var/log/mobyle
error_log , tracks 
access_log , tracks launched jobs 
account_log , optional, tracks the execution time of each job
session_log , tracks session mechanism
debug , if level 3 job debug is set in configuration, stores the child process standard output.
build , if level 1 job debug is set in configuration, tracks the contruction of the job launch command line.

4.3 - Compile Siginterrupt (and setsid if required)
=>Compile Siginterrupt: go to the ${MOBYLEHOME}/Utils folder, then type:
%python setup.py build
%sudo python setup.py install

=>Compile setsid
setsid is mandatory only if Mobyle is set up to run jobs in "Sys" mode, i.e., without any particular batch management system.
cd ${MOBYLEHOME}/Utils
gcc setid.c -o setsid
mv setsid ../Tools

4.4 - Configure Mobyle
=>copy the ${MOBYLEHOME}/Exemple/Local/Config/Config.template.py file to ${MOBYLEHOME}/Local/Config/Config.py and edit it. Here are the main configuration vars:
ROOT_URL: the hostname + port of the server (e.g., http://mymobylemachine.myplace.com:80).
RESULTS_PATH = the path to the "jobs" folder.
USER_SESSIONS_PATH = the path to the "sessions" folder.
FORMAT_DETECTOR_CACHE_PATH = the path to the format detector cache path folder.
MOBYLEROOT_HTDOCS_URL = the path from the website root to the htdocs Mobyle folder.
MOBYLEROOT_CGI_URL = the path from the website root to the cgis Mobyle folder.
LOGDIR = the log directory path

DATABANKS_CONFIG = Lists the various bio-banks which are available to load data from the web portal. This list should remain empty unless the golden program is compiled and set up.

MAINTAINER = the server administrator's e-mail adress, used to send critical error messages.
HELP = the e-mail adress of the person that receives user help requests. This is also the adress that will be the sender of job notification e-mails to users.
MAILHOST = mail server used to send the above-cited messages.

OPT_EMAIL = defines if it is mandatory to enter an e-mail before to run a job.
PARTICULAR_OPT_EMAIL = overloads the above directive on a program-specific base.

ANONYMOUS_SESSION = "captcha"|"no"|"yes" authorizes or the job submission with anonymous sessions (or asks to solve a captcha problem to stop bot submissions).
AUTHENTICATED_SESSION = "email"|"no"|"yes" authorizes the creation of authenticated sessions, where an e-mail adress confirmation system can be used.

BATCH = Batch submission system. set to Sys if you do not have such a system available.

4.5 - Configure Apache
=>set up the MOBYLEHOME
The $MOBYLE_HOME var should be set in the cgis. Ex:
        ScriptAlias /cgi-bin/ /var/www/cgi-bin/
        <Directory "/var/www/cgi-bin">
                AllowOverride None
                Options FollowSymLinks
                Order allow,deny
                Allow from all
                SetEnv MOBYLEHOME /home/hmenager/cvs/Mobyle
        </Directory>
=>specific MIME types
In order to be able to upload/vizualize some specific data types, such as PDB files, you need to overload their mime type. The default is the chemical/x-pdb in /etc/mime.types, but this mime type forces opening such data with an external tool in most navigators. Therefore, if you use such data, you can (for instance, add this directive in the apache configuration file /etc/apache2/mods-available/mime.conf: "AddType text/plain .pdb".
=>download button
The "save" button that is available in job results will automatically open a "save as" prompt, given that you add this little trick to the apache server configuration:
RewriteEngine on
RewriteCond    %{REQUEST_URI}	 ^/MobyleData/jobs(\.*) #replace the last part with your own  Mobyle RESULTS_PATH
RewriteCond %{QUERY_STRING}	^save$
RewriteRule   (.*)/([^/]+)$    $1/$2 [E=SAVEDFILENAME:$2]
Header set Content-Disposition "attachment; filename=\"%{SAVEDFILENAME}e\"" env=SAVEDFILENAME

4.6 - Setup and configure GOLDEN and SQUIZZ 

4.6.1 - Golden (optional)
Golden is a software that retrieves sequence entries from bio-banks. It is used within the Mobyle portal, to directly load data from these banks before to analyze them
=>Installation
$tar -xzf golden-1.1a.tar.gz
$cd golden-1.1a/
$./configure
$make
$make install
=>configuration
Edit in ${MOBYLEHOME}/Local/Config/Config.py the GOLDEN_PATH var, which is the path to the golden binary.
DATABANKS_CONFIG = available bio-banks description.
e.g.:
DATABANKS_CONFIG = [
    {'id':'embl', 'dataType':'Sequence', 'bioTypes':['Nucleotide'], 'label': 'EMBL Nucleotide Sequence Database'},
    {'id':'enzyme', 'dataType':'Sequence', 'bioTypes':['Protein'], 'label': 'Enzyme nomenclature database'},
    {'id':'uniprot', 'dataType':'Sequence', 'bioTypes':['Protein'], 'label': 'Universal Protein Resource = SwissProt + TrEMBL + PIR'},
]
id is the identifier of the bank, as listed by golden: 
$golden -l
For detailed instructions about the golden program setup, please refer to the golden distribution.

4.6.2 - Squizz (Strongly recommended)
It is mandatory to set up a sequence/alignment format detector/converter program. Squizz is strongly recommended, but you can also use the java version of readseq (http://iubio.bio.indiana.edu/soft/molbio/readseq/), although it is far too permissive in our opinion.
=>Setup
tar -xzf squizz-0.99.tar.gz
cd squizz-0.99/
./configure
make
make install
=>Configuration
SEQCONVERTER = path to the installed format detection softs (the only ones that are currently supported are SQUIZZ and READSEQ)
e.g.:
SEQCONVERTER= {
    'SQUIZZ': '/usr/local/bin/squizz',
#    'READSEQ': '/home/hmenager/cvs/Mobyle/Tools/jreadseq'
}
