INSTALLATION INSTRUCTIONS FOR MOBYLE
************************************

================= !!!!!!!!!    UPDATE WARNING     !!!!!!!!! =========================
    Please be aware that the configuration and data storage have changed 
    dramatically since version 0.9 and 0.95. If you are upgrading from one of these
    versions, please refer to the UPDATE file
=====================================================================================

1 - Requirements:

- Any machine running a unix-like Operating System.
- An apache server with a loaded mod_cgi and optionally the rewrite engine.
- Python, >=2.5
- The following Python libraries:
 	+ simpletal, >= 4.1
 	+ 4suite, >= 1.0.2
 	+ simplejson, >= 1.7.1
 	+ python imaging library (with libjpeg support), >= 1.1.5  
 	+ PyCAPTCHA, (http://releases.navi.cx/pycaptcha/pycaptcha-0.4.tar.bz2)
	+ libxml2 ( >=2.6.17 ),
	+ siginterrupt, (ftp://ftp.pasteur.fr/pub/gensoft/projects/mobyle/)
- The following tools:
        + a biological sequence/alignment format converter software,
          squizz/readseq (the java version)
          squizz is strongly recommended (ftp://ftp.pasteur.fr/pub/gensoft/projects/squizz)
- Optional:
        + a batch system, such as SGE (http://gridengine.sunsource.net/) or
          Torque (http://www.clusterresources.com/pages/products/torque-resource-manager.php).
        + dnspython >=1.5.0 is helpful to check user emails domain validity.
        + golden (ftp://ftp.pasteur.fr/pub/gensoft/projects/golden/)
          is helpful to directly load biological sequences from
          databanks into the web portal.
        + the xml programs definitions (ftp://ftp.pasteur.fr/pub/gensoft/projects/mobyle/Programs-xxx.tgz)
          it's a collection of programs definition (emboss, phylip, blast, ...) ready to use with mobyle. 
          
2 - Technical overview:

A "Mobyle" server does not run any specific daemon (apart from
apache). When a user launches a job, it is actually running a cgi that
runs a bioinformatics program in a subprocess. If the subprocess runs
for more than a certain time, it detaches itself, and continues
monitoring the execution until its end. The "apache" user is therefore
the one that runs every request and user permissions should be done so
that it can access and run every data, program, and parameter of the
Mobyle configuration.

3 - The Mobyle archive tree:

Example => A few sample files.

Local => Configuration, local parameters and code for the Mobyle
         system.

Src => Mobyle source code:
	* the Mobyle folder contains the "core" code for the Mobyle
          Server.
	* the Portal folder contains the code for the web portal that
          provides an access to the system.

Tools => A few utilities and scripts.


4 - Installation steps:

4.1 - Make sure every required dependence/software is present.

4.2 - Do the installation.

To install mobyle, you need to provide 3 different path, where all
files will be installed:

    python setup.py install \
      --install-core=/path/where/to/install/core/files \
      --install-cgis=/path/where/to/install/cgis/files \
      --install-htdocs=/path/where/to/install/html/files

- The `--install-core' option specify the prefix to install mobyle
  core files (code, tools, example, documentation, ...).

- The `--install-cgis' option specify the installation path for mobyle
  portal cgis to be executed by the web server.

- The `--install-htdocs' option, will be used as the mobyle document
  root, which will hold the portal html files (in 'portal' subfolder),
  the users sessions, the jobs and the programs definitions (in data
  subfolder). Make sure this subtree is readable by the web server
  user. Furthermore, the permissions on subfolders 'sessions' and
  'jobs' must allow the web server user to write in.

4.3 - Configure Mobyle.

Go to your freshly installed mobyle core (from now on, you can remove the sources).

=> Copy the Example/Local/Config/Config.template.py file to
Local/Config/Config.py and edit it to suit your needs. Here are the
main configuration variables :

cp Example/Local/Config/Config.template.py  Local/Config/Config.py
          
- ROOT_URL
  This variable hold the web server name, including the port if not
  the default value of 80.
      `http://mymobylemachine.myplace.com:80'.

- CGI_PREFIX
  This variable holds the path, which appended to the web server name
  (ROOT_URL), to find the mobyle cgis folder.
      `cgi-bin/mobyle'

- HTDOCS_PREFIX
  This variable holds the path, which appended to the web server name
  (ROOT_URL), to find the mobyle htdocs folder.
      `mobyle'

- MAINTAINER
  This is the server administrator's email addresses list. If a
  critical server error is encountered, a message will be sent to all
  the specified email addresses.
      `[ 'adm1@mydomain.fr', 'adm2@mydomain.fr', ... ]'

- HELP
  This is email address where users support requests are sent. It is
  also used for sending mail from the server to users (jobs
  notifications, ...).
      `help@mydomain.fr'

- MAILHOST
  Specify here the mail server to use for sending all email messages.
      `smtp.mydomain.fr'

- BATCH
  Jobs submission system to be used. Currently allowed values are
  `SYS' (local execution), `SGE' (Sun Gridengine) and PBS (Portable
  Batch System). The default is `SYS'.

- LOGDIR
  This variable indicates the path where all mobyle log files are
  located. Note that this directory must be created first and
  permissions set to allow the web server to write in, otherwise no
  logging will be used.

- SEQCONVERTER
  This is a dictionary which lists all the `Sequence/Alignment'
  converters. For now only `squizz' and `readseq' (java version) are
  supported; and at least one of then must be specified.
      `{ 'SQUIZZ' : '/path/to/squizz', 'READSEQ' : '/path/to/jreadseq' }'

- BINARY_PATH
  List of directories which will be added to the existing PATH
  environment variable to locate programs.

- DATABANKS_CONFIG
  Dictionary which describes the locally available databanks to fetch
  entries from, using the various utilities.
      `{ 'WGS' : { 'dataType' : 'Sequence', 
                   'bioTypes' : [ 'Nucleic' ],
                   'label' : 'Genbank - Whole Genome Shotgun',
                   'command' : [ 'golden', '%(db)s:%(id)s' ] },
         'PDB' : { 'dataType' : '3DStructure', 
                   'bioTypes' : [ 'Protein' ],
                   'label' : 'Protein Data Bank',
                   'command' : [ 'PDBGet.py', '%(id)s' ] } }'

4.4 - Configure Apache.

=> Specific MIME types

In order to be able to upload/visualize some specific data types, such
as PDB files, you need to overload their mime type. The default is the
chemical/x-pdb in /etc/mime.types, but this mime type forces opening
such data with an external tool in most navigators. Therefore, if you
use such data, you can (for instance, add this directive in the apache
configuration file /etc/apache2/mods-available/mime.conf:
	"AddType text/plain .pdb".

=> Download button

The "save" button that is available in job results will automatically
open a "save as" prompt, given that you add this little trick to the
apache server configuration:

You must substitute the HTDOCS_PREFIX/data/jobs with its real value.

You can add these few lines in an .htaccess file (if your general apache
configuration permits it ) in the directory where the jobs will
be stored (--install-hdocs value + '/data/jobs' by default).

RewriteEngine on
RewriteCond    %{REQUEST_URI}	 ^HTDOCS_PREFIX/data/jobs(\.*)
RewriteCond %{QUERY_STRING}	^save$
RewriteRule   (.*)/([^/]+)$    $1/$2 [E=SAVEDFILENAME:$2]
Header set Content-Disposition "attachment; filename=\"%{SAVEDFILENAME}e\"" env=SAVEDFILENAME

=> directory indexes

As the jobs are stored in web accessible subtrees, the data confidentiality is based on 
a unique key as job identifier. Thus it is strongly recommended to forbid the directory indexes
 of the "--install-htdocs/data/jobs" subtree. 
 You can do that in your apache general configuration or in .htaccess files
 with the directive Options -Indexes

=> hidden files

in job directories Mobyle uses some "hidden files" for administration purpose.
- the unix command generated ( .command )
- some information relative to the job ( .admin )
- some internal informations to mobyle execution ( .forChild.dump or ADMIN subtree)

These files are not intended to be viewed by the users. Thus it could be a good idea
to mask them. You could do that with a rewrite rules placed in a .htaccess  
in --install-htdocs/data/jobs subtree.
  
# Do not show hidden files content
RewriteCond     %{REQUEST_URI}  /\. [OR]
RewriteCond     %{REQUEST_URI}  ADMINDIR
RewriteRule     .*              - [F,L]



4.5 - Programs descriptions deployment.

We provide a set of programs descriptions, which is available at
ftp://ftp.pasteur.fr/pub/gensoft/projects/mobyle/Programs-xxx.tgz

Download it and expand the archive in the Programs subfolder. Then,
configure Mobyle according to the bioinformatics software installed on
your platform.

- LOCAL_DEPLOY_ORDER
  The order in which INCLUDE and EXCLUDE directive are evaluated.

- LOCAL_DEPLOY_INCLUDE
  The list of programs descriptions to install.

- LOCAL_DEPLOY_EXCLUDE
  The list of programs descriptions to not install.

For INCLUDE and EXCLUDE directives shell jokers could be used. By
example, 'dna*' refers to all programs descriptions beginning by 'dna'...

Use the mobdeploy script which is located in Tools subfolder to
deploy programs descriptions (for more details see associated
README). Make sure, once the programs descriptions are deployed that
they are readable by the web server. The mobdeploy script does not
install the programs, just publish their xml descriptions. To be useful you
must install separately the bioinformatics softwares corresponding to
the descriptions. For more explanations about programs deployment see 
the Tools/README. 


5 - Tests:

Try to connect to your portal. The url of the portal is:
    ROOT_URL/CGI_PREFIX/portal.py

You should see the welcome page at the center and, on left, the
available programs presented in a hierarchical tree.

5.1 - Troubleshoot.

If you have any trouble, it will be very useful to check :
 - the apache error log, if you have a "500 Internal Server Error"
 - the mobyle error_log (located in previously defined LOGDIR variable), 
   if something seems goes wrong but there is no error 500. 
 
- Instead of the welcome page I see 'Internal Server Error'.
    -> Make sure the web server has the writable permissions on the
       --install-htdocs /sessions subfolder.

- I see "internal server error" in programs section ( on left ).
    -> Have you installed programs descriptions ( see section 4.5 )?
    -> Make sure the files in data/programs/ are readable by apache user.

- I have an internal server error when loading a programs page.
    -> The installed programs descriptions are readable by the web server.
    
- After filling the captcha the portal is indefinitely "disable": 
  see the  apache error log, if you have an "IOError: decoder jpeg not available",
  -> the PIL package has no the jpeg support. You must have libjeg installed before installing
  PIL, check if PIL detect your libjpeg, PIL show a summary after building (python setup.py build_ext -i).
  

- after launching a job, I have : Mobyle Internal server error.
   -> see the mobyle error_log if you have 
   AsynchronRunner : CRITICAL ... __init__: exec child caught an error ...
   check if the python module Src/Mobyle/RunnerChild.py is executable by apache user.
 
If your problem is not among those covered above then please contact
us at mobyle-support@pasteur.fr


6 - Mailing list:

There is a mailing list destinate to the mobyle administrators. We
discuss about mobyle, new release etc ... (this is a moderated and low
traffic list).

You can subscribe to mobyle users at:
    http://sympa.pasteur.fr/wws/subrequest/mobyle-users
