The University of Arizona
SyMAP Installation  
Home | Search | Contact Us |

SyMAP v2.0 Installation Guide

Last update: 5/16/07
The installation instructions for v2.0 are the same as for v1.0.
The only change to this guide is that a Troubleshooting Guide was added.
Please send bug reports and suggestions to symap@agcol.arizona.edu.

Contents:

1 Overview

SyMAP aligns an FPC map either to a pseudomolecule (sequenced chromosome) or to another FPC map.
  • Data for the FPC-to-Pseudo mapping consists of sequenced markers positioned on the map and/or BESs associated with clones on the FPC map. BLAT1 is used for the alignments.
  • Data for FPC-to-FPC alignments consists of shared markers and fingerprint overlaps.

For details on the algorithm see:
        C. Soderlund,  W. Nelson, A. Shoemaker and A. Paterson (2006)
        SyMAP: A system for discovering and viewing syntenic regions of FPC maps 
        Genome Research 16:1159-1168.

This guide provides instructions on creating the demo SyMAP; this should be done first in order to insure that the database connection works. The demo files were created from the maize fpc to rice pseudomolecule synteny alignment (see www.agcol.arizona.edu/symap).

1.1 Server System Requirements

The SyMAP server software runs under Linux, Unix, and Mac OSX. It should also work on Windows XP, although it is untested.

To install and run SyMAP with the demonstration data requires a minimum of 100MB RAM and 500MB disk space. The RAM and disk space requirements for your data sets depend on the size of your data; typically you want to have free disk space at least twice the size of your data set.

Installing and running SyMAP are computationally intensive, so a fast machine with at least 1GB of RAM is recommended.

For best results, use the following:
  • Apache Web Server
  • MySQL v5 or later (SyMAP may run with other database systems such as Oracle, but is untested; if you can run it on other database systems, please let us know!)
  • Perl with the following CPAN modules:
       DBI
       Cwd
       Data::Page
       Storable
       File::Temp
       CGI
       CGI::Carp
       GD
    

    1.2 Client System Requirements

    SyMAP runs in a web browser as a Java Applet on the client system. This requires Firefox or Internet Explorer with the Java Runtime Environment (JRE).

    1.3 SyMAP Components

    The SyMAP system consists of four components:
    • Admin and processing scripts, written in Perl
    • HTML and CGI display pages
    • Java-language display applet
    • MySQL database
    In addition, a standalone interface to the Java applet is supplied, as well as SyTry, an independent tool for testing and adjustment of parameters.

    SyMAP files are located in three directories:

    • Admin - where the symap.tar is untarred, and the data is located.
    • HTML - contains the html files and java jar files.
    • CGI - contains the cgi files.

    1.4 Package Structure

    The tar file contains the directory symap/ with the following contents:
      cgi/    data/         LICENSE             scripts/
      html/   params        symap_install.html  symap_troubleshoot.html
      java/   release.html  symap.pm            sytry/
    
    The /data directory contains all project data, in subdirectories:
      fpc/   fpc_pseudo/  fpc_fpc/  pseudo/
    

    Data specific to the FPC or pseudmolecule projects are stored under the "fpc" or "pseudo" directories, respectively.
    Raw anchor data for pair alignments is stored under the "fpc_pseudo" or "fpc_fpc" directories, depending on the pair type.

    Go to top

    2 Installing SyMAP and Creating the Demo

    If you encounter installation problems, please see the Troubleshooting Guide.

    1. First make sure that you have a MySQL database running on a server which can also host the SyMAP HTML and CGI pages.
       Important: Java applet security requires that the database and web server be on the same machine.
       Create a new database for symap data. MySQL must be installed and running with accessible admin/user accounts -- see your trusty System Administrator :).
         > mysql -u <db_adminuser> -p <db_adminpasswd>     
         mysql> create database symap;

    2. Decide where you want the Admin directory to go, and untar the package in that location.

         > tar -xvf symap_2_0.tar.gz
    
       The untar will create a "symap" directory with all the SyMAP files underneath it.

    3. Go to the symap directory and edit the file "params".
       The params file contains information for the database as well as the locations for web files in the local filesystem and the URLs for those files on the internet. Edit the required parameters listed below in the params file for your system configuration.

    ParameterDescription
    db_nameThe name of the database created in Step 1.
    db_serverThe name of the MySQL database server, e.g. myserver.myschool.edu.
    db_adminuserThe name of the "admin" database user with table create privileges.
    db_adminpasswdThe password for the "admin" user.
    db_clientuserThe name of the "client" database user, only requires read access.
    db_clientpasswdThe password for the "client" user.
    html_pathThe filesystem path to install the SyMAP HTML files.
    cgi_pathThe filesystem path to install the SyMAP CGI files.
    html_urlThe URL path to the html_path directory.
    cgi_urlThe URL path to the cgi_url directory.
    Example:

        db_name         = symap
        db_server       = myserver.myschool.edu
        db_adminuser    = admin
        db_adminpasswd  =
        db_clientuser   = client
        db_clientpasswd =
        html_path       = /web/htdocs/symap
        cgi_path        = /web/cgi-bin/symap
        html_url        = http://myserver.myschool.edu/symap
        cgi_url         = http://myserver.myschool.edu/cgi-bin/symap
        logfile         = symap.log
        site_logo       = symap_logo.gif
    

    4. Run the installation script:

          perl scripts/install.pl
       and follow the instructions.
       It will ask you if you want to create the demo, say Yes.
       A project can be removed with the script
          perl remove_project.pl 
       or by re-running the install.

    NOTE: You may have to exit all browser windows and restart the browser to see correct results.

    When the install is complete, the web pages will be available at the web location specified as "html_url" in the params file. The installed pages should match those seen here. Note that the top page redirects immediately to the CGI script "projects.cgi".

    Important: The installation script must be re-run every time the "params" file is changed.

    Go to top

    3 Creating or Updating a Project

    3.1 Creating a New Project

    To create a FPC-to-FPC or FPC-to-Pseudo project, perform the following steps:

    Step 1: Create the FPC project
    Create and populate a directory tree under data/fpc (see FPC Project under Data Organization).
    Copy fpc/demo_fpc/params to the new directory and make edits.
    Load the data into the database with the following command:

         perl scripts/fpc.pl <fpc_name>
    Note that "fpc name" is the name of the directory under /fpc.

    Step 2: For FPC-to-Pseudo project
    a. Pseudomolecules

    Create and populate a directory tree under data/pseudo (see Pseudomolecule under Data Organization).
    Copy pseudo/demo_seq/params to this directory and make edits.
    Load the data into the database with the following command:

         perl scripts/pseudo.pl <pseudo name>
    Note that "pseudo name" is the name of the directory under /fpc.

    b. FPC/Pseudo pair
    Create and populate a directory tree under data/fpc_pseudo (see FPC/Pseudo under Data Organization).
    Copy fpc_pseudo/demo_fpc_to_demo_seq/params to the new directory and make edits,
    if desired (default parameters are usually satisfactory).
    Load the data into the database with the following commands:

         perl scripts/anchors.pl pseudo <fpc_name> <pseudo_name>
         perl scripts/synteny.pl pseudo <fpc_name> <pseudo_name>
    

    Step 3: For FPC-to-FPC project
    No setup is needed for this case, unless you have fingerprint overlaps to include, or you wish to use parameters other than the defaults. If one of the above does hold, then:
    Create and populate a directory tree under data/fpc_fpc (see FPC/FPC under Data Organization). Copy fpc_fpc/demo_fpc_to_demo_fpc/params to the new directory and make edits, if desired. Load the data from this directory into the database with the following commands:

         perl scripts/anchors.pl fpc <fpc_name> <fpc_name> 
    
    Regardless whether you did the preceding steps or not, compute the synteny using
         perl scripts/synteny.pl fpc <fpc_name> <fpc_name>
    
    Notes: The params files must be present for the individual fpc/pseudo projects, and should be edited with the appropriate values, particularly the display name and whether numbered chromosomes or lettered linkage groups are used (for FPC projects, the second setting only matters if there is contig anchoring). The param file may be omitted from the pair directories (fpc_pseudo, fpc_fpc), in which case the defaults will be used.

    The commands to add this project are:

          perl scripts/fpc.pl xx
          perl scripts/pseudo.pl yy
          perl scripts/annotation.pl yy  (if there is any pseudomolecule annotation)
          perl scripts/anchors.pl pseudo xx yy
          perl scripts/synteny.pl pseudo xx yy
    
    The web page created calls projects.cgi to read the database and provide a list of all symap projects, hence, you do not have to change this script to add more projects.

    To remove an fpc or pseudomolecule project, type

        perl scripts/remove_project.pl fpc xx
            - or -
        perl scripts/remove_project.pl pseudo yy
    
    When an fpc or pseudomolecule project is removed, all anchor and synteny data which depend on that project are automatically removed.

    3.2 Updating a project

    Put the new data into the appropriate directory and run the perl scripts as was done for creating the project. If only data under fpc_pseudo or fpc_fpc was modified, then only anchors.pl and synteny.pl need to be re-run, while if fpc or pseudo data was modified, then fpc.pl or pseudo.pl must be re-run, followed by anchors.pl and synteny.pl

    Go to top

    4 Data organization

    To add a new project, the project data must be arranged in the data/ directory using the structure below that corresponds to the project type. The demo data serves as an example for FPC-to-FPC and FPC-to-Pseudo project types.

    4.1 FPC Data
    data/fpc/
        <fpc_name>/
            name.fpc   
            params         
            sequence/
                bes/
                    bes.seq   (may have any number of bes sequence files)
                    ...       (bes names MUST match FPC clones names plus suffix r or f)
                              (e.g. if b0003A21 is a clone, bes could be b0003A21f)
                mrk/
                    mrk.seq   (any number of marker sequences; must match FPC marker names)
                    ...
    

    4.2 Pseudomolecule Data
    data/pseudo/
        <pseudo_name>/
            params 
            sequence/
                pseudo/
                    chr01.seq    (The chrNN name scheme must be followed in the file name
                    chr02.seq       as well as the fasta headers)
                    etc..        (These must be the exact sequences used to make the blat
                                     alignment data)
            annotation/
                centromere.gff  (GFF-format annotation data; see samples in demo_seq)
                etc..
    
    

    4.3 FPC/Pseudo Data
    data/fpc_pseudo/
        <fpc_name_to_pseudo_name>/
            params (optional; if absent, parameters will be defaulted)
            blat_params (only needed if blats are done by doblat.pl) 
            blats/
                bes_pseudo/
                    file1.blat    (blat files with query=bes, target=pseudomolecules)
                    file2.blat    (file name irrelevant but query/target names must match
                    etc..          those from individual fpc/pseudo directories)
                mrk_pseudo/
                    file1.blat    (blat files with query=marker, target=pseudomolecules)
                    etc..
    
    

    4.4 FPC/FPC Data

    This data is optional and only needed if you wish to have an fp_comp.txt for
    fingerprint overlaps or to alter one or more parameters.

    data/fpc_fpc/
        <fpc_name1_to_fpc_name2>/    (names must come in alphabetical order)
            params          (see sample in demo_fpc_to_demo_fpc)
            fp_comp.txt     (fingerprint overlaps; see sample)
    
    

    Go to top

    5 Running BLATs

    The script doblats.pl automates the running of BLAT alignments for marker and BES sequences against pseudomolecules. It is not necessary to do your BLAT searches using this tool but it is provided for convenience.

    Generally better results are obtained with less stringent BLAT searches, allowing the anchor filter to prune them afterwards.

    Usage of doblats.pl:

    perl scripts/doblats.pl <fpc name> <pseudo name> 
    The blats are controlled by a blat_params file under the appropriate pair subdirectory (see example under demo_fpc_to_demo_seq).

    The fields are as follows:

    type: bes_pseudo or mrk_pseudo
    pat1: pattern to match in the first name (the bes/mrk file name)
    pat2: pattern to match in the second name (pseudomol file name)
    use: y or n; do or don't do blats for matching files
    args: blat argument list

    The patterns allow different subsets of markers or BES to be run with different BLAT parameters. Note that the patterns are restricted to be very simple, just substrings or "." to match anything.

    Go to top

    6 SyTry

    The directory "sytry" contains the SyTry tool, developed for algorithm testing and parameter adjustment. It is currently released as beta software; for further details on usage, see the README in the sytry directory, and the program's own help function.

    Also in the SyTry directory is rdp.sh, a script which will launch the Java applets of SyMAP in a standalone mode.

    Go to top


    1 Kent, J. (2002) BLAT--the BLAST-like alignment tool, Genome Research 12:656-64.
       Download
  • Last Modified Tuesday November 10, 2009 09:21 AM and 48 seconds