This guide provides detailed instructions for the SyMAP user interface. For detailed
information on installation, requirements, and capabilities of SyMAP, see
the System Guide. For a quick visual
introduction, see the Tour.
The first sections below describe features of the Java application, while
the final section describes the web-only displays.
The 2D display is activated from the Explorer
clicking the 2D
button, or by selecting a block in the Whole Genome Dot Plot
Below is a screenshot depicting an FPC-to-sequence comparison for Maize FPC
(chromosome 3) to Rice (chromosome 1) to Sorghum (chromosome 3). Note that rice chr1 is the "reference"
chromosome, hence is placed in between the others.
The alignment window displays "tracks" (drawn as rectangles) of sequence information, showing synteny "hits" (drawn as lines) between tracks.
There are three types of tracks:
Left click in a track shows the most commonly used filters and features (e.g.
flipping the region). They are also on the filter menu above each track
(e.g. Sequence Filter). See the Filter descriptions below.
The control panel provides buttons that allow for navigation between views, scaling, color customization, and help.
History (Home, <, >) SyMAP retains a record of the prior views (like a web browser).
The history navigation back (<) and
forward (>) buttons allow you to
move back and forth through the previous views.
The Home button returns to the initial view.
Zoom Buttons (-, +)
The zoom buttons allow for quickly expanding and narrowing the view region.
The minus (-) button shrinks the view region by 50%, keeping
the same center, while the plus (+) button doubles the view region,
again keeping the same center.
The Scale button
(right of +) resizes the tracks so that they are in the same scale
(base pairs per pixel) as the reference sequence track in the view.
Mouse Function Selector
This drop-down selector assigns a function to the mouse's left button
click and drag actions:
- Base View - open the base alignment view for the selected region.
- Zoom - zoom to the selected region.
- Zoom All - zoom the track to the selected region and all other tracks with hits in the region.
Opens a menu for customizing colors.
Help Opens this manual in a web browser.
||Drag bottom of track
||Position mouse at bottom of track (resize cursor appears), hold down left
mouse button, and move mouse.|
||Position mouse over track and use mouse wheel.|
||Right mouse button
||Position mouse over track or white space between tracks, and click right mouse button.|
The main panel is divided into "tracks",
where each track represents an FPC map or sequence. Each track
is aligned to the tracks to its immediate left or right, and the
individual pieces of evidence for synteny (referred to as anchors or "hits") are drawn as
lines between the two tracks, color-coded by type. Above each track
is a "filter" button which allows configuration of the content displayed for
that track (e.g., annotations), and above the space between each pair of tracks
is a "hit filter" button that allows selection of the types of hits displayed
between the tracks on either side (e.g., only show syntenic hits).
track represents a contiguous piece of sequenced DNA,
typically for a sequenced chromosome. The sequence length
and the coordinates of the displayed region are indicated above and below the track.
Annotation data of the following types may be loaded into SyMAP and displayed.
(Note that display colors can be changed.)
Red band across the chromosome
Blue "X" across the chromosome
Predicted genes and exons
Annotation strip in center of chromosome (see image )
Genetically anchored ("Framework") markers (FPC maps only)
See images below
The sequence information displayed can be configured via the
Sequence Filter by clicking the
Sequence Filter button above the sequence track, or by right-clicking the
mouse over the sequence track.
The example below shows a sequence-to-sequence alignment.
- The black lines down the center of the Brachypodium chromosome indicate predicted
genes; this annotation file did not include exons.
The black lines with heavier blue bars interspersed down the center of the rice chromosome indicate exon/intron
predictions; the annotation file included both genes and exons.
- The alternating brown/gray lines indicate anchor clusters. The solid brown lines indicate
a single anchor (or multiple anchors very close together), while the gray lines are spaces with no alignment.
- The ruler on the right side of the sequence rectangle
shows the relative BP position along the sequence.
- Additional information about each
annotation may be seen by choosing "Show Description for Annotations" from the Sequence Filter.
The Sequence Filter allows the user to select the type of information shown for the
sequence. It is accessed via the Sequence Filter
button above the sequence track,
or by right-clicking the mouse over the sequence track.
Start and End
The positions of the sequence display can be set via the
corresponding text boxes.
The units of the values entered can be selected from the accompanying drop down menus
(BP, KB, MB, GB).
Full Sequence Sets the start and end positions of
the sequence display to encompass the whole chromosome.
Flip Reverses the orientation of the sequence track.
Enables/disables the display of gene/exon annotations
along the sequence.
Show Framework Markers Typically genetic markers or radiation hybrid markers.
This option enables/disables the display of framework markers (drawn as solid green
rectangles) along the sequence.
Show Gaps Enables/disables the display of sequence gaps (drawn as solid red
rectangles) along the sequence.
Show Centromere Enables/disables the display of the centromere (drawn as a cyan "X") on
Show Ruler Enables/disables the display of the sequence ruler along the right side
of the sequence.
Show Description for Annotations Enables/disables the display of the annotation descriptions along the
right side of sequence. NOTE: this only works if you are zoomed in close enough that they can clearly
Show Hit Score Line Enables/disables the display of the score line next to each hit along the
sequence. The length of the line represents the magnitude of the % Identity value for
Show Hit Score Value Enables/disables the display of the score value next to each hit along the
sequence. The score value corresponds to the % Identity value for the hit.
Show Hit Length Enables/disables the display of the hit length line next to each hit along the
sequence. The hit length line denotes the start and end points of the hits relative
to the sequence (may not be visible for tiny hits).
track shows a block of FPC contigs, which is simply
a list of contigs arranged end-to-end. This track
type allows alignments to be displayed which stretch across more than one
FPC contig. Individual clones
are not shown in a
block track, but markers
are shown, and hits from
both clones and markers to neighboring tracks are shown.
Clicking on one of the contig rectangles of a block track
brings up the contig track display for that particular contig. The filter options
that are shared with the Contig track are maintained.
A Block track may simply be a group of contigs, in which case the block title is the
project name (e.g., "Maize"). Or a block track may contain a synteny block,
identified by SyMAP software, in which
case the Block Title includes the Syteny Block Number. The Synteny Block Number
consists of three fields:
In the screenshot above, the Synteny Block Number "6.1.1" refers to Maize Chromosome 6
mapped to Rice Chromosome 1, synteny block #1 of this pairing.
The Block Filter allows the user to select the type of information shown for the
block. It is accessed via the Block Filter
button above the block track,
or by right-clicking the mouse over the block track.
See the Markers section, for more information on markers.
Marker Name Allows control over which marker names/hits to display.
A marker is specified by entering a search string for the marker name and
selecting the appropriate setting in the show/hide drop-down. A wild-card
character "*" may be used at either end, e.g. "*SSR*".
Show No Marker Names Hide all marker names.
Show Only Marker Names With Hits Only show the names of markers which have a
hit somewhere along the chromosome currently being viewed. These hits
may not be visible if the view has been zoomed.
Show Only Marker Names With Visible Hits Only show the names of markers which have hits visible in the
Show All Marker Names Display the names of all markers on the block, regardless of hits.
Contig Set Allows the user to define which contigs to display.
Multiple contigs and ranges may be specified, for example
"1,2,3,5-20,31,47" will display contigs 1,2,3,31,47, and all contigs between 5 and 20.
The contigs are displayed in the order given.
Flip Flip the entire block, reversing the
display order of the contigs.
track shows a detailed view of the clones
that make up the selected contig. Hits from
clones to neighboring tracks are shown as lines going to the clones. If the
hit is from a BES on the clone, the line is drawn to one end of the clone, while
if the hit is from a marker attached to that clone, the hit is drawn to the middle
of the clone. Because a marker may be attached to many clones, a hit from
a particular marker is drawn from a red "marker join dot" to the side of the
contig, and the join dot is then connected to each of the clones to which
the marker is attached.
Since a contig track is usually accessed by choosing a contig from a block
track, a Block View link has been provided at the top of the contig
track which will return to the previous block view. The filter options that
are shared with the Block track are maintained when using this link.
The contig information displayed can be controlled via the
are represented as short vertical lines. Dots on the ends of the
line represent a BES.
|open||the BES has no hit|
|light purple||the BES has a hit|
|black||the BES has a hit within a block and has no hits showing|
|dark purple||the BES has a hit within a block and has a hit showing|
A green line in the middle of the clone going towards
another track is a marker hit.
Gold-colored clones are clones having both BES hits
with the same orientation. This can indicate an inversion breakpoint within the clone.
Moving the mouse over a clone will give a description in the status bar and highlight
the clone and its hits. Moving the mouse over a clone's BES gives more information
on that BES's hits.
The clone information displayed can be configured via the
Contig Filter by clicking the
Contig Filter button above the contig track, or by right-clicking the
mouse over the contig track.
are shown next to Block tracks and Contig tracks. Moving the mouse over
a marker name will highlight the corresponding hits. Clicking on a marker's name will
keep the marker highlighted until another marker is clicked or the same marker is
Markers are colored based on their hit information and if they are shared.
||Left mouse button
||position mouse over marker and click left mouse button.|
||position mouse over marker/clone.|
BES Hit is signified by a purple line between two tracks.
Marker Hit is signified by a green line between tracks. If a
marker of the same name can be found on another track in the view, the marker hit
will be blue. When viewing a contig in the Contig view, the hits
from the same marker on different clones are joined together at a red
dot. Moving the mouse over the red dot will highlight the marker's name.
Clone Fingerprint Hit is signified by a black line between two tracks.
allows the user to select the type of information shown for the
currently visible contig. It augments the Block Filter with options specific to clones.
It is accessed via the Contig Filter button above the contig track,
or by right-clicking the mouse over the contig track.
Change Contig a quick way to change the currently viewed contig.
Clone Name: The Clone names and hits can be shown, hidden, or highlighted through the Clone
Name text box. A clone name can be searched on by entering a search string
for the clone name and choosing to show, hide, or highlight those clones that match it.
A wildcard "*" may be used at the beginning or end.
Clone with Remarks: Clones can be filtered or highlighted based on the remarks attached to them in the FPC file.
For convenience, all remarks present in the contig are listed in the selection box,
and one or more may be selected for filtering (to select multiple, hold down the
Control key while clicking). Clones containing at least one of the remarks selected
are filtered or highlighted based on the option selected above the remarks selection box.
Show Clone Names The user can choose to show clone names by selecting the Show Clone Names
check box. The width of the contig can than be adjusted to
give sufficient space for the names.
Show Only Clones with current BES paired hits Show only those clones for which both BES have a hit showing in the current view.
Show Only Clones with BES paired hits Show only those clones for which both BES have hits (filled-in circles).
Start and End The portion of the contig to be displayed can be adjusted through the start and end text boxes.
The desired start and end points are entered in CB units.
Width of Contig The width of the contig can be adjusted using the slider.
Flip the Contig Reverse the order in which the clones are displayed.
The Hit Filter menu allows the user to select which types of hits are displayed,
and what stringency of alignment is required:
There are several types of hits which may be shown or hidden:
Synteny hits ,i.e., those found by
SyMAP analysis to be part of a synteny block.
Gene hits, i.e., hits to an annotated gene region.
The types of filter are as follows:
E-Value Filter Hit types can be filtered by E-Value using the corresponding E-Value slider.
When the slider is all the way to the left, all hits are shown.
Percent Identity Filter Hit types can be filtered by percent identity using the corresponding
% Identity slider.
Show Only Shared Marker Hits This option applies to a 3-track (FPC/Pseudo/FPC) view. When selected, only marker hits
for markers that are shared between the two FPC tracks are shown.
Show Marker Join Dot The join dot is the red dot connecting all of the clone hits of a marker.
This option allows the user to show the marker hits without the join dot.
When this option is selected, marker hits will be drawn directly from the
centers of the clones to hit locations on the sequence.
This can make synteny more visible, but it also adds
some redundant lines for markers which are attached to many clones.
Show Only Synteny Hits Shows only hits that are part of a synteny block.
Show Only Hits to Genes Show only hits which intersect a gene annotation.
Show Only Non-Gene Hits Show only hits which do not intersect a gene annotation.
Show All Hits hows all known hits.
The base view displays a detailed diagram of the selected hit.
A base view of the hits along a sequence track can be brought up by dragging the
mouse along the sequence and releasing when the desired range is highlighted. The range
selected is increased if necessary to show the full length of the markers and BESs
involved. If the user selected range contains blank areas (no marker or BES hits) the
range is reduced to include only the genes with hits. There is a maximum range of 50Kb
that can be selected for the base view, so zooming in on an area first may be
If there are multiple hits in the selected region, than the base view of those hits
appears in a new dialog. This view consists of a ruler along the top showing the area
of the sequence covered, the hits, and the genes.
Hits are displayed as lines with an arrow on one end showing the direction of the hit.
Marker hits are green, while BES hits are purple.Clicking on a hit brings up the hit's
BLAST view in the bottom of the dialog. The hit presently shown in the BLAST view is
shown in gray. The full length of the marker/BES is shown as a grey dotted line above
the hits.A vertical red line along a hit represents a mismatch . A vertical green line
along a hit represents a deletion. An arrow pointing down (i.e. 'v') along a hit
represents an insertion.
Annotated genes are displayed below the hits. Exons are represented by a blue box, and
the leftmost or rightmost exon box will have an arrow tip indicating whether the gene
is on the + strand (right-pointing) or - strand (left-pointing) relative to the
sequence. The system attempts to expand the view to show the full gene that has
hits. Genes that overlap this gene will be partially shown.
The SyMAP Query interface has been greatly expanded in v4.0. This interface has two basic
A. Locate homologous (or paralogous) regions based on annotation and location
B. Create putative gene families spanning multiple species, and apply family-based filters
Some sample queries which are possible through the interface:
• Find un-annotated regions on one genome which are homologous to regions on another genome
which have already been annotated.
• Find putative gene families which are present in one lineage but absent in another.
• Identify the likely orthologs on genome A of a given gene on genome B, by using synteny blocks
or exactly collinear pairs to filter out probable paralogs.
• Find orphan genes which are specific to one genome.
To open the query interface, first select two or more sequence projects in the Project Manager (note, the
queries are not supported for FPC maps). Then press the "SyMAP Queries" button to open the interface:
The Overview window (above) lists the projects which were selected for querying. To set up a query, open the
"Query Setup" window by clicking on its title in the left panel. This opens the window below:
Here you set up the query and then press "Do Search" to execute it.
The query occurs in three stages, corresponding to the section numbers
on the Query Page. Each stage has its own filters, as follows:
1. Filter hits
The first stage is to retrieve hits (anchors) from the database, based on
filters such as chromosome and annotation string. Note that each anchor
connects two species and hence represents a pair
regions on the two species. Filters are as follows:
Annotation String Search
Enter annotation search terms. Hits will be returned which overlap a matching annotation
on either side.
Each selected species has a row in which you can select to search either all
its sequences, or a particular sequence or basepair region.
Only synteny hits
Return only hits which are part of a syntenic block. This
helps to screen false positive hits, but can also conceal true hits that are not
part of a detectable syntenic block. (Note that SyMAP hits are already filtered
during loading using a reciprocal-top-2 filter.)
Only hits in a collinear gene pair
Return only hits which are part of a pair of aligning genes having no intervening
non-aligning genes. This relates to the RunSize column, which is one of the optional
columns of the Query Result table. RunSize shows the size of collinear chain which
a hit is contained in, hence this checkbox is equivalent to RunSize at least 2.
Note that a collinear chain is not the same as a SyMAP synteny block, because blocks
require at least 7 anchors, and are allowed to have intervening genes which do not align.
Show orphan genes
If checked, then the results will show all genes which do not overlap any of the hits
matching the search criteria (i.e, the hits which would be shown in the table if this
checkbox were not checked). Note that the genes must also meet the search criteria,
e.g. matching the annotation query string and the chromosome location requirements.
2. Filter putative gene families (PgeneFs)
Using the hits that pass the Stage 1 filters, SyMAP constructs putative gene families (PgeneFs)
spanning the selected species. This is done by grouping hits whose hit regions overlap on
at least one genome. Note, if you have more than 6 species selected, this stage can take
an hour or more.
Each PgeneF is given a number, which is shown in the Query Results table (column name "PgeneF").
The size of the PgeneF is also shown (column "PgFSize").
Filters at this stage apply to the PgeneF as a whole:
These filters permit searching for gene families shared by one group of species but not
present in another.
If a species is checked to include, then the PgeneF will only be retained if it includes
at least one hit which hits that species.
If a species is checked to exclude, then the PgeneF will be discarded if any of its
hits are to that species.
No annotation to included species
Find PgeneFs which are not yet annotated. A PgeneF will be discarded if it is annotated on any of
the species which are checked in the Include line.
Complete linkage of included species
Require the PgeneF to be fully linked, i.e. for each pair of species A and B in the group,
there must be a hit linking A to B.
3. Filter displayed hits
The query returns hits between all pairs of the selected species; however, you may
only be interested in seeing those which hit certain species. To achieve this,
select those species in the Include
row of Filter Stage 2, and also check
the box below, Show only hits to the included species
. Only those hits will be shown,
although the PgeneF numbers will reflect groupings created using all hits.
SyMAP Query Results
When the query is complete, the Query Results page opens showing the table of results:
The table contains all the hits (anchors) resulting from the query. Each hit connects
two species and you can see the respective chromosomes and start/end locations of
the hits, as well as gene annotations overlapped by the hits.
Note that the table contains columns for all of the selected
species, but each hit only connects two species, and the other species columns are empty.
If the query specified orphan genes, then each row represents one gene and shows data only
for one species.
Note that entries in the table may be selected. To select more than one
entry, use "ctrl-click".
Brings up a SyMAP 2-track view for each selected entry, so the hits
can be seen in their full chromosomal context. The hits are initially padded to each side
by the margin amount indicated (default 50kb), but you can easily zoom further out in the 2-track views.
Saves the selected hits using the selected set of columns to a CSV format suitable for import into Excel.
Save for Reload
Saves the selected hits in a CSV format which can be reloaded back into SyMAP
later (click "Results" on the lefthand pane,
and use the "Load Saved Query" button). The saved table includes all possible columns, not just
those currently selected for display.
Save as Fasta
Sequences from the selected hits are written to a Fasta file. Both sides of each hit
are written, using the start/end coordinates shown in the table.
Sequences for the selected hits are written out and a multiple alignment is created
using MUSCLE (Edgar 2004 NAR:32). If no selection is made then the whole table is used. Note that
alignments of more than 10 sequences can take considerable time.
You can sort the columns by clicking on them, and rearrange them by dragging the header
boxes. You can add/remove columns using the "Select Columns" button at the bottom. This
opens up a column-selection section, shown here:
are the meanings of the columns:
Row number within the table
PgeneF number. All hits in a given PgeneF have the same number. Note that
the number is generated during the search and won't necessarily be the same
in a different search.
Size of the PgeneF which contains this hit.
Database index of the hit. This is used in the "Save for Reload" function.
Synteny block containing this hit (if any). The format is ..N, where
runs from one to the number of blocks between those two chromosomes.
The number of anchors which comprise the synteny block.
If the hit is contained in an uninterrupted collinear sequence of shared
genes, RunSize shows the length of these sequence. If you have chosen
the option "Only hits in a collinear gene pair", then RunSize will be two
or greater for all returned hits.
Database index of the hit. This is used in the "Save for Reload" function.
Location: one row for each species
Chromosome (or draft sequence number) where the hit is aligned
Start/end of the hit region (note that hits are clustered during anchor loading, so hit regions
are not necessarily single MUMmer alignments).
Shows the number of distinct regions on this species which are included in the
PgeneF containing this hit. For example, if a gene from Species A expanded by
tandem duplication to 5 genes on Species B, then the PgeneF will have PgFSize=5,
and #RGN will be one for Species A and 5 for species B.
Annotation: one row for each species
The key/value pairs from the annotation gff attributes sections are shown here
They can be different for each species, e.g. one may have "ID" while another uses "Name" for the
gene names. "All_anno" shows all of the attributes joined together.
This section shows overall statistics for the query results, for each species.
The meaning of the terms is as follows:
Number of hits involving that species.
Number of distinct regions covered on that species (see the definition of the "#PGF" column above).
Out of the number of distinct regions, how many have annotation.
Number of orphan genes returned for this species.
Number of chromosomes (or draft sequences) this species has, for reference.
Prior to v4.2, the web-display system was partly separate from the standalone system and contained
some slightly different displays. Starting in v4.2 these have been unified so now the web displays
are the same as those available in standalone, which are described in the other sections of this document.
As of v4.2, there are two ways to download data for SyMAP synteny blocks, individual anchors, and annotations:
- Through the Explorer. Select the species of interest,
open the Explorer, and the download button is at the lower left.
This exports a table of all the computed synteny block co-ordinates for all the
selected species, including their self-alignments, if those were computed.
- Through the Query interface. For example,
choosing two species and executing a query with the default parameters
will show a table of all the anchors found between the two species, plus
their annotations and synteny block membership, if any.