The University of Arizona
MTP tutorial  
Home | FPC | Search | Contact Us

Written by Fred Engler     Aug 2003
Updated by Will Nelson     June 2004
Updated by Martin Pokorny, Jingmei Yang, Will Nelson & Cari Soderlund   April 2006
Updated by Jingmei Yang, Will Nelson & Cari Soderlund   Aug 2006
Updated by Will Nelson & Cari Soderlund   Aug 2007

Contents

A. Introduction
B. MTP using Fingerprints
    1. Finding overlapping clone pairs using fingerprints (Step1)
    2. Viewing overlapping clone pairs in the contig display(Step3)
    3. Picking MTP clones(Step2)
    4. Viewing MTP clones in the contig display
C. MTP using Fingerprints and draft sequence alignments to BES
D. Saving MTP results
E. Mandatory clones
F. Split BSS Contigs
G. HICF Contigs
Also see: MTP simulation results

A. Introduction Back to top

Selecting an MTP (Minimal Tiling Path) is the task of picking a set of minimally overlapping clones that span an entire contig. Due to inexact coordinates in the CB (Consensus Bands) map, one cannot pick overlapping clones based solely on their position on the map. There are two methods using two different input sources for picking MTP clones: 1) the fingerprint method, in which overlaps are determined by looking at the clone fingerprints and their map position, and 2) the BSS-draft method, in which sequence comparison between draft sequence and BESs (BAC End Sequences) via BSS is used. The first method involves analyzing the fingerprints of a pair of overlapping clones for shared restriction fragments (bands), and verifying the integrity of the fingerprints of the potentially overlapping pair by matching bands with a spanning and two flanking clones. In the second method, map overlap is confirmed by a draft sequence contig matching two BESs of the overlapping clone pair. The first method uses information that is already present in the physical map, but the overlaps can be inexact. The second method requires sequence information, but gives very exact overlaps. The 'Select MTP' function can use fingerprints, BES draft sequence comparison results, or both as input for picking MTP clones. When both are used, precedence is given to overlaps verified by the BSS-draft method. There are two steps in automatically picking MTP clones: 1) finding a set of overlapping clone pairs, and 2) picking a contiguous path of overlapping clone pairs through a contig. The following sections guide you through an example of using the automatic MTP picking function of FPC. Before you begin, please download the files used in this demo by clicking here. Uncompress the files:
tar xzvf mtpdemo.tar.gz
Change directory to the mtpdemo directory:
cd mtpdemo
Next, start FPC with the demo file by typing
fpc mtpdemo
on the command line. Open the window for selecting MTP clones by clicking on the 'MTP' button from the Main Menu. The following window appears:



B. MTP using Fingerprints

1. Finding overlapping clone pairs using fingerprints Back to top

This step locates all pairs of overlapping clones that satisfy the criteria given in the top part of the window. No sequence information is used in this step. The fingerprint bands and the map positions of clones are used in the analysis.
Make sure the 'Use fingerprints' options are turned 'on' (the circle is filled). Leave all parameters at their default value. An explanation of the parameters is given in the online help, which may be read by clicking on the 'Help' button at the top of the window.
Click on the 'Find overlapping pairs' button. This starts the computation process. You will see on the standard output the progress being made through the contigs, as shown below:

********** Find overlapping pairs ***********
Read 114098 bands from .cor file. Band range(720 3298).
Find Fingerprints Pairs
Clone pairs for ctg1 (clones 314)...3837 pairs
Clone pairs for ctg2 (clones 221)...2880 pairs
Clone pairs for ctg3 (clones 92)...1016 pairs

.
.
.
Clone pairs for ctg40 (clones 5)...1 pairs
Clone pairs for ctg41 (clones 2)...0 pairs
Contigs with zero pairs 2
Identified 33456 fingerprint pairs

// All contigs  Min FPC overlap 0  Max FPC overlap 20
// Use Fingerprints: Min Shared Bands 6  
********** Finish overlapping pairs ***********


When the computation is complete, the button will turn gray.

2. Viewing overlapping clone pairs in the contig display Back to top

We will come back to STEP 2, but first we will look at the results from STEP 1 by going to STEP 3 (which shows results from both steps). To see the overlapping pairs in the contig display, select a contig via the 'Contig' text box, and click the 'Next' button beside 'Step through pairs'. The selected contig will open, and the first pair, along with the spanning and flanking clones, will be highlighted. As you continue to click on 'Next', each pair will in turn be highlighted:



If the 'Show fingerprints (fp only)' option is turned on, the Fingerprint window will also open, showing a series of fingerprints:



A total of five fingerprints will be shown, and five clones are highlighted. In the contig display, the clones highlighted in blue indicate the clone pair. The pale blue clone spanning the overlap of the pair (called the spanner) verifies the shared bands of the pair. Extending to the left and right of the pair are two clones highlighted in gray. These clones confirm bands in the pair that are not confirmed by the spanning clone. The Fingerprint window is used to show how bands are shared. The following color scheme is used:
Cyan -- band is shared by both clones in the pair and the spanning clone.
Green -- band is shared only by the left clone in the pair and spanning clone.
Blue -- band is shared only by the right clone in the pair and spanning clone.
Violet -- band is shared by a clone in the pair and its flanking clone, but not by the spanner.
Red -- band in a pair or spanning clone that is unconfirmed; a mismatch.
In the standard output, information on the shared bands and unmatched bands, along with the length of the pair clones, is given:

Fingerprint pair:
L-flank     Left        Spanner     Right       R-flank
z2598       z2597       z2602       z2612       z2611
            122880                  167936                  (length)
6           10          12          20          9           (shared)
-           2           1           0           -           (mismatch)
The numbers displayed on the terminal for each pair have the following correspondence to the colors of bands in the Fingerprint window.
  • The number of cyan bands is the number in the "Spanner" column and "(shared)" row (e.g. 12).
  • The number of green bands is the number in the "Left" column and "(shared)" row (e.g. 10).
  • The number of blue bands is the number in the "Right" column and "(shared)" row (20).
  • The number of violet bands in the fingerprint of the left clone of the pair is the number in the "L-flank" column and the "(shared)" row (e.g. 6).
  • The number of violet bands in the fingerprint of the right clones of the pair is the number in the "R-flank" column and the "(shared)" row (e.g. 9).
  • The number of red bands bands is the sum of the numbers in the "(mismatch)" row (e.g. 2, 1, 0 for Left, Spanner, Right, respectively).

You may get output as follows:


Fingerprint pair: olap 49152
L-flank     Left        Spanner     Right       R-flank
z2598       z2597       z2602       z2612       z2611      
            122880                  167936                  (length)
6           10          12          20          9           (shared)
-           2           1           0           -           (mismatch)
which indicates that no valid spanner and flankers could be found, though the pair does qualify based on the user input.
  • You can step through the pairs by repeatedly clicking on the 'Next' and 'Previous' buttons beside 'Step through pairs'.
  • You may select the clone from which to begin stepping through the pairs by clicking on 'Pick start', followed by clicking on the clone of interest.
  • You may wish to hide everything except the step buttons. To do this, click on the 'Mini' button. Only those options essential in stepping through pairs are shown:



To revert to the full-sized window, click on the 'Full' button.

3. Picking MTP clones Back to top

Now we will go back to STEP2 to automatically select the MTP. Using the shortest paths algorithm, a minimal path of clones is picked through a contig based on the amount of clone overlap (shared bands) and clone size. To run this, click on the 'Pick MTP clones' button in STEP 2. The following text will be displayed:

************ Starting PickMTP ************
Building graphs completed.         
Finding MTP completed.         
Average MTP clone size: 143959
Contig totals: (in CB units)
Contig    Ctg len   # MTP  overlap   # of gaps   gap length   %covered
-------   -------   -----  -------   ---------   ----------   --------
ctg1          493      14      102           0            0        94%
ctg2          303       8       62           0            0        94%
ctg3          159       4       23           0            0        92%
ctg4          137       3       16           0            0        79%
ctg5          185       5       30           0            0        94%
ctg6          538      17      133           0            0        97%
ctg7          298      11       76           0            0        94%
ctg8          872      26      184           0            0        97%
ctg9          257       7       48           0            0        89%
ctg10         100       3       16           0            0        95%
...
ctg40          37       1        0           0            0        64%
ctg41          41       1        0           0            0        70%

Clone overlap (base pairs):
Positive:
    20000- 30000- 40000- 50000- 60000- 70000- 80000-
    29999  39999  49999  59999  69999  79999  89999 
       80     31     21      7      5      2      3 
   Total positive overlap: 5177344
   Average positive overlap: 34747
   Number of positive clone overlaps: 149

Clones picked:190
BSS pairs: 0 (0%)
Fingerprint pairs: 149 (98%)
Single MTP clones: 3 
Mandatory clone pairs: 0 (0%)
Expressway junctions: 0 (0%)

Number of clones in MTP: 190
Number of mandatory clones: 0

Total gap span: 0 kb
Total MTP span: 24019 kb
Percent of map covered: 92%

// All contigs  Prefer large    Mandatory: 
************ Finished PickMTP ************

The 'Pick MTP clones' button will turn gray once the process completes.

In the table displayed on the standard output, the "ctg len" column displays the length of the contig, and the "overlap" column displays the total overlap of the clones in the MTP of the given contig, both lengths being given in FPC units.

Gaps in the MTP are counted wherever there is a break in the MTP (not including pairs with a negative overlap, which can happen with BSS pairs), where this number does not include any potential uncovered segments of the contig beyond the ends of the MTP. Thus "# gaps" displays the number of such breaks in the MTP, and "gap len" is the total length of those gaps in FPC units. Gaps should usually be 0 if fingerprint data is used, because there is almost always at least one viable path through the contig; however, if only BSS data is used, or if the parameters are set very stringently, then gaps may appear. We emphasize that these gaps are relative to the FPC contig; typically, additional gaps will be found when sequencing is performed, because the FPC contig embodies only partial information about the underlying sequence.

Finally, "%covered" displays the fraction of the contig between the ends of the MTP on that contig (excluding gaps).

Note that all of the lengths given in the MTP report to the standard output are based on the numbers of bands and the average band size in the FPC project, and are therefore only approximations of the actual overlaps, gaps, etc.

Note: If you change the pair parameters, you will need to first rerun the 'Find overlapping pairs', and then rerun the 'Pick MTP clones' function.

4. Viewing MTP clones in the contig display Back to top

The MTP clones are viewed in the contig display the same way as the pairs, using the 'Next' and 'Previous' buttons beside 'Step through MTP'. Sometimes a complete path cannot be found through a contig. The contiguous paths are called "expressways". Look at the information in the standard output to see how the clones make up the expressways. For example,

Clones (1, 2) in expressway of 14:

Fingerprint pair:
L-flank     Left        Spanner     Right       R-flank
z2598       z2602       z2610       z2628       z2629
            176128                  126976                  (length)
8           28          7           6           18          (shared)
-           0           0           0           -           (mismatch)
tells us that we are looking at the 1st and 2nd clones in a contiguous path of 14 clones. Click on the 'All' button to see all picked clones highlighted in blue. Look at the text output to see where junctions occur.

Whenever a path or expressway does not span the entire contig, a new expressway must start near the last clone of the previous expressway. The process of choosing a clone to begin the new expressway can only be based on overlaps of clones on the CB map, and should therefore be verified by a person. FPC will display messages to the standard output signifying such regions of "weak overlap" whenever they occur in the MTP (see the section on picking an MTP using BSS pairs, below, for an example).

C. Adding overlap data from draft sequence alignments to BES
Back to top

A much more accurate estimate of the overlap of two clones can be obtained if a BES from each clone hits a particular draft sequence contig. If your species has both draft sequence and BES, it is recommended to use this data in addition to the fingerprint overlap data.

To use the draft information, you must first use the BSS tool to make the BSS file of alignments. This is very easy (see the BSS documentation). For this demo, a BSS file "Dseq.Dbes.bss" has been produced by performing a BSS search with query Seq/DSeq.seq and database BES/DBes.bes. This query file DSeq.seq simulates Whole Genome Shotgun Sequence contigs, and the database DBes.bes are actual BAC End Sequences. MegaBLAST was used as the search engine, using an E-value cutoff of 1e-100.

NB: The MTP function was developed with very short draft sequences in mind, for example the result of a 1x sequence survey project. Beginning in FPC V9.1, it is also possible to use long reference sequences, such as sequenced chromosomes from a closely related species. If your project does involve small survey sequences, you may wish to change the Multiple Contig Ratio parameter back to its former default value of 3. This parameter is on the Advanced Settings dialog.

Finding overlapping clone pairs using fingerprints and BSS: To incorporate the BSS alignments, turn on the 'Use BSS results' option. Click on the 'Load...' button, and select the "Dseq.Dbes.bss" on the right. Click 'OK' (or double-click "Dseq.Dbes.bss"). Click on the 'Find overlapping pairs' button, and say yes to removing the old pairs. This starts the process of finding overlapping pairs based on the sequence comparison results between draft genomic sequence and BAC End Sequences. The following is printed to the standard output:


********** Find overlapping pairs ***********
Hit rejections:
  0     singleton
  2564  min ID and min score
  24327 only min ID
  2778  only min score
  0     not in best contig
  0     seqCtg hit too many ctgs (0 seqCtgs)

Total good hits: 10213

Pair rejections:
  13808 same orientation
  0     too much sequence overlap
  6845  below minimum fpc overlap
  453   above maximum fpc overlap
  34124 different contigs

Total good pairs: 2197

write /home/will/demo/mtpdemo/BSS_results/mtp_pairs.bss
Find Fingerprints Pairs
Clone pairs for ctg1 (clones 314)...3958 pairs
Clone pairs for ctg2 (clones 221)...2951 pairs
Clone pairs for ctg3 (clones 92)...1042 pairs
....
Clone pairs for ctg40 (clones 5)...1 pairs
Clone pairs for ctg41 (clones 2)...0 pairs
Contigs with zero pairs 2
Identified 34669 fingerprint pairs

// All contigs  Min FPC overlap 0  Max FPC overlap 20
// Use Fingerprints: Min Shared Bands 6  
// Use BSS: Score 400  Identity 97  File /home/will/demo/mtpdemo/BSS_results/Dseq.Dbes.bss
// Advanced:  Max Seq overlap 50000  Mult contig ratio 0  Allow neg overlaps  Allow mult BES hits
********** Finish overlapping pairs ***********

You can step through these pairs in the same way as with the fingerprint-based pairs. Only the pair will be highlighted, as spanners and flankers are not used for this process.

If you want to view the BSS alignment data for just those overlaps used for the MTP, select BSS from the Main Menu. You will see a file called mtp_pairs.bss, which was written during the MTP "find overlapping pairs" process; these are the BSS results that were selected for input to the MTP algorithm.

Picking MTP clones with fingerprints and BES overlaps: To see the effect of the more-precise overlap data, click again on the 'Pick MTP clones' button, to create a new MTP including the new data.

The MTP data is again printed to the console, as follows:


************ Starting PickMTP ************
Building graphs completed.         
Finding MTP completed.         
Average MTP clone size: 142157
Contig totals: (in CB units)
Contig    Ctg len   # MTP  overlap   # of gaps   gap length   %covered
-------   -------   -----  -------   ---------   ----------   --------
ctg1          493      14       41           0            0        95%
ctg2          303       8       20           0            0        97%
ctg3          159       4       10           0            0        94%
ctg4          137       4        9           0            0        97%
ctg5          185       5        9           0            0        98%
ctg6          538      17       74           0            0        98%
ctg7          298      10       21           0            0        94%
ctg8          872      27       82           0            0        97%
ctg9          257       7       29           0            0        91%
ctg10         100       3       12           0            0        97%
..
ctg40          37       1        0           0            0        64%
ctg41          41       1        0           0            0        70%

Clone overlap (base pairs):
Positive:
        0- 10000- 20000- 30000- 40000- 50000- 60000- 70000- 80000-
     9999  19999  29999  39999  49999  59999  69999  79999  89999 
       48      4     18     16     13      3      7      3      3 
   Total positive overlap: 2890786
   Average positive overlap: 25137
   Number of positive clone overlaps: 115

Negative:
     0-9999        10000-19999     
        34              1               
   Total negative overlap: 44125
   Average negative overlap: 1260
   Number of negative clone overlaps (spanned by draft): 35

Clones picked:191
BSS pairs: 87 (56%)
Fingerprint pairs: 63 (41%)
Single MTP clones: 3 
Mandatory clone pairs: 0 (0%)
Expressway junctions: 0 (0%)

Number of clones in MTP: 191
Number of mandatory clones: 0

Total gap span: 0 kb
Total MTP span: 24274 kb
Percent of map covered: 93%

// All contigs  Prefer large    Mandatory: 
************ Finished PickMTP ************

To look at just the overall statistics:

   Total positive overlap: 2890786
   Average positive overlap: 25137
   Number of positive clone overlaps: 115
Comparing with the previous run, we see that total overlaps have dropped from 5.2 Mb to 2.9 Mb, a reduction of 44%. The average positive overlap between clones has been reduced from 35kb to 25kb, a major improvement. These basepair figures are also more accurate now, since some overlaps are known exactly from the draft alignments, while before they all were estimated from the fingerprint band overlaps.

Notice also that the printout now has a "negative overlap" section which was not present before:


   Total negative overlap: 22731
   Average negative overlap: 391
   Number of negative clone overlaps: 58
Negative overlaps arise because the draft sequence may overlap the ends of two clones that do not overlap. Since these clones must be very close together, you may want them to be used in the MTP. If you do not, on the Advanced Settings, set 'Only positive overlaps' to on.

If you now step through the MTP pairs as before, you will see the draft-based pairs distinguished from the fingerprint pairs, e.g.:


Clones (1, 2) in expressway of 2:

BSS-draft based pair:
Left        Right       Seq Olap    FPC Olap
z2603       z2627       2824        0
180224      106496                               (length)

D. Saving MTP results Back to top

You may wish to save the results of running MTP, for which purpose FPC provides two options. The first option, "Set MTP clone status to TILE", will change the status of all clones in the MTP to TILE. This option may be useful to indicate in FPC those clones that are part of the MTP. They should highlighted in red on the contig display. If they are not, pull down in white space on the contig display and select 'Edit track properties'; make a filter of Status=Tile, and set Color to blue; see Ctgdemo for more information.

Having done this, your contig displays will show the MTP clones, as well as clone remarks with prefix "MTP:" indicating the expressway start and stop locations and the overlap between each clone and the previous clone in its expressway:

The second option, is Save on the "File of MTP clones", will produce a text file of the clones in the MTP. The clones in this file are given by contig, where the clone names are listed together with the clone lengths, and the (estimated) overlap of each pair. This option may be useful when you need a list of the MTP clones outside of FPC.

You may also Save the "File of Pairs" and later load them in using the "Or use existing Pairs File:" in STEP 1..

E. Mandatory clones Back to top

If some clones have already been sequenced or are in the pipleline for sequencing, these clones should be included in the MTP, hence, they are called "Mandatory clones" in STEP 2. Select the "Mandatory clone" button and a menu will appear that lists: Tile, Sent, Ready, Shotgun, Finished, SD. As shown in this tutorial, you can manually set a clone to one of these statuses. The status of Tile implies the clone has been selected for sequencing but not sent. The status of SD represents a Simulated Digest clone created from the sequence; these can be created with our FSD/ESD package. All the other statuses can be used as desired by your laboratory. Generally, the original clone with have a sequencing status other than SD. You do not want to have both the original in the SD clone in the MTP, as that would be redundant. We provide the choice of selecting zero or more types to be included.

By default, all of the statuses are unchecked, and you should check those for which you have clones which you want included in the MTP.

Important: If your intention is to pick an entirely new MTP, make sure that you are not including mandatory clones. The best way to do this is to unselect all the options in the Mandatory Clone dialog. You can verify that no clones were mandatory by saving the MTP to a file (as just described) and checking that no mandatory clones are indicated in the file.

F. Split BSS Contigs Back to top

In the BSS, there is an option to "Split BSS output by contig". This can be useful when draft sequence is being blasted against the BESs, because there may be a tremendous amount of output, and you may wish to study it per contig without loading a very large BSS result file. If you have generated a split-contig BSS output, then for the MTP you can either load just a single contig file, or you can enter the directory that contains the contig files, in which case all contigs will be loaded.

If you want to try this, first bring up the BSS window and make a split-contig version of the DSeq.DBes.bss result file we have been working with:

  • For Query, select the Seq directory (click Browse, double click Seq, select OK).
  • For Database, select the BES directory (click Browse, double click BES, select OK).
  • Enter the subdirectory 'test'. Select "Split BSS output by contig".
  • Select 'Start search'.
  • You will see the word 'test' in the BSS results window when it is done.
Then in the MTP window, select Load for the BSS file, select test and then Dseq.Dbes.bss (this is a directory). Run STEP 1 and STEP 2 as before.

G. HICF Contigs Back to top

If you are using HICF fingerprints instead of Agarose:
  • Select Configure on the main menu.
  • Select HICF.
  • Save .fpc
You will note that there are different default values for HICF. Also, the option to "Use Sizes" is gone, as it is not relevant for HICF. Everything else is the same.


Email Comments To: fpc@agcol.arizona.edu

 

 

 

Last Modified Wednesday October 29, 2008 13:14 PM and 45 seconds