Table of contents - Feb 2008


Overview

If you have not read through the FPC tutorial and Contig display tutorial, you should do so, as that will answer many of your questions. They are both available from www.agcol.arizona.edu.

To see the Main Menu Help, select the Help with the right mouse and pull-down.

All windows use the same conventions. They are:
Click once on a box or any bold string, it will highlight the box/string. Click a second time and it will bring up a window associated with the box/string. For example, next to the word 'Project' is the name of the project. Click it once to highlight it, click it again and the project window is displayed. Note that the timing of the double click does not matter, i.e. the second click can be at any time after the box or text is highlighted. The one exception to this rule is on the BSS windows, where a fast double-click is required to view results.

On any window, click in white space (i.e. anywhere there is not a button or text) with the right mouse button, and a pull-down of options will appear; continue to hold the mouse button down and drag the cursor over the desired option.
NOTE: Most windows have a 'close' button. And for all windows, on the pull-down in white space, there is a 'close'. Use one of these to close a window (do NOT use the Unix-supplied close on upper left of the window).

A button with a '...' indicates you use the right mouse button to see a set of functions. In some cases, you can click with the left button for the default option or a window of options (e.g. select Help with the right button and a pull-down appears. Select it with the left button, and this help appears.)

FPC definitions

Buried clones

Buried clones: Say clone 1 has N bands and clone 2 has M bands (N less than or equal to M), and all N bands of clone 1 are exactly the same as N of the M bands of clone 2, then clone 1 can be 'buried' in clone 2. Clone 1 is often referred to as the child, and clone 2 is referred to as the parent. This is an 'exact' bury. If a user-defined percentage (default 0.1) do not match, the clone will be buried as 'approximate'.

Clones are buried automatically by the assembly routines (Main Analysis -- Build Contigs and Incremental Build, Contig Analysis -- Compute CBmaps). This can also be done by the user from the Contig Analysis -- Semi-auto edits. The automatic routines sometimes bury clones as 'pseudo'; this is when the child does not qualify as an exact or approximate. See the documentation on these routines to understand the situations in which this happens.

Buried clones are not displayed on the contig display unless requested. All children and parents have suffixes:
A '*' after a clone name indicates it is a parent clone.
A '=' after a clone name indicates it is an exact match to its parent clone.
A '~' after a clone name indicates it is an approximate match to its parent clone.
A '+' after a clone name indicates it is a pseudo parent clone.
A '#' after a clone name indicates it is a pseudo match to its parent clone.

CB Units

The metric in FPC is the CB unit. The assembly algorithm builds an approximate restriction map, which it calls a consensus band (CB) map. The clones are aligned to it for their position. This is also approximate alignment; note that since there is error and uncertainty in the data, a perfect alignment is never possible. The length of the clone in the Contig display is equal to the number of its bands. The length of the contig is equal to the approximate number of consensus bands.

Q clones

When aligning the bands of a given clone to the CBmap, FPC keeps track of the number of bands that do not align, and the number of gaps in the alignment. If approximately 50% of the bands do not align, it is called a Q (questionable) clone. If there are many Q clones in a contig, this often indicates that there is a false positive join.

Anchors (frameworks & placements)

A framework file has markers that have been located on a chromosome or linkage group. Each marker has a chr/lg assignment, and a global position that is relative to the start of the chr/lg. Frameworks are also often called anchors. They are generally from a genetic map or radiation hybrid map. A framework can have type F or P, where the F stands for framework and is well ordered. The P stands for placement, and is not well-ordered but is known to be located between the two flanking frameworks. Note the ambiguity -- the set is often referred to as the framework markers even though it consists of both frameworks and placements. The set is also referred to as anchors. For the markers to be shown in FPC, they need to be located on at least one clone in FPC. As of V7, the anchors have both a chromosome (or linkage group) assignment and an anchor position relative to the chromosome.

Other Documentation

All other documentation is at:
www.agcol.arizona.edu/software/fpc

1. Demo - get demo.tar and download FPC Tutorial. The Word document takes the user through a demonstration of FPC using the demo.tar files. Also, under the /files directory in this demo are examples of all possible files that can be loaded by a FPC function under File.... There is a README explaining them, and the format is easily discernible from looking at the appropriate file.

2. Contig demo - get ctgdemo and select Contig Display Tutorial. This is a web based tutorial that takes you through all the options of the Contig Display. It is very versatile and very easy to learn to use. You can add tracks, delete tracks, filter what is shown, and color entities based on type of substring.

3. BSS demo - get bssdemo and select BSS Tutorial. This is a web-based tutorial that takes you through using the BSS, a tool which allows you to blast your sequence against all the sequence associated with the FPC map (BESs and the sequence of simulated digest clones).

4. MTP demo - get mtpdemo and select MTP Tutorial. This is a web based tutorial that takes you through selecting a MTP (minimal tiling path) of clones with FPC.

5. HICF demo - get hicf.tar and select HICF Tutorial. This is a web based tutorial that illustrates the differences between using FPC with Agarose versus HICF.

6. Automerge demo - get auto.tar and select Automerge Tutorial. This is a web based tutorial that illustrates the automatic contig merge function of FPC.

There is a FPC V4 Manual that has not been updated since release V4. All changes since then are listed in the Release Notes. These two are probably not necessary.

See the reference section of www.agcol.arizona.edu/software/fpc for a list of papers published on FPC.

Email fpc@agcol.arizona.edu with your questions and suggestions.

Main Menu Functions

Double click the Project name, and a list of the contigs along with the number of clones, etc is shown. This is referred to as the Project window.

Search

There are three classes of search: contigs, clones and markers. Click on one of these buttons to make it the current one. Click a second time and it will bring up a window referred to as the 'keyset', which contains all the entities in the class. When a class is the current one, you may search for a subset of the entities. Click on Search commands with the right button, and you will see a drop-down list with the most common of the search options. Alternatively, select with the left button and a complete menu of search options will appear. Contigs do not have any search types except Name; see the Search menu on the Project window for contig searches.

A search is always performed on what is in the keyset. If there is no keyset, all entities in the highlighted class are searched. For example, if you search on substring (see Name below), and it displays all clones that start with a 'a', you can then search this keyset for all clones that were added after a given date, say 1/1/01. The resulting keyset will be all clones starting with 'a' that were added after 1/1/01.

Name: For contigs, enter a contig number. For clones or markers, enter a name. In any case, you may see all the entities with a given substring, as follows: A '*' at the beginning or end of the string says to show all entities with that substring in the name, e.g. if the clone class is highlighted, and 'a*' is entered, all clones starting with an 'a' are shown. Note that a convention in genomics has been established as follows: if there is no match, the software will automatically put a '*' at the beginning and end of the string, and search again.

Date: For markers and clones, you can search for the last modified date, or for the creation date. NOTE: the date must be entered in European format, e.g. January 2, 2000 is entered 2/1/00. That is, the day comes before the month.

Starting the search: After entering a string in the text box for a search, either hit carriage return or select the class name.

Clear: clears what is in the text box.

Reset: resets the keyset to all entities. Closing the keyset window also resets the keyset to all entities.

File....

The markers and frameworks are entered from external files from this drop-down menu. See the Help under Ctg->Chr for more information about frameworks. Marker, clone and contig remarks can be added from an external file. Example files are provided with the demo.tar in the /files subdirectory, along with a README explaining what is what.

External files are written from this menu, for example, you can save an fpc file under a different name with 'Save FPC as..'.

The project can be locked, so that if you are editing the file, no one else can edit it at the same time. Others can still browse the FPC file, but if they try to write, FPC will tell them "Locked by User X", where X is the user who has the lock on the file.

The only interesting function on Check .cor is the "Clean .cor"; if you have removed clones from your fpc file, this will remove the corresponding bands from the .cor file. Though FPC will work without cleaning the .cor file, it saves space if you remove many clones.
The other occasionally useful function is if you enter a clone name in the text box, it will print its bands to terminal.

Clean Up

Closes all open windows.

Contig Display Help

See the Main Menu Help/Overview for definitions of buried clones, CB units, and anchors.

Tracks in FPC

The contig display in FPC is a track-based viewer. Filters can be applied on a per-track basis, and tracks can be added, removed, resized, and rearranged as desired. There are currently four types of tracks:

Markers track:

Shows any markers attached to clones in the contig. A marker is usually attached to several clones, reflecting hybridizations to clones performed in the lab or via BSS.

Clones track:

This is the most important FPC track, showing the ordering of clones based on their fingerprints. Note: the ordering is approximate, and can easily be off by a few CB units.

Remarks track:

Remarks can be added to markers and clones. These remarks can be viewed in a remarks track. Several remarks can be attached to a single clone or marker, but a single remark cannot be attached to several clones or markers. Remarks are centered under the clone or marker they are attached to. Remarks are added to clones or markers via the Clone Edit or Marker Edit windows. You can also add a file of remarks by pulling down on 'File...' from the Main Menu, and selecting 'Merge clone remarks' or 'Merge marker remarks' (the demo.tar has an example file).

Anchors track:

This track shows the ordering of anchors attached to clones in the contig. Anchors are special markers that have a chromosome position assigned to them. Note: the positions of the anchors in the display reflect the positions of the corresponding markers in the contig, not the genetically mapped positions along the chromosome. A unique feature of this track type is that all data is displayed within the horizontal window boundaries; no horizontal scrolling occurs. The shaded part of the track represents areas not currently visible. Clicking on an anchor centers the display on the marker and highlights it.

Sequence track:

Displays sequences anchored to clones, primarily by alignments to BES. Sequence contigs are shown as small rectangles along the sequence line. Clicking on the sequence once highlights it and all of the clones which it hits, and causes all of the sequence contigs which have hits to turn red. Clicking again brings up the information window for that sequence. Right-clicking brings up a menu with a choice to toggle the "Cycle" feature on and off. If this setting is on, then additional clicks cycle through the sequence contigs, turning them red and showing the clones hit by each in turn.
The "edit track properties" box for a sequence track contains a filter for reversed sequences, i.e., sequences whose contigs appear to be in the reverse order relative to the FPC contig. They may be identified for futher analysis.

Scrollbars

There are three types of scrollbars used for navigating through contigs:

Horizontal contig scrollbar:

This is the horizontal scrollbar along the bottom; it is used to horizontally scroll all tracks simultaneously.

Vertical contig scrollbar:

If there are more tracks than can be shown in the window, this scrollbar will be visible along the right side of the window, and will scroll through the tracks.

Vertical track scrollbar:

This is the vertical scrollbar along the left of every track. If not all items are shown in the viewable area of the track, use this scrollbar to see the other items.

Ruler

The ruler above the horizontal contig scrollbar measures the coordinates of the contig in CB (Consensus Bands) units. Each unit corresponds to one restriction fragment, which when using a typical agarose restriction enzyme such as EcoRI, corresponds to approximately 4096 bases. There are two position tick marks on the ruler. These can be dragged around when they are clicked, and are used to define a region to perform operations on (See Select Clones from the Edit menu).

Resizing a track

The viewable area of a track can be changed by moving the mouse over the bottom edge of the track. When the cursor changes, click the mouse button and drag it up/down to decrease/increase the viewable area of the track. The track is resized as soon as the mouse button is released.

Editing a track

There are many ways to define what items are shown in a track, and how they are shown. This information is edited by right-clicking on an empty part of the track and selecting `Edit track properties' from the pulldown menu that appears.

Removing a track

A track can be removed by right-clicking on an empty part of the track and selecting `Remove track' from the pulldown menu that appears. This removes the track from view; however, no data is deleted.

Zoom

Zooming is used to change the horizontal scale of all tracks. Click `Whole' to select the zoom that fits all items in the current window.

Show buried clones

See the Main Help/Overview for a description of buried clones. This function has 2 states:

Yes:

All clones, including exact, approximate and pseudo matches are shown.

No:

Clones that have exact, approximate, or pseudo matches to other clones in the contig are hidden from view.

Search

Performs a substring search on all items in all tracks, and highlights them if there is a match. The contig is centered over the matches found.

CB Unit Range

If there are many clones in a contig, the display has to be partitioned into sections or otherwise the display is too slow (or worse, you run out of memory). If arrows are shown beside the text box, it means your display has been sectioned. Only the frameworks contained in the section are shown. Only clones and markers within the CB unit range are shown. You can scroll around and zoom as usual. To move to the next section, use the arrows or change the CB coordinates. You can change the CB coordinates within the text boxes; but beware, if you try to show to much, you may run out of memory.

If a contig is over 3000 CB units, it is automatically sectioned; to change the default to something other than 3000, go to the Configure menu on the main menu. Using the CB units section is also useful for printing a specific range of clones; see 'Print' for details.

Contig remarks

The text box below the Zoom section shows the contig remarks. Two types of contig remarks are shown:

Chr Remark:

Gives information regarding the anchoring of the contig to a chromosome. This remark is typically created by the `Ctg->Chr' function off the Main Menu.

User Remark:

Any comments the user wishes to make regarding the contig. Contig remarks can be edited via the Contig/Edit/Contig Remarks function, or from the Project Window by clicking on the contig to highlight it, and then right-clicking and choosing Edit Highlighted Contig.

Menu items

This is a brief overview of the items available from the menu bar at the top of the contig display.

File menu

Print...

Opens a dialog for either printing to a printer, or saving to a file. Exactly what is printed/saved depends on several options:

Print Area: If 'Region' is selected, everything between the position tick marks on the Ruler will be printed. If 'Entire Display' is selected, everything between the 'CB units section' coordinates will be printed. This is usually the entire contig.
Note, for the later case, it may print more than what is on the screen, i.e. it will print everything in the scrollable region. In this case, the anchors will be off to the left on the printout. You can do the following: remove the anchor track, add a marker track to the bottom, and filter everything but the anchors.

Orientation: Determines the orientation of the contig on the printed page.

Page size: Determines the dimensions of the printed page. Everything is sized so that it fits on one page of the selected page size.

Close

Closes the contig display.

Edit menu

Perform editing functions on the current contig.

Contig remarks...

Edit the contig remarks (Chromosome remark, User remark, Trace remark). You can also change the status of the contig or manually assign it a chromosome and position.

Select clones...

Select clones based on various criteria, and perform operations on the selected set. A region is between the little blue and black arrow found in the ruler at the bottom of the display; operations can be performed on the clones selected in a region, or the whole contig, or individually selected clones.

Edit clone coordinates...

Change clone coordinates on selected/highlighted clones.

Merge contigs...

Merge two contigs. The contig you are merging with will be placed to the right of the current contig. Flip contigs as necessary to join the correct ends.

Analysis menu

Routines used to analyze the current contig.

Evaluate...

Perform various analysis functions on clones in the contig.

Compute CBmaps...

Calculate new CBmaps for the current contig, or a selected clone set.

Semi-auto edits...

More analysis functions, plus actions for burying clones, etc.

BSS...

Brings up the BSS (Blast Some Sequence) window. This is also available from the Main Menu.

Highlight menu

These items are used to manipulate how clones are highlighted.

Clear all

Clears all highlights.

Trail

If this toggle is on, entities will remain highlighted as you click from one to the other. This is a handy way to select a group of clones when used in conjunction with 'Select colored'.

Select colored

Selects all highlighted clones (clones with a background color).

Select keyset

Selects all clones in the current keyset.

Show Additions

Clones within contigs that have been added by the IBC (Incremental Build Contig) or Keyset->FPC with Auto on, will be shown in blue.

Add track menu

These items are used to add additional tracks to the display. Each will create a track of the given type (marker, clone, remark or anchor) and append it to the bottom of the current display. This means you can have more than one track of a given type.

Layout menu

These items are used to save the currently defined layout of tracks. These settings are written to a file called [project].fpp whenever the project is saved (Main Menu/Save .fpc). One can apply layouts in three ways:

Store layout for this contig

Saves the layout for the current contig only.

Store layout for all contigs

Saves the layout so that it is applied to all contigs, overwriting any previously defined layouts.

Store layout as default

Saves the layout so that it is applied to all contigs that do not have a layout defined for them, and any new contigs.

Size Options

The contig display takes up must of the screen. You can reduce the size by selecting "No contig stats", which removes the right part of the screen, or "Tracks Only", which removes everything by the top menu and the contig display to the right of the Contig Stats. You can do almost everything as with a full screen, with the main omission being the zoom, which is also available from this pull-down.

Contig Display Legend

Special marks

A '*' after a clone name indicates it is a parent clone.
A '=' after a clone name indicates it is an exact match to its parent clone.
A '~' after a clone name indicates it is an approximate match to its parent clone.
A '+' after a clone name indicates it is a pseudo parent clone.
A '#' after a clone name indicates it is a pseudo match to its parent clone.

Pseudo buried clones refer to clones that are buried even though they do not satisfy the criteria of being an exact or approximate bury.

Entity highlight colors

The highlight color refers to the background of an entity, not the color of the text, which is set via the Edit Track Properties.

General

These highlight colors apply to all entities (clones, markers, and remarks).
Cyan: The current entity
Green: Friend of current entity
Dark Green: Parent of hidden friend of current entity

Clone specific

These highlight colors only apply to clone tracks.
Mid Blue: Selected clone

Clones highlighted via Analysis/Evaluate...:

Pale Blue: Clone1 or Clone2
Red: Run CtgCheck, snCtgCheck, NoOlap and the bad clone(s) are shown in red.
Light Red: Using the Step function, one clone is shown in Red, and the corresponding clone is shown in Light Red, as follows: Best match but not nearest (CtgCheck), below cutoff but no overlap (NoOlap)
For BadOlap, all pairs are shown in light red (these are pairs where the number of shared bands and the amount of overlap differs by over [15], where the 15 can be changed.
Purple: Match by prob and share marker (Ctg CpM, Ctg Markers)
Violet: Match by prob only (Ctg CpM), match by marker only (Ctg Markers), and for matches (Ctg-->Ends, Sel-->Ends)

Clones highlighted via Analysis/Compute CBmaps...:

Purple: the clone that joined neighboring CBmap (CBmap Okall). There could be more than one that qualified to join the CBmaps, only the best is shown. If the clone is buried, its parent will be highlighted in light purple.

Clones highlighted via Analysis/Semi-auto edits...:

Blue: For new additions (Show Additions)
Purple: For possible buries (Find Canonicals), for matches (Compare Keyset)
Pale Blue: Possible parent clone (Find Canonicals)

Marker specific

These highlight colors only apply to marker tracks.

Markers highlighted via Analysis/Evaluate...:

Light Red: Marker Split, MultCtg, OneClone

Edit Track Properties Help

Explanation of filters:

Filters determine which items are shown and/or their color. Filters are applied as soon as the `Apply' or `Apply & Close' button is clicked. Clicking `Close' leaves the track unchanged.
Use the `Store contig layout' function on the contig window to save any changes to the track properties. If you want the change permanent, "Save .fpc".
The available filter criteria vary depending on the type of track. Track-specific information is given below.

Colors:

The Color column determines the color of the items meeting all the filter criteria. `Invisible' is a special color; it hides all items meeting the criteria.

Filter precedence:

Many filters can be used simultaneously. The order from top to bottom determines the precedence of the filters (the lowest on the list has the highest precedence -- picture the markers dropping through the filters from the top). The order can be manipulated by dragging/dropping the filters in the list. For example, to only show markers of type STS, use one filter and set the 'Type' field to 'STS' and the 'Color' field to 'Black. To show all markers, but color the STS markers Blue, use two filters: the first should have all text fields blank or set to '*', all option menus set to '[Any]', and the color menu set to 'Black'. The second filter should have the 'Type' field set to 'STS' and the 'Color' field set to 'Blue'.

Adding a filter:

1) Click on the row containing the text `[Select to add filter]'.
2) Modify the criteria fields in the `Selected Filter Edit' box as desired. Remember: all criteria must be met for the filter to take effect. An empty text box or a `*' in the text box, or the `[Any]' option in a pulldown menu will match everything.
3) Click `Apply' for changes to take effect.

Modifying a filter:

1) Select the filter you wish to modify by clicking on its row.
2) Modify the criteria fields in the `Selected Filter Edit' box as desired.
3) Click `Apply' for changes to take effect.

Removing a filter:

1) Select the filter you wish to remove by clicking on its row.
2) Click on the `Remove Filter' button in the `Selected Filter Edit' box.
3) Click `Apply' for changes to take effect.

Changing track position:

To change the track position with respect to the other tracks in the contig, use the spin button labeled `Track position' in the top right corner. You can change the position by entering a number directly, or by using the up/down arrows. Changes are applied once an `Apply' button is clicked.

Row policy:

Three methods of arranging data in the track are possible:

Automatic:

Computes the minimum number of rows necessary to show everything without writing items on top of one another. The track's vertical scrollbar is used to scroll through the data if the items do not all fit in the viewable area.

Fit data to viewable area:

Forces all items to be drawn in the currently viewable part of the track. This effectively disables the track's vertical scrollbar. This policy may "squash" items, or it may spread them out more than necessary. However, all items will always be visible in the track's viewable area.

Limit number of rows to:

Items will always be arranged using the selected number of rows, regardless of the viewable area, or the number of rows really needed.

Marker track filter criteria

These filters apply to all marker tracks.

Marker Substring:

Filters based on the name of the marker. All markers containing the search string in their name will match. `*' wildcards are allowed at the beginning and end of the search string.

Remark Substring:

Filters based on any remarks attached to the marker. All markers containing the search string in their name will match. `*' wildcards are allowed at the beginning and end of the search string.

Type:

Filters based on the marker type. Currently available marker types are: BAC, cDNA, Clone, Cosmid, eBAC, eMrk, End, Fosmid, Locus, Overgo, PAC, PCR, Probe, Repeat, RFLP, SNP, SSR, STS, TC, and YAC.

Attachment:

Filters based on how clones are attached to the marker. Currently available is One Clone (for markers that have only one clone attached in this contig) and Mult Ctg (for markers that have clone attachments in several contigs.)

Clone track filter criteria

These filters apply to all clone tracks.

Clone Substring:

Filters based on the clone name. All clones containing the search string in their name will match. `*' wildcards are allowed at the beginning and end of the search string.

Remark Substring:

Filters based on any remarks attached to the clone. All clones containing the search string in their name will match. `*' wildcards are allowed at the beginning and end of the search string.

Type:

Filters based on the clone type. There are two variations on type. The first refers to the way the clone was produced in the lab (BAC, PAC, etc.). The second refers to how the clone matches with other clones in the contig (Parent, Pseudo Parent, etc.). Some of the latter types may apply only when 'Show buried clones' is set to `Yes'.

Status:

Filters based on the sequencing status of the clone.

Remark track filter criteria

These filters apply to all remark tracks.

Remark Substring:

Filters based on the remark name. All remarks containing the search string will match. `*' wildcards are allowed at the beginning and end of the search string.

Type:

Filters based on the type of remark. Currently available remark types are: Clone Remark, Clone Fp Remark, and Marker Remark. Note that the marker remarks are mixed in with the Clone remarks by default, but you can Add Track of type Remark and have on Remark Track only show marker remarks and have the other Remark Track show the Clone Remarks. Don't forget to "Store contig layout" and Save .fpc if you want this a permanent change.

Anchor track filter criteria

These filters apply to all anchor tracks.

Anchor Substring:

Filters based on the anchor name. All anchors containing the search string will match. `*' wildcards are allowed at the beginning and end of the search string.

Type:

Filters based on the type of anchor. Currently available types are Anchor and Placement. See help in Ctg-Chr window for an explanation.

Chromosome assignment:

Filters based on how the chromosome assignment of the anchor compares to the chromosome assignment of the contig.

Parameters

Tolerance

- how different two bands can be to be called the same. Used for the comparison of clones.

Cutoff

- when two clones are compared, their number of shared bands is counted, and the probability that the shared bands are a coincidence is computed. If the coincidence score is less than this value, the two clones are said to overlap. E.g. 1e-15 is a more stringent cutoff than 1e-12.

Bury

- one clone may be buried in another if only N percent of bands are different, where N is the value of Bury. This reduces the number of clones on the initial contig display. See the Main Menu Help Overview for more information on Buried clones.

Precompute

- setting this to 'on' makes the computation of the coincidence score faster if there will be many comparisons (e.g. a complete Build) and the clones have more than 60 bands on average. It also uses more memory. This can only be set from the Main Analysis window, though it works for all functions that compute coincidence scores.

Use CpM

- two clones may overlap if their coincidence score is below the cutoff, AND they have shared markers. See the CpM table for the cutoffs.

Log

- Results of various searches will be written to a file of the name containing the project name followed by the userid. This used to be used when there was a lot of manual editing for reordering clones. It has fallen into some dis-use, so we have not been keeping it entirely current. Please email fpc@agcol.arizona.edu if you desire it updated for some use.

Stdout

- Results will be written to the terminal.

Contig/Analysis/Evaluate

Clone 1

enter a clone name by either (1) typing it in, (2) clicking on a clone in the contig display, then clicking the "Clone 1" box, or (3) drag and drop from the terminal window. The clone will be used in the following five functions.

Eval

compares Clone 1 with all the clones that it overlaps based on contig coordinates.

FPC

compares Clone 1 against all clones in FPC and prints the ones that it overlaps based on the cutoff.

Ctg Prob

compares Clone 1 against all clones in the contig and prints the ones that it overlaps based on the cutoff (and CpM table if used).

Ctg Markers

compares Clone 1 against all clones in contig and prints the ones that share one or more markers with it.

Clone 2

enter a clone name, as specified for Clone 1. Click "--> Clone 2" and the two clones will be compared.

Ctg->Ends

compares the end clones to the ends of all other contigs, where the end clones are the ones that are within N CB units from the end of the contig, where N is specified in "FromEnds".

Sel->Ends

the same as the above function, but the selected set is compared to the ends of all other contigs.

CtgCheck

This function finds the clones where the clone immediately to the right is not the one with the best Sulston score.
Step: will step through the clone pairs that failed the above check.

NoOlap

This will show all pairs that have a good overlap based on the Sulston score, but do not have overlapping contig coordinates.
Step: will step through the clone pairs that failed the above check.

BadOlap

For each pair of clones that overlap based on the Sulston score, this function determines the number of shared bands and the number of overlapping bands based on the contig coordinates. If the (ABS(shared bands - overlapping coordinates)) is less than the value in the box (default 15), the clone pair are marked as bad. Note that it will be rare for the number of shared bands and the overlap to agree, as (1) not all shared bands are really the same, and (2) the ordering is approximate. But, these values should not be wildly different.

snCtgCheck

Locate clones whose two nearest neighbors in the CB map are not its best and second-best overlapping clones. Not ordinarily used.

Marker Split

highlights the markers within the contig that hit clones that do not overlap.

MultCtg

highlights the markers within the contig that hit other contigs besides the current one (note, these markers can be made invisible or colored from the marker track filter).

OneClone

highlights the markers that only hit one clone (note, these markers can be made invisible or colored from the marker track filter).

Contig/Analysis/Compute CBmap

Calc

This function will compute the CBmap on the current contig. This is the exact same function as used by the Build Contigs from the Main Analysis window. If you first "Unbury All", and the parameters (Tolerance, Cutoff, Bury~) are the same as they were for the Build, you will get the same solution. The solution is shown in a window that shows the actual consensus band map and the alignment of the clones to the map. If the CBmap has many o's (gaps) and extras (row of numbers below the clone name), you may want to change the Cutoff and re-Calc (do not change the Tolerance). If there is a better solution with the re-Calc, you can instantiate it with the OK.

Fp Order

Construct a CB map using the order of the clones in the FP window, or if no FP window is active, using the order of the clones within the contig. Useful for manual ordering of clones, but this is not recommended.

Gel Order

Same as Fp Order but uses the clone order within the Gel window.

Selected

Builds the CBmap of the selected clones.

Build

The brings up an empty CB window. As you pick a clone on the Contig window, it will be added to the CBmap. Note that this is not doing a global analysis so the alignment may become very sub-optimal, but it allows you to see the fingerprint alignment for a few clones.

Unbury All

This only exists here so you can first unbury all clones before running Calc, which runs the CB algorithm. The algorithm generally will perform better if all clones are initially unburied.

/Compute CBmap/CBmap display

Along the left edge is a counter of the consensus bands, and next to this are the consensus band values. A tick marker by a consensus band delineates a partially ordered group.
The clone names are across the top using 4 rows. Beneath the clone names are the number of extra bands for each clone, where an extra band is one that does not align to the CBmap. Beneath the extra bands is a vertical column of {+, o, x} where a '+' indicates the clone contains the band, a 'o' indicates the clone does not contain the band, and a 'x' indicates it contains the band using twice the given tolerance.

Whole

The entire consensus band map is shown.

In

Zoom in on the consensus band map.

Out

Zoom out on the consensus band map.

Find High

If there are multiple CBmaps, you can highlight a clone in the contig then select this button, and it will find the highlighted clone.

Extra

This lists the values of the extra bands (probably not very useful).

Band

This lists the consensus band vales (probably not very useful).

??...

Show Qs: highlights in purple all the Q clones.
Show Friends: highlights in purple all the clones that overlap the blue clone based on the sulston score.
Clear All: clears all highlighted.
Show ?: shows all the extra bands that can be placed somewhere on the map.

Select

This is useful if there are multiple CBmaps. By clicking this, the clones in the current CBmap will be selected on the contig display.

Align

obsolete.

Next

If there are multiple CBmaps, this shows the next one.

Last

If there are multiple CBmaps, this shows the last one.

Again

Calculates a new solution for the current CBmap.

Left...

Left: the clones are ordered according to the left coordinate in the CBmap.
Contig: the clones are ordered according to the left coordinate in the contig.
Build: the clones are ordered according to how they were added to the CBmap by the algorithm.

OK

Bring up the menu to instantiate the clones in the contig, that is, they take on the coordinates computed by the current set of CBmaps (see CB units in the Main Menu Overview Help).

/Compute CBmap/CBmap display/OK

When a contig's CBmap is calculated, and if it is done at the original cutoff, then it will only create one CBmap, and none of these parameters apply. You can just click "OK All" if you want to use the new coordinates. That is, all clones will get new contig coordinates based on the alignment to the CBmap. A clone's coordinates are computed by selecting the midpoint of its position on the CBmap, then adding -(number of bands)/2 for the left coordinate and +(number of bands)/2 for the right coordinates. Hence, the length of all clones is equal to the number of their bands.

When a contig's CBmap is calculated, and if it is done at a more stringent cutoff then the original contig was created at, it may break up into multiple CBmaps and some clone may become Singles, that is, they no longer match with any other clone. It will try to join CBmaps based on their end clones overlapping at a less stringent cutoff. The following parameters instruct the software on how to handle the Singles and multiple CBmaps

Bury Singles as Pseudo

If you want to keep these in the contig, have them buried as Pseudo. They can not be Exact or Approximate as that would require a good overlap with their parent. Hence, the best overlapping clone is found for the single clone, and it is buried as a Pseudo.

Move Singles to Ctg0

Move these clones to Ctg0 (Singletons).

Distance FromEnd to qualify as end clone

This specifies how far from the end of the CBmap the end of the clone needs to be to qualify as an end clone.

Cutoff for matching end clones

If two end clones from two different CBmaps overlap based on this cutoff, the CBmaps may be joined (the algorithm checks all possible joins and uses the best).

Overlap between adjacent CBmaps

If two CBmaps overlap based on their end clones, they will be placed in the same contig with the end clones overlapping. This parameter tells the software how many CB units to have them overlap.

Move unconnected to new contigs

After doing the end joining, there may still be disconnected CBmaps. If this flag is not on, then all CBmaps will be put in the same contig with a line between disconnected contigs. If this is on, disconnected CBmaps will be moved to new contigs. For example, say there are 5 CBmaps and they can be joined as (1,3), (2,5) and 4. The CBmaps (1,3) will remain in the current contig, CBmaps (2,5) will be moved to a new contig, and CBmap 4 will be moved to yet another new contig.

OK All

Performs the end joining and placing the clones in the current or new contigs.

Reject

Do not do anything and close the window.

Report to Stdout

Reports the analysis of joining CBmaps (this is probably useless except for debugging).

Starting coordinate

You will only see this parameter if you have executed Fp Order, Gel Order or Selected, and the set of clones is not equal to the number of clones in the contig. This allows you to place a subset of clones within the given contig at a starting coordinate of your choice.

Contig/Analysis/Semi-auto edits

SelectColored

All highlighted clones will have their state set to selected. An example of when this is useful: say you want to make a keyset of all the clones in the contig hit by a given marker. Select the marker and it highlights its clones. Use this function to set all the highlighted to Selected. On the main menu, select Clone, then Search Commands, then Selected from the Search Command menu.

ShowAdditions

Each clone has a field called "Oldctg". After any type of merge (Contig/Edit/Merge, or Main Analysis IBC or Ends-->Ends with Auto on), the OldCtg is set to the value of the contig before the merge. This will highlight the merged clones.

UndoAdditions

This will undo the additions. The merged clones will go back to their original contig based on the OldCtg values. Note that they may not go back to the original contig number. Also, the order will be recalculated when they are moved to another contig.

ClearAll

Same as the Contig/Highlight/Clear all. It clears all highlighting. Its here for convenience.

OldCtg

Enter a contig number, and any clones with that value for the OldCtg field will be highlighted. This is useful on the Ends-Ends when multiple contigs have been merged, you can see exactly where one particular contig landed.

Orphans

Highlights all clones that do not match any clone in the contig below the cutoff.

Remaining functions

These are fairly obsolete as detailed hand editing is rarely done anymore.

Single clones can be unburied or buried. All clones can be unburied. Select a set of clones (select a clone and pull down with the right mouse and drag to Select, or go to the Contig/Edit/Select Clone window), and then select Bury selected to have these clones automatically buried.

Select a clone in the contig and then select Remove from Ctg and it will be removed.

The Find Canonicals finds all clones that may be buried based on the Bury parameter, and can be added with the Next/Last/Bury Purple.

The Compare Keyset compares all the clones in a keyset and can be added with the Next/Last/Add/Add&bury.

Main Analysis/Clone


Enter a clone name in the yellow box. This may be selected from a terminal window, that is, drag and drop works.

Select -->FPC and the clone will be compared against all of the clones in FPC.

Select -->Key and it will be compared against the keyset if one exists. If there is no keyset, then it will check for a clone window and compare against this clone. That is, if you want to compare two clones, put the name of one in the yellow box, and bring up the Clone box for the other (and make sure there is no Keyset).

Main Analysis/Ends->Ends,KeySet->FPC

Auto Merge/Add

When this is on, the result of each search will be automatically performed, as follows:
For Ends-->Ends, the joins are made. The coordiantes of the end clones are set to overlap by 10. The CB algorithm, which reassembles all clones, is NOT exectued.
For Keyset-->Fpc, it automatically adds each clone to a contig and aligns it with the best hitting clone.

FromEnd

Specifies how close the end of the clone must be to the end of the contig in CB units in order for it to be considered an End clone.

Ends-->Ends

Reports pairs of contigs that have clones on the end that match according to the Cutoff or CpM table (if on). The 'Match' parameter sets the minimum number of distinct clones on each contig which must match in order for the contig pair to be reported. Results are reported as R, L, or B, depending whether the clones come from the right, left, or both sides of a contig (the last applies to small contigs). Uses FromEnd and Auto. If Auto is checked, then the merges found will be automatically carried out. The merges are done in order of the cutoff at which they first occur, and each end of each contig is only allowed to be used once. Merges are recorded in contig remarks, and clone Fp remarks with keyword 'End-merge' are added to the strongest-overlapping clone pair at each merge site.

In order to view the merged contigs, go to the Contig display/Highlight, and select ShowAdditions. The added clones will be shown in violet. If multiple contigs were merged, enter the Oldctg number for one of them to see how a single contig was merged.

KeySet-->FPC

Reports all matches below the cutoff (or according to CpM table).
If the clone is a singleton and Auto is on, it will be added to the contig for which it has the best score. It is positioned using the mid-point of the clone for which it has the best score.
NB There used to be a Single->Fpc, but that can be done by making a keyset of singletons.

Ends Only: when this is on, it only adds singletons that have a good match to a clone with FromEnd of the end of the contig.

Include Ctg0: by default, clones that are singletons (i.e. in Ctg0) are ignored so that clones will be added to contigs whenever possible, e.g. if clone A hits clone B in ctg0 with a better score than clone C in ctg1, and both scores are below the cutoff, it will not be added to ctg1 if this flag is on.

Main Analysis Build Contigs

Also, FPC Tutorial for more details.

Build Contigs: best contig of

This parameter is used for Build Contigs and Incremental build. The algorithm to build contigs is called the CB (Consensus Band) algorithm. The CB algorithm is greedy, that is, it can come up with a different answer by starting with a different pair of clones. By default, it tries starting with 10 different pairs, resulting in 10 different solutions. It uses the best of the 10. You can change the 10 to a higher or lower number.

Build Contigs

First it kills all existing contigs as described below, i.e. it uses the Kill parameters. Second, it runs the CB algorithm on all singleton. CB algorithm builds contigs.

Kill

All contigs with less than N clones will be removed, that is, all clones will be made singletons. By default, N=max where max is the maximum number of clones in a contig so all contigs will be killed. It is often undesirable to Kill contig have have clones selected for sequencing, i.e., clones with a sequence status other than none. If Kill Seq Ctgs is off, they will not be killed, if it is on, they will be killed. Note: contigs with status of Dead or Avoid are also killed.

Incremental Build

This function takes all the clones added after the Last Build date, and if the Use CpM button is on, all clones that have had new markers added to it since the Last Build date, and does the following:
(1) Compares each clone against all contigs and joins all contigs which it hits. The clone is added to the contig.
(2) Compares each clone with all old singletons. If the new single has been added to a contig and it matches an old single, the old single is added to the contig. Otherwise, a new contig is made of the two singletons.
(3) For all contigs with new clones: If the contig is new, the CB algorithm is run to reorder the clones. For exising contigs: if NoCB is off and the contig status is not NoCB (see below), the CB algorithm is run to reorder the clones; otherwise, the new clones are given the position of their best hit (i.e. the clone for which they have the best Sulston match); and merged contigs are added to the right end and the user must reorder the contigs.
(4) The contig window displays all contigs with added and merged contigs. If log is on, they are also written to the log file.

Comparing Incremental Build with Build Contigs:

If Build Contigs is run on all clones in the database using a given cutoff, and new clones are added using Update .cor, and then the Incremental Build Contigs executed, the results will be exactly as if the Build Contigs was run on all clones at once. This does not hold true if you change the cutoff or change the contigs between Builds. If you split a contig, it will remain split on the Incremental Build. Likewise if you join contigs they stay joined. The one caveat is that if a new gel is added for an existing clone, the clone is not re-evaluated in the Incremental build.

Contig status:

A contig status is as follows:
+ Ok - run the CB algorithm on all modified contigs.
+ NoCB - On Incremental Build, adds and merges are done, but no reordering.
+ Dead or Avoid - do not touch this contig with either Build.

The contig status can be changed via the project window with the Edit Highlighted Contig (pull down in white space), or from the Contig display window, select "Contig remarks" on the Edit menu.
Alternatively, all contigs with at least one sequenced clone can be set to NoCB as follows: pull down in white space on the Main Analysis window; select "The Attic", then select "Set all sequenced contigs to NoCB".

On the Contig display-Highlight menu, the "Show Additions" shows singletons added in Purple and merged contigs in light purple. Under Analysis/Semi-auto edits, the "UndoAdditions" undoes all IBC changes to the contig.

CpM table:

For either Build, the CpM table will be used if Use CpM is on. For IBC, the CpM values for the Last build are saved in the fpc file. To change the values, change the Current CpM table, execute Incremental Build, and answer No to the first question and Yes to the second question.

Main Analysis ReBuild Contigs

The DQer

Look on the Project window (double click project name on Main window) There is a column labelled Qs. When the clones in a contig are ordered by the CB algorithm, it tried to order the bands. If 50 percent of the bands cannot be ordered it is called a Q clone. If there are many Q clones in a contig, it is likely to be chimeric.

if >=N Q's: The DQer reanalyzes all contigs that have over N Qs, where N=5 by default. The threshold can also be set to a percentage of the number of clones in the contig. To use it this way, enter the number as a percentage, e.g. "15%".

Step N: It reruns the CB algorithm up to 3 times, using successively more stringent cutoffs until there are less than N Qs. If after 3 tries, a contig still has more than N Qs, it is not further analyzed. The successive cutoffs are determined by the step value, which defaults to 1. For example, if the cutoff is 1e-12 and the step value is 3, the DQer will reanalyze the Q contigs at 1e-15. If that does not reduce the Qs, it tries 1e-18, then 1e-21. If a contig splits into multiple CBmaps at the more stringent cutoff, each CBmap is moved to a new contig. Note: for agarose based FPCs, a stepsize of 1 is typical; for hicf data, a larger step size is often necessary.

If a clone becomes a singleton (Ignored) at the more stringent cutoff, it is moved to Ctg0.

No merge CBmaps: the CB algorithm often results in more than one CBmap per contig when assembled at a more stringent cutoff. If this option is not on, it will try to merge CBmaps based on less stringent cutoff for the end clones (this is the same idea as used with the End-->End merge). If this flag is on, all but the first CBmap will be moved to new contigs; you may want to have them all moved, as then you can use the automatic End merger to join (it allows you to specify how many end clones should match).

"DQer Split ctg172 1e-13 Map>1 Qs2." This is an example Result message that could be displayed on the Project window. This says that more than one CBmap was created at the 1e-13. The algorithm compared all clones at the ends of CBmaps at a lower cutoff (in this case, it would be a 1e-11), and joined CBmaps when possible. In the current contig, some were joined, and some were moved to ctg172 (where multiple CBmaps may also have been joined). You will see that the number of Qs is "~ 2", this is because it adds up the number of Qs from the CBmaps it has joined, hence, it is not the number of Qs of the complete assembly of all clones in the contig into one CBmap.
If you look at the output that is written to the terminal and the Results in the project window, it should be clear what has happened.

ReBuild Contigs with Q eq - or Q eq ~

If you look on the project window, contigs that have been manually edited by moving or removing clones have a Q eq '-'. If contigs (CBmaps) have been merged based on end clones, or new clones have been added, these contigs will have a '~' before the number of Qs. If the "Qs eq -" is selected, these contigs will be rebuilt. If the "Qs eq ~" is selected, the approximate Q contigs will be rebuild.

All such contigs will have the CB algorithm run on them to reorder the clones using the given cutoff (or CpM). If the contig splits into multiple CBmaps, they will remain in the same contig with a line between them. If singletons are created, they are buried as Pseudo clones.

Project Window

Order contigs (top-right button)

You may select with left mouse button to go to the next ordering option, or pull down with the right mouse button to see all option and select the one you want. The following discusses the different options.

By ctg...

The contigs are ordered by contig number. The date is the last date modified. The status is explained on the Help page for the Edit Contig option; briefly, it is set by the user, and tells the system whether to use the contigs in merges and re-assemblies. A '+' after the status indicates that it was changed during the last execution of the IBC (Incremental Build Contig). The Qs is an indication of how good the assembly is; many Qs often indicate a false join (see Main Analysis). The remark is described below.

By size...

The contigs are ordered according to the number of clones. The rest of the fields are as described in "By ctg".

By length...

The length of the contig is approximately the number of restriction fragments, which is referred to as CB (consensus band) units. For agarose fingerprints, the average band is 4096 bases. The length of the contig is the band size times the length of the contig in CB units. The contigs are listed in order by this length. Note: the average band size can be changed from the Main Window, Configure window.

Counts...

The contigs are listed by number of clones. Its lists the number of buried clones in each contig. It then considers that a clone starting with a different letter is a different type; it lists the number of clones for each type. For example, say a contig has clones starting with 'a' or 'b'. It may say "a10 b50" indicating that 10 clones start with 'a' and 50 clones start with 'b'.

Sequence...

This lists the contigs with sequenced clones first. From the Clone Edit window, the user can select a shotgun status of {SD, Tile, Sent, Ready, Shotgun, Finish} and a type of {Auto, Full, Half, and Gap}. Also, the type and status can be set by reading in a file from the File... menu, Replace sequence status option.
The SD clones are from Simulated Digest; we have a program called FSD that will run a simulated digest on a sequenced clone and convert the sizes to migration rates.
The other shotgun status are from the Sanger sequencing pipeline. Clones would be selected for sequencing by setting the status to Tile, and then the clones were entered into the sequencing pipeline. The sequence status was nightly updated by reading in a file.
The Auto is a new shotgun type. It is set when clones are selected automatically (see Main Window, MTP). The Full indicates the clone is selected for complete finishing, the Half means it is selected for draft. The Gap indicates it is selected to fill a gap between sequenced clones.

Keyset...

Given a keyset of clones or markers, the contigs will be searched for the items in the keyset. They will then be ordered according to how many items from the keyset they contain.

Position...

This is only relevant if the Ctg->Chr routine has been executed. It is described in the help on the Ctg->Chr window.

Results...

When an analysis is run from the Main Analysis window, the contigs with results are ordered first in this window, along with messages. If you close this window and open it back up, the results will remain for the last function executed.

Low Score

The CB algorithm is greedy, hence, it may not find the best solution. Therefore, it is run N times, and the best solution is taken. The contig is assigned a High, Avg and Low scores based on scores from the N times. This display sorts them by the low score. Contigs with no Qs (Qs eq -) or approximate Qs (Qs eq ~ N) will be shown first because they do not have a High, Avg or Low score, as the entire set of clones must assemble into one CBmap in order to have these scores.

Frameworks...

The frameworks are listed in order. An 'F' indicates it is a well-ordered framework; no character indicates it is a placement, which is a not-well-ordered framework. Next, the Chr or Lg is listed. The Diff column is the difference in consecutive anchors. The Seq indicates whether a clone has been picked for sequencing from the contig(s) it is on. The position is its contig position; a '*' beside it indicates that a new contig has started. The positions within a contig should ideally be consecutive. Last is each contig number that the marker is found in, followed by the number of clones it hits.

Other orderings

Other orderings: On the Search window, there are other ways to order the contigs. For example, listing all the contigs that have Q clones, or anchor (frameworks) more than N cM apart. You can also have contigs listed first that have a given substring in the user remark, chr remark, or trace remark.

Chr_Remark, User_Remark and Trace_Remarks

The button cycles through these three types of remarks, displaying the one shown. Select the button with the left mouse button to select the next option. Pull down with right mouse button to see all options. For any given type of remark, they can be cleared from the Search window, select Cleanup Menu.

Chr_remark

This will show "Lg_remark" if your framework file specifies linkage groups. For simplicity, we usually just say chromosome remark. Contigs are assigned to chromosomes either by (1) the Ctg->Chr routine from the main window, or (2) manually editing this remark. The following discusses the format (by example):

Chr1 [60 ch4/20 Fw40 Pm15 Seq25]
The contig has 55 anchors, 40 of them are frameworks and 15 of them are placements. The contig has 25 SD (simulated digest) clones that have remarks placing them on a chromosome. From this 40+15+25 pieces of chromosome evidence, 60 were on chr1 and 20 on chr4. The contig was assigned to Chr1.

Chr4 {...1 [60 ch4/20 Fw40 Pm15 Seq25]}
The contig was manually edited to assign it to chr4. The automatic analysis is put into curly brackets, and a '...' replaces the word 'Chr'.

* [ch3/1 Seq1]
A '* indicates there is one sequenced clone.

- [ch3/1 Pm1]
A '-' indicates there is one placement hitting one clone.

+ [ch3/2 ch3/2 Fw4]
A '+' indicates there was ambiguous evidence.

& [ch2/1 Fw1]
A '&' indicates that the frameworks hits multiple clones in multiple contigs.

Chr3 {& [ch2/1 Fw1]} indicates the contig has been manually assigned to a chromosome. The automatic assignment is put into curly braces. Any of the above situations may be shown in curly braces.

User_Remark

The user may enter whatever remark they want. This is done from the Contig Edit window (as described below). The remark will show in the contig display. You may search on user remark from the Search window.

Trace Remark

All routines that merge, add or remove clones write a remark indicating the action. The last action always goes at the beginning. The remark is 80 characters long, so old actions are removed from the end. It can be useful to clear all the trace remarks (select Search, then Cleanup Menu, then Clear all Trace remarks) before doing a new set of edits, as it makes it easier to see what has changed.

Pull down in white space

Edit Highlighted Contig: Select a contig in the Project window, this select this item and a window will appear in which you may edit the status, the Chromosome remark, the contig position, or the user remark. See the help on that Edit Contig section for more information.

Goto Current Contig: if a contig is displayed, selecting this will jump to the contig entry in the window.

Goto Top: jumps to the top of the window.

Print to file: prints the contents of the window to a file.

Project/Search

Search by Word

Enter a substring into one of the three remark types, and hit Enter. The contigs containing that substring in the respective remark will be displayed first.

Display contigs by:

If you select "Contig Chr Remarks", all contigs that contain a Chr_Remark will be displayed first in the Project Window. Likewise for "Contig User Remarks" and "Contig Trace Remarks".

The "Q contigs" will display all contigs with 1 or more Qs.

The "Anchors > N distance apart" will display all the contigs that have anchors that are N cM (or whatever metric your anchors are in) apart.

The "Contigs with anchors & print" will display all the contigs with anchors, and ask you what file you want this information printed to.

The "Markers from keyset & print" is like the "By keyset" option on the Project window, where it will list all contigs that have a marker in the keyset, but this also lists the markers, and lets you print the results to a file.

Re-number contigs:

The "Move contigs up" causes all contig numbers to be sequential. The next option allows you to swap two contigs (this probably isn't useful, unless you are hand-ordering your contigs).

Misc:

File of pairs of clones to compare

This allows you to input a file, where there are two clone names per row. The two clone names will be compared based on the Sulston score and the result printed. This has been useful for checking re-arrayed clones.

Find clones with mult gels < (or >) cutoff

FPC can handle multiple gels per clone. We regularly test our reproducibility by redoing a plate and re-entering it. The clone names should be the same, but the gel names should be different. This function will compare all clones that have duplicate gels, and print out the ones that are above (or below) the cutoff (using the cutoff on the Analysis window). It also tells the number of good and bad clones based on the cutoff, and shows how many bands are within 0 tol, 1 tol, etc.

Cleanup Menu

This allows you to clear all contig Chr, User or Trace remarks.
It allows you to change all contig status's to OK.
You can set the value of the Qs for all contigs to '-' (this is useful to ReBuild them all without recomputing what clones are in what contigs).
You can turn off the Auto Update for setting the Chr/Lg or Pos (i.e. so that Ctg->Chr will run on all contigs).
You can clear all clone remarks.
You can clear the value of all Oldctg. This is quite useful to do before making a bunch of new edits, as you will be sure that the "Show Additions" on the Contig/Highlight menu only shows the recent additions.
Whenever you add new markers, it sets all the new one to have a status of 'New'. This is used by the IBC. When your project is done, you might as well clear all these.
The last two functions are probably obsolete.

Ctg-->Chr

The following provides information on SD clones, framework file, and the three functions on this window. It also covers manually setting the chromosome or linkage group for a contig, and manually setting the position.

SD clones

SD clones are simulated digest clones created from downloads from Genbank. See the ESD/FSD on www.agcol.arizona.edu/software/fpc. FSD expects the chromosome of the clone to be in the Genbank definition line, e.g.

DEFINITION Oryza sativa (japonica cultivar-group) chromosome 10 clone OSJNBb0015J03, complete sequence.
It also parses out the clone name and tries to match it with a name in FPC, e.g. b0015J03 is a substring of OSJNBb0015J03 so it is considered the same. Finally, it parses out the author name and makes a clone remark like:

b0081F12, Chr10 - Wing

This remark is used in 'Process sd clones' and 'Assign Ctg->Chr'.

Framework file

See the Main Menu/Help/Overview for a definition of anchors, frameworks, and placement.

When "Replace markers (fw & seq)" is executed on the "File..." menu, a file called "framework" is automatically read. If a file called "framework" does not exist, it looks for a file called merge.fw where the file of markers to be read is named merge.ace (replace the word "merge" with whatever name makes sense). Alternatively, the "Replace framework" can be executed on the "File..." menu, and a framework file (with suffix .fw) can be selected. The following gives example of the 6 possible format, where they all have a global position for each marker, and F stands for framework (a well-ordered marker) and P stands for Placement (in the right area, but maybe not the exact position):

Map "Demo"
Label "Chromosome"
Abbrev "Chr"
A01 2 0.1 F
A07 1 0.7 P
A09 X 2.7 F

Map "Demo"
Label "Linkage Groups"
Abbrev "Lg"
A01 C 0.1 F
A07 A 0.7 F

Map "Demo"
Label "Chromosome"
Abbrev "Chr"
A01 2.1 0.1 F
A07 Y.1 0.7 P

Map "Demo"
Label "Chromosome"
Abbrev "Chr"
A01 2q1 0.1 F
A07 Xp1 0.7 P

Map "Demo"
Label "Chromosome"
Abbrev "Chr"
A01 Xq1.1 0.1 F
A07 1p1.1 0.7 P
A08 Yp1.1 0.7 P

Map "Demo"
Label "Linkage Groups"
Abbrev "Lg"
A01 C.1 0.1 F
A07 A.1 0.7 F

For chromosomes, the software expects a number or the letters "X" or "Y". For linkage groups, the software expects single letters A, B.... WebFPC V2.2 and WebChrom V2 support these formats also.
A generic option has been added to allow use of chromosome names which do not fit these patterns. To enable this option, add the string "generic_grpnames" at the end of the "Map" line, e.g.
Map "Demo" generic_grpnames
If this is found, then the chromosome names are taken as they are, with no interpretation of the characters. They are sorted field by field, where the field breaks are defined by letter/number changes, or non-alphanumeric characters. The names may be up to 9 characters long.

Process sd clones

For all sd clones:
1. Sets their sequence status to FINISHED.
2. If the original clone is in the FPC database based on the remark as described above, its status is set to TILE.
3. A file called "genbank.webfpc.ref" is created that can be used with WebFPC to link to the genbank entries.

Assign Ctg-->Chr (or Lg group):

This function assigns a contig to a chromosome or linkage group by entering a contig remark. It also assigns a position to each anchored contig, which is used by the "Order Ctgs based on Chr Assignment" function. There are quite a few options; the defaults are what we used for the maize fpc project.

Evidence weight

Each framework marker for a chromosome counts M points, and each placement marker for a chromosome counts P points. If a SD clone contains a chromosome number in its remark, it counts N points. By default, M=2, P=1, N=2. The contig is assigned to the chromosome with the most points. For example, a contig remark:

Chr1 [46 ch2/1 Fw27 Pm1 Seq19]

indicates a total of 47 pieces of evidence, 46 of them assigned to Chr1. There are 26 frameworks, 1 placement and 19 sequence clones on chromosome 1, so the score is (26x2 + 1x1 + 19x2). Chromosome 2 has one piece of evidence which is a framework, so its score is 2.

If a contig only has only one sequenced clone and no anchors that agree, it is not assigned to a chromosome. The contig remark will be, e.g.

* [ ch3/1 Seq1]

If there are multiple sequenced clones from the same chromosome, it is assigned a chromosome, but not a position.

Ignore if one anchor and it:

If a contig only has one piece of evidence and if the corresponding bullet is selected, it will remain unanchored.
* it is a Placement marker
* it has one clone hit
In either case, the corresponding remark will be, e.g.

- [ ch3/1 Pm1]
- [ ch3/1 Fw1]

Note, the number of clone hits is not shown, so in the first case, the placement marker may have hit multiple clones, but it was ignored since it was a placement. In the second case, we know the framework hit only one clone since it had to be the second rule that disqualified it.
If multiple anchors hit one clone and only one clone is hit, this is treated like a the one anchor one clone case.

The third option is:
* it anchors multiple contigs
If a contig only has one anchor, and it has multiple clone hits in at least one other contig, it is not anchored. The remark will look like,

& [ ch3/1]

Ignore if multiple Chr/Lg and either

One of the following two must be selected. But by making the number a 1, you make the option ineffective.

* #score is less than [2]x all other score.
The chromosome that has the greatest score must have a score [2]x greater than the accumulative score of all other chromosomes . The [2]x means 2 times, and the number can be replaced within the [].

* #clones is less than [2]x all other clone hits.
If the number of clone hits to the chromosome with the highest score is less than [2]x the number of hits to the clones hitting other chromosomes, a chromosome is not assigned to the contig. For example,

+ [ ch1-1 ch6-1 Fw2]

is the chromosome remark. It is not assigned a chromosome because:

Chr1 total 1 Fw 1 Seq 0 clones 6
Chr6 total 1 Fw 1 Seq 0 clones 5

The '+' indicates this ambiguous situation.

Maximum between anchors

First, the algorithm will calculate what chromosome a contig belongs on by the above rules. It then evaluates the anchors for that chromosome; if all of them are greater than the specified distance, the contig is not assigned. For example,

! [ ch1-2 ch2-3 Fw3]

It would have been assigned to Chr1, except the two anchors were too far apart.

Contig Position:

Basically, it ignores all anchors further than N cM distance apart, and then takes the average of the remaining cluster (where all anchors in a cluster are within N cM distance). If there are multiple clusters, it uses the cluster with framework markers (if the other clusters are placement only). If there are multiple clusters with frameworks, it does not make an assignment. If there is more than one framework and all the frameworks are further than N cM apart, it does not make an assignment.

Manually setting the chromosome or contig position

Contig remarks are shown on the contig display, and on many of the project windows. They can be altered by the user, in which case, the edited contig will not be changed by this function. See "Edit Highlight Contig" on the project window (pull-down in white space), or the "Contig Remarks" on the "Edit" menu of the Contig display.

On the Project window, select Show Position: it will show the contig position and a C indicates that the user has requested that the chromosome remark not be changed; A P will indicate that the position has been set by the user.

If the Assign Ctg-->Chr function is run after manually editing a chromosome assignment, the function will still calculate what it would assign, and put the results in braces. For example,

Chr2 {...1 [10 chr2/2 Fw12]}

indicates that the user set the chromosome to Chr2, but the automatic function would have set it to Chr1.

Order Ctgs based on Chr assignment

The function reorders the contigs based on chromosome (linkage group) assignment and ctg-chr position assigned by the Ctg->Chr function. Therefore, ctg1 will be the first contig on chromosome 1 (or linkage group A), etc.

Edit Contig Remarks & Status

Status

OK - this is the default.
NoCB - when the IBC (Main Analysis) is run, clones will be added, and contigs will be merged, but the CB algorithm will not be run. This is done if manual editing has changed the order of the clones. Generally, manual editing of the clone coordinates is no longer done, so this is obsolete.
Avoid - This is never killed on the Build Contigs, and is ignored by IBC.
Dead - like Avoid, but it is also not on the Summary statistics.

Chr Remark

The Ctg->Chr routine (Assign Ctg>Chr) should be run before manually editing any chromosome or positions. The routine will not always get it right, so this allows you to over-write the assignment. Once you set "No Auto update", you can run the Ctg->Chr routine again, and it will not make an assignment. If a contig has no good assignment, just select "No Auto update", type the word "none". Otherwise, make sure you make a valid assignment (e.g. 1, 1.1, 1p1, A, A.1 are all valid, no other form is valid).
The "Pos:" field allows you to assign a position to the contig, which is used by the "Order Ctgs based on Chr Assignment", which orders the contigs by this position within chromosomes.

User and Trace Remark

The User Remark is whatever you want to write.
The Trace Remark is entered by the analysis routines, but it may be modified by the user. The analysis routines will continue to modify it.

All remarks (Chr, User and Trace) can be cleared for all contigs on the Project Window/Menu/Cleanup Menu window.

Edit Clone Record

Rename or Cancel Clone

Rename Clone renames the clone, but the original name is kept as it is used to index into the .size and .gel file (for agarose clones, the sizes and gels are accessed in the Gel Window). The FpName field is set to the old clone name. For example, say X is the new clone name and Y is the old clone name, and no FpName exists. When you click Accept, Y becomes the FpName and X becomes the new clone name. If a FpName already exists, it will not be replaced (as it is the index into the size and gel file). This will become permanent when you click "Accept Edit" and then "Save .fpc" from the main window. If you click "Reject Edit," no changes will be made anywhere.

Cancel Clone will effectively cancel the clone. In doing this, an exclamation point ("!") will be placed in front of the old clone name (e.g., "!a0001A12"), all markers will be unbound from the clone, all buried clones will be removed from the clone yet will remain in the contig, and the parent will be removed. The exclamation point ("!") will make sure that the clone sticks around in the FPC data file, only with the "!" to show that the clone has been deleted.

General Information

After the label Clone, the clone's name appears.

After the label Contig, the clone's contig number appears with arrows to go from a lower number to a higher number. If a clone is assigned to contig 0, it is effectively not in a contig and is referred to as a singleton. If you wish to assign a clone to a new contig, move the arrow up to the maximum number indicated. (The maximum number is always one more than the last contig used.) If a clone is assigned to a contig (other than 0), two numbers will appear after the clone number. These indicate the beginning and ending bands where the clone is placed in the contig.

If the clone has an FpName, it appears by the label FpName. This will not appear unless the clone has been renamed - see above.

Gel indicates the Gel which the clone was taken from.

Bands is followed by two numbers. The first indicates the offset into the COR file of the clone. The second indicates the number of bands that the clone has.

Remarks and FP Remarks

Remarks are primary remarks and FP Remarks are secondary. FP Remarks are generally not viewed, and meant to be remarks to help in editing FPC.

The Remarks box indicates the remarks that the clone has. If you would like to add a new remark, enter one on a new, separate line. Each line will contain one remark (and note that lines may indeed wrap around, often indicated by an arrow); blank lines will not be recognized as valid Remarks.

FP Remarks indicates the FP remarks that the clone has. This box follows the same rules as the Remarks box, as described above.

Markers and Buried Clones

The Markers box contains the names of the markers assigned to a clone. These can be typed in, one per line. Or you can do the following: you can click "Pick from Contig Display" and open up a contig display window; then, click on a marker to add it to the list of markers. To turn off the click-'n-add feature, click the "Pick from Contig Display" toggle again.

The Buried Clones box contains the names of the clones buried in the clone. This feature work similarly to the Markers box. The "Pick from Contig Display" button allows a clone to be buried in the current clone. Just click a clone from the contig display. The clone must be in the same contig as the clone being edited, and it must not be buried in another clone or be a parent itself.

Parent and Classifications

The Parent box contains the name of the parent of the clone. The parent's name must be a clone that does not itself have a parent.

The classifications boxes contain the Clone Type, the Shotgun Type, and the Status of the clone. If either the Shotgun Type or the Status is changed to "NONE", the other is changed to "NONE" as well. If both are "NONE" and one is changed to another value, the other is changed to the first value in its list.

Accept or Reject Edit

Accept Edit accepts the current clone's entry information.

Reject Edit rejects the current clone's entry information. No changes will be made in FPC's data, despite whatever has been done in the window.

BSS Help

BSS (Blast Some Sequence) is a tool to simplify running sequence searches against sequences that are derived from clones on the FPC map. For a short manual and tutorial of the BSS, see www.agcol.arizona.edu/software/fpc; it is HIGHLY recommended that you take a few minutes to read these.

Query and Database (Target)

These specify the query and target files for the searches. The query may be any fasta file of sequences. The targets must be either BAC-end sequences (BES) or sequenced clones. In both cases, the clone name must be part of the name of the BES or sequence name, specified on its fasta description line.

NOTE: the BSS expects all BESs to have the same length prefix and suffix, e.g. ZMBa000B01.f matches clone a000B01 with a prefix of length 3 and a suffix of lenght 1. It will still work if this is not the case, but it will execute much slower.

For draft sequence, the description line for each sequenced contig must be the sequence name followed by a "." and contig number (e.g. a0003b12.1). (See the Manual for information on SD clones created with the FSD program for agarose projects).

The query and database files may have any suffix. When specifying files, the "*" character is used to specify all files in a given directory. To specify all files with a given suffix, for example ".seq", use "*.seq". A single file name may also be used to specify only one file.

Search tool

Choose BLAST, MegaBLAST, or BLAT. The executable (blastall, megablast, or blat) must be in your path.

Search parameters

The E - value, or Score if BLAT is used, sets a minimum strength for the hits to be returned. The "Other" parameters can be any parameters the search program accepts on its command line. Click "Options.." and the search program will print its options to the terminal window. (Options causing a change in the output format of the search program should not be used, since BSS will be unable to read the output).

BSS output options

By default, all BSS result files (and directories) are put into a subdirectory of the FPC project directory named "BSS_results". If you want to organize your BSS files further, the "Subdirectory" option allows you to specify a subdirectory of "BSS_results" into which to place the results of a new search.

The option "Split BSS output by contig" may be selected for BLAST or MegaBLAST searches only. Selecting this option causes the hits recorded in the BSS output to be split into separate files according to the contig in which the hits occur. This option is especially useful to avoid computer resource limitations when your BSS search results in very many hits (as well as allowing you to avoid looking at all the hits in a single BSS file). One consequence of selecting this option is that a BSS directory (instead of a single BSS file) will be created for every query file. In each BSS directory, a BSS file is created for each contig in which hits were found.

Start Search

Start the search by clicking the "Start Search" button at the bottom. When the search is complete, the file name (which is the query name followed by the database name) will appear in the BSS result window. Double click it to see the results.

BSS Results

This window lists all BSS results in the "BSS_results" directory. Directories are differentiated from files in the display by a trailing slash character ("/") in their names (and a "File size" field value of "-"). Double-clicking on a file opens a results file, while double-clicking on a directory updates the list with the files and subdirectories in that directory. Note that, after descending into a directory, an entry ending in "../" appears in the list, which allows you to go back up the directory tree.

BSS Results Display

The results window has three panels with various inforamtion. You can click a contig for it to be displayed. You can click a hit to see the actual alignment on the terminal window. You can click a column heading to sort a column. The File, Analysis and Columns are described in the BSS manual and tutorial; that is, they are not further described here. But we think they are self-explanatory -- try them!

File

Under the File pull-down at the top of the window is the options to delete all the files in the BSS result window, or delete a selected file in the BSS result window.

BSS Results Help

This window details the hits from one particular BSS results file.

Query Table

This is the table in the upper right. The columns are:
- Sequence - Query Name
- Hits/#Ctgs is the total number of hits and the number of contigs with hits (unless only one contig has hits).
- Best Ctg/#Hits is the contig with the most hits and the number of hits to the contig (unless its the only contig with hits).

Contig Table

This is the table in the middle right. The columns are:
- Contig (FPC)
- CloneHits is the number for the contigs.
When the "Split BSS output by contig" option is used, this will only show the current contig.

Double-clicking on a line brings up the contig.

Hit Table

This is the large table at the bottom. It lists all of the hits found in the search. Double-clicking on a hit causes the alignment to print to the terminal window.

The hits can be sorted by column. Click on any column heading and all hits are ordered by that column. Cntl-Shift-click on the same column heading and results are ordered in reverse.

Menus

File

Save BSS - if you have filtered the Hit Table (see Analysis) and want to save the results, this will overwrite the current file.

Save BSS as - same as previous but you name the file.

Save as spreadsheet - this will save the Hit Table in a tab delimited file. It saves exactly what is shown, e.g. if you have removed columns, they will be removed from the file.

Analysis

Filter hits: brings up a window of filters that can be applied. You can applied consecutive filters, you can then consecutively undo them.

View keyset of hit clones: This will bring up a keyset of all clones in the "Clone" column of the hits table. Use as follows: bring up a contig, select Highlight on the contig window, select "Select Keyset" and all the clones with a hit will be highlighted in the contig.

Add hits to FPC: brings up a window that lets you add all Sequences in the Hit Table to FPC as markers or remarks. Note that only the Sequences shown in the Hit Table will be added. Therefore, you can filter out the hits you do not want first, e.g. maybe you only want to add the sequences that did not hit more the N contigs, use the filter "Max ctg hit".

Columns

There are a lot of columns on the Hit table. You can select what columns to remove.

Select MTP Help

Selecting a Minimal Tiling Path (MTP) means picking a set of minimally overlapping clones that span an entire contig. The MTP functions automatically select the MTP. The following sections give an overview of the parameters and the 4 steps. For a full discussion on the algorithm, see Engler et. al. (2003) Genome Research 13:2152-2163. Also, see the MTP tutorial at www.agcol.arizona.edu/software/fpc and the simulation results at www.agcol.arizona.edu/software/fpc/sim/mtp.

There are two primary choices, depending on what data is available, as follows:

Fingerprint: Two clones that are near each other are analyzed based on their fingerprints. The overlap is evaluated using the shared bands between the two clones and by using a spanning clone and two flanking clones to confirm the shared bands.

BSS (BES-draft) method: BES sequences are aligned to draft sequence contigs using BSS, and potential overlaps are identified when the BES from two clones hit the same sequence contig. This method allows for very precise identification of overlaps, but it does require extra sequence information.

The MTP algorithm can use one or both of these methods. If both are used, precedence is given to the BES-draft overlaps. Regardless of the method, two steps are employed: 1) minimally overlapping clone pairs are identified, and 2) a contiguous path through the contig is assembled from these pairs.

Execute step 1 and step 2 to compute the MTP.
STEP 1: Find overlapping pairs.
This will be run on fingerprints and/or BSS results, depending on whether you have selected Use fingerprints and/or Use BSS results For the BSS results, a BSS file (or directory) must be listed in the "BSS File:" text box.
STEP 2: Pick MTP clones.
STEP 3: View results (if desired).
STEP 4: Save results.

STEP 1: Find overlapping pairs

Finds overlapping clone pairs, using the fingerprints, BSS results, or both. Select either 'Use fingerprints' and/or 'Use BSS results'. Set the parameter as described for each. Then select the "Find overlapping pairs" button. It wil turn gray when the function has finished.

Min FPC Overlap

Minimum overlap of the FPC coordinates in the contig for a candidate pair.

Max FPC Overlap

Maximum overlap of the FPC coordinates in the contig for a candidate pair.

FromEnd

The MTP algorithm starts and finishes with clones at the end of the contig. Since the coordinates are not exact, we identify the 'end clones' as those that have a coordinate at least "FromEnd" from the end (e.g. 15 would allow any clone on the left that starts with a coordinate between 0 and 15 to be used as an end clone).

Contig

Under STEP 2, there is an option to pick the MTP clones from just one contig. That setting also applies to the 'find overlapping pairs'.

Use fingerprints

These are used for finding overlaps based on clone fingerprints. To use fingerprints in picking pairs, make sure the 'Use fingerprints' radio button is on. The choice of the "agarose/hicf" configuration parameter (in the Configure window) will set appropriate default values for fingerprint pair parameters.

Min Shared Bands

Minimum number of shared bands verified by the spanner of a candidate pair.

Weight

This number controls the tradeoff between minumum overlap and the risk of false positive overlaps. The default value has been set based on many simulations, hence, you should generally leave this value alone. One reason to increase it is if you do not care if you have large overlaps and do not want false positives.

Use Sizes

If Agarose, then you have a choice as to whether it uses the accumlative sizes for determining overlap, or the number of bands. If you select "Use Sizes", there must be a /Sizes directory with the size files in it.

Use BSS results

These are used for finding overlaps based on a BSS draft to BES sequence comparison. To use BSS results in picking pairs, make sure the 'Use BSS results' radio button is on. Also, make sure the 'BSS File' is a valid BSS file or directory.

BSS File

The file (or directory) of results from a BSS 'Sequence to BES' comparison. This search must be run separately prior to picking MTP clones, via the 'BSS' function from the 'Main Menu'.

Filter BLAST Score

Discard all hits with a BLAST score below this cutoff.

Identity

Discard all hits with a BLAST %id below this cutoff.

Advanced settings

Parameters for fine-tuning the algorithm. See additional help on Advanced settings window.

mtp_pairs.bss

After running the 'Find overlapping pairs', you will see a file called mtp_pairs.bss in the BSS_results directory. These are the pairs that are selected by the 'Find overlapping pairs' function for candidates for the MTP algorithm.

Or use existing Pairs File

You can save your pairs after running the 'Find overlapping pairs' function (see STEP 4, "File of Pairs". You may load the file of pairs which will then be used for the 'Pick MTP clones'.

STEP 2: Pick MTP clones

Pick the minimally overlapping path of clones based on all overlapping pairs. If there isn't a contiguous path through a contig, the contiguous portions are numbered as expressways. Once all MTP clones have been identified the button turns gray.

All contigs or Contig N

To find MTP clones for all contigs, make sure the "All contigs" option is turned on. To find MTP clones for only one contig, enter a contig number in the text field beside 'Contig', and turn on that option.

Give preference to large clones

The clones with the longer lengths are given priority to being selected for the MTP. The length is based on the number of bands if "Use Sizes" is not selected.

Use Sizes

When giving preference to longest clone. This only works when there is a /Size directory (output from Image, contains the size of each fragment instead of the migration rate).

Mandatory clones

This button can be used to designate those clones that must be included in an MTP. See additional help on Mandatory Clones window.

STEP 3: View results

These buttons are used to step through and view the pairs of picked clones in the contig display.

Mini Window

As the MTP window is large, selecting the Mini Window reduces the size to only show the step functions, as that provides more screen room to view the fingerprint window, contig display, and the output to the terminal.

Contig

The contig to step through.

Step through pairs

Steps through all pairs found at the current cutoffs in the selected contig. The clone pair is highlighted blue. For fingerprints only, the spanner is highlighted pale blue, and the flanking clones are highlighted pale gray. Overlap information is printed to the terminal window.

Step through MTP clones

Steps through all clones picked to be in the MTP for the selected contig. Highlight colors are the same as for stepping through pairs. Expressway and overlap information is printed to the terminal window.

Show fingerprints (Fp only)

If on, the fingerprints of the current clone pair along with the spanner and flanking clones are shown in the fingerprint window. The following color key indicates how bands are shared:
Cyan -- band is shared by both clones in the pair and the spanning clone.
Green -- band is shared only by the left clone in the pair and spanning clone.
Blue -- band is shared only by the right clone in the pair and spanning clone.
Violet -- band is shared by a clone in the pair and its flanking clone, but not by the spanner.
Red -- band in a pair or spanning clone that is unconfirmed; a mismatch.

Set MTP clone status to TILE for contig

Sets the sequence status of all picked MTP clones in the selected contig to 'TILE'. A clone remark will also be added to each picked MTP clone of the form "MTP: [left clone] [overlap]", where "[left clone]" is the overlapping clone in the MTP to the left of the given clone, and "[overlap]" is the (estimated) overlap in base pairs with the left overlapping clone.

STEP 4: Save results for all contigs

Set MTP clone status to TILE

Sets the sequence status of all picked MTP clones in all contigs to 'TILE'. A clone remark will also be added to each picked MTP clone of the form "MTP: [right clone] [overlap]", where "[right clone]" is replaced by the overlapping clone in the MTP to the right of the given clone, and "[overlap]" is replaced by the (estimated) overlap in base pairs with the right overlapping clone.

Clear TILE

Any clone with a sequence status of TILE will be cleared, i.e. it will no longer have a sequencing status. It also removes are clone remarks starting with "MTP:", so make sure not to use that prefix on your remarks.

File of Pairs

Once the 'Find overlapping pairs' function has been run, you may save all the pairs that were found by clicking on the 'Save' button and entering a file name. These pairs can later be loaded with the 'Or use existing Pairs file' so the 'Find overlapping pairs' does not have to be re-run (good for really big datasets).

File of MTP clones

Writes a file of all MTP clones when you click 'Save'. Mandatory picked clones are indicated with a '*'.

Advanced BSS Pairs MTP Settings Help

These parameters can be used to fine-tune the performance of the BSS-draft based selection method. See http://www.agcol.arizona.edu/ software/fpc/gr2003_supplemental/ for details.

Maximum sequence overlap

With large draft sequence contigs, it is possible to identify a clone pair that overlaps more than desirable. To avoid this situation, clones may not overlap more than the maximum sequence overlap, as calculated from the sequence comparison results.

Multiple contig ratio

If a sequence contig hits in several FPC contigs, the ratio of the number of hits in the best contig (these hits are kept) to the number of hits in all other contigs (these hits are discarded) must be at least this big. Otherwise, all hits to that sequence contig are discarded.

Only positive overlaps

If this option is on, pairs must have a positive overlap as calculated from the sequence comparison results.

Only single BES hits

If this option is on and more than one sequence contig hits a single BES, all hits to that BES are discarded. This should be on when the sequence contigs are thought to be unique.

Mandatory Clones Help

These settings can be used to designate those clones that must be included in an MTP. The designation of these "mandatory clones" is based on the sequence status of the clones. The radio buttons in this window allow you to choose a set of sequence statuses such that clones having a sequence status in the chosen set will be mandatory clones.

Loading Sequences

There are two ways to load sequence information: either directly from BSS alignment output, or using a specially constructed sequence information file.

The sequence information file describes supercontigs (i.e., scaffolds) and their contained sequence contigs, for example:

>DQ002408 95814
DQ002408.6 80814 5000
DQ002408.5 65814 5000
DQ002408.4 50814 5000
DQ002408.3 40342 472
DQ002408.2 25342 5000
DQ002408.1 15000 342
DQ002408.0 0 5000

Supercontig sections start with ">" and have format

>supercontig_name length

Followed by lines specifying the sequence contigs (if any):

seqctg1 length location
seqctg2 length location
...

The sequence contig location is its starting position in the supercontig, the sequence contig names don't have to have any relation to the supercontig name.
All lengths are in bp.

Typically, draft sequence will incorporate BAC-end sequences, if available, removing the need to align the BES separately to the draft sequence. Instead, you can prepare a BES location file giving the location of each BES within its scaffold. This file is loaded using the "BES" button and has format

>sequence1
BES location size
BES2 location size
....
>sequence2
...

Note that the scaffolds need to be defined using the sequence information file previously described, which should be loaded first.

Alternately, if you have sequences which do not incorporate BES, you may align them to the BES using the BSS function of FPC. The BSS output file may then be loaded using the "BSS" button on the sequence window. In this case it is not necessary to load a separate sequence information file, unless the sequences which appear in the BSS output are contigs within larger scaffolds. In this case, the sequence information file is necessary to define the scaffolds, and the sequence contig names in the sequence information file must match those used for the BSS search.

The location of the sequence information and BSS files are saved when the FPC project is saved. They will be reloaded automatically, assuming they have not been moved. (If they are in the FPC directory, the whole directory may be moved and the files will still be located.)

Sequence Placement

Parameters for placement of sequences onto contigs.

Window size: size, in kb, of the search window used for the alignment (Note that corresponding distance along the FPC contig is inferred using the "Band size" parameter from the Configuration page).

Min clone hits: minimum number of distinct BES which must be hit in a search window

Top N: keep the top N placements (N=0 keeps all)

Searching Sequences

Parameters for building a keyset of sequences (supercontigs). (For description of keyset and related functionality, see the clone/marker search functions).

Name: name search pattern, with same conventions as marker/clone search

Placed: show only supercontigs which have an FPC contig placement.

Multiple placements: show only supercontigs with more than one placement.

Search: do the search.

Clear Keyset: Empty the current keyset in preparation for a new search.

Analysis Functions


These functions help extract the important information from the alignments of sequence to FPC contigs.
Seq Joins: Lists sequences pairs for which merges are suggested Ctg Joins: Lists contig pairs for which merges are suggested Misassemblies: Lists sequence/contig alignments which are inconsistent, indicating a misassembly of one or the other (further examination is needed to decide which). The inconsistencies are of two types: A) an alignment terminates in the middle on each side, when it should extend if the assemblies are correct; and B) a single sequence has multiple alignment regions to a single FPC contig.
< br> The analysis algorithms make use of two parameters which define when an alignment has reached the end of the FPC contig or sequence:

Ctg_FromEnd (cb units): this is the identical parameter appearing on the Main Analysis window

Seq_FromEnd (kb) : tells how close to the sequence endpoint the alignment must reach to be considered as reaching the end.

Additional analysis functions:


Unaligned FPC Contigs: List contigs having no sequence alignment. The Min Clones parameters restricts the list to contigs of at least that size. For each contig, the output table also lists the BES in that contig which are incorporated into draft contigs.
NOTE:for this extra information to be included, a list of all available BES must be loaded through the "Seq Info" function (top of the Sequence window). In other words, make a file with format
--------------------
>a0001A02.r 770
>a0001A03.f 770
>a0001A03.r 759
--------------------
where the list includes all BES which exist, and the numbers are their lengths (they don't have to be right for this purpose). When you load them through "Seq Info" they will be added to the set of "draft sequences", allowing the analysis functions to be aware of their existence.

The output file lists each unaligned contig, along with all of its BES which are incorporated into a draft sequence.

Unaligned draft contigs: List sequence contigs (or segments of sequence contigs) having no FPC alignment. The segments are restricted by the "Min Length" paramater. A keyset of the sequences is generated. The output file also lists the BES embedded in the unaligned segments, along with the FPC contig the BES belong to and the number of bands in the clone fingerprint. This information helps determine where the segment should align and whether the segment comes from a region having unusually few or many restriction sites. In addition, the clones containing these BES are given an FP remark of the form "unaligned:sequence_name". By using the clone search functions on these remarks, you can get a keyset of the unaligned BES in any of the sequences.

Unincorporated BES: List BES which should be incorporated into draft sequence (based on alignment location) but which are either not incorporated or are incorporated to a different draft sequence. In the latter case the second sequence and FPC contig (if any) to which it aligns is also listed.
In addition, the clones for the misplaced BES are remarked with an FP remark of the form "misplaced:<BES name>", plus additional information (if any) as just described.
NOTE:for this function to work, a list of all available BES must be loaded as described for the "Unaligned FPC Contigs" function.

Other Sequence Functions


Print to file: generates a list of the sequence placements

Remove sequences: clear all loaded sequences.