|
JAVA GBrowse | ||
|
|||
IntroductionThe Java GBrowse is a Java Applet-based implementation of the original CGI-based GMOD GBrowse web application developed by GMOD. Although similar to the CGI-based GBrowse application, Java GBrowse has some new functionality, including more search options, and a modified navigation interface. With a few exceptions, Java GBrowse can be used with the same configuration and database as the original CGI-based GBrowse. We currently only support a subset of the configuration options offered in the original GBrowse, we hope that we have captured a rich enough sub-set of the functionality to meet the needs of many configurations. We may continue to expand this project and welcome any feedback about what could be added or changed to make it more useful. Please direct comments to: www@agcol.arizona.edu. Before reading this document you may wish to skim the Java GBrowse Help file to get a feel for the basic terminology used. If you are an experienced user of GBrowse, you may wish to read the following section to see what options are currently supported and what has changed. For new users it is recommended that you skip ahead to the section on Importing a GFF File to create the Java GBrowse database. For a sample configuration of Java GBrowse application see the Maize Sequenced BAC Browser, which is part of the HMPR-MSLL project. Changes between GBrowse and Java GBrowseIn Java GBrowse, computed options using PERL cannot be used unfortunately. We added the ability to change various track options based on the entity's method and source to hopefully mitigate this some (see below). Other major areas of functionality that are not currently supported are: semantic zooming and displaying the DNA sequences for applicable glyphs at high magnification. The following options for the general section of the configuration file are not implemented.
For track section options, the link option does not currently support the $class variable, but all other variables are supported. In addition the following gives a list of track options that are not supported.
Also we currently only offer a subset of the glyphs from the original GBrowse (see Glyph Types below). As for new functionality, it is possible to narrow searches using the the desired track in addition to the class name and reference sequence (see Search Options below). The user interface for the browser has also been modified. Most notably a drop-down menu for changing the reference sequence has been added, and since the Java browser's interface cannot be customized using HTML, the following options were added to the general section of the configuration file to allow some customization:
Java GBrowse works with same configuration files that is used by GMOD CGI-based GBrowse. The Java GBrowse will prompt for warnings in the configuration error dialog window for the options that are not supported by Java GBrowse. You can set the above suppress_warnings option to 1 in the general section of the configuration file to prevent configuration error dialog being displayed, in which case, the warnings will be sent to the Java error console. Java GBrowse handles individual settings for multiple features in a track differently as it does not support the perl subroutines defined in configutation file. See creating the configuration file section for details. In addition, "limit" was removed from the Track Options dialog and some options were added for searching ("filter by reference sequence" and "filter by track") and customizing the detail view ("horizontal scroll ratio" and "vertical limit"). See the help file for more information. Creating MySQL database and Importing the GFF FileJava GBrowse currently supports MySQL database. Download and install the MySQL database from http://www.mysql.com on your system. Follow these steps after installing MySQL database on your system to setup the Java GBrowse MySQL database for your organism: 1. Log into the MySQL database
as root user. create database <database name>;
3. Grant permission to the database using following command grant all privileges on <database name>.* to <username>@localhost;
grant select on <database name>.* to nobody@localhost;
4. Quit the MySQL database. Above steps will create the
MySQL database for organism. Now
import the GFF file ( Gene Finding Format text file layout developed at
the Sanger
Centre ) into MYSQL database using the bp_bulk_load_gff.pl script
distributed with the Bioperl
package. bp_bulk_load_gff.pl -c -d <database name> input_file.gff For more detail on setting GBrowse MySQL database refer to GMOD GBrowse tutorial. Creating the Configuration FileA configuration file for Java GBrowse consists of a general section followed by multiple track sections. In addition, a track defaults section can used to set default values for any of the track settings. Each section is introduced with a header in square brackets. After the header, a section consists of multiple option settings with the format of: option name = option value. Comments can be entered at any location by beginning the line with a pound symbol. The following gives a sample that sets the important options in the general section (values that should be changed are in bold). # This is a sample GENERAL section for a Java GBrowse configuration
# file.
[GENERAL]
# To suppress the warnings for the configuration file errors
suppress_warnings = 1
# The "description" option can be used to give a descriptive name to
# the browser:
description = RICE FPC (Chromosome)
# The "ref_seq descriptor" option can be used to give a more descriptive
# name to the reference sequence. The default is "sequence" if not set.
ref_seq descriptor = Chromosome
# The options "db_args", "user", and "pass" must always be set to tell
# Java GBrowse how to access the Genome database. The options "user"
# and "pass" give the user name and password for the database. The "db_args"
# option specifies arguments for accessing the database and can span
# multiple lines. The "-adaptor" argument specifies the Java class for
# accessing the database. The "-dsn" argument specifies a URL for the
# database with the format of "jdbc:database_protocol:url_to_database".
db_args = -adaptor com.mysql.jdbc.Driver
-dsn jdbc:mysql://url_to_db/db_name
user = the_db_user_name
pass = the_db_password
# The "default features" option gives a list of what detail tracks should
# be visible by default. The names should match the names in the section
# headers of the desired tracks.
Then a track defaults section
can be added to specify default values for any of the track settings.
Any of these settings that aren't defined for a given track will take
on the value specified here. (The meaning of the options will be described
below.) [TRACK DEFAULTS] After defining the general and track defaults sections, multiple track sections should be added that define what is to be displayed in each row or track of the overview and detail view. The order of the tracks in the browser follows the order in which they are declared here, although the end-user can re-order the detail view tracks. Each track section's options specify the sub-set of the genome data to be displayed in the track and how it should be displayed. The desired data entities for the track are identified using source and method names that reference the original GFF file. The feature option is where one or more method/source pairs are associated with a track in a space separated list with the format of: feature = method1:source1 method2:source2 The main specification of how
data is displayed is by its glyph type. Additional options allow for setting
the glyph height, foreground and background color, etc.
See Track Options Section for details.
bgcolor = color_for_feature1 color_for_feature2
fgcolor = color_for_feature1 color_for_feature2
fontcolor = color_for_feature1 color_for_feature2
font2color = color_for_feature1 color_for_feature2
link = "http://xyz.com/name="$name "http://abc.com/name="$name
Note: In CGI-based GBrowse the above functionality is achieved with a Perl subroutine.
If only one setting is listed, it will be applied to all of the features in the track. Here is a sample that sets up several overview and detail tracks. # Placing ":overview" at the end of the track name (in the section header) # specifies that it should appear in the overview. All other tracks appear in # the detail view. The "key" option gives the descriptive name for the track that # will be displayed to the end user. The "link" field allows the glyphs of the # track to be linked to a URL (the optional $name parameter will be replaced with # the clicked entity's name). ###### Overview Tracks ###### [contig:overview] feature = contig:FPC glyph = generic fgcolor = black bgcolor = yellow height = 4 key = Contigs [frameworkmarker:overview] feature = frameworkmarker:FPC bgcolor = red glyph = dot fgcolor = black height = 8 key = Framework Marker ###### Detail View Tracks ###### [contig] feature = contig:FPC glyph = generic fgcolor = black bgcolor = yellow height = 4 link = "http://url_for_link?source=ricefpcctg&name="$name key = Contigs # Note: with the Markers track, an individual foreground color is configured for each # feature. [Markers] feature = marker:FPC frameworkmarker:FPC electronicmarker:FPC bgcolor = black glyph = dot fgcolor = limegreen red yellow height = 5 link = "http://url_for_link?db=rice_fpc_chromosome&marker="$name key = Markers [Sequence_Clones] feature = sequenced:FPC fgcolor = black bgcolor = blue glyph = generic For a complete list of options for the general and track sections see Options for General Section and Options for Track Section below. For a complete list of supported glyphs see Supported Glyph Types. Adding the Java GBrowse Applet to a Web PageTo add the applet to a web page, the .jar file (GenomeBrowser.jar), configuration file, and folder containing the help files (GBrowseHelp) need to be copied into the web page's directory tree. (By default, Java GBrowse expects the help folder to be named GBrowseHelp and to be located in the same parent folder as GenomeBrowser.jar. Otherwise the URL for the help file can be set explicitly using the instructions option in the configuration file.) Then, within the body of the page that will display the browser, the applet must be declared with HTML code like the following. All of the values that need to be changed are in bold. Also, notice that the applet's width and height are both set to 1 so that the applet will not actually be visible on the page. <applet code="GBStubApplet.class" archive="GenomeBrowser.jar" MAYSCRIPT width="1" height="1" name =" rice_fpc_chr" > <param name="CONFIGFILE" value="ricefpcchr.conf"> <param name="APPLET_NAME" value="rice_fpc_chr"> </applet> In the HTML, the value of the archive attribute should be set to the correct relative path for the .jar file. Secondly, the parameter CONFIGFILE should contain the URL of where the desired configuration file is located. If a full path is not specified, the location is relative to the .jar file (using "../" in the path is not supported). The final value to set is the name of the applet, which must match for both the name attribute and the APPLET_NAME parameter. Note: the security restrictions placed on applets only allow them to open sockets to a single IP address--that of their host. As a result, the configuration file, database, and .jar file must all be accessed from the same host. The actual browser will not be displayed until the showBrowser method is called on the applet, and when it is called, Java GBrowse will open in a separate window. One way of calling showBrowser is with a link that initiates a Javascript. The applet can be accessed in the Javascript document object using the name it was given in its HTML declaration. The first parameter to showBrowser should always be document.cookie. The second parameter can be used to make the browser search for a specific reference sequence or named entity (see Search section below). The following sample HTML demonstrates calling showBrowser on the applet that was declared previously. <SCRIPT LANGUAGE="JavaScript">
<!--
function showRiceFPCChr ()
{
if ( document.browserLoaded )
document.rice_fpc_chr.showBrowser ( document.cookie, "" );
}
// --> </SCRIPT> <BODY onLoad="document.browserLoaded = false;"> <li><small> <font face="Helvetica, Arial, sans-serif"> <a href="javascript:showRiceFPCChr ()">Rice FPC Chromosome</a> </font></small></li> When showBrowser is called, if there are any errors in the configuration file, a dialog will be displayed listing the problems. This dialog can be suppressed for non-fatal errors by setting the suppress_warnings option. (Note: the document.browserLoaded variable used in the BODY tag and the Javascript function was added to prevent a lock-up we noticed if showBrowser was called too soon. The problem caused Java GBrowse to fail to function until the web browser was closed and reopened. It appeared that the loading of the .jar file was being stalled indefinitely by our network giving a redundant password request. With this configuration, the web page was visible for a fairly long time interval before the second password request and clicking on the link at any point before the password was entered would cause this lock-up.) Search Options with the "showBrowser" MethodThe second parameter of the showBrowser method specifies an optional search string that can be used to direct Java GBrowse to a specific reference sequence or named entity. This method can also be called after the browser is already open to direct it to a new area of the Genome. The search string should use one of the following formats.
Formats 1 and 2 cause the browser to display the input reference sequence. Format 2 additionally specifies the region (in base pairs) of the overview that should be expanded in the detail view. With format 3, the browser will search for and center around the named entity (a BAC clone in this example) if the name is non-ambiguous. Often it will be necessary to qualify the entity name with a class name, track name, and/or a reference sequence as in formats 6 through 8 to make it non-ambiguous. The class name refers to the class given for the entity in the group field of the original GFF file. The track name should match the name for the desired track in the configuration file and can be either the internal name (inside the brackets--e.g. [Track name]) or the display name (the value of key for the track). One way to test a search string is to use the search option from the toolbar of Java GBrowse, since it calls the same sub-routines as the showBrowser method. AggregatorsCurrently, aggregators are primarily used with the segments and transcript glyphs. An aggregator must be specified for these glyphs to indicate the method names of the sub-entities that form each composite entity. Otherwise entities in the track will not be grouped together. To use an aggregator, it must be declared in the aggregators option in the general section of the configuration file and then referenced in the feature option for the track. The following predefined aggregators are provided in Java GBrowse. (The descriptions in the table were taken from the tutorial for the original GBrowse.)
In addition, custom aggregators can be defined using the format of: aggregator_name { sub_method1, sub_method2 / composite_method }
Each sub-method name gives the method name for a desired sub-set of entities (their method names in the original GFF file). To use the aggregator, its name should be referenced in the track's feature option as though it were an ordinary method. If the aggregator is qualified with a source name, queries will be restricted within that source for all sub-methods. Here's a sample feature line using the predefined transcript aggregator qualified with the source FGENESH. feature = transcript:FGENESH Colors in the Configuration FileAll color options should be set in the Configuration file using one of the predefined colors in the table below or by entering a hex-decimal RGB value using the format of: #RRGGBB (e.g. #86A0B3).
Options for General Section of the Configuration Fileaggregators -- a space separated list of one or more aggregators that are used by the tracks defined in the configuration file. The option may span multiple lines and should list the names of one or more predefined or custom aggregators (see the Aggregators section). bump density -- the value of this option gives a threshold that, combined with label density, is used to determine how collisions between entities of a track are handled when "format" is set to "auto" for the track in the Track Options dialog. If the number of visible entities in the track exceeds this value, the placement algorithm for avoiding collisions will be disabled and the glyphs for all entities in the track will be drawn in the same row. db_args -- Specifies the arguments that Java GBrowse will use to access the genome database. (The option can span multiple lines.) The "-adaptor" argument specifies the Java class for accessing the database. The "-dsn" argument specifies the URL for the genome database with the format of "jdbc:database_protocol:url_to_database". For example:
db_args = -adaptor com.mysql.jdbc.Driver
-dsn jdbc:mysql://url_to_db/db_name
default features --
a list of what tracks should be visible by default. This list should consist
of one or more track names (from the section header) separated by spaces
or line feeds. default segment -- the default zoom level in base pairs (the default width of the region of the overview that is displayed in the detail view). detailed bgcolor, detailed bgcolor2 -- these set the colors for the background of the detail view and its guide lines respectively. description -- a description
for the browser that will be displayed above the overview. key bgcolor -- this option can be used to set the background color for the entire browser. keyword search max -- sets the maximum number of results that can be returned by an ambiguous search. The default is 1000. instructions
-- can be used to override the default location of the help file, which
is "http://parent_folder_of_jar/GBrowseHelp/gbrowse_help.htm".
(Note: it appears that relative URLs do not work on all platforms, so
it is recommended that a full URL with host name is used.) label density -- a threshold that is used in conjunction with bump density to determine if labels should be omitted when the track's "format" is set to "auto" in the Track Options dialog. If the track's format is auto, and there are more entities visible than the threshold, labels will be omitted when drawing glyphs. max segment -- the maximum
width in base pairs for the region expanded in the detail view. pass -- the password for connecting to the genome database. overview bgcolor, overview
bgcolor2 -- these options set the colors for
the background of the overview and its guide lines respectively. ref_seq descriptor -- a more specific name for the reference sequence (e.g. contig) to display in the browser; defaults to "sequence" if not set. reference class -- specifies the class name under which the reference sequences are defined in the original GFF file. The default is "sequence". suppress_warnings -- this options can be set to 1 to prevent the configuration errors dialog from being displayed, in which case, the warnings will be sent to the Java error console. user -- the user name that should be used for connecting to the database for the genome. zoom levels -- this
option should always give a space separated list of zoom levels that will
be offered in the Zoom Level Drop Down. Supported Glyph Typesanchored_arrow -- entities are indicated with a arrow at their position in the reference sequence.
arrow -- similar to anchored_arrow, but glyphs have a slightly different shape.
dot -- entities are indicated with a dot at their position in the reference sequence.
generic -- entities
are displayed as a rectangle covering their base pair range of the reference
sequence.
line -- entities are displayed as a line covering their base pair range.
segments -- groups together entities with the same name into an aggregate glyph for similarity alignments and spliced transcripts. For this glyph to work correctly, the track's feature option must be set using an aggregator (see Aggregators section).
transcript -- similar
to the segments glyph, but entities are connecting using a solid line.
For this glyph to work correctly, the track's feature option must
be set using an aggregator (see Aggregators section).
xyplot -- the data in the track is displayed in a graph.
Options for Track Section of the Configuration Filebgcolor -- the background color for the track's glyphs. bump -- setting this property to "0" disables collision control for the track (all entities will be drawn in a single row regardless of overlap). connector -- this property can be used to cause GBrowse to group together related entities and draw connection lines between them. Currently supported connectors types are dashed and solid. (The hat option from the original GBrowse is not currently supported.) By default entities are considered related if their names match. More complex relationships can be defined using the group_pattern option. description -- a boolean value (1 or 0) indicating if the description field should be displayed (below entities) when available. feature -- a space separated list of features for the track, which are identified using source and method names that reference the original GFF file. (An aggregator name can be substituted in place of the method name--see the Aggregators section.) This option should be set using the format of: feature = method1:source1 method2:source2 fgcolor -- the foreground color for the track's glyphs (used for their outlines). fontcolor -- the color for rendering entity labels. font2color -- the color for rendering entity descriptions. group_pattern -- allows for the grouping of entities based on portions of their name (the connector property can be used to specify the type of connection between entities in a group). The following table (taken from the tutorial for the original GBrowse) gives some basic regular expressions for grouping. For more information on entering regular expressions see Sun's Java documentation.
glyph -- the desired glyph for all entities in the track (see Glyph Types section). height -- the height in pixels for the glyph. (Not applicable for some glyphs--e.g. line.) key -- the name for the track that will be displayed to the end user. label -- a boolean value (1 or 0) indicating if the label field should be displayed (above entities) when available. link -- a URL for an external web page to be opened when entities in the track are clicked. If the following variables are placed in the URL, they will be replaced with the appropriate data from the clicked entity.
For example: link = http://url_for_link?source=ricefpcctg&name=$name strand_arrow --this option can be set to 1 to cause the generic glyph to be rendered with an arrow at either end indicating the direction of the entity's sequence. The default value is 0 (no arrow). |
| Email Comments To: www@agcol.arizona.edu |