|
This simulation
compares different coverages and error. It also
compares HICF and agarose. The results show that HICF is better than
agarose even with error. The results also show the benefit of reducing
your error.
Type Set Cutoff
|
Coverage
|
Error
|
#Ctgs
|
Score
|
#Qs
(#ctgs w/Qs)
|
HICF
1 1e-40
|
|
|
|
|
|
|
10x
|
6%
|
24
|
0.892
|
0(0)
|
|
10x
|
12.5%
|
46
|
0.802
|
39(11)
|
|
20x
|
6%
|
2
|
0.874
|
0(0)
|
|
20x
|
12.5%
|
8
|
0.710
|
328(7)
|
HICF
2 1e-50
|
|
|
|
|
|
|
10x
|
6%
|
43
|
0.886
|
0
|
|
10x
|
12.5%
|
79
|
0.797
|
13(9)
|
|
20x
|
6%
|
9
|
0.874
|
1(1)
|
|
20x
|
12.5%
|
17
|
0.724
|
106(10)
|
Agarose
2 1e-12
|
|
|
|
|
|
|
10x
|
6%
|
163
|
0.959
|
0(0)
|
|
10x
|
12.5%
|
211
|
0.921
|
8(8)
|
|
20x
|
6%
|
67
|
0.928
|
0(0)
|
|
20x
|
12.5%
|
147
|
0.872
|
55(33)
|
The same set of clones were used for a given set, except the 20x has
an additional 10x clones. So the same clones are used for the HICF Set2
and Agarose Set2. Though the
number of contigs for the 20x/12.5% error is less than for the 10x/6%
error, the number of Qs is much higher, implying an increase in the
problem within the map. These problems will lead to more time for
manually editing the map and increased difficulty in selecting a MTP
(see MTP simulation). The take home message from
this simulation is that if you reduce your
error, you can reduce your coverage. Moreover, the amount of overlap
between clones (i.e. the endpoint coordiantes) will be closer to
correct. Note, they can never be exact
since there in not enough information in the bands, but the less error
in the fingerprints, the less error in the overlap. Reduce your error!!
For HICF, we
typically get 12.5% error per
clone. For Agarose, it really depends on how good the band-calling is.
We had exceptionally good band-callers for the maize agarose
fingerprints and had an average <6% error per clone. To
test your
error, do the following: (1) refingerprint a plate of clones, (2) for
the second plate, give the clones the same name but a different gel
name, and add them to your fpc database (Update .cor) (3) On the
Project Window,
select Search, then select "Find clones with mult gels > cutoff".
Iit will list all the clones where the two gels do not match below the
given cutoff. At the end of the output, the final statistics will be
displayed as the following example:
0 51689
17.839
1
79004 27.265
2
60871 21.007
3
40494 13.975
4
25598 8.834
5
15881 5.481
6
9895 3.415
7
6328 2.184
IBM Bad
2354
Good 12394 Fp 10.49 (7,1e-10, min 0, max 32767)
This says that 17.8% of the bands for this dataset are
exactly the
same, 27% have a difference of 1, etc. The list goes up to 7 since that
is the tolerance. All bands that do not match with another at the
tolerance are considered a Fp (false positive). There are 10.49 Fp in
this set, which is 5.25% error per clone. Out of all the clones with
multiple gels, 2354 were below the cutoff and the rest were good
matches.
The above simulations were performed with one Build. On a real dataset,
we then perform
automatic end joining, which further reduces the number of contigs
without manual merging. See the Automerge
demo.
|