How to use wclique for selecting genetic or Radiation Hybrid (RH) markers for framework mapping.
Introduction
wclique is a C++ program that helps select genetic or RH markers for
framework mapping. wclique is copyrighted material, but may be freely
distributed under the terms of the
GNU General Public Licence
.
Synopsis
% wclique [-r] [-b {breaks}] [-e {exact}] [-i {iters}] {input file}
- -r
- toggles optional weighing of markers by retention frequency
instead of default informativity. May be of interest when selecting
RH markers.
- -b
- followed by an integer specifies the minimum number of breaks
allowed between selected markers. The default value is 1.
- -e
- followed by an integer specifies the maximum number of test
markers to exaustively search. The default is 50, increasing this
number slows execution, but may find better cliques. Setting this
value to the number of markers in the input file guarantees finding
the optimal solution but is prohibitivly slow for more than 150
markers.
- -i
- followed by an integer specifies the number of random trials to
run when searching for cliques. If the -e parameter is set
to the number of markers in the file, this parameter may be reduced to
1. The default value is 100.
Input file
wclique requires an file that contains an integer M, specifying the
number of markers to be analyzed, followed by M lines of marker names,
then N, an integer specifying the number of chromosomes to be
analyzed, followed by N lines of length M each, specifying whether
each marker is phase "P", "M", or "U".
Here's a (small) sample file with 4 markers and 5 chromosomes.
4
NAME1
NAME2
NAME3
NAME4
5
PMUP
UPPM
PUUM
MMPM
MUPU
For genetic mapping
we use origins, from the
BPE package
to infer the grandparental origins of a set of markers in a three
generation pedigree, then filter the labels file with a perl script (
mwc-prep.perl
). Here is how we typically use the programs:
% origins -p pedin.dat -d datain.dat -r /dev/null -l labels.dat
% mwc-prep labels.dat > wclique.in
% wclique wclique.in
For RH mapping we also have perl scripts to convert radmap input files
into wclique format.
Given a .mat and a .l file from radmap, you can produce the input file
with a perl script (
rh-prep.perl
). Here is how we typically use the programs:
% rh-prep map.mat map.l > rhclique.in
% wclique -r rhclique.in
Output format
wclique produces a series of cliques on it's standard output. Each is
a maximal weight clique, as it is discovered by the partial
enumeration algorithm, the final line of the output is a maximal
weight clique, although there may be others before it of equal weight:
Minimum number of breaks = 1
Found clique of size 4 , weight 14
1 2 3 4
The first line states the minimum number of breaks allowed between any
two markers in the clique, subsequent pairs of lines give the clique
size and weight, and lists indicies for the clique markers (i.e.,
"1 2 3" means the first second and third markers in the file form part
of a clique).
The output is similar when run with the -r option:
Using retained weight function
Minimum number of breaks = 1
Found clique of size 4 , weight 7
1 2 3 4
We have two other perl tools, the first,
getnames.perl, can read an input file in
wclique format, and a list of clique marker indicies, and output the
marker name for each marker in the list.
% cat test.set
1 2 3
% getnames wclique.in test.set
NAME1
NAME2
NAME3
The second, verify.perl, reads a input file
a list of marker indicies, and a minimum number of breaks, and checks
that every marker does in fact have the required number of breaks:
% verify wclique.in test.set 3
Verifying that all markers have at least 3 breaks.
Markers = 4
Chromosomes = 5
Marker 1 has only 1 breaks with marker 2.
Marker 1 has only 2 breaks with marker 3.
Marker 2 has only 1 breaks with marker 3.
Notes
- We recommend using between 3 and 6 breaks minimum when building a
framework map for an entire chromosome.
- You can reasonably run wclique on up to about 100 markers
exhaustively (-e 100).
- You can run wclique -e 50 -i 100 (the defaults) for up
to 300 markers. Larger trials will require adjusting -e
and -i.
Related Links
Humberto Ortiz Zuazaga
humberto@momo.uthscsa.edu
$Date: 1996/11/05 20:57:50 $