trans.anno
Class AnnotateRegionsWithGeneList

java.lang.Object
  extended bytrans.anno.AnnotateRegionsWithGeneList

public class AnnotateRegionsWithGeneList
extends java.lang.Object

For annotating a picks file (chr, start, stop) with gene information for a particular list of CG names. Specific for dmel release 4.0 gff3.

See Also:
AnnotateRegions

Constructor Summary
AnnotateRegionsWithGeneList(java.lang.String[] args)
           
 
Method Summary
static void compareBindingRegionsVsGeneGrps(GeneGroup[] geneGroups, BindingRegion[] bindingRegions)
          Does a complete scan, could be optimized.
static int[] countGenes(BindingRegion[] br)
          Returns: the number of genes with one or more binding regions within the neighborhood.
static int countNumberBindingRegionsWithNeighbors(BindingRegion[] br)
           
static int countNumberNeighbors(BindingRegion[] br)
           
static java.util.ArrayList extractCGNames(java.util.ArrayList geneGroups)
          Extracts the names of each gene group returning an ArrayList of Strings.
 int findDistanceToATG(BindingRegion br, GeneGroup gp)
          Finds the distance to conservative estimate of an ATG, returns 0 if overlaps.
 int findDistToClosestATG(BindingRegion br)
          Finds the distance to the closest ATG translation start site.
 int findDistToClosestATG(BindingRegion br, java.util.ArrayList geneGroups)
           
 int findDistToClosestTranscript(BindingRegion br)
          Finds the distance to the closest ATG translation start site.
 int findDistToClosestTranscript(BindingRegion br, java.util.ArrayList geneGroups)
           
 int findDistToClosestTranscript(BindingRegion br, GeneGroup gp)
          Finds the distance to conservative estimate of start of first exon, returns 0 if overlaps.
static void main(java.lang.String[] args)
           
static BindingRegion[] makeRandomBindingRegions(BindingRegion[] br, java.util.HashMap chromLengths, int sizeNeighborhood)
          For each binding region this will make another binding region from the same chromosome with the same length, yet at a random location.
static boolean overlap(java.util.ArrayList ints, int startRegion, int endRegion)
          Tests whether any startEnd int[] in the ArrayList of ints ovelaps a region defined by the startRegion and endRegion.
static BindingRegion[] parseIntervalFile(java.io.File intervalFile, int sizeNeighborhood)
          Attempts to fetch a serialized array of Interval[], then sorts/ ranks the intervals by the median ratio of the sub window.
static BindingRegion[] parsePicksFile(java.io.File picksFile, int sizeNeighborhood)
           
 void printDistToClosestATGAndTranscript(BindingRegion[] br)
          Prints rank, chrom, start, stop, distance to closest ATG, to closest transcript start.
static void printDocs()
           
 void processArgs(java.lang.String[] args)
          This method will process each argument and assign new varibles
 
Methods inherited from class java.lang.Object
equals, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Constructor Detail

AnnotateRegionsWithGeneList

public AnnotateRegionsWithGeneList(java.lang.String[] args)
Method Detail

makeRandomBindingRegions

public static BindingRegion[] makeRandomBindingRegions(BindingRegion[] br,
                                                       java.util.HashMap chromLengths,
                                                       int sizeNeighborhood)
For each binding region this will make another binding region from the same chromosome with the same length, yet at a random location.


printDistToClosestATGAndTranscript

public void printDistToClosestATGAndTranscript(BindingRegion[] br)
Prints rank, chrom, start, stop, distance to closest ATG, to closest transcript start.


findDistToClosestATG

public int findDistToClosestATG(BindingRegion br)
Finds the distance to the closest ATG translation start site.


findDistToClosestATG

public int findDistToClosestATG(BindingRegion br,
                                java.util.ArrayList geneGroups)

findDistToClosestTranscript

public int findDistToClosestTranscript(BindingRegion br)
Finds the distance to the closest ATG translation start site.


findDistToClosestTranscript

public int findDistToClosestTranscript(BindingRegion br,
                                       java.util.ArrayList geneGroups)

findDistanceToATG

public int findDistanceToATG(BindingRegion br,
                             GeneGroup gp)
Finds the distance to conservative estimate of an ATG, returns 0 if overlaps.


findDistToClosestTranscript

public int findDistToClosestTranscript(BindingRegion br,
                                       GeneGroup gp)
Finds the distance to conservative estimate of start of first exon, returns 0 if overlaps.


countNumberNeighbors

public static int countNumberNeighbors(BindingRegion[] br)

countNumberBindingRegionsWithNeighbors

public static int countNumberBindingRegionsWithNeighbors(BindingRegion[] br)

countGenes

public static int[] countGenes(BindingRegion[] br)
Returns: the number of genes with one or more binding regions within the neighborhood. the number of genes where the binding region is on the 5' end and the number of genes where the binding region is on the 3' end of the respective gene, the number of genes that overlap a binding region on their 5' end and 3' end, lastly the number of regions entirely contained by a gene, the number of binding regions with neighbors, the number of regions with no neighbors as defined by the neighborhood, the number of regions in non coding DNA, the number of regions in coding DNA, the number of regions that overlap coding and nonCoding DNA

Returns:
int[9] {num 5', num 3', overlap 5', overlap 3', contained, no neighbors, non coding, coding, overlap coding and non coding}

overlap

public static boolean overlap(java.util.ArrayList ints,
                              int startRegion,
                              int endRegion)
Tests whether any startEnd int[] in the ArrayList of ints ovelaps a region defined by the startRegion and endRegion. Assumes start is always <= end.


extractCGNames

public static java.util.ArrayList extractCGNames(java.util.ArrayList geneGroups)
Extracts the names of each gene group returning an ArrayList of Strings.


compareBindingRegionsVsGeneGrps

public static void compareBindingRegionsVsGeneGrps(GeneGroup[] geneGroups,
                                                   BindingRegion[] bindingRegions)
Does a complete scan, could be optimized.


parsePicksFile

public static BindingRegion[] parsePicksFile(java.io.File picksFile,
                                             int sizeNeighborhood)

parseIntervalFile

public static BindingRegion[] parseIntervalFile(java.io.File intervalFile,
                                                int sizeNeighborhood)
Attempts to fetch a serialized array of Interval[], then sorts/ ranks the intervals by the median ratio of the sub window. It then uses it to build an array of BindingRegion. Will return null if it cannot fetch an Interval[].


printDocs

public static void printDocs()

processArgs

public void processArgs(java.lang.String[] args)
This method will process each argument and assign new varibles


main

public static void main(java.lang.String[] args)