Workhorse function that calculates overlaps between userSets, and then uses a fisher's exact test rank them by significance of the overlap.
runLOLA(userSets, userUniverse, regionDB, minOverlap = 1, cores = 1, redefineUserSets = FALSE, direction = "enrichment")
userSets | Regions of interest |
---|---|
userUniverse | Regions tested for inclusion in userSets |
regionDB | Region DB to check for overlap, from loadRegionDB() |
minOverlap | (Default:1) Minimum bases required to count an overlap |
cores | Number of processors |
redefineUserSets | run redefineUserSets() on your userSets? |
direction | Defaults to "enrichment", but may also accept "depletion", which will swap the direction of the fisher test (use 'greater' or less' value passed to the 'alternative' option of fisher.test) |
Data.table with enrichment results. Rows correspond to individual pairwise fisher's tests comparing a single userSet with a single databaseSet. The columns in this data.table are: userSet and dbSet: index into their respective input region sets. pvalueLog: -log10(pvalue) from the fisher's exact result; oddsRatio: result from the fisher's exact test; support: number of regions in userSet overlapping databaseSet; rnkPV, rnkOR, rnkSup: rank in this table of p-value, oddsRatio, and Support respectively. The --value is the negative natural log of the p-value returned from a one-sided fisher's exact test. maxRnk, meanRnk: max and mean of the 3 previous ranks, providing a combined ranking system. b, c, d: 3 other values completing the 2x2 contingency table (with support). The remaining columns describe the dbSet for the row.
If you have the qvalue package installed from bioconductor, runLOLA will add a q-value transformation to provide FDR scores automatically.
#>#>#>#>#>#>data("sample_universe", package="LOLA") data("sample_input", package="LOLA") getRegionSet(regionDB, collections="ucsc_example", filenames="vistaEnhancers.bed")#> GRangesList object of length 1: #> [[1]] #> GRanges object with 1339 ranges and 0 metadata columns: #> seqnames ranges strand #> <Rle> <IRanges> <Rle> #> 1 chr1 [ 3190582, 3191428] * #> 2 chr1 [ 8130440, 8131887] * #> 3 chr1 [10593124, 10594209] * #> 4 chr1 [10732071, 10733118] * #> 5 chr1 [10757665, 10758631] * #> ... ... ... ... #> 1335 chrX [139380917, 139382199] * #> 1336 chrX [139593503, 139594774] * #> 1337 chrX [139674500, 139675403] * #> 1338 chrX [147829017, 147830159] * #> 1339 chrX [150407693, 150409052] * #> #> ------- #> seqinfo: 69 sequences from an unspecified genome; no seqlengths#>#>#>#>#>#>#> GRangesList object of length 1: #> [[1]] #> GRanges object with 1339 ranges and 0 metadata columns: #> seqnames ranges strand #> <Rle> <IRanges> <Rle> #> 1 chr1 [ 3190582, 3191428] * #> 2 chr1 [ 8130440, 8131887] * #> 3 chr1 [10593124, 10594209] * #> 4 chr1 [10732071, 10733118] * #> 5 chr1 [10757665, 10758631] * #> ... ... ... ... #> 1335 chrX [139380917, 139382199] * #> 1336 chrX [139593503, 139594774] * #> 1337 chrX [139674500, 139675403] * #> 1338 chrX [147829017, 147830159] * #> 1339 chrX [150407693, 150409052] * #> #> ------- #> seqinfo: 23 sequences from an unspecified genome; no seqlengths#>#>#>#>#> [1] "/sfs/lustre/scratch/ns5bc/code/LOLA/inst/extdata/hg19/ucsc_example/regions/vistaEnhancers.bed"res = runLOLA(userSets, userUniverse, regionDB, cores=1)#>#>#>#> GRanges object with 632 ranges and 0 metadata columns: #> seqnames ranges strand #> <Rle> <IRanges> <Rle> #> [1] chr1 [18229570, 19207602] * #> [2] chr1 [35350878, 35351854] * #> [3] chr1 [38065507, 38258622] * #> [4] chr1 [38499473, 39306315] * #> [5] chr1 [42611485, 42611691] * #> ... ... ... ... #> [628] chrX [125299245, 125300436] * #> [629] chrX [136032577, 138821238] * #> [630] chrX [139018365, 148549454] * #> [631] chrX [154066672, 154251301] * #> [632] chrY [ 2880166, 7112793] * #> ------- #> seqinfo: 69 sequences from an unspecified genome; no seqlengths#>userSetsRedefined = redefineUserSets(userSets, userUniverse) resRedefined = runLOLA(userSetsRedefined, userUniverse, regionDB, cores=1)#>#>#>