Workhorse function that calculates overlaps between userSets, and then uses a fisher's exact test rank them by significance of the overlap.

runLOLA(userSets, userUniverse, regionDB, minOverlap = 1, cores = 1,
  redefineUserSets = FALSE, direction = "enrichment")

Arguments

userSets

Regions of interest

userUniverse

Regions tested for inclusion in userSets

regionDB

Region DB to check for overlap, from loadRegionDB()

minOverlap

(Default:1) Minimum bases required to count an overlap

cores

Number of processors

redefineUserSets

run redefineUserSets() on your userSets?

direction

Defaults to "enrichment", but may also accept "depletion", which will swap the direction of the fisher test (use 'greater' or less' value passed to the 'alternative' option of fisher.test)

Value

Data.table with enrichment results. Rows correspond to individual pairwise fisher's tests comparing a single userSet with a single databaseSet. The columns in this data.table are: userSet and dbSet: index into their respective input region sets. pvalueLog: -log10(pvalue) from the fisher's exact result; oddsRatio: result from the fisher's exact test; support: number of regions in userSet overlapping databaseSet; rnkPV, rnkOR, rnkSup: rank in this table of p-value, oddsRatio, and Support respectively. The --value is the negative natural log of the p-value returned from a one-sided fisher's exact test. maxRnk, meanRnk: max and mean of the 3 previous ranks, providing a combined ranking system. b, c, d: 3 other values completing the 2x2 contingency table (with support). The remaining columns describe the dbSet for the row.

If you have the qvalue package installed from bioconductor, runLOLA will add a q-value transformation to provide FDR scores automatically.

Examples

dbPath = system.file("extdata", "hg19", package="LOLA") regionDB = loadRegionDB(dbLocation=dbPath)
#> Reading collection annotations:
#> ucsc_example: found collection annotation:/sfs/lustre/scratch/ns5bc/code/LOLA/inst/extdata/hg19/ucsc_example/collection.txt
#> Reading region annotations...
#> ::Loading cache:: /sfs/lustre/scratch/ns5bc/code/LOLA/inst/extdata/hg19/ucsc_example//ucsc_example_files.RData
#> ucsc_example
#> ::Loading cache:: /sfs/lustre/scratch/ns5bc/code/LOLA/inst/extdata/hg19/ucsc_example/ucsc_example.RData
data("sample_universe", package="LOLA") data("sample_input", package="LOLA") getRegionSet(regionDB, collections="ucsc_example", filenames="vistaEnhancers.bed")
#> GRangesList object of length 1: #> [[1]] #> GRanges object with 1339 ranges and 0 metadata columns: #> seqnames ranges strand #> <Rle> <IRanges> <Rle> #> 1 chr1 [ 3190582, 3191428] * #> 2 chr1 [ 8130440, 8131887] * #> 3 chr1 [10593124, 10594209] * #> 4 chr1 [10732071, 10733118] * #> 5 chr1 [10757665, 10758631] * #> ... ... ... ... #> 1335 chrX [139380917, 139382199] * #> 1336 chrX [139593503, 139594774] * #> 1337 chrX [139674500, 139675403] * #> 1338 chrX [147829017, 147830159] * #> 1339 chrX [150407693, 150409052] * #> #> ------- #> seqinfo: 69 sequences from an unspecified genome; no seqlengths
getRegionSet(dbPath, collections="ucsc_example", filenames="vistaEnhancers.bed")
#> Reading collection annotations: ucsc_example
#> ucsc_example: found collection annotation:/sfs/lustre/scratch/ns5bc/code/LOLA/inst/extdata/hg19/ucsc_example/collection.txt
#> Reading region annotations...
#> ::Loading cache:: /sfs/lustre/scratch/ns5bc/code/LOLA/inst/extdata/hg19/ucsc_example//ucsc_example_files.RData
#> Reading 1 files...
#> 1: /sfs/lustre/scratch/ns5bc/code/LOLA/inst/extdata/hg19/ucsc_example/regions/vistaEnhancers.bed
#> GRangesList object of length 1: #> [[1]] #> GRanges object with 1339 ranges and 0 metadata columns: #> seqnames ranges strand #> <Rle> <IRanges> <Rle> #> 1 chr1 [ 3190582, 3191428] * #> 2 chr1 [ 8130440, 8131887] * #> 3 chr1 [10593124, 10594209] * #> 4 chr1 [10732071, 10733118] * #> 5 chr1 [10757665, 10758631] * #> ... ... ... ... #> 1335 chrX [139380917, 139382199] * #> 1336 chrX [139593503, 139594774] * #> 1337 chrX [139674500, 139675403] * #> 1338 chrX [147829017, 147830159] * #> 1339 chrX [150407693, 150409052] * #> #> ------- #> seqinfo: 23 sequences from an unspecified genome; no seqlengths
getRegionFile(dbPath, collections="ucsc_example", filenames="vistaEnhancers.bed")
#> Reading collection annotations: ucsc_example
#> ucsc_example: found collection annotation:/sfs/lustre/scratch/ns5bc/code/LOLA/inst/extdata/hg19/ucsc_example/collection.txt
#> Reading region annotations...
#> ::Loading cache:: /sfs/lustre/scratch/ns5bc/code/LOLA/inst/extdata/hg19/ucsc_example//ucsc_example_files.RData
#> [1] "/sfs/lustre/scratch/ns5bc/code/LOLA/inst/extdata/hg19/ucsc_example/regions/vistaEnhancers.bed"
res = runLOLA(userSets, userUniverse, regionDB, cores=1)
#> Calculating unit set overlaps...
#> Calculating universe set overlaps...
#> Calculating Fisher scores...
locResult = res[2,] extractEnrichmentOverlaps(locResult, userSets, regionDB)
#> GRanges object with 632 ranges and 0 metadata columns: #> seqnames ranges strand #> <Rle> <IRanges> <Rle> #> [1] chr1 [18229570, 19207602] * #> [2] chr1 [35350878, 35351854] * #> [3] chr1 [38065507, 38258622] * #> [4] chr1 [38499473, 39306315] * #> [5] chr1 [42611485, 42611691] * #> ... ... ... ... #> [628] chrX [125299245, 125300436] * #> [629] chrX [136032577, 138821238] * #> [630] chrX [139018365, 148549454] * #> [631] chrX [154066672, 154251301] * #> [632] chrY [ 2880166, 7112793] * #> ------- #> seqinfo: 69 sequences from an unspecified genome; no seqlengths
writeCombinedEnrichment(locResult, "temp_outfolder")
#> Overwriting temp_outfolder/allEnrichments.tsv...
userSetsRedefined = redefineUserSets(userSets, userUniverse) resRedefined = runLOLA(userSetsRedefined, userUniverse, regionDB, cores=1)
#> Calculating unit set overlaps...
#> Calculating universe set overlaps...
#> Calculating Fisher scores...
g = plotTopLOLAEnrichments(resRedefined)