This function will take the user sets, overlap with the universe, and redefine the user sets as the set of regions in the user universe that overlap at least one region in user sets. this makes for a more appropriate statistical enrichment comparison, as the user sets are actually exactly the same regions found in the universe otherwise, you can get some weird artifacts from the many-to-many relationship between user set regions and universe regions.

redefineUserSets(userSets, userUniverse, cores = 1)

Arguments

userSets

Regions of interest

userUniverse

Regions tested for inclusion in userSets

cores

Number of processors

Value

userSets redefined in terms of userUniverse

Examples

dbPath = system.file("extdata", "hg19", package="LOLA") regionDB = loadRegionDB(dbLocation=dbPath)
#> Reading collection annotations:
#> ucsc_example: found collection annotation:/sfs/lustre/scratch/ns5bc/code/LOLA/inst/extdata/hg19/ucsc_example/collection.txt
#> Reading region annotations...
#> ::Loading cache:: /sfs/lustre/scratch/ns5bc/code/LOLA/inst/extdata/hg19/ucsc_example//ucsc_example_files.RData
#> ucsc_example
#> ::Loading cache:: /sfs/lustre/scratch/ns5bc/code/LOLA/inst/extdata/hg19/ucsc_example/ucsc_example.RData
data("sample_universe", package="LOLA") data("sample_input", package="LOLA") getRegionSet(regionDB, collections="ucsc_example", filenames="vistaEnhancers.bed")
#> GRangesList object of length 1: #> [[1]] #> GRanges object with 1339 ranges and 0 metadata columns: #> seqnames ranges strand #> <Rle> <IRanges> <Rle> #> 1 chr1 [ 3190582, 3191428] * #> 2 chr1 [ 8130440, 8131887] * #> 3 chr1 [10593124, 10594209] * #> 4 chr1 [10732071, 10733118] * #> 5 chr1 [10757665, 10758631] * #> ... ... ... ... #> 1335 chrX [139380917, 139382199] * #> 1336 chrX [139593503, 139594774] * #> 1337 chrX [139674500, 139675403] * #> 1338 chrX [147829017, 147830159] * #> 1339 chrX [150407693, 150409052] * #> #> ------- #> seqinfo: 69 sequences from an unspecified genome; no seqlengths
getRegionSet(dbPath, collections="ucsc_example", filenames="vistaEnhancers.bed")
#> Reading collection annotations: ucsc_example
#> ucsc_example: found collection annotation:/sfs/lustre/scratch/ns5bc/code/LOLA/inst/extdata/hg19/ucsc_example/collection.txt
#> Reading region annotations...
#> ::Loading cache:: /sfs/lustre/scratch/ns5bc/code/LOLA/inst/extdata/hg19/ucsc_example//ucsc_example_files.RData
#> Reading 1 files...
#> 1: /sfs/lustre/scratch/ns5bc/code/LOLA/inst/extdata/hg19/ucsc_example/regions/vistaEnhancers.bed
#> GRangesList object of length 1: #> [[1]] #> GRanges object with 1339 ranges and 0 metadata columns: #> seqnames ranges strand #> <Rle> <IRanges> <Rle> #> 1 chr1 [ 3190582, 3191428] * #> 2 chr1 [ 8130440, 8131887] * #> 3 chr1 [10593124, 10594209] * #> 4 chr1 [10732071, 10733118] * #> 5 chr1 [10757665, 10758631] * #> ... ... ... ... #> 1335 chrX [139380917, 139382199] * #> 1336 chrX [139593503, 139594774] * #> 1337 chrX [139674500, 139675403] * #> 1338 chrX [147829017, 147830159] * #> 1339 chrX [150407693, 150409052] * #> #> ------- #> seqinfo: 23 sequences from an unspecified genome; no seqlengths
getRegionFile(dbPath, collections="ucsc_example", filenames="vistaEnhancers.bed")
#> Reading collection annotations: ucsc_example
#> ucsc_example: found collection annotation:/sfs/lustre/scratch/ns5bc/code/LOLA/inst/extdata/hg19/ucsc_example/collection.txt
#> Reading region annotations...
#> ::Loading cache:: /sfs/lustre/scratch/ns5bc/code/LOLA/inst/extdata/hg19/ucsc_example//ucsc_example_files.RData
#> [1] "/sfs/lustre/scratch/ns5bc/code/LOLA/inst/extdata/hg19/ucsc_example/regions/vistaEnhancers.bed"
res = runLOLA(userSets, userUniverse, regionDB, cores=1)
#> Calculating unit set overlaps...
#> Calculating universe set overlaps...
#> Calculating Fisher scores...
locResult = res[2,] extractEnrichmentOverlaps(locResult, userSets, regionDB)
#> GRanges object with 632 ranges and 0 metadata columns: #> seqnames ranges strand #> <Rle> <IRanges> <Rle> #> [1] chr1 [18229570, 19207602] * #> [2] chr1 [35350878, 35351854] * #> [3] chr1 [38065507, 38258622] * #> [4] chr1 [38499473, 39306315] * #> [5] chr1 [42611485, 42611691] * #> ... ... ... ... #> [628] chrX [125299245, 125300436] * #> [629] chrX [136032577, 138821238] * #> [630] chrX [139018365, 148549454] * #> [631] chrX [154066672, 154251301] * #> [632] chrY [ 2880166, 7112793] * #> ------- #> seqinfo: 69 sequences from an unspecified genome; no seqlengths
writeCombinedEnrichment(locResult, "temp_outfolder")
#> Overwriting temp_outfolder/allEnrichments.tsv...
userSetsRedefined = redefineUserSets(userSets, userUniverse) resRedefined = runLOLA(userSetsRedefined, userUniverse, regionDB, cores=1)
#> Calculating unit set overlaps...
#> Calculating universe set overlaps...
#> Calculating Fisher scores...
g = plotTopLOLAEnrichments(resRedefined)