../code.databio.org/vignettes/vignette3simpleCache.Rmd
vignette3simpleCache.RmdThis vignette assumes you’re familiar with the “Getting started with BiocProject vignette” for the basic BiocProject information and “An introduction to simpleCache” for the basic simpleCache information.
simpleCache with BiocProject
For a large project, it can take substantial computational effort to run the initial data loading function that will load your data into R. We’d like to cache that result so that it doesn’t have to be reprocessed every time we want to load our project metadata and data. Pairing simpleCache with BiocProject allows us to do just that. This means that if your custom data processing function loads or processes large data sets that take a long time, the R object will not be recalculated, but simply reloaded.
Briefly, this is the simpleCache logic:
simpleCache with BiocProject
Load the libraries and set the example project config file path:
library(simpleCache) library(BiocProject) projectConfig = system.file( "extdata", "example_peps-master", "example_BiocProject", "project_config.yaml", package = "BiocProject" )
Set the cache directory and read the data in
setCacheDir(tempdir()) simpleCache("dataSet1", { BiocProject(file = projectConfig) }) #> ::Creating cache:: /tmp/RtmpaXwULV/dataSet1.RData #> Loading config file: /tmp/Rtmpp7Kvae/temp_libpath658d0e8e5/BiocProject/extdata/example_peps-master/example_BiocProject/project_config.yaml #> Function 'readBedFiles' read from file '/tmp/Rtmpp7Kvae/temp_libpath658d0e8e5/BiocProject/extdata/example_peps-master/example_BiocProject/readBedFiles.R'
This loads your PEP and its data with BiocProject, and then caches the result with simpleCache. Say you rerun this line of code. simpleCache prevents the calculations from rerunning since the dataSet1 object is already present in the memory:
simpleCache("dataSet1", { BiocProject(file = projectConfig) }) #> ::Object exists (in .GlobalEnv):: dataSet1
Say you come back to your analysis after a while and the dataSet1 object is not in the memory (simulated by removing it with rm() function here). simpleCache loads the object from the directory you have specified in setCacheDir().
rm(dataSet1) simpleCache("dataSet1", { BiocProject(file = projectConfig) }) #> ::Loading cache:: /tmp/RtmpaXwULV/dataSet1.RData
And that’s it! In the simplest case this is all you need to organize, read, process your data and prevent from copious results recalculations every time you come back to your project.