Subprojects

The PEP that this example is based on is available in the example_peps repsitory in the example_subprojects1 folder.

The example below demonstrates how and why to use implied attributes functionality to define numerous similar projects in a single project config file. This functionality is extremely convenient when one has to define projects with small settings discreptancies, like different attributes in the annotation sheet. For example libraries ABCD and EFGH instead of the original RRBS.

Import libraries and set the working directory:

import peppy

Code

Read in the project metadata by specifying the path to the project_config.yaml

p_subproj = peppy.Project("../examples/example_peps-master/example_subprojects1/project_config.yaml")
No local config file was provided
Found global config file in DIVCFG: /Users/mstolarczyk/Uczelnia/UVA/code/pepenv/uva_rivanna.yaml
Loading divvy config file: /Users/mstolarczyk/Uczelnia/UVA/code/pepenv/uva_rivanna.yaml
Use 'compute_packages' instead of 'compute'
Available packages: set(['singularity_local', 'default', 'largemem', 'singularity_slurm', 'sigterm', 'local', 'parallel'])
Activating compute package 'default'

To see whether there are any subprojects available within the project_config.yaml file run the following command:

Let's inspect the sample annotation sheet.

p_subproj.sheet
sample_name library organism time file_path
0 pig_0h RRBS pig 0 source1
1 pig_1h RRBS pig 1 source1
2 frog_0h RRBS frog 0 source1
3 frog_1h RRBS frog 1 source1
p_subproj.subprojects
{'newLib2': {'metadata': {'sample_annotation': 'sample_annotation_newLib2.csv'}}, 'newLib': {'metadata': {'sample_annotation': 'sample_annotation_newLib.csv'}}}

As you can see, there are two subprojects available: newLib and newLib2. Nonetheless, only the main opne is "active".

Each of subprojects can be activated with the following command:

sp = p_subproj.activate_subproject("newLib")
sp2 = p_subproj.activate_subproject("newLib2")
No local config file was provided
Found global config file in DIVCFG: /Users/mstolarczyk/Uczelnia/UVA/code/pepenv/uva_rivanna.yaml
Loading divvy config file: /Users/mstolarczyk/Uczelnia/UVA/code/pepenv/uva_rivanna.yaml
Use 'compute_packages' instead of 'compute'
Available packages: set(['singularity_local', 'default', 'largemem', 'singularity_slurm', 'sigterm', 'local', 'parallel'])
Activating compute package 'default'
No local config file was provided
Found global config file in DIVCFG: /Users/mstolarczyk/Uczelnia/UVA/code/pepenv/uva_rivanna.yaml
Loading divvy config file: /Users/mstolarczyk/Uczelnia/UVA/code/pepenv/uva_rivanna.yaml
Use 'compute_packages' instead of 'compute'
Available packages: set(['singularity_local', 'default', 'largemem', 'singularity_slurm', 'sigterm', 'local', 'parallel'])
Activating compute package 'default'

Let's inspect the sample annotation sheet when the newLib2 subproject is active.

sp.sheet
sample_name library organism time file_path
0 pig_0h EFGH pig 0 source1
1 pig_1h EFGH pig 1 source1
2 frog_0h EFGH frog 0 source1
3 frog_1h EFGH frog 1 source1

The PEP

The library attribute in each sample has changed from RRBS to EFGH. This behavior was specified in the project_config.yaml that points to a different sample_annotation_newLib2.csv with changed library attribute.

with open("../examples/example_peps-master/example_subprojects1/project_config.yaml") as f:
    print(f.read())
metadata:
    sample_annotation: sample_annotation.csv
    output_dir: $HOME/hello_looper_results

derived_attributes: [file_path]
data_sources:
    source1: /data/lab/project/{organism}_{time}h.fastq
    source2: /path/from/collaborator/weirdNamingScheme_{external_id}.fastq

subprojects:
    newLib:
        metadata:
            sample_annotation: sample_annotation_newLib.csv
    newLib2:
        metadata:
            sample_annotation: sample_annotation_newLib2.csv



with open("../examples/example_peps-master/example_subprojects1/sample_annotation_newLib2.csv") as f:
    print(f.read())
sample_name,library,organism,time,file_path
pig_0h,EFGH,pig,0,source1
pig_1h,EFGH,pig,1,source1
frog_0h,EFGH,frog,0,source1
frog_1h,EFGH,frog,1,source1