API Reference¶
For the sake of brevity, we highlight only the key parts of the mrQA API.
The most important methods are mrQA.project.check_compliance
. Here is a summarized reference of commonly used methods.
mrQA.project module¶
- mrQA.project.check_compliance(dataset: BaseDataset, strategy: str = 'majority', decimals: int = 3, output_dir: Optional[Union[Path, str]] = None, verbose: bool = False)[source]¶
Main function for checking compliance. Infers the reference protocol according to the user chosen strategy, and then generates a compliance report
- Parameters
dataset (BaseDataset) – BaseDataset instance for the dataset to be checked for compliance
strategy (str) – Strategy employed to specify or automatically infer the reference protocol. Allowed options are ‘majority’
output_dir (Union[Path, str]) – Path to save the report
decimals (int) – Number of decimal places to round to (default:3).
verbose (bool) – print more if true
- Returns
report_path – Path to the generated report
- Return type
Path
- Raises
ValueError – If the input dataset is empty or otherwise invalid
NotImplementedError – If the input strategy is not supported
NotADirectoryError – If the output directory doesn’t exist
- mrQA.project.compare_with_majority(dataset: BaseDataset, decimals: int = 3) BaseDataset [source]¶
Method for post-acquisition compliance. Infers the reference protocol/values by looking for the most frequent values, and then identifying deviations
- Parameters
dataset (BaseDataset) – BaseDataset instance for the dataset which is to be checked for compliance
decimals (int) – Number of decimal places to round to (default:3).
- Returns
dataset – Adds the non-compliance information to the same BaseDataset instance and returns it.
- Return type
BaseDataset
- mrQA.project.generate_report(dataset: BaseDataset, report_path: str, sub_lists_dir_path: str, output_dir: Union[Path, str]) Path [source]¶
Generates an HTML report aggregating and summarizing the non-compliance discovered in the dataset.
- Parameters
dataset (BaseDataset) – BaseDataset instance for the dataset which is to be checked
report_path (str) – Name of the file to be generated, without extension. Ensures that naming is consistent across the report, dataset and record files
sub_lists_dir_path (str) – Path to the directory in which the subject lists should be stored
output_dir (Union[Path, str]) – Directory in which the generated report should be stored.
- Returns
output_path – Path to the generated report
- Return type
Path
mrQA.run_parallel module¶
This module contains functions to run the compliance checks in parallel
- mrQA.run_parallel.create_script(data_source: Optional[Union[str, Path, Iterable]] = None, ds_format: str = 'dicom', include_phantom: bool = False, verbose: bool = False, output_dir: Optional[Union[Path, str]] = None, debug: bool = False, subjects_per_job: Optional[int] = None, hpc: bool = False, conda_dist: Optional[str] = None, conda_env: Optional[str] = None)[source]¶
Given a folder(or List[folder]) it will divide the work into smaller jobs. Each job will contain a fixed number of subjects. These jobs can be executed in parallel to save time.
- Parameters
data_source (str or List[str]) – /path/to/my/dataset containing files
ds_format (str) – Specify dataset type. Use one of [dicom]
include_phantom (bool) – Include phantom scans in the dataset
verbose (bool) – Print progress
output_dir (str) – Path to save the output dataset
debug (bool) – If True, the dataset will be created locally. This is useful for testing
subjects_per_job (int) – Number of subjects per job. Recommended value is 50 or 100
hpc (bool) – If True, the scripts will be generated for HPC, not for local execution
conda_dist (str) – Name of conda distribution
conda_env (str) – Name of conda environment
- mrQA.run_parallel.process_parallel(data_source: Union[str, Path], output_dir: Union[str, Path], out_mrds_path: Union[str, Path], name: Optional[str] = None, subjects_per_job: int = 5, conda_env: str = 'mrcheck', conda_dist: str = 'anaconda3', hpc: bool = False)[source]¶
Given a folder(or List[folder]) it will divide the work into smaller jobs. Each job will contain a fixed number of subjects. These jobs can be executed in parallel to save time.
- Parameters
data_source (str or Path) – Path to the folder containing the subject folders
output_dir (str or Path) – Path to the folder where the output will be saved
out_mrds_path (str or Path) – Path to the final output mrds file
name (str) – Name of the final output file
subjects_per_job (int) – Number of subjects to be processed in each job
conda_env (str) – Name of the conda environment to be used
conda_dist (str) – Name of the conda distribution to be used
hpc (bool) – Whether to use HPC or not
- mrQA.run_parallel.split_ids_list(data_source: Union[str, Path], all_ids_path: Union[str, Path], per_batch_ids: Union[str, Path], output_dir: Union[str, Path], subjects_per_job: int = 50)[source]¶
Splits a given set of subjects into multiple jobs and creates separate text files containing the list of subjects. Each text file contains the list of subjects to be processed in a single job.
- Parameters
data_source (Union[str, Path]) – Path to the root directory of the data
all_ids_path (Union[str, Path]) – Path to the output directory
per_batch_ids (Union[str, Path]) – filepath to a file which has paths to all txt files for all jobs. Each of these txt files contains a list of subject ids for corresponding job.
output_dir (Union[str, Path]) – Name of the output directory
subjects_per_job (int) – Number of subjects to process in each job
- Returns
batch_ids_path_list – Paths to the text files, each containing a list of subjects
- Return type
list
- mrQA.run_parallel.submit_job(scripts_list_filepath: Union[str, Path], mrds_list_filepath: Union[str, Path], hpc: bool = False) None [source]¶
Given a folder(or List[folder]) it will divide the work into smaller jobs. Each job will contain a fixed number of subjects. These jobs can be executed in parallel to save time.
- Parameters
scripts_list_filepath (str) – Path to the file containing list of bash scripts to be executed
mrds_list_filepath (str) – Path to the file containing list of partial mrds files to be created
hpc (bool) – If True, the scripts will be generated for HPC, not for local execution
- Return type
None