API Reference¶

For the sake of brevity, we highlight only the key parts of the mrQA API.

The most important methods are mrQA.project.check_compliance. Here is a summarized reference of commonly used methods.

mrQA.project module¶

mrQA.project.check_compliance(dataset: BaseDataset, strategy: str = 'majority', decimals: int = 3, output_dir: Optional[Union[Path, str]] = None, verbose: bool = False)[source]¶

Main function for checking compliance. Infers the reference protocol according to the user chosen strategy, and then generates a compliance report

Parameters

dataset (BaseDataset) – BaseDataset instance for the dataset to be checked for compliance
strategy (str) – Strategy employed to specify or automatically infer the reference protocol. Allowed options are ‘majority’
output_dir (Union[Path, str]) – Path to save the report
decimals (int) – Number of decimal places to round to (default:3).
verbose (bool) – print more if true

Returns

report_path – Path to the generated report

Return type

Path

Raises

ValueError – If the input dataset is empty or otherwise invalid
NotImplementedError – If the input strategy is not supported
NotADirectoryError – If the output directory doesn’t exist

mrQA.project.compare_with_majority(dataset: BaseDataset, decimals: int = 3) → BaseDataset[source]¶

Method for post-acquisition compliance. Infers the reference protocol/values by looking for the most frequent values, and then identifying deviations

Parameters

dataset (BaseDataset) – BaseDataset instance for the dataset which is to be checked for compliance
decimals (int) – Number of decimal places to round to (default:3).

Returns

dataset – Adds the non-compliance information to the same BaseDataset instance and returns it.

Return type

BaseDataset

mrQA.project.generate_report(dataset: BaseDataset, report_path: str, sub_lists_dir_path: str, output_dir: Union[Path, str]) → Path[source]¶

Generates an HTML report aggregating and summarizing the non-compliance discovered in the dataset.

Parameters

dataset (BaseDataset) – BaseDataset instance for the dataset which is to be checked
report_path (str) – Name of the file to be generated, without extension. Ensures that naming is consistent across the report, dataset and record files
sub_lists_dir_path (str) – Path to the directory in which the subject lists should be stored
output_dir (Union[Path, str]) – Directory in which the generated report should be stored.

Returns

output_path – Path to the generated report

Return type

Path

mrQA.run_parallel module¶

This module contains functions to run the compliance checks in parallel

mrQA.run_parallel.create_script(data_source: Optional[Union[str, Path, Iterable]] = None, ds_format: str = 'dicom', include_phantom: bool = False, verbose: bool = False, output_dir: Optional[Union[Path, str]] = None, debug: bool = False, subjects_per_job: Optional[int] = None, hpc: bool = False, conda_dist: Optional[str] = None, conda_env: Optional[str] = None)[source]¶

Given a folder(or List[folder]) it will divide the work into smaller jobs. Each job will contain a fixed number of subjects. These jobs can be executed in parallel to save time.

Parameters

data_source (str or List[str]) – /path/to/my/dataset containing files
ds_format (str) – Specify dataset type. Use one of [dicom]
include_phantom (bool) – Include phantom scans in the dataset
verbose (bool) – Print progress
output_dir (str) – Path to save the output dataset
debug (bool) – If True, the dataset will be created locally. This is useful for testing
subjects_per_job (int) – Number of subjects per job. Recommended value is 50 or 100
hpc (bool) – If True, the scripts will be generated for HPC, not for local execution
conda_dist (str) – Name of conda distribution
conda_env (str) – Name of conda environment

mrQA.run_parallel.get_parser()[source]¶

mrQA.run_parallel.main()[source]¶

mrQA.run_parallel.parse_args()[source]¶

mrQA.run_parallel.process_parallel(data_source: Union[str, Path], output_dir: Union[str, Path], out_mrds_path: Union[str, Path], name: Optional[str] = None, subjects_per_job: int = 5, conda_env: str = 'mrcheck', conda_dist: str = 'anaconda3', hpc: bool = False)[source]¶

Given a folder(or List[folder]) it will divide the work into smaller jobs. Each job will contain a fixed number of subjects. These jobs can be executed in parallel to save time.

Parameters

data_source (str or Path) – Path to the folder containing the subject folders
output_dir (str or Path) – Path to the folder where the output will be saved
out_mrds_path (str or Path) – Path to the final output mrds file
name (str) – Name of the final output file
subjects_per_job (int) – Number of subjects to be processed in each job
conda_env (str) – Name of the conda environment to be used
conda_dist (str) – Name of the conda distribution to be used
hpc (bool) – Whether to use HPC or not

mrQA.run_parallel.split_ids_list(data_source: Union[str, Path], all_ids_path: Union[str, Path], per_batch_ids: Union[str, Path], output_dir: Union[str, Path], subjects_per_job: int = 50)[source]¶

Splits a given set of subjects into multiple jobs and creates separate text files containing the list of subjects. Each text file contains the list of subjects to be processed in a single job.

Parameters

data_source (Union[str, Path]) – Path to the root directory of the data
all_ids_path (Union[str, Path]) – Path to the output directory
per_batch_ids (Union[str, Path]) – filepath to a file which has paths to all txt files for all jobs. Each of these txt files contains a list of subject ids for corresponding job.
output_dir (Union[str, Path]) – Name of the output directory
subjects_per_job (int) – Number of subjects to process in each job

Returns

batch_ids_path_list – Paths to the text files, each containing a list of subjects

Return type

list

mrQA.run_parallel.submit_job(scripts_list_filepath: Union[str, Path], mrds_list_filepath: Union[str, Path], hpc: bool = False) → None[source]¶

Given a folder(or List[folder]) it will divide the work into smaller jobs. Each job will contain a fixed number of subjects. These jobs can be executed in parallel to save time.

Parameters

scripts_list_filepath (str) – Path to the file containing list of bash scripts to be executed
mrds_list_filepath (str) – Path to the file containing list of partial mrds files to be created
hpc (bool) – If True, the scripts will be generated for HPC, not for local execution

Return type

None