Data management and processing
Datasets collected by CATI follow the following steps:
- Each dataset is collected by acquisition centres and transferred to CATI by a secure solution.
- A thorough sanity and quality check is run on the dataset, to ensure that the dataset is whole and consistent with the Standard Operating Procedures (SOPs) and evaluate the quality of each image, as requested.
- The dataset is cleaned and organised (curation) to be indexed in the database.
- Monitoring tables are extracted to allow the principal investigator and coordinating team to monitor the imaging part of the study.
- Standard and/or dedicated preprocessing and processing pipelines are run according to study requirements, with systematic quality control procedures.
- Datasets and preprocessed / processed results are prepared and exported in BIDS format (DICOM and NIFTI) on a sharing portal (sftp or https).
Data Transfer to CATI
CATI proposes a secure web-based service for neuroimaging data upload, CATI Collector (https://cati.cea.fr/collector). This service is fully based on the HTTPS communication protocol. It relies on NeurospinCloud service, which is a secure service from the CEA based on NextCloud with a two-factor authentication.
CATI Collector is a full online solution; no piece of software needs to be installed locally on your computer. This web service allows centres to upload deidentified DICOM datasets on the website as “.zip” archives. A transfer form is to be filled as a web form on the website before uploading a dataset for traceability issues. Transferred dataset can be de-identified through this service if no local solution can be set up, to comply with GDPR regulations of research projects.
Quality Control and Monitoring
Datasets are first checked for overall quality issues: consistency of identification number with coordinator file or baseline visit, consistency of acquisition centre and acquisition date, deidentification of DICOM datasets…
The second step is to ensure that no data was lost during transfer and that data format (uncompressed DICOM) is correct and readable.
Sanity check is then performed, to ensure consistency with the SOPs and validated acquisition / reconstruction protocol for this centre, with the following items: acquisition system (MRI & NM system, head coil and software version), sequences acquired (+ acquisition order), specific acquisition parameters for each sequence and NM reconstruction parameters. Acquisitions are converted to NIfTI research format, together with the BIDS sidecar json file (https://bids.neuroimaging.io/), and this conversion is assessed through geometrical characteristics of the images. DICOM files are pseudonymised with strict rules to ensure compliance with the GDPR.
A specific quality control procedure is then run for each type of MRI sequences and NM dataset, using preprocessing steps, and based on both quantitative indices and systematic visual assessments. Visual assessment will evaluate specific artefacts to determine whether these artefacts are system related, patient related or motion related, and thus could be improved or not by specific procedures during acquisition. Monitoring tables and views will be extracted, with information regarding consistency with respect to SOPs and quality for each sequence / NM dataset and each centre, or each subject.
Preprocessing and processing
According to the needs of the project leader, several preprocessing and processing pipelines can be run on both MRI and NM datasets. All preprocessing and processing results will undergo visual quality control with in-house visual reports to ensure its reliability. Quantitative measurements are made available through spreadsheets with details on quality control.
These pipelines can be standard freely available or in-house pipelines, for example:
- structural imaging: SPM, FreeSurfer, VolBrain or Morphologist
- white matter lesions: WHASA or LesionBrain
- diffusion imaging: TractSeg or AMICO
- …
or in-house pipelines built with state-of-the-art processing steps to fulfill the necessary requirements, for example:
- a pipeline built from SPM, CAT12 and other tools for multimodal longitudinal data processing
- a pipeline built from MRtrix, FSL, SynthSeg for diffusion MRI preprocessing
- a pipeline built with AFNI and structural tools for functional imaging preprocessing
- a pipeline built from CERES, SUIT and FreeSurfer for cerebrum and cerebellum segmentation
- a pipeline for preprocessing neonatal datasets
- …
New pipelines can be considered, designed, implemented and validated, after evaluating the feasibility, if no solution is yet available or satisfactory. New tools can also be evaluated and run if needed.
Data Sharing
Imaging data are shared as DICOM or NIfTI format with associated metadata (subject code, time points, acquisition parameters, etc.), following an organization standard defined by the neuroimaging community: the Brain Imaging Data Structure (BIDS) (https://bids.neuroimaging.io/).
Specific subsets of a study data can be shared with project consortium members via a secure file transfer protocol (SFTP) upon request by the project leader and with access rights management.
A data sharing portal can also be implemented and managed by CATI, to provide access to the study data with precise access rights management managed by the project leader. Data will then be accessible via the Neurospin Cloud service, either for download or through direct access using the WebDAV transmission protocol, allowing data to be accessed as a network drive on major operating systems (Windows, macOS, and Linux).