About
Who are we?
CAMIS: Comparing Analysis Method Implementations in Software
We are a cross-industry PHUSE DVOST Working Group, run in collaboration with members from PHUSE, PSI, ASA and IASCT. In addition to discussions and issue comments which we do the github repo, we meet monthly on the 2nd Monday of each month. If you would like to join us please contact: workinggroups@phuse.global
Objectives
Through the creation of the CAMIS White Paper, the group provided guidance on the types of questions statistical staff should ask to aid with replication of analysis methods in different software and to identify the fundamental sources of discrepant results between software.
The group aims to save unnecessary repetition of work within the community, through the creation of this open source repository. This repository welcomes contributions from the wider community and is a resource comparing and documenting differences in analysis method implementations in software.
Background
Traditionally, highly regulated industries (such as the pharmaceutical industry), have limited themselves to the use of commercially available software. When taking such an approach, the responsibility for the validation and testing of the product was often delegated to the software development company themselves, to ensure the software performs in line with its documentation, producing accurate reliable and reproducible results. However, one downside of this approach is that new methods and functionality can be slow to be adopted, limiting new method implementation and tools that can bring in efficiency.
With the increase in popularity of data science, the rate at which community led tools and methods are being developed in open-source software is rapid. The availability of advanced analytic capabilities, has led to increased desire for statistical staff in regulated industries to have access and approval to use to open source software (Rimler et al. 2022). The use of open source software is now widely accepted (FDA 2015), however, this increased variety of tools has resulted in an overlap of capabilities. This has raised challenging questions of traditional approaches to clinical analyses – particularly in situations where the overlap yields different results.
One example of this challenge encompasses discrepancies which have been discovered in statistical analysis results between different programming languages, even when working within qualified statistical computing environments. Subtle differences exist between the fundamental approaches and assumptions implemented within each language, yielding differences in results which are correct and consistent with their respective documentation. The fact that these differences exist may cause unease for sponsor companies when submitting to a regulatory agency, as it is uncertain if the agency will view these differences as problematic. By understanding the source of any discrepancies, one can reinstate that confidence.
This cross-industry group aims to empower statistical staff to make informed choices on the implementation of statistical analyses when multiple languages yield different results.