“Never theorize before you have data. Invariably, you end up twisting facts to suit the theories, instead of theories to suit facts.”  Sherlock Holmes

 

The increasing number of studies and publications on the use of artificial intelligence in medicine warrants closer scrutiny of the methodology of machine learning approaches.

While the machine learning methodology has advantages of higher level of performance, adaptability to more complex inputs, and scalability, the older rule-based system has the advantage of interpretability. The lack of total interpretability of machine learning models is compounded by lack of transparency of the nature of the datasets: training, validation, and test datasets; in addition, some of the methodology and development of machine learning can also be missing adequate scrutiny.

These shortcomings can lead to decreased ability to generalize to other populations.

The authors propose a minimum information about clinical artificial intelligence modeling (MI-CLAIM) guideline for transparency of projects utilizing artificial intelligence for all stakeholders. The purported aim of this guideline is to enable a direct assessment for clinical impact (often not defined or even stated) as well as allow replication of the technical design process.

The MI-CLAIM has six parts: 1) study design (this section is further divided into four subsections: clinical setting; performance measures; population composition; and current baselines to measure performance against); 2) separation of data into partitions for model training and model testing (with focus on overfitting and information leakage as well as internal vs external testing); 3) optimization and final model selection (includes best format of data, type of model to be used, and optimal model hyperparameters as well as transformations); 4) performance evaluation (model itself and clinical performance metrics); 5) model examination (with discussions on sanity check, biases, shifts, and visual explanations as well as sensitivity analysis); and 6) reproducible pipeline.

While all of these sections are essential for an overall good project, often the clinical relevance is weak or even absent for manuscripts.

The full article can be read here