Handling Missing Data in Clinical and Medical Research: Concepts, Challenges, and Implementation in R Software

Madreseh, Elham; Hosseingholizadeh, Nasrin; Akhlaghi, Maassoumeh; Alikhani, Majid; Sadeghi, Shokufe

[Home ] [Archive]

[ فارسی ]

Main Menu

Home

Journal Information

Articles archive

For Authors

For Reviewers

Registration

Contact us

Site Facilities

Search in website

Receive site information

Volume 1, Issue 3 (8-2025)

2025, 1(3): 54-62

Back to browse issues page

Handling Missing Data in Clinical and Medical Research: Concepts, Challenges, and Implementation in R Software

Elham Madreseh ^*

, Nasrin Hosseingholizadeh

, Maassoumeh Akhlaghi

, Majid Alikhani

, Shokufe Sadeghi

Rheumatology Research Center, Tehran University of Medical Sciences, Tehran, Iran & Clinical Research Development Unit, Shariati Hospital, Tehran University of Medical Sciences, Tehran, Iran

Abstract: (52 Views)

The presence of missing data is regarded as one of the most common and frequently unavoidable challenges in data science and clinical research. This issue may adversely affect the accuracy, internal validity, and interpretation of research findings. In this context, an in-depth understanding of datasets enables health data analysts to implement strategies aimed at preventing and minimizing missing data during the design and conduct phases of a study. Nevertheless, owing to the inherent nature of clinical research, incomplete data remain unavoidable, thereby necessitating the use of practical and robust approaches for managing missing data. This article reviews the primary methods for addressing missing data and presents various missing-data mechanisms and patterns, as well as the proportion of missing data that may be considered ignorable. Finally, through an example based on a hypothetical dataset related to rheumatoid arthritis, one of the most widely used approaches for imputing missing data—multiple imputation by chained equations—is introduced. The corresponding codes are implemented and interpreted using the mice package in R software. Researchers with varying levels of expertise in biostatistics and R software can, provided that the relevant assumptions are met, apply the codes included in this article to estimate missing data in their own research datasets.

Article number: 7

Keywords: Missing data, Missing not at random, statistical imputation, Machine learning, Deep learning

Full-Text [PDF 1189 kb] (32 Downloads)

Type of Study: Research | Subject: General
Received: 2026/01/26 | Accepted: 2026/06/17 | Published: 2026/06/25

Send email to the article author

Add your comments about this article

Mendeley

Zotero

RefWorks

Madreseh E, Hosseingholizadeh N, Akhlaghi M, Alikhani M, Sadeghi S. Handling Missing Data in Clinical and Medical Research: Concepts, Challenges, and Implementation in R Software. Journal title 2025; 1 (3) : 7
URL: http://idap.ir/article-1-52-en.html

Rights and permissions
	This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License.

Volume 1, Issue 3 (8-2025)

Back to browse issues page

Persian site map - English site map - Created in 0.09 seconds with 36 queries by YEKTAWEB 4745