This Repo contains all of the files, images, datasets and code used in my undergraduate thesis: Gender Bias in Clinical Studies: A Statistical Approach.
Institution: University of Crete, Deparment of Mathematics and Applied Mathematics
Supervisor: Supervisor: Pavlos Pavlidis, Associate Professor, Department of Biology UoC and Affiliated Researcher, ICS-FORTH
Heraklion 2025
Abstract:
This thesis aims to investigate the impact of gender bias in clinical research, with a focus on differences in gene expression. The historical exclusion of females in clinical trials and drug testing has led to significant disparities in healthcare outcomes. To explore this issue, this thesis statistically analyzes gender-based differences in gene expression across various conditions. A few basic theoretical concepts are explained before presenting the methodology used. Using datasets from the GEO database, hypothesis testing is conducted separately for each dataset. Linear models, applied through the limma package in R, identify significantly differentially expressed genes between genders. The results are presented and visualized, highlighting the extent of gender-specific variations in gene expression.
Keywords: Medical research, gender bias, gene expression, differentially expressed gene (DEG), statistical bias, microarray, t-test, linear model, limma, design matrix, contrast matrix, means model, mean-reference model, factor, level, empirical Bayes, GEO datasets
Datasets: Below is the list with all of the GEO Datasets used and links to them. Those were not uploaded onto this repositorie due to the size limitations of GitHub.
https://docs.google.com/spreadsheets/d/1zdT-pYzlnM5JREDoc6iLs5po7ZMVYkvogP1svIat6ws/edit?usp=sharing