Mendelian Randomization Standard Operating Procedure (SOP)

This SOP outlines the standardized pipeline for conducting two-sample Mendelian Randomization (MR) analyses, from instrument selection through to causal inference adjudication.

1. Genetic Instrument Selection

Instruments must be robustly associated with the exposure and independent of one another.

1.1 Stringent Criteria (Primary Analysis)

For exposures with well-powered GWAS, apply the following strict inclusion criteria:

P-value Threshold: $p < 5 \times 1 0^{- 8}$ (Genome-wide significance)
Linkage Disequilibrium (LD) Clumping: $r^{2} < 0.001$
Clumping Distance: > 10,000 kb
Reference Panel: 1000 Genomes European (or population-matched to the GWAS)

1.2 Loose Criteria (Relaxed Analysis)

If the stringent criteria yield fewer than 5 Single Nucleotide Polymorphisms (SNPs), the analysis lacks the degrees of freedom required for standard sensitivity tests. In this case, relax the threshold:

P-value Threshold: $p < 5 \times 1 0^{- 6}$
Requirement: If this relaxed threshold is used, MR-RAPS becomes a mandatory primary reporting model due to the guaranteed introduction of weak instruments and measurement error.

2. Estimating Instrument Strength

Before conducting MR, validate the strength of the genetic instruments to rule out weak instrument bias.

2.1 Variance Explained (R-squared)

Calculate the proportion of variance in the exposure explained by each SNP using the Minor Allele Frequency (MAF) and the effect size ( $β$ ):

$R^{2} = 2 \times M A F \times (1 - M A F) \times β^{2}$

Note: Review the average MAF. If the average MAF is very low (< 0.05), the instruments rely on rare variants, which may have less stable effect estimates.

2.2 F-Statistics

Calculate the individual F-statistic for each SNP to measure instrument strength, where $N$ is the sample size of the exposure GWAS:

$F = \frac{R^{2} \times (N - 2)}{1 - R^{2}}$

Individual F-statistic: Any single SNP with $F < 10$ should be excluded from the standard IVW analysis.
Total/Mean F-statistic: Calculate the average F-statistic across all retained SNPs. If the Mean $F < 10$ , the overall instrument is weak, and MR-RAPS must be prioritized.

3. Analytical Models to Test

Run the following suite of models to assess causality, heterogeneity, and horizontal pleiotropy.

3.1 The Baseline Model

Inverse Variance Weighted (IVW): The primary meta-analysis of all SNPs.
- Check Cochran's Q statistic. If $p > 0.05$ , report the IVW Fixed Effects (FE) model.
- If Cochran's Q $p < 0.05$ , switch to the IVW Multiplicative Random Effects (RE) model to account for heterogeneity.

3.2 Standard Sensitivity Models (The Big Three)

MR-Egger: Used to detect directional pleiotropy via the intercept.
Weighted Median: Provides a valid estimate if up to 50% of the instrument weight comes from invalid (pleiotropic) SNPs.
Weighted Mode: Provides a valid estimate if the largest single cluster of SNPs is valid (ZEro InSIDE assumption).

3.3 Advanced Robustness Models

MR-PRESSO: Run if Cochran's Q is significant. It detects and removes specific outlier SNPs driving horizontal pleiotropy and provides an outlier-corrected estimate.
MR-RAPS: Run if Mean F < 10 or if loose inclusion criteria were used. It handles measurement error from weak instruments.
CAUSE: Run as the ultimate robustness check to differentiate true causality from correlated pleiotropy (shared genetic architecture).

3.4 Calculating I-squared GX for MR-Egger

To determine if the MR-Egger slope is reliable, you must test the No Measurement Error (NOME) assumption by calculating $I_{G X}^{2}$ . If $I_{G X}^{2} < 0.90$ , the MR-Egger slope suffers from dilution bias and should not be trusted.

# R Script for calculating I^2_GX using a harmonised TwoSampleMR dataframe ('dat')

calculate_i2gx <- function(dat) {
  beta_x <- dat$beta.exposure
  se_x <- dat$se.exposure
  
  # Variance of the exposure estimates
  var_beta_x <- var(beta_x)
  
  # Mean of the squared standard errors
  mean_se_x2 <- mean(se_x^2)
  
  # Calculate I^2_GX
  I2gx <- 1 - (mean_se_x2 / var_beta_x)
  
  # Constrain between 0 and 1
  I2gx <- max(0, I2gx)
  
  return(I2gx)
}

i2gx_value <- calculate_i2gx(dat)
print(paste("I^2_GX =", round(i2gx_value, 3)))

4. Decision Matrix: Which Results to Report

Use the following adjudication criteria to synthesize the results from the models above.

Scenario	Data Presentation	Adjudicated Result to Report
1. Clean Signal	F > 10. Cochran's Q p > 0.05. Egger Intercept p > 0.05. CAUSE Causal Model p < 0.05.	IVW (Fixed Effects). All assumptions are met.
2. Balanced Pleiotropy	F > 10. Cochran's Q p < 0.05. Egger Intercept p > 0.05.	MR-PRESSO (Outlier Corrected) or IVW (Random Effects). Support with Weighted Median.
3. Weak Instruments	Mean F < 10 OR Relaxed criteria (p < 5e-6) used.	MR-RAPS. Baseline IVW is vulnerable to weak instrument bias.
4. Directional Pleiotropy	Egger Intercept p < 0.05.	Weighted Mode (if I2GX < 0.90) OR MR-Egger Slope (if I2GX >= 0.90). IVW is discarded.
5. Correlated Pleiotropy	CAUSE model comparison p > 0.05 (Sharing Model wins).	Null Result (Discard Causality). The traits are correlated due to shared genetics, not a direct causal pathway. Any significant IVW results are false positives.
6. Inconclusive	MR-PRESSO removes >30% of SNPs causing power loss, OR Median and Mode significantly contradict each other.	Unresolved / Inconclusive. State that genetic evidence is too pleiotropic to reliably disentangle.

Mendelian Randomization

Contents