Podcast & Blog


Statistical Challenges in Reproductive Research

Analysis of reproductive research poses numerous challenges to those conducting the studies and those interpreting the studies.  There are universal challenges to analysis of all research, such as interpreting causation versus association.  Additionally, there are unique challenges to reproductive research not found in many other health related fields, which include the pitfalls in analyzing implantation rate, the availability of a large number of outcome endpoints and the plethora of available modeling parameters.  Finally, there is a challenge accounting for the subtle non-independence of much of the data generated during reproductive studies due to how treatment is rendered to patients.  In this blog, some of the more important and unique challenges in reproductive research study design and data analysis will be examined.

The Problem of Multiple Comparisons and Multiplicity

Government regulatory agencies overseeing the approval of drugs have started to address the widespread problem of having multiple comparisons of outcomes in clinical trials, which is referred to as multiplicity.   Trials that include an excessive number of comparisons suffer from inflated Type I errors, which lead to false positive studies.  Regulatory agencies, such as the Food and Drug Administration in the United States, state that the Type I error needs to be strictly controlled for not only for the primary, but also for the secondary outcomes in trials.

Assisted Reproductive Technology (ART) studies are prone to this error because of the large amount of data collected on patients during treatment cycles.  When the threshold for the P value in the study is held at the traditional 0.05 level without accounting for multiple comparisons, the rate of finding a spurious association is increased with every additional comparison performed.  The source of this common error is many times from the desire to assess all potential outcomes thinking that the study is more complete if done that way.  Unfortunately, the opposite effect happens, and it unintentionally increases the rate of false positive outcomes.  Reproductive research study outcomes can be numerous which commonly include the number of oocytes, mature oocytes retrieved, percent oocyte maturation, fertilization, embryology grades on each day of assessment, blastulation, supernumerary embryos, implantation, clinical pregnancy, ongoing pregnancy, and live birth.  Additionally, analysis by age group is commonly performed in reproductive studies.  This also leads to multiplicity and should be handled accordingly.

One way to approach the risks associated with multiple comparisons can be controlled by making a clear, a priori single outcome measure.  Subsequently, multiplicity of any secondary outcomes can then be accounted for with one of the traditional statistical methods (Bonferroni, Holm, or Hochberg) or with more advanced methods.  Applying these methods can reduce the risk of falsely concluding an effect exists, when it does not.

Assessing Implantation as an Outcome can be Challenging

Implantation is a common endpoint in ART studies and it is a reasonable secondary endpoint in some situations. There are several potential issues using implantation as a study outcome which need to be carefully considered prior to conducting the analysis.  First, live birth and sustained implantation are far more clinically relevant than simply implantation.

Second, multiple embryos being transferred to the same patient are not necessarily independent events.  The fate of embryos transferred to the same uterus will be similar due to the uterine environment.  This dependence violates one of the key assumptions of many statistical tests.  Therefore, unless a single embryo is transferred to the uterus, analysis of implantation will require advanced statistical methods that consider the dependence of events, such generalized estimating equations (GEE) and mixed-effects models.  In summary, implantation, especially sustained implantation, can be included in studies, but any assumptions made about dependence and reasons for using implantation instead of live birth should be examined carefully.

Adjusting for Female Age

Female age is a powerful predictor of reproductive study outcomes.  The effects of female age are generally greater than many interventions that are studied.  One way to handle this is by stratifying data by age groups.  Stratification is a simple and intuitive method, but a more powerful method of “adjusting” for female age is to include it in a regression model along with the intervention.  For the dichotomous outcome of live birth, this can be performed by including age in a logistic regression model.

Unfortunately, adding female age as a single linear predictor to a model is not always enough.  The inadequacy of using a simple linear predictor for female age in logistic regression is the odds of live birth are not linearly related to female age.  The linear relationship is required by the modeling assumptions for the logistic regression to perform properly.  The error is not as great when the age range is narrow because non-linear effects can be approximated by a line in that case.  Currently, many studies employ a wide range of ages spanning the full reproductive window of a woman’s life, so the non-linearity needs to be accounted for.

To account for the non-linearity in age, a transform to the data or non-linear modeling method may be utilized.  For female age with respect to pregnancy outcomes, a suitable method to reduce the error from the non-linearity is using piecewise linear modeling of female age or a non-linear transform of the patient’s age.  Properly modeling female age is critical because when the residual (unmodeled) effect of age is larger than the effect of an intervention, it is likely that a significant effect of that intervention will not be found.  This occurs because the modeling inaccuracy appears as noise in the final model leading to larger standard errors and confidence intervals reducing the ability to detect a statistically significant result in the intervention.  This becomes even more problematic as the room for improvement in ART outcomes becomes smaller.


In conclusion, studies in the field of reproductive research are at risk of numerous study design and statistical pitfalls.  By carefully considering analyses a priori and adhering to fundamental statistical rules, the plethora of data generated from reproductive studies can be analyzed in a rigorous fashion.


Author: George Patounakis