#renv::install("PowerTOST")
library(PowerTOST)
library(knitr)
library(data.table)
library(purrr)
Attaching package: 'purrr'
The following object is masked from 'package:data.table':
transpose
The most unambiguous requirements are mentioned in FDA Guidance for Industry. Statistical Approaches to Establishing Bioequivalence:
Sample sizes for average BE should be obtained using published formulas. Sample sizes for population and individual BE should be based on simulated data. The simulations should be conducted using a default situation allowing the two formulations to vary as much as 5% in average BA with equal variances and certain magnitude of subject-by-formulation interaction. The study should have 80 or 90% power to conclude BE between these two formulations. Sample size also depends on the magnitude of variability and the design of the study. Variance estimates to determine the number of subjects for a specific drug can be obtained from the biomedical literature and/or pilot studies.
Appropriate method is described in Diletti D, Hauschke D, Steinijans VW. Sample Size Determination for Bioequivalence Assessment by Means of Confidence Intervals. Int J Clin Pharmacol Ther Toxicol. 1991;29(1):1–8
and implemented in R package PowerTOST with one clarification: it is simulation-based (iterative) procedure rather than simple calculation by formula.
sampleN.TOST()
function can calculate sample size for different designs:
no | design | df | df2 | steps | bk | bknif | bkni | name |
---|---|---|---|---|---|---|---|---|
0 | parallel | n-2 | n-2 | 2 | 4.0 | 1/1 | 1.0000000 | 2 parallel groups |
1 | 2x2 | n-2 | n-2 | 2 | 2.0 | 1/2 | 0.5000000 | 2x2 crossover |
1 | 2x2x2 | n-2 | n-2 | 2 | 2.0 | 1/2 | 0.5000000 | 2x2x2 crossover |
2 | 3x3 | 2*n-4 | n-3 | 3 | 2.0 | 2/9 | 0.2222222 | 3x3 crossover |
3 | 3x6x3 | 2*n-4 | n-6 | 6 | 2.0 | 1/18 | 0.0555556 | 3x6x3 crossover |
4 | 4x4 | 3*n-6 | n-4 | 4 | 2.0 | 1/8 | 0.1250000 | 4x4 crossover |
5 | 2x2x3 | 2*n-3 | n-2 | 2 | 1.5 | 3/8 | 0.3750000 | 2x2x3 replicate crossover |
6 | 2x2x4 | 3*n-4 | n-2 | 2 | 1.0 | 1/4 | 0.2500000 | 2x2x4 replicate crossover |
7 | 2x4x4 | 3*n-4 | n-4 | 4 | 1.0 | 1/16 | 0.0625000 | 2x4x4 replicate crossover |
9 | 2x3x3 | 2*n-3 | n-3 | 3 | 1.5 | 1/6 | 0.1666667 | partial replicate (2x3x3) |
10 | 2x4x2 | n-2 | n-2 | 4 | 8.0 | 1/2 | 0.5000000 | Balaam’s (2x4x2) |
11 | 2x2x2r | 3*n-2 | n-2 | 2 | 1.0 | 1/4 | 0.2500000 | Liu’s 2x2x2 repeated x-over |
100 | paired | n-1 | n-1 | 1 | 2.0 | 2/1 | 2.0000000 | paired means |
Basic usage: we should specify targetpower
(power to achieve at least, e.g. 0.8 or 0.9), theta0
(T/R ratio if logscale = TRUE
which is convenient default value) and cv
(coefficient of variation given as ratio if logscale = TRUE
).
+++++++++++ Equivalence test - TOST +++++++++++
Sample size estimation
-----------------------------------------------
Study design: 2x2 crossover
log-transformed data (multiplicative model)
alpha = 0.05, target power = 0.8
BE margins = 0.8 ... 1.25
True ratio = 0.95, CV = 0.3
Sample size (total)
n power
40 0.815845
+++++++++++ Equivalence test - TOST +++++++++++
Sample size estimation
-----------------------------------------------
Study design: 2x2x4 (4 period full replicate)
log-transformed data (multiplicative model)
alpha = 0.05, target power = 0.9
BE margins = 0.8 ... 1.25
True ratio = 0.98, CV = 0.24
Sample size (total)
n power
14 0.917492
Note that total (not per-sequence) sample size is given.
alpha
(one-sided significance level, default is 0.05) almost never needs to be changed, theta1
(lower bioequivalence limit) and theta2
(upper bioequivalence limit) can be changed for non-standard bioequivalence limits, e.g. for narrow therapeutic index drugs.
Reproduction of Table 1 from FDA Guidance for Industry. Statistical Approaches to Establishing Bioequivalence is quite tricky because it consists one more parameter to consider - the subject-by-formulation interaction variance component, \(\sigma_D^2\).
\[\sigma_D^2=(\sigma_{BT}-\sigma_{BR})^2+2\times(1-\rho)\times\sigma_{BT}\times\sigma_{BR}\] where \(\sigma_{BT}^2\) and \(\sigma_{BR}^2\) are between-subject variances for the T and R formulations, respectively and \(\rho\) is correlation between subject-specific means \(\mu_{Tj}\) and \(\mu_{Rj}\). These parameters are rarely reported in publications and can’t be estimated from CI boundaries and sample size. In such lack of information one can assume \(\sigma_{BT}=\sigma_{BR}\) as well as \(\rho=1\). Under these reasonable assumptions \(\sigma_D^2=\sigma_D=0\), so sampleN.TOST()
calculation should be correct.
targetpower <- c(0.8, 0.9)
theta0 <- 1 - 0.05
CV <- c(0.15, 0.23, 0.3, 0.5)
design <- c("2x2x2", "2x2x4")
dt <- CJ(CV, targetpower, design, theta0)
sample_size <- purrr::pmap(dt, sampleN.TOST, print = FALSE)
kable(rbindlist(sample_size))
Design | alpha | CV | theta0 | theta1 | theta2 | Sample size | Achieved power | Target power |
---|---|---|---|---|---|---|---|---|
2x2x2 | 0.05 | 0.15 | 0.95 | 0.8 | 1.25 | 12 | 0.8305164 | 0.8 |
2x2x4 | 0.05 | 0.15 | 0.95 | 0.8 | 1.25 | 6 | 0.8458307 | 0.8 |
2x2x2 | 0.05 | 0.15 | 0.95 | 0.8 | 1.25 | 16 | 0.9260211 | 0.9 |
2x2x4 | 0.05 | 0.15 | 0.95 | 0.8 | 1.25 | 8 | 0.9328881 | 0.9 |
2x2x2 | 0.05 | 0.23 | 0.95 | 0.8 | 1.25 | 24 | 0.8066535 | 0.8 |
2x2x4 | 0.05 | 0.23 | 0.95 | 0.8 | 1.25 | 12 | 0.8143816 | 0.8 |
2x2x2 | 0.05 | 0.23 | 0.95 | 0.8 | 1.25 | 32 | 0.9044320 | 0.9 |
2x2x4 | 0.05 | 0.23 | 0.95 | 0.8 | 1.25 | 16 | 0.9082552 | 0.9 |
2x2x2 | 0.05 | 0.30 | 0.95 | 0.8 | 1.25 | 40 | 0.8158453 | 0.8 |
2x2x4 | 0.05 | 0.30 | 0.95 | 0.8 | 1.25 | 20 | 0.8202398 | 0.8 |
2x2x2 | 0.05 | 0.30 | 0.95 | 0.8 | 1.25 | 52 | 0.9019652 | 0.9 |
2x2x4 | 0.05 | 0.30 | 0.95 | 0.8 | 1.25 | 26 | 0.9043064 | 0.9 |
2x2x2 | 0.05 | 0.50 | 0.95 | 0.8 | 1.25 | 98 | 0.8032172 | 0.8 |
2x2x4 | 0.05 | 0.50 | 0.95 | 0.8 | 1.25 | 50 | 0.8128063 | 0.8 |
2x2x2 | 0.05 | 0.50 | 0.95 | 0.8 | 1.25 | 132 | 0.9012316 | 0.9 |
2x2x4 | 0.05 | 0.50 | 0.95 | 0.8 | 1.25 | 66 | 0.9021398 | 0.9 |
As we can see, calculated values are equal to the reference ones for smallest \(\sigma_D=0.01\) if CV=0.15 and CV=0.23. If CV=0.30 and power 80%, sample sizes are also equal, but for other parameters combinations sample sizes are underestimated.
Conclusion: we can trust sampleN.TOST()
; for CV less or equal 0.30 with power 80% and for CV less or equal 0.23 with power 90% it can be considered as validated against reference from FDA guidance.
CV can be calculated from CI boundaries and sample size if only these values are available:
R version 4.4.3 (2025-02-28)
Platform: x86_64-pc-linux-gnu
Running under: Ubuntu 24.04.2 LTS
Matrix products: default
BLAS: /usr/lib/x86_64-linux-gnu/openblas-pthread/libblas.so.3
LAPACK: /usr/lib/x86_64-linux-gnu/openblas-pthread/libopenblasp-r0.3.26.so; LAPACK version 3.12.0
locale:
[1] LC_CTYPE=C.UTF-8 LC_NUMERIC=C LC_TIME=C.UTF-8
[4] LC_COLLATE=C.UTF-8 LC_MONETARY=C.UTF-8 LC_MESSAGES=C.UTF-8
[7] LC_PAPER=C.UTF-8 LC_NAME=C LC_ADDRESS=C
[10] LC_TELEPHONE=C LC_MEASUREMENT=C.UTF-8 LC_IDENTIFICATION=C
time zone: Europe/London
tzcode source: system (glibc)
attached base packages:
[1] stats graphics grDevices datasets utils methods base
other attached packages:
[1] purrr_1.0.2 data.table_1.16.0 knitr_1.48 PowerTOST_1.5-6
loaded via a namespace (and not attached):
[1] digest_0.6.37 cubature_2.1.1 fastmap_1.2.0 xfun_0.48
[5] magrittr_2.0.3 htmltools_0.5.8.1 rmarkdown_2.28 lifecycle_1.0.4
[9] mvtnorm_1.3-1 cli_3.6.3 vctrs_0.6.5 renv_1.0.10
[13] compiler_4.4.3 tools_4.4.3 evaluate_1.0.0 Rcpp_1.0.13
[17] yaml_2.3.10 rlang_1.1.4 jsonlite_1.8.9 htmlwidgets_1.6.4