Comparing Testing approaches under Non-Proportional Hazards.

1. Weighted Log-rank test(WLRT).

  • Peto-Peto .
  • Modified Peto-Peto.
  • Tarone-ware.
  • Gehan-Breslow/ Wilcoxon.
  • Fleming-Harrington.

2. Modestly Weighted Log-rank test.

  • Unstratified
  • Stratified

3. Max Combo test.

  • Unstratified
  • Stratified

Note on the Use of Custom R Functions:

The survminer package can be used to perform the Peto–Peto, modified Peto–Peto, Tarone–Ware, and Gehan–Breslow/Wilcoxon tests. However, its functionality is limited to generating p-values and does not provide the corresponding test statistics. The custom R function from Kassambara’s GitHub repository can be used to perform these tests. This function internally implements weighted log-rank tests and supports the log-rank, Gehan–Breslow, Tarone–Ware, Peto–Peto, modified Peto–Peto, and Fleming–Harrington tests. The custom function reproduces functionality previously available through survMisc::comp(), which is no longer available on CRAN. The custom functions used to perform each test, as presented in the R document, were derived from this GitHub-based implementation. Only the sections relevant to pairwise comparisons were retained, since the original custom function also supports k-group comparisons.

Analysis Supported in R Supported in SAS Match Notes
WLRT- Peto-Peto Yes Yes Yes In R, the survminer::surv_pvalue(method = "S1") function computes the p-value for the Peto–Peto test. However, the survminer package does not provide a function for generating the corresponding test statistics. The custom R function can be used to perform this test and obtain both the p-value and chi-square statistic. The resulting values are comparable to those produced by SAS but not consistent with results from the coin::logrank_test() and survival::survdiff() functions. In SAS, this test is implemented using the LIFETEST procedure with a STRATA statement and the TEST=peto option.
WLRT- Modified Peto-Peto Yes Yes Yes The survminer::surv_pvalue(method = "S2") function in R generates the p-value for the modified Peto–Peto test. The custom R function can be used to obtain both the chi-square statistic and the p-value. In SAS, this test is performed using the LIFETEST procedure with a STRATA statement and the TEST=modpeto option.
WLRT-Tarone-Ware Yes Yes Yes The coin::logrank_test() function in R performs the Tarone–Ware test when the argument type = "Tarone-Ware" is specified. The survminer::surv_pvalue(method = "sqrtN") function computes the p-value for this test. However, because the survminer package does not provide a built-in function to return the test statistic, the custom R function can be used to compute the chi-square statistic along with a p-value consistent with the method = "sqrtN" implementation. The results obtained from the custom function and the survminer package agree with those produced in SAS using TEST=TaroneWare with the STRATA statement in PROC LIFETEST. In contrast, the results from coin::logrank_test() do not match the SAS output. The survminer::surv_pvalue() function computes p-values from survfit objects by comparing survival curves. Its default method is survdiff, which performs the standard log-rank test.
WLRTGehan Breslow Yes Yes Yes In R, survminer::surv_pvalue(method = "n") together with the custom R function yields the p-value for the Gehan–Breslow test, corresponding to its canonical weighting scheme. The coin::logrank_test() function performs this test when the argument type = "Gehan-Breslow" is specified. In SAS, the equivalent test is obtained by specifying TEST=Wilcoxon in the PROC LIFETEST procedure. However, the results produced by coin::logrank_test() are not consistent with the SAS output.
WLRT-Fleming- Harrington Yes Yes Yes This test is computed in R using nphRCT::wlrt() and coin::logrank_test functions. In SAS, you have to specify test=FH(\(\rho,\gamma\)) using the LIFETEST procedure. The results produced by nphRCT::wlrt() and the LIFETEST procedure are consistent.
Max Combo Yes Yes Yes

In R, the test can be implemented using nph::logrank.maxtest(), which defaults to a two-sided test. There is no built-in functionality in the SAS LIFETEST procedure to perform this test directly; it must be implemented via this SAS macro. Results from the R implementation and the SAS macro are similar.

A stratified max-combo test can be performed in both R and SAS. Additionally, the choice of weighting can be modified, as demonstrated in the SAS file.

Modestly Weighted Log-rank Yes No No In R, this test can be performed using the nphRCT::wlrt() function and specifying either the t* or s* parameter. Here, s* represents the fixed survival probability threshold, whereas t* denotes the time point at which the pooled survival probability reaches s* (see the referenced documentation for the definition of this test’s weight function). A stratified version of the test can be implemented by incorporating the strata() function. This approach provides both the individual test statistics for each stratum and the combined test statistic. To the knowledge of the CAMIS contributors, there is no direct implementation of this test in the SAS LIFETEST procedure.

Comparison Results.

\[H_0 : S_1(t)=S_2(t) \mbox{ }\forall t \mbox{ v/s } H_1 : S_1(t) \neq S_2(t) \mbox{ for some t. }\]

Note: coin::logrank_test() - Default distribution is asymptotic. Generates \(Z\) test statistic. \(Z^2 = \chi^2_{(1)}\)

Test Statistic Function in R R Result Function in SAS SAS Result Match Notes
WLRT- Peto-Peto Chi-square

Custom R function

coin::logrank_test()-Asymptotic

survival::survdiff()

9.8238

Z=3.0423

9.9000

PROC LIFETEST with STRATA group /test=peto) 9.8238

Yes

No

No

survminer::surv_pvalue() prints the p-value; The custom R function generates both the chi-square statistics and the corresponding p-value.
P-Value

survminer::surv_pvalue()

Custom R function

coin::logrank_test()-Asymptotic

survival::survdiff()

0.0017

0.0017

0.0023

0.0020

PROC LIFETEST with STRATA group /test=peto 0.0017

Yes

Yes

No

No

No

WLRT- Modified Peto Chi-square Custom R function

9.7491

Z= 3.0276

PROC LIFETEST with STRATA group /test=modpeto 9.7491

Yes

No

P-Value

|survimer::surv_pvalue()

Custom R function

coin::logrank_test()

0.0018

0.0018

0.002465

PROC LIFETEST with STRATA group /test=modpeto 0.0018

Yes

Yes

No

WLRT- Tarone-Ware Chi-square

Custom R function

coin::logrank_test()

9.4230

Z=2.9636

PROC LIFETEST with STRATA group /test=taroneware 9.4230

Yes

No

P-Value

|survimer::surv_pvalue()

Custom R function.

coin::logrank_test()

0.0021

0.0021

0.0030

PROC LIFETEST with STRATA group /test=taroneware 0.0021

Yes

Yes

No

WLRT- Gehan-Breslow/ Wilcoxon Chi-square

Custom R function

coin::logrank_test()

8.2593

Z=2.7863

PROC LIFETEST with STRATA group/test=wilcoxon 8.2593

Yes

No

P-Value

|survimer::surv_pvalue()

Custom R function

coin::logrank_test()

0.0041

0.0041

0.0053

PROC LIFETEST with STRATA group/test=wilcoxon 0.0041

Yes

Yes

No

Fleming- Harrington Chi-square nphRCT::wlrt()

FH(0.5,0.5)=10.3122

FH(1,1)=9.8019

FH(0,1)=9.5455

FH(0.5,2)=8.32428

FH(1,0)=9.9

PROC LIFETEST with STRATA group/test=FH()

FH(0.5,0.5)=10.3122

FH(1,1)=9.8019

FH(0,1)=9.5455

FH(0.5,2)=8.32428

FH(1,0)=9.9

Yes
coin::logrank_test()

FH(0.5,0.5) :Z=3.0582

FH(1,1): Z=2.9720

FH(0,1): Z=2.9256

FH(0.5,2): Z=2.7163

FH(1,0): Z=3.0423

No
P-Value nphRCT::wlrt()

FH(0.5,0.5)=0.0013

FH(1,1)=0.0017

FH(0,1)=0.0020

FH(0.5,2)=0.0041

FH(1,0)=0.0017

PROC LIFETEST with STRATA group/test=FH()

FH(0.5,0.5)=0.0013

FH(1,1)=0.0017

FH(0,1)=0.0020

FH(0.5,2)=0.0041

FH(1,0)=0.0017

Yes
coin::logrank_test()

FH(0.5,0.5)=0.0022

FH(1,1)=0.0030

FH(0,1)=0.0034

FH(0.5,2)=0.0066

FH(1,0)=0.0023

No
Modestly Weighted Log-rank(unstratified) Chi-square nphRCT::wlrt() 11.2786 No direct implementation Null No This function can perform types of modestly-weighted log-rank tests and the Fleming-Harrington(\(\rho,\gamma\)) test, in addition to the standard log-rank test.
P-Value nphRCT::wlrt() 0.0008 No direct implementation Null No
stratified Modestly Weighted Logrank_test Chi-square nphRCT::wlrt()

strata1=7.5185

strata2=3.7418

Combined=10.8359

No direct implementation Null No
P-Value nphRCT::wlrt()

strata1=0.0061

strata2=0.0531

Combined=0.0010

No direct implementation Null No
Unstratified -Max Combo Chi-square nph::logrank.maxtest() 3.30 (Z test) SAS macro 3.30152 Yes In R, it defaults to two sided test, unless specified otherwise.
P-Value nph::logrank.maxtest()

0.00196\(\approx\) 0.0020

Bonferroni adjusted p-value=0.00385

SAS macro

0.0020

The SAS macro does not perform Bonferroni adjustment.

Yes

In R, logrank.maxtest(), the algorithm mvtnorm::GenzBretz() calls the Genz–Bretz algorithm, a quasi-Monte Carlo method, which introduces randomness in each run; this generates slightly varying values for the unadjusted p-value in each run. Therefore, there is a need to use set.seed() for reproducibility.

In the SAS macro, a Monte-Carlo simulation x = RANDNORMAL(n, mean,corr2) generates 5,000,000 samples from a k-dimensional multivariate normal distribution and a fixed seed (1).

Stratified -Max Combo Chi-square strata.MaxCombo::SMCtest()

Z1=3.1813

Z2=3.1813

Z3=3.2738

SAS macro 3.18132

Yes

Yes

No

In R, the test outputs multiple p-values corresponding to different covariance estimators; The SAS macro generates a combination test with a single Z max and p for p-value. The first pval and z.max is closer to those of SAS Macro.
p-Value strata.MaxCombo::SMCtest()

p1=0.0030

p2=0.0026

p3=0.0021

SAS macro 0.0034

The closest. slight difference

No

No

The SAS macro was modified to accommodate a stratifying variable. The adjustments are documented in the SAS document.

Summary and Recommendation.

Testing combinations of Weighted Log-rank statistics is a robust alternative to Weighted Log-rank for detecting differences in survival curves in non-proportional hazard situations. However, some authors have expressed caution about the use of the combination test in the sense that one risk is identifying statistically significant results with clinical insignificance; for instance, in cases where treatment is uniformly worse than control, Max Combo can still offer a high chance of rejecting the null hypothesis, favouring treatment. Magirr & Burman developed Modestly weighted Log-rank to counter these issues, especially for a delayed effect case; the weighting is controlled such that the worse treatment effect is not rewarded at an early time point.

The Wilcoxon test reported in SAS documentation corresponds to the Gehan-Breslow test in R. For Peto Peto, Modified Peto, Gehan-Breslow/Wilcoxon and Tarone-Ware test, to ensure reproducibility with SAS procedures, |survimer::surv_pvalue() and a Custom R function can be used. |survimer::surv_pvalue() and custom R compute these test statistics based on the weighting definition of these tests, for example: these functions utilise the size of the risk set and the square root of the risk set, respectively, to compute weights for Gehan-Breslow/Wilcoxon and Tarone-Ware, respectively. The coin::logrank_test() and |survival::survdiff() implement these tests differently, as discussed earlier. coin provides an implementation of a general framework for conditional inference procedures commonly known as permutation tests. |survival::survdiff() in R uses hypergeometric variance formulation to implement Mantel-Cox log-rank test . It uses \(G^\rho\) family of tests .

References.

  1. wlrt() documentation: https://search.r-project.org/CRAN/refmans/nphRCT/html/wlrt.html

  2. survdiff() documentation: https://www.rdocumentation.org/packages/survival/versions/3.8-3/topics/survdiff

  3. survminer() documentation: https://cran.r-project.org/web/packages/survminer/survminer.pdf

  4. logrank.maxtest() documentation: https://search.r-project.org/CRAN/refmans/nph/html/logrank.maxtest.html

  5. Robust modestly weighted log-rank testsdocumentation:https://arxiv.org/html/2412.14942v1

  6. nphRCT package documentation: https://cran.r-project.org/web/packages/nphRCT/nphRCT.pdf

  7. LIFETEST procedure documentation: https://documentation.sas.com/doc/en/statug/15.2/statug_lifetest_syntax01.htm

  8. Combination weighted log-rank tests documentation: https://support.sas.com/resources/papers/proceedings20/5062-2020.pdf

  9. LIFETEST procedure documentation: https://support.sas.com/documentation//cdl/en/statug/68162/HTML/default/viewer.htm#statug_lifetest_details16.htm

  10. Stratified modestly weighted log-rank test documentation: https://cran.r-project.org/web/packages/nphRCT/vignettes/weighted_log_rank_tests.html

  11. Stratified Max-Combo documentation: https://cran.r-project.org/web/packages/strata.MaxCombo/strata.MaxCombo.pdf

  12. Max-combo documentation: https://search.r-project.org/CRAN/refmans/nph/html/logrank.maxtest.html