Reconstructing survival data from published Kaplan-Meier curves: an algorithm validation
Background: The number of health-related publications has been exponential in the recent decade. Indeed, cardiovascular area includes more than 670k papers published at MEDLINE [((cardiovascular) NOT ((("Animals"[Mesh] NOT ("Animals"[Mesh] AND "Humans"[Mesh])))], of which, more than 20k are related to coronary artery bypass grafting (CABG). We are witnessing a 6% increase in publications per year. The growth in potential information content is a challenge to manage; one way to leverage the evidence level using such a vast amount of information is using systematic reviews and meta-analysis. These approaches allow health professionals to quickly synthesize and efficiently be up to date regarding their specific topics of interest and, eventually, identify major gaps in the evidence. Meta-analysis of randomized controlled trials (RCTs) is considered one of the highest levels of evidence since pooling the data of individual studies provide higher statistical power to detect differences even in low-frequency events(1, 2).
Besides the common pitfalls of all systematic reviews, such as overlapping data, publication bias, the limited number of databases searched, low quality, and heterogeneity of the original studies included(2), the lack of easy access to data is a severe limitation to leverage the potential information present in many studies. Also, an overwhelming number of authors do not reply to the reviewers or community queries for delivering access to papers’ data, implying their exclusion from quantitative analysis due to the lack of information, impacting the growth of the evidence building negatively. However, the publication of figures that provide relevant information is not an uncommon practice. More specifically, Kaplan-Meier curves to estimate cumulative survival differences between studies groups are usually provided, but occasionally without the association measure, hazard ratio (HR), which is of high relevance for evidence assessment. For the cases where the HR is not provided, the incidence rate ratio (IRR) calculation could be an option since the number of deaths within each group is provided (IRR=((nr events exposed group)/(total patients in risk exposed group×follow-up time))/((nr events control group)/(total patients in risk control group ×follow-up time))). For RCTs this is not a constraining factor, but for observational studies, it is likely to be a source of bias since the results should be adjusted for potential confounding or unbalanced covariates. In these cases, the crude number of deaths within each group and IRR calculation would provide an unadjusted and inconclusive analysis of data. Adjusted HR or HR provided from previously matched (balanced) groups are the best estimations to pool within observational studies.
Guyot et al(3) developed an algorithm to reconstruct and analyse Kaplan-Meier curves, providing survival statistics. It also delivers HR, but the authors pointed out that this was not the primary purpose of their algorithm. Indeed, their validation exercise resulted in excellent reproducibility and accuracy for reconstruction of survival statistics, such as median survival and probability of survival, while HR reproducibility and accuracy was less good mostly when less information about number at risk and number of events were available. The reason pointed by the authors for the lower accuracy of HR comparing with the other statistics is that the HR is a weighted average of ratios along the curve, while the other statistics are simple points estimates.
Our aim was to assess and validate, with our data, Guyot et al. algorithm for HR computation, for long-term survival outcome, and to estimate the precision of this method according to different levels of provided data.
Methods: Using GetData Graph Digitizer 2.26 (http://getdata-graph-digitizer.com/), we imported one survival curve from our cohort study (n=2414 patients) that compared long-term survival between a bilateral internal mammary artery (BIMA) and single internal mammary artery (SIMA) CABG surgery. A careful delineation of each curve after limiting X and Y axes was done, and two ASCII (text) files, one per group, were exported with all the supplied coordinates: time (X) and cumulative survival (Y) for each point. Three event tables were built: 1) the number of patients at risk provided for every single year (from 0 to 10 – 10 points); 2) the number at risk was provided only every two years (5 points); 3) the number at risk only at three distinct points: beginning, median and end of follow-up (3 points). Those data files were then imported by an R script that reads the number at risk at each time point, produce vectors and calculates approximations of number of censored on each interval, i; adjusts the total number at risk and number of events within each i according to Kaplan-Meier (K-M) estimates read from curves. Then it obtains the individual patient data (IPD) from the reconstructed Kaplan-Meier data including time, event and respective arm. Finally, it reads those K-M and coxph formula could be applied. Those results were compared with Cox regression data derived directly from our dataset.
Results: We included 2414 patients(4), of which 1478 were subject to SIMA and 936 were subject to BIMA. During the follow-up, median of 5.5 years for a maximum of 12 years, 391 deaths were registered (20% SIMA vs. 10% BIMA). Kaplan-Meier curves marked an evident statistical difference between those groups as well as the Cox regression (HR for BIMA: 0.6046, CI 95%: 0.4771 – 0.7663). From the 3 methods we stated above we got the following HR [95% CI]: 1) 0.5994 [0.4735 – 0.7587]; 2) 0.6021 [0.4761 – 0.7614]; and 3) 0.5921 [0.4663 – 0.7518].
Conclusion: Even though the authors who provided the algorithm for reconstruction of Kaplan-Meier curves question its ability to compute HR, our own experience shows otherwise, corroborating with Saluja and colleagues who compared four methods developed to estimate HR from Kaplan-Meier curves and recommended the Guyot method(5). This algorithm has high potential and could be used to extract data from papers where quantitative data is absent. We recently used it to extract data from 9 studies avoiding to exclude them from a meta-analysis of RCTs and propensity score studies(6).
Copyright (c) 2020 Francisca Almeida Saraiva, João P. Leite-Moreira, André P. Lourenço, António S. Barros, Adelino Leite-Moreira
This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License.
When submitting an article to the Journal of Statistics on Health Decision (JSHD), authors certify the following clauses:
- Originality and single submission – The contents presented in the article have not been published previously in whole or in part, and were not submitted or are not under active consideration elsewhere prior JSHD decision. The article is authentic and does not contain plagiarism.
- Authorship – All authors reviewed the article, agreed with its content, and agreed to its submission to the JSHD. All the authorship criteria stated by The International Committee of Medical Journal Editors Guidelines were met.
- Conflicts of interest – Any conflict of interests were declared. If authors have no declaration, it should be written (in the acknowledgements section): “The authors declare no conflict of interests”.
- Ethics committee and informed consent (if applicable) – The current research was approved by an independent ethics committee and subjects gave their informed consent before they were enrolled in the study.
- And authors agree to the Open Access license agreement of the Journal of Statistics on Health Decision, stated bellow.