Zhe-Xuan Wang, En-Xin Wang, Wei Bai, Dong-Dong Xia, Wei Mu, Jing Li, Qiao-Yi Yang, Ming Huang,Guo-Hui Xu, Jun-Hui Sun, Hai-Liang Li, Hui Zhao, Jian-Bing Wu, Shu-Fa Yang, Jia-Ping Li, Zi-Xiang Li,Chun-Qing Zhang, Xiao-Li Zhu, Yan-Bo Zheng, Qiu-He Wang, Jing Li, Jie Yuan, Xiao-Mei Li, Jing Niu,Zhan-Xin Yin, Jie-Lai Xia, Dai-Ming Fan, Guo-Hong Han, on behalf of China HCC-TACE Study Group
Abstract BACKGROUND The treatment outcome of transarterial chemoembolization (TACE) in unresectable hepatocellular carcinoma (HCC) varies greatly due to the clinical heterogeneity of the patients. Therefore, several prognostic systems have been proposed for risk stratification and candidate identification for first TACE and repeated TACE (re-TACE).AIM To investigate the correlations between prognostic systems and radiological response, compare the predictive abilities, and integrate them in sequence for outcome prediction.METHODS This nationwide multicenter retrospective cohort consisted of 1107 unresectable HCC patients in 15 Chinese tertiary hospitals from January 2010 to May 2016. The Hepatoma Arterial-embolization Prognostic (HAP) score system and its modified versions (mHAP, mHAP2 and mHAP3), as well as the six-and-twelve criteria were compared in terms of their correlations with radiological response and overall survival (OS) prediction for first TACE. The same analyses were conducted in 912 patients receiving re-TACE to evaluate the ART (assessment for re-treatment with TACE) and ABCR (alpha-fetoprotein, Barcelona Clinic Liver Cancer, Child-Pugh and Response) systems for post re-TACE survival (PRTS).RESULTS All the prognostic systems were correlated with radiological response achieved by first TACE, and the six-and-twelve criteria exhibited the highest correlation(Spearman R = 0.39, P = 0.026) and consistency (Kappa = 0.14, P = 0.019), with optimal performance by area under the receiver operating characteristic curve of 0.71 [95% confidence interval (CI): 0.68-0.74]. With regard to the prediction of OS,the mHAP3 system identified patients with a favorable outcome with the highest concordance (C)-index of 0.60 (95%CI: 0.57-0.62) and the best area under the receiver operating characteristic curve at any time point during follow-up;whereas, PRTS was well-predicted by the ABCR system with a C-index of 0.61(95%CI: 0.59-0.63), rather than ART. Finally, combining the mHAP3 and ABCR systems identified candidates suitable for TACE with an improved median PRTS of 36.6 mo, compared with non-candidates with a median PRTS of 20.0 mo (logrank test P < 0.001).CONCLUSION Radiological response to TACE is closely associated with tumor burden, but superior prognostic prediction could be achieved with the combination of mHAP3 and ABCR in patients with unresectable liver-confined HCC.
Key words: Transarterial chemoembolization; Hepatocellular carcinoma; Prognostic system; Radiological response; Overall survival; Predictive ability
According to the Barcelona Clinic Liver Cancer (BCLC) staging system and current treatment guidelines, transarterial chemoembolization (TACE) is the first-line treatment option for intermediate hepatocellular carcinoma (HCC) with asymptomatic, large or multifocal unresectable nodules in the absence of macrovascular invasion (MVI) or extrahepatic metastasis (EHS)[1-3]. However, the treatment outcome of TACE varies greatly, with median survival ranging from 13 to 43 mo[4,5]. Apart from the differences in TACE techniques, it is universally recognized that such a wide variation in survival results from an intrinsic disease heterogeneity including the degree of liver dysfunction, tumor burden and other factors under the general term of “intermediate HCC”, which have not been adequately captured by current staging systems[6,7]. Moreover, the current use of TACE in clinical practice exceeds guideline recommendations, covering not only patients with unresectable early HCC, but also those with liver-confined advanced diseases[8,9].
Several prognostic algorithms have been proposed to address the clinical heterogeneity of HCC patients receiving TACE[10]. Typically, the Hepatoma Arterialembolization Prognostic (HAP) score was proposed and has been modified into three different versions (mHAP, mHAP2 and mHAP3), and target unresectable HCC patients treated with TACE for outcome prediction[11-14]. However, these prediction systems derived from a highly heterogeneous population, and their predictive values remain controversial in the majority of patients treated with TACE in the real world(patients with unresectable early, intermediate and liver-confined advanced stage).Recently, the “six-and-twelve” (6&12) criteria were proposed by our team to predict treatment outcomes in guideline-recommended patients treated with TACE. This prognostic model was “l(fā)inear predictor = largest tumor diameter (cm) + tumor number” and could divide patients enrolled into 3 risk stratifications with the cut-off values “6” and “12”, which may provide an easy-to-use tool (a Nomogram developed based on statistical results) for classification and individual survival prediction[5].However, the prognostic ability of the 6&12 should be investigated in a larger population. In addition, the ART (assessment for re-treatment with TACE) and ABCR(alpha-fetoprotein (AFP), BCLC, Child-Pugh and Response) systems were proposed for outcome prediction of repeated TACE (re-TACE)[15,16]. Despite the development of these prognostic systems, there is no consensus regarding their clinical significance due to the absence of real-world validations and comparisons.
We carried out this nationwide multicenter study with the aim of externally validating the existing prognostic systems for TACE, investigating their correlations with radiological response, comparing their predictive abilities regarding survival and identifying the optimal combination of scoring systems for first TACE and re-TACE in real-world HCC patients.
A total of 2978 cases were extracted from a nationwide database of HCC patients treated with TACE at 15 Chinese tertiary hospitals between January 2010 and May 2016. HCC was diagnosed by either histological or imaging evaluations according to the American Association for the Study of Liver Diseases / European Association for the Study of the Liver (AASLD/EASL) guidelines. Patients meeting one of the following criteria were excluded: (1) Any previous HCC-related treatments; (2)Presence of MVI and/or EHS; (3) Child-Pugh score > 7 or decompensation; (4) Eastern Cooperative Oncology Group performance status score > 1; (5) Diffuse tumor; (6)Additional systemic treatment; and (7) Absence of baseline information or imaging. In total, 1107 patients were included, and 912 of these patients received re-TACE (Figure 1). The study protocol conformed to the ethical guidelines of the 1975 Declaration of Helsinki and was approved by the institutional Ethics Committee of the First Affiliated Hospital of the Fourth Military Medical University; patients were not required to give informed consent for this study because the analysis used anonymous clinical data that were obtained after each patient agreed to treatment by written consent.
Treatment decisions were made at the discretion of the multidisciplinary liver tumor boards in each enrolled institution on the basis of following treatment guidelines.Before TACE, digital subtraction angiography (DSA) of the hepatic artery was performed to assess the vascular anatomy and tumor vascularity. During TACE, a vascular catheter was selectively inserted into the tumor-feeding artery followed by an injection containing a mixture of doxorubicin (10-50 mg) and lipiodol (2-20 mL),and then embolization using gelatin sponge particles. Laboratory assessment was carried out every four to six weeks after the procedure. Radiologic evaluation using the modified Response Evaluation Criteria in Solid Tumors (mRECIST) was performed on the fourth and eighth week after TACE and every eight weeks thereafter using contrast-enhanced computed tomography (CT) or magnetic resonance imaging (MRI). However, in clinical practice, the intensity of follow-up depended on individuals' baseline characteristics (including kidney function) and responses to the last treatment, i.e., on demand. Thus, not all patients strictly stuck to this imaging follow-up schedule. Moreover, no contrast-induced nephropathy was observed in the current cohort. For patients with residual viable lesions or local and/or distant intrahepatic recurrences during follow-up, on-demand re-TACE sessions were carried out; and TACE therapy was discontinued when persistent disease progression occurred after two sessions according to imaging assessments.Once patients entered advanced stage according to the specialized assessment, they would receive the recommended treatment according to the national guidelines including systemic therapies and best support care. Then, follow-up was continuously conducted by local investigators until a terminal event occurred or loss of follow-up.
According to the baseline characteristics, the prognostic scores based on HAP[11],mHAP[12], mHAP2[13], mHAP3[14]and the 6&12 criteria[5]were calculated, respectively(Table 1). Risk stratification and candidate identification based on HAP, mHAP and mHAP2 were obtained according to previous literature. For comparability, the quartiles and medians of the continuous scores of mHAP3 and the 6&12 criteria were used to divide patients into four risk strata and to distinguish candidates from noncandidates. For outcome prediction after re-TACE, calculation of the predictive score,patient stratification related to death risk, and identification of potential candidates were conducted according to patient characteristics before re-TACE according to ART[15], and ABCR[16]. The outcome evaluation of first TACE treatment was based on overall survival (OS), which was defined as the time from first TACE to death or the end of the study; whereas assessment of re-TACE effectiveness was based on post re-TACE survival (PRTS), which was defined as the time from the second TACE session to death or the end of the study.
Figure 1 Flowchart of the patient selection process. HCC: Hepatocellular carcinoma; TACE: Transarterial chemoembolization.
Categorical variables were described as frequencies and percentages, and continuous data as the median with interquartile range. Median OS was estimated using Kaplan-Meier curves and compared by the log-rank test. The accompanying hazard ratio (HR) was estimated for each prediction system using the Cox proportional hazard regression model. Receiver operating characteristic (ROC) curves were used to evaluate the correlation between prediction systems and radiological response; and time-dependent area under the ROC curve (AUROC) curves were used to compare the discriminatory abilities for survival at different follow-up time points. The Spearman test and Kappa value were used to evaluate correlation and consistency between prediction systems and response. To determine the optimal prognostic system, the concordance (C)-index and likelihood ratio (LR) were calculated for each predictive score to evaluate the prognostic value regarding OS. Statistical analyses were conducted using SPSS software version 17.0 (SPSS Inc., Chicago, IL, United States) and R version 3.3.1 (R Foundation for Statistical Computing, Vienna, Austria).
The study cohort consisted of 1107 HCC patients receiving at least one session of TACE, and their baseline characteristics are described in Table 2. The median age was 57 years, and hepatitis B virus infection was the main etiology of HCC. In addition,912 patients with more than one TACE session were included in the analysis of re-TACE. The median number of TACE sessions was 3 in both the whole cohort and in those patients treated with repeated TACE.
The median scores for HAP, mHAP, mHAP2, mHAP3, and the 6&12 criteria are shown in Table 3. The patients were divided into four groups (grade A, B, C and D)based on the risk score; however, to compare methodology, the quartiles of the mHAP3 (0.05, 0.41, 0.83) and 6&12 criteria (7.5, 9.7, 12.9) were used to divide the patients into four grades of risk stratification. With regard to radiological response,149 (13.5%) patients had a complete response (CR), 441 (39.8%) had a partial response(PR), 299 had stable disease (SD) and 218 had progressive disease (PD); the response rate (CR and PR) reached 53.3%. Compared with the other scoring systems, the 6&12 criteria had the highest correlation (Spearman R = 0.39, P = 0.026) and consistency(Kappa = 0.14, P = 0.019) with treatment response to the first TACE. In the ROC analysis, the AUROC of the 6&12 score for predicting treatment response reached 0.71[95% confidence interval (CI): 0.68-0.74] and 0.66 (95%CI: 0.63-0.69), which was better than those of the other systems (Figure 2A and 2B).
According to the prediction systems, patients with A and B grade of risk stratification were considered candidates for TACE; otherwise, they were considered non-candidates (Table 4). Similarly, in the mHAP3 and 6&12 systems, patients in groups A and B were defined as candidates, and those in groups C and D were considered noncandidates. As shown in Table 3, all five prediction systems identified the TACEcandidates with improved OS from non-candidates (P < 0.001). However, the mHAP3 system had the highest discriminatory ability (C-index 0.60, 95%CI: 0.57-0.62), as well as optimal homogeneity within the classification (LR χ2= 57.5). More importantly, the mHAP3 system had the highest AUROC according to the time-dependent ROC analysis (Figure 3A). Based on the mHAP3, there were 554 TACE candidates with a median OS of 33.8 mo and 553 non-candidates with a median OS of 17.2 mo; Cox regression analysis also demonstrated that candidates defined by the mHAP3 system had an almost 50% reduced risk of death compared to non-candidates (HR = 0.52,95%CI: 0.44-0.62, P < 0.001).
Table 1 Summary of the prognostic scoring systems (points)
Based on the ART score, the 912 available patients were divided into two groups, 646 were candidates and 266 were non-candidates (Table 4). However, no significant difference in PRTS was detected between these two groups of patients (27.0 mo vs 23.7 mo, log-rank test P = 0.222). In the ABCR assessment, the 600 candidates reached a median PRTS of 33.1 mo, which was longer than the 16.4 mo in 312 non-candidates(log-rank test P < 0.001). In addition, the Cox regression analysis showed that the candidates based on the ABCR had a more than 50% reduced risk of death compared with non-candidates (HR = 0.47, 95%CI: 0.39-0.57, P < 0.001). Compared with ART,the ABCR system had a better C-index, LR χ2, and time-dependent AUROC at any follow-up time point (Table 4 and Figure 3B).
Table 2 Baseline characteristics of patients treated with first TACE and before repeated TACE, n(%) / Median [lQR]
Considering that mHAP3 had the highest prognostic value for first TACE and ABCR was correlated with treatment outcome in patients receiving re-TACE, we combined the two scoring systems to stratify the patients treated with TACE. In general, for patients receiving at least two sessions of TACE, 374 patients who were both candidates of first TACE defined by mHAP3 and re-TACE defined by ABCR were considered candidates, while the other 538 patients were non-candidates. According to the survival analysis, candidates achieved better outcomes compared with noncandidates with a median PRTS of 36.6 vs 20.0 mo (P < 0.001) (Figure 4).
Figure 2 Receiver operating characteristic curves for evaluating the radiological correlations of the scoring systems. A: Correlations between radiological response and predicting scores; B: Correlations between radiological response and risk stratifications based on the predictive systems for first transarterial chemoembolization. HAP: Hepatoma Arterial-embolization Prognostic; mHAP: Modified HAP; 6&12: Six-and-twelve criteria.
The strengths and novelty of the current study are as follows: (1) Validation of the prognostic values of the prediction systems for first and re-TACE in unresectable Chinese HCC patients; (2) Determination of the correlations between the prediction systems and radiological response after the first TACE; (3) A comparison of the discriminatory values of these prediction systems in a time-dependent manner; and(4) Integration of the systems in sequence to identify candidates for TACE therapy.
According to the treatment guidelines for HCC, TACE is recommended as standard treatment for intermediate HCC[2,3]. However, its clinical application widely exceeds this recommendation in real-world practice, and the heterogeneity of TACE-treated HCC has consequently resulted in the variance in treatment outcomes[7-10]. As a prognostic model with indicators including albumin, bilirubin, AFP and tumor diameter, the HAP scoring system could achieve risk stratifications for patients undergoing initial TACE[11]. Thereafter, Pinato et al[12]removed serum bilirubin from HAP, as its performance appeared inferior to other parameters, and then proposed the mHAP score. To improve the accuracy of prognosis classification, the mHAP2 was developed with the addition of tumor number as a predictor and adjustment of the cut-off for serum bilirubin[13]. Furthermore, the mHAP3 score proposed an individual prognostic model for outcome prediction in a continuous manner for each patient with unresectable HCC[14]. However, the HAP system and its modified versions were derived from populations with flexible inclusion criteria and even included patients with MVI. In contrast, our previously proposed 6&12 criteria adopted strict inclusion criteria focusing on the guideline-recommended patients, and excluded those with advanced disease but liver-confined HCC[5]. Nevertheless, TACE was mainly performed in unresectable liver-confined disease regardless of intermediate or advanced stages[8]. Consequently, we investigated the performance of these prediction systems in such a group of patients. More importantly, the current study determined their associations with radiological response for the first time, demonstrating that the 6&12 criteria had the highest correlation with treatment response, indicating that the most important predictive factor for imaging response was tumor burden.Interestingly, the 6&12 criteria were not better than the HAP system and its modified versions when predicting OS. When comparing their scope of application, the 6&12 criteria were generated in guideline-recommended TACE candidates who had little heterogeneity in terms of liver function and performance status, as well as other characteristics, which was different from the HAP and other systems. Consequently,when predicting OS in the current study population with significant heterogeneity,the 6&12 criteria may not have been sufficiently comprehensive. In contrast, with the inclusion of more relative factors for calculating continuous predictive scores, the mHAP3 system performed better than the others in predicting OS.
For the evaluation of re-TACE treatment, the ART system consisting of factors related to radiological response, as well as changes in aspartate aminotransferase and Child-Pugh score was used to assess suitability for subsequent TACE[15]; nevertheless,the ABCR score selected AFP, BCLC-stage, points increase in Child-Pugh and tumor response as variable parameters, to provide better patient selection for re-TACE[16].According to current analyses, the ABCR system showed a good association withPRTS, but ART showed inferior performance. Although the radiological response and changes in liver function were included in both systems, there may be differences as the ABCR system included the AFP change and BCLC stage. Several studies have reported that the change in AFP after TACE was correlated with treatment effectiveness[17,18]; and the inclusion of BCLC stage reflected the detailed radiological response, especially the pattern of PD (intrahepatic or extrahepatic progression)[19].Consequently, the ABCR system may be more reliable for the evaluation of treatment outcome following re-TACE.
Table 3 Correlations between radiological response and prognostic systems for first transarterial chemoembolization
Finally, considering the predictive abilities of the mHAP3 and ABCR systems, the combination of both could identify candidates for TACE therapy. The significance of this combination includes the following: (1) There has been no such attempt at combining these systems in the past; (2) TACE treatment is an intervention that affects the outcome of patients on the basis of the natural course of the disease. Even if the same patient had different outcomes before and after treatment, the scoring systems designed for pre-treatment (the inability to independently assess the impact of the predictive factor value change on outcomes) and the scoring systems designed for post-treatment (the inability to independently assess the impact of the patient's underlying status on outcomes) might not be accurate enough to predict outcome,when applied separately; (3) This study selected the best performing scores in the pretreatment period and post-treatment period, respectively, to achieve the optimum prediction which was more effective than solo prediction; and (4) This combination took advantage of the two scores to make up for their shortcomings: mHAP3 could predict the baseline, but could not guide the clinical decision for the next TACE procedure; ABCR incorporated imaging indicators to better predict survival, but not in the initial assessment of the patient at baseline (this system can only be used after TACE therapy). Combining the above points, the predictive power and clinical application value of this integration of mHAP3 and ABCR are better than each system alone.
There were also several limitations in this study: (1) The retrospective nature of this study may have led to some bias; (2) To compare the HAP, mHAP and mHAP2 systems, we used the quartile values of the continuous scores in mHAP3 and the 6&12 criteria to divide patients into four risk stratification groups, and used their median values to distinguish candidates from non-candidates, which might have compromised their prediction performance; (3) Given that all patients included in this study were Chinese and the main etiology of HCC was hepatitis B virus infection,caution is necessary in the generalization and extrapolation of our findings; and (4)Study results based on current developed scoring systems need further external validations in a large population from multicenter studies.
Figure 3 Time-dependent receiver operating characteristic curves for comparisons. A: Comparisons among prognostic systems in first transarterial chemoembolization; B: Comparisons among prognostic systems in repeated transarterial chemoembolization. AUROC: Area under receiver operating characteristic curve; HAP: Hepatoma Arterial-embolization Prognostic; mHAP: Modified HAP; 6&12: Six-and-twelve criteria; ART: Assessment for retreatment with TACE; ABCR: Alpha-fetoprotein, BCLC stage, Child-Pugh and Response.
In summary, this nationwide multicenter study demonstrated that previously proposed prognostic scoring systems could identify TACE candidates with radiological response and improved OS in unresectable HCC patients treated with first TACE. For re-TACE treatment, the ABCR system, but not the ART system, had a predictive ability for PRTS. Considering the optimal discriminatory abilities of mHAP3 and ABCR in predicting the prognoses of first TACE and re-TACE, these two systems could be sequentially combined to predict treatment outcome of TACE,which may provide useful data for its clinical applications.
Table 4 Comparison of prognostic performance of the predicting systems
Figure 4 Survival curves between candidates and non-candidates according to sequential use of the Hepatoma Arterial-embolization Prognostic system version 3 and alpha-fetoprotein, BCLC stage, Child-Pugh and Response system. re-TACE: Repeated transarterial chemoembolization.
Transarterial chemoembolization (TACE) is the most commonly used treatment in patients with unresectable hepatocellular carcinoma (HCC). However, the treatment outcome for such patients varies greatly. Apart from the differences in TACE techniques, the heterogeneity of liver dysfunction, tumor burden and other relevant factors should be carefully considered.
Previously, several prognostic systems have been proposed for risk stratification and clinical decision-making in first TACE and repeated TACE (re-TACE). Nevertheless, it is unknown which model has the highest predictive ability and should be chosen in clinical practice.
In this nationwide multicenter study, we aimed to validate the existing prognostic models for TACE treatment, compare their predictive abilities for overall survival, and finally identify the optimal scoring systems for first TACE and re-TACE in HCC patients.
The prognostic values of the Hepatoma Arterial-embolization Prognostic (HAP) scoring system and its modified versions (mHAP, mHAP2 and mHAP3), as well as the six-and-twelve criteria were compared in 1107 unresectable HCC patients treated with at least one session of TACE,while the same analyses were conducted in 912 patients receiving re-TACE to evaluate the ART(assessment for re-treatment with TACE) and ABCR (alpha-fetoprotein, Barcelona Clinic Liver Cancer, Child-Pugh and Response) systems for post re-TACE survival (PRTS).
With regard to the initial TACE treatment, six-and-twelve criteria had the highest correlation and consistency with radiological response and the mHAP3 criteria had the optimal discrimination value for overall survival. For re-TACE therapy, the ABCR score significantly identified patients with improved PRTS, while the ART system failed to do so. Finally,combining mHAP3 and ABCR systems could discriminate candidates suitable for TACE with improved outcomes compared with non-candidates.
The results from this study suggest that there is high heterogeneity in patients with unresectable HCC and receiving TACE treatment. The six-and-twelve criteria were closely correlated with radiological response, mHAP3 and ABCR were reliable prognostic systems for first TACE and re-TACE. The sequential combination of these systems would facilitate risk stratification and outcome prediction.
This study clearly highlights the need for risk stratification of unresectable HCC patients treated with TACE. Comparing the prognostic abilities among the existing scoring systems, we recommend the combined use of mHAP3 and ABCR for survival prediction of HCC patients receiving TACE for the first time, which would not only refine the prognostic stratification but also facilitate individual management. Therefore, future studies focusing on external validations in a large population are necessary.
World Journal of Gastroenterology2020年6期