CPH models reported usually as lists of risk factors along
CPH models, reported usually as lists of risk factors along with their parameters, are prevalent in medical literature. One such model is the CPH model created for the purpose of predicting the probability of one year survival of patients suffering from Pulmonary Arterial Hypertension . The model includes 19 binary risk factors (reproduced from the original paper in Table 1) and the baseline probability of survival, . By following the method outlined above, we created a BN-Cox model shown in Fig. 3, which is equivalent to the CPH model reported in .
Evaluation of the BN-Cox model In this section, we provide an empirical evaluation of the BN-Cox model by comparing its predictive precision to the baseline survival analysis models like the CPH model and the Kaplan–Meier (K–M) estimator  and to Bayesian networks learned from data. We chose a classical example application of the CPH model, the Recidivism data set . The data set was collected in the course of an experimental study of 432 male prisoners, who were under observation for one year after being released from prison. The event of interest in this analysis is re-arrest, i.e., whether the prisoner is re-arrested during the Daptomycin synthesis of study or not. The Recidivism data set is quite likely the most widely used example data set for survival analysis , , especially for the CPH model. We use the Recidivism data set to compare the precision of the proposed BN-Cox model, the CPH model, the K–M model, and the Bayesian network models learned directly from data. We will explain how we built the models for the purpose of evaluation in Section 5.1 and show the result of the predictive comparison in Section 5.2.
Application of the BN-Cox model to risk calculation As we mentioned earlier, CPH models are widely used in medical risk assessment and are often reported in the literature. Recently, we proposed a BN-Cox-based risk score calculator to the existing Pulmonary Arterial Hypertension (PAH) risk calculator . The core of the original PAH risk score calculator by Benza et al. , the electronic mobile calculator app developed by the United Therapeutics Europe Limited and available at http://www.pah-app.com/, was based on the CPH model. Hence, we replaced the CPH model by a BN-Cox model constructed from the CPH parameters reported in Benza et al.  (Table 1). Fig. 3 shows the structure of the BN-Cox model for the BN-Cox-based calculator. In this case, we omitted the time variable, as the purpose of the PAH Risk Calculator is to capture the risk at one point in time (one year). We created the conditional probability table of the survival node from Equation (11). We configured all risk factors cases (all binary risk factors generated 219 cases) and created the CPT of the survival node from the 219 cases. This allowed us to reproduce fully the PAH CPH model by means of a Bayesian network (see more details in ). This by itself offers no advantages over a CPH model-based calculator but we view it as the first step toward a better calculator that relaxes some of the CPH assumptions and is capable of representing a generalized structure of interactions between risk factors and the survival variables. With the PAH BN-Cox model, we created a risk score calculator using an approach similar to . Equation (11) captures the survival probabilities s given the states of risk factors. We can extract a hidden hazard ratio of each variable by configuring states of other risk factors to be absent. For example, the hazard ratio of a risk factor can be estimated from The term is similar to the baseline survival probability in the CPH model (). Hence, with this equation, we can track back all hazard ratios. Then, we use the same criteria as the original PAH Risk Calculator to convert the hazard rate to a score. Score of 2, for example, indicates at least two-fold increase in risk of mortality compared to the baseline risk. Fig. 8 shows a screen shot of the graphical user interface (GUI) of our prototype of the Bayesian network risk calculator. The left-hand side pane allows for entering risk factors for a given patient. The right-hand side pane shows the calculated score and survival probabilities. Currently, the numerical risks produced by the BN-Cox calculator are identical to those of the original CPH-based PAH Risk Calculator . However, the BN-Cox model makes CPH's assumptions explicit and will allow to relax them in the future. One immediate advantage of the BN-Cox representation is that BNs make it possible to refine the parameters with additional data records.