Advanced Search

Ferraris VA, Ferraris SP. Risk Stratification and Comorbidity.
In: Cohn LH, Edmunds LH Jr, eds. Cardiac Surgery in the Adult. New York: McGraw-Hill, 2003:187224.

This Article
Right arrow Alert me when this article is cited
Right arrow Alert me if a correction is posted
Right arrow Similar articles in this book
Right arrow Author home page(s):
Victor A. Ferraris
Google Scholar
Right arrow Articles by Ferraris, V. A.
Right arrow Articles by Ferraris, S. P.
Right arrow Articles citing this Article
Right arrow Search for Related Content
Right arrow Articles by Ferraris, V. A.
Right arrow Articles by Ferraris, S. P.
Related Collections
Right arrow Cardiac - other

Chapter 6

Risk Stratification and Comorbidity

Victor A. Ferraris/ Suellen P. Ferraris

????Risk Stratification
????Outcomes and Risk Stratification
????Measures of Comorbidity
????Improving the Quality of Care
????Other Goals of Outcomes Analysis
????Analytic Tools of Risk Stratification
????Regression Analysis
????Hierarchical Models and Logistic Regression
????Statistics of Survival
????Bayesian Analysis: Models Based on Experience
????''Breakthrough Statistics''
????Risks Factors for Operative Mortality
????Risk Factors for Postoperative Morbidity and Resource Utilization
????Patient Satisfaction as an Outcome
????Risk Stratification to Measure the Effectiveness of Care
????Ethical Implications of Risk Stratification: The Dilemma of Managed Care
????The Costs of Gathering Data
????Decision Analysis: The Infancy of Risk Stratification and Outcomes Assessment
????Volume/Outcome Relationship and Targeted Regionalization
????Medical Errors and Outcomes
????Public Access and Provider Accountability
????New Risk Stratification Methods: Neural Networks and Other Computer-Intensive Methods
????Information Management: Electronic Medical Records

It may seem a strange principle to enunciate as the very first requirement in a Hospital that it should do the sick no harm. It is quite necessary, nevertheless, to lay down such a principle, because the actual mortality in hospitals ... is very much higher than ... the mortality of the same class of diseases among patients treated out of hospital....

Florence Nightingale, 1863

The formal assessment of patient care had its beginnings in the mid-1800s. One of the earliest advocates of analyzing outcome data was Florence Nightingale, who was troubled by observations that hospitalized patients died at rates higher than those of patients treated outside of the hospital.1,2 She also noted a vast difference in mortality rates among different hospitals, with London hospitals having a mortality rate as high as 92%, while smaller rural hospitals had a much lower mortality rate (12%15%). Although England had tracked hospital mortality rates since the 1600s, the analysis of these rates was in its infancy during Nightingale's era. Yearly mortality statistics were calculated by dividing the number of deaths in a year by the average number of hospitalized patients on a single day of that year. Nightingale made the important observation that raw mortality rates were not an accurate reflection of outcome, since some patients were sicker when they presented to the hospital, and therefore would be expected to have a higher mortality. This was the beginning of risk adjustment based on severity of disease. She was able to carry her observations to the next level by suggesting simple measures, such as improved sanitation, less crowding, and locating hospitals distant from crowded urban areas, that would ultimately result in dramatic improvement in patients' outcomesan example of a quality improvement project (see below).

Ernest Amory Codman, a Boston surgeon, was one of the most outspoken early advocates of outcome analysis and scrutiny of results. Codman was a classmate of Harvey Cushing, and he became interested in the issues of outcome analysis after a friendly bet with Cushing about who had the lowest complication rate associated with the delivery of anesthesia. In the early 1900s as medical students, they were responsible for administering anesthesia. Since vomiting and aspiration were common upon induction of anesthesia, many operations were over before they started. Cushing and Codman compared their results and kept records concerning the administration of anesthesia while they were medical students. This effort not only represented the first intraoperative patient records, but also served as a foundation for Codman's later interest (almost passion) for the documentation of outcomes. Codman actually paid a publisher to disseminate the results obtained in his privately owned Boston hospital.3 Codman was perhaps the first advocate of searching for a cause of all complications. He linked specific outcomes to specific interventions (or errors). He believed that most bad outcomes were the result of errors or omissions by physicians, and completely ignored any contribution to outcome from hospital-based and process-related factors. His efforts were not well received by his peers, and eventually his private hospital closed because of lack of referrals.

Both Codman and Nightingale viewed outcome analysis as an intermediate step toward the improvement of patient care. It was not enough to know the rates of a given outcome. While it is axiomatic that any valid comparison of quality of care or patient outcome must account for severity of illness, this is only the initial step toward improving patient outcome.

Further definition of outcome assessment occurred in the mid-1900s. As more and more therapeutic options became available to treat the diseases that predominated in the early 20th century (e.g., tuberculosis), a need arose to determine the best alternative among multiple therapies, leading to the advent of the controlled randomized trial and tests of effectiveness of therapy. One of the earliest randomized trials was conducted to determine whether streptomycin was effective against tuberculosis.4 Although the trial proved streptomycin's effectiveness, it also stimulated a great deal of controversy. After World War II, several clinicians advocated the use of randomized, controlled trials to better identify the optimal treatment to provide the best outcome. Foremost among these was Archie Cochrane. Every physician should know about Archie Cochrane (Fig. 6-1). He is as close to a true hero as a physician can get, but there may be those who see him as the devil incarnate. As you can see from some of the highlights of his career (see Fig. 6-1), he lived during an exciting time. In the 1930s Professor Cochrane was branded as a "Trotskyite" because he advocated a national health system for Great Britain. His advocacy was tempered by 4 years as a prisoner of war in multiple German POW camps during World War II. He saw soldiers die from tuberculosis, and he was never sure what the best treatment was. He could choose among collapse therapy, bed rest, supplemental nutrition, or even high-dose vitamin therapy. A quote from his book sums up his frustration:

I had considerable freedom of clinical choice of therapy: my trouble was that I did not know which to use and when. I would gladly have sacrificed my freedom for a little knowledge.5

View larger version (176K):
[in this window]
[in a new window]
FIGURE 6-1 Portrait of Archie Cochrane with brief biography.


His experience with the uncertainty about the best treatment for tuberculosis and other chest diseases continued after the war, when he became a researcher in pulmonary disease for the Medical Research Council in Great Britain. His continued interest in tuberculosis was now heightened by the fact that he had contracted the disease. Archie wanted to know the best drug therapy for tuberculosis, since there were now drugs available that could treat this disease, with streptomycin being the first really effective drug against mycobacterium tuberculosis.4 He was a patron of the randomized controlled trial (or RCTs, as he liked to refer to them) to test important medical hypotheses. He used the evidence gained from these RCTs to make decisions about the best therapy based on available evidencethe beginning of "evidence-based" practice. He felt that RCTs were the best form of evidence to support medical decision making (so-called "class 1" evidence). Initially he was a voice in the wilderness, but this eventually changed. In 1979 he criticized the medical profession for not having a critical summary, organized by specialty and updated periodically, of relevant RCTs. In the 1980s, a database of important RCTs dealing with perinatal medicine was developed at Oxford. In 1987, the year before Cochrane died, he referred to a systematic review of randomized controlled trials (RCTs) of care during pregnancy and childbirth as "a real milestone in the history of randomized trials and in the evaluation of care," and suggested that other specialties should copy the methods used. This led to the opening of the first Cochrane center (in Oxford) in 1992 and the founding of the Cochrane Collaboration in 1993.

The Cochrane Web site ( has summaries of all available RCTs on a wide range of medical subjects. Thus it is fair to call Archie Cochrane the "father of evidence-based medicine," Evidence-based medicine has, at its heart, the imperative to improve outcomes by comparing alternative therapies to determine which is the best. Evidence-based studies that involve randomized trials have the advantage of being able to infer cause and effect (i.e., a new therapy or drug causes improved outcome). On the other hand, observational studies (or retrospective studies) are able to define only associations between therapies and outcome, not prove cause and effect.

Risk Stratification

Risk stratification means arranging patients according to the severity of their illness. Implicit in this definition is the ability to predict outcomes from a given intervention based on preexisting illness or the severity of intervention. Risk stratification is therefore defined as the ability to predict outcomes from a given intervention by arranging patients according to the severity of their illness. The usefulness of any risk stratification system arises from how the system links severity to a specific outcome.

There have been numerous attempts at describing severity of illness by means of a tangible score or number. Table 6-1 is a partial listing of some of the severity measures commonly used in risk assessment of cardiac surgical patients. This list is not meant to be comprehensive, but it does give an overview of the types of risk stratification schemes that have been used for cardiac patients. The risk stratification systems listed in Table 6-1 are in constant evolution, and the descriptions in the table may not reflect current or future versions of these systems. All of these severity measures share 2 common features. First, they are all linked to a specific outcome. Second, all measures view a period of hospitalization as the episode of illness. The severity indices listed in Table 6-1 define severity predominantly based on clinical measures (e.g., risk of death, clinical instability, treatment difficulty, etc.). Two of the severity measures shown in Table 6-1 (MedisGroups used in the Pennsylvania Cardiac Surgery Reporting System and the Canadian Provincial Adult Cardiac Care Network of Ontario) define severity based on resource use (e.g., hospital length-of-stay, cost, etc.) as well as on clinical measures.6,7 Of the 9 severity measures listed in Table 6-1, only one, the APACHE III system, computes a risk score independent of patient diagnosis.8 All of the others in the table are diagnosis-specific systems that use only patients with particular diagnoses in computing severity scores.

View this table:
[in this window]
[in a new window]
TABLE 6-1 Examples of risk stratification systems used for patients undergoing cardiac surgical procedures

Each of the risk stratification measures shown in Table 6-1 has been tested against a validation set of patients and found to be an adequate measure of the risk of operative mortality or of other outcome. However, assessing the validity and performance of various risk-adjustment methods entails more than simple cross-validation. No severity tool will ever perfectly describe patients' risks for death, complications, or increased resource use. The most important reason that risk-adjustment methods fail to completely predict outcomes is that the data set used to derive the risk score comes from retrospective, observational data that contain inherent selection bias; i.e., patients were given a certain treatment that resulted in a particular outcome because a clinician had a certain selection bias about what treatment that particular patient should receive. In observational data sets, patients are not allocated to a given treatment in a randomized manner. In addition, clinician bias is not always founded in evidence-based data. An excellent review of the subtleties of evaluating the performance of risk-adjustment methods is given in the book by Iezzoni, and this reference is recommended to the interested reader.9 More attention is paid to the quality of risk-adjustment systems in subsequent sections.

Outcomes and Risk Stratification

There are at least 4 outcomes of interest to surgeons dealing with cardiac surgical patients: mortality, serious nonfatal morbidity, resource utilization, and patient satisfaction. Which patient characteristics constitute important risk factors may depend largely on the outcome of interest. For example, Table 6-2 lists the multivariate factors and odds ratios associated with various outcomes of interest for our patients having cardiac operations.1012 The clinical variables associated with increased resource utilization after operation are different than those associated with increased mortality risk. As a generalization, the risk factors associated with in-hospital death are likely to reflect concurrent, disease-specific variables, while factors associated with increased resource utilization reflect serious comorbid illness.10,11,13 For example, mortality risk after coronary artery bypass graft (CABG) is associated with disease-specific factors such as ventricular ejection fraction, recent myocardial infarction, and hemodynamic instability at the time of operation. Risk factors for increased resource utilization (as measured by length of stay and hospital cost) include comorbid illnesses such as peripheral vascular disease, renal dysfunction, hypertension, and chronic lung disease. It is not surprising that comorbid conditions are important predictors of hospital charges, since patients with multiple comorbidities often require prolonged hospitalization, not only for treatment of the primary surgical illness but also for treatment of the comorbid conditions.

View this table:
[in this window]
[in a new window]
TABLE 6-2 Multivariate factors associated with various outcomes of cardiac surgery10,11

Operative mortality is an easily defined, readily measured outcome. Most studies that have attempted to define effective care have focused on mortality as an outcome. However, outcomes such as resource utilization or quality of life indicators may be more relevant postoperative factors in many instances. Outcome measures other than operative mortality are particularly important when deciding how to spend health care dollars.14,15

Measures of Comorbidity

Comorbidities are coexisting diagnoses that are indirectly related to the principal surgical diagnosis but may alter the outcome of an operation. Physicians or hospitals that care for patients with a higher prevalence of serious comorbid conditions are clearly at a significant disadvantage in unadjusted comparisons. The prevalence of comorbid illness in patients with cardiac disease has been well demonstrated. In one series of patients with myocardial infarction, 26% also had diabetes, 30% had arthritis, 6% had chronic lung problems, and 12% had gastrointestinal disorders.16

Several indices of comorbidity are available. Table 6-3 compares 5 commonly used comorbidity measures: the Charlson index, the RAND Corporation index, the Greenfield index, the Goldman index, and the APACHE III scoring system.1623 There are many limitations of comorbidity indices, and they are not applied widely in studies of efficacy or medical effectiveness. Perhaps the most serious drawback of comorbidity scoring systems is the imprecision of the databases used to form the indices. Most of the data used to construct the indices come from two sources: (1) administrative databases in the form of computerized discharge abstract data, and (2) out-of-hospital follow-up reports. Discharge abstracts include clinical diagnoses that are often assigned by nonphysicians who were not involved in the care of the patient. Comprehensive entry of correct diagnoses is not a high priority for most clinicians, and problems with discharge coding have been identified by Iezzoni and others.2426 These authors found that many conditions that are expected to increase the risk of death are actually associated with a lower mortality. The presumed explanation for this paradoxical finding is that less serious diagnoses are unlikely to be coded and entered in the records of the most seriously ill patients. Likewise, the accuracy of out-of-hospital follow-up studies is hard to validate, and they may contain significant inaccuracies. Because of these shortcomings, analyses that compare physician or hospital outcomes and that do not provide adequate adjustment for patient comorbidity are likely to discriminate against providers or hospitals that treat disproportionate numbers of elderly patients with multiple comorbid conditions.

View this table:
[in this window]
[in a new window]
TABLE 6-3 Five comorbidity indices

A vivid example of failure to adjust for severity of illness occurred when the leaders of the Health Care Financing Administration (HCFA) released hospital mortality figures in March of 1986. Significantly higher death rates than predicted were reported for 142 hospitals. At the facility with the most aberrant death rate, 87.6% of Medicare patients died compared to a Medicare average of 22.5%. What was not taken into account was that this particular facility was a hospice caring for terminally ill patients.27 The HCFA model had not adequately accounted for patient risks and comorbidities.

Risk adjustment for severity of illness and comorbidity is equally important for patients about to undergo stressful interventions such as surgical operations or chemotherapy. For example, Goldman et al reported that preexisting heart conditions and other comorbid diseases were important predictors of postoperative cardiac complications for patients undergoing noncardiac procedures.23 The Goldman scoring system is commonly used by anesthesiologists in assessing patients preoperatively, especially prior to noncardiac procedures.

The tools of risk stratification and outcome analysis can be used to judge effectiveness of care and to aid providers in quality improvement in a number of areas, including the following: (1) cost containment, (2) patient education, (3) effectiveness-of-care studies, and (4) improving provider practices. Table 6-4 provides an idealized list of some of the potentially beneficial uses and goals of risk stratification.

View this table:
[in this window]
[in a new window]
TABLE 6-4 Uses of risk stratification and outcome assessment

Improving the Quality of Care

The ultimate goal of risk stratification and outcome assessment is to account for differences in patient risk factors so that patient outcomes can be used as an indicator of quality of care. A major problem arises in attaining this goal because uniform definitions of quality of care are not available. This is particularly true of cardiovascular disease. For example, there are substantial geographic differences in the rates at which patients with cardiovascular diseases undergo diagnostic procedures and, incidentally, there is little, if any, evidence that these variations are related to survival or improved outcome.2831 In one study, coronary angiography was performed after acute myocardial infarction in 45% of patients in Texas compared to 30% of patients in New York State (p 29 In these patient populations the differences in the rates of coronary revascularization were not as dramatic, and the survival in these patients was not related to the type of treatment or diagnostic procedures. Regional variations of this sort suggest that a rigorous definition of the "correct" treatment of acute myocardial infarction, and other cardiovascular diseases, is elusive, and the definition of quality of care for such patients is imperfect. Similar imperfections exist for nearly all outcomes in patients with cardiothoracic disorders.

Recognizing the difficulties in defining "best practices" for a given illness, professional organizations have opted to promote practice guidelines or "suggested therapy" for given diseases.32 These guidelines represent a compilation of available published evidence, including randomized trials and risk-adjusted observational studies, as well as consensus among panels of experts proficient at treating the given disease.33 For example, the practice guideline for coronary artery bypass grafting is available for both practitioners and the lay public on the Internet ( Table 6-5 summarizes the 1999 AHA/ACC guidelines for coronary artery bypass grafting in patients with acute (Q-wave) myocardial infarction. These guidelines were developed using available randomized controlled trials, risk-adjusted observational studies, and expert consensus. They are meant to provide clinicians with accepted standards of care that most would agree upon, with an ultimate goal of limiting deviations from accepted standards.

View this table:
[in this window]
[in a new window]
TABLE 6-5 1999 AHA/ACC guidelines for CABG in ST-segment elevation (Q-wave) MI

The methodology for developing guidelines for disease treatment is evolving. Many published guidelines do not adhere to accepted standards for developing guidelines.34 The greatest improvement is needed in the identification, evaluation, and synthesis of the scientific evidence.


There have been many efficacy studies relating to cardiothoracic surgery. These studies attempt to isolate one procedure or device and evaluate its effect on patient outcomes. The study population in efficacy studies is specifically chosen to contain as uniform a group as possible. Typical examples of efficacy studies include randomized, prospective, clinical trials (RCTs) comparing use of a procedure or device in a well-defined treatment population compared to an equally well-defined control population.

Efficacy studies are different from effectiveness studies.5 The latter deal with whole populations and attempt to determine the treatment option that provides optimal outcome in a population that would typically be treated by a practicing surgeon. An example of an effectiveness study is a retrospective study of outcome in a large population treated with a particular heart valve. Risk stratification is capable of isolating associations between outcome and risk factors. Methodological enhancements in risk adjustment are capable of reducing biases inherent in population-based, retrospective studies,35 but they can never eliminate all confounding biases in observational studies.

One reasonable strategy for using risk stratification to improve patient care is to isolate high-risk subsets from population-based, retrospective studies (i.e., effectiveness studies), and then to test interventions to improve outcome in high-risk subsets using RCTs. This is a strategy that should ultimately lead to the desired goal of improved patient care. For example, a population-based study on postoperative blood transfusion revealed that the following factors were significantly associated with excessive blood transfusion (defined as more than 4 units of blood products after CABG): (1) template bleeding time, (2) red blood cell volume, (3) cardiopulmonary bypass time, and (4) age.36 Cross-validation of these results was carried out on a similar population of patients undergoing CABG at another institution. Based on these retrospective studies, it was reasonable to hypothesize that interventions aimed at reducing blood transfusion after CABG were most likely to benefit patients with prolonged bleeding time and low red blood cell volume. A prospective clinical trial was then performed to test this hypothesis using two blood conservation techniques, platelet-rich plasma saving and whole blood sequestration, in patients undergoing CABG. The results of this stratified, prospective clinical trial showed that blood conservation interventions were beneficial in the high-risk subset of patients.37 The implications of these studies are that more costly interventions such as platelet-rich plasma saving are only justified in high-risk patients, with the high-risk subset being defined by risk stratification methodologies. Other strategies have been developed that use risk-adjustment methods to improve quality of care, and these methods will be discussed below.

Other Goals of Outcomes Analysis

Financial factors are a major force behind health care reform. America's health care costs amount to 15% to 20% of the gross national product, and this figure is rising at a rate of 6% annually. Institutions who pay for health care are demanding change, and these demands are fueled by studies that suggest that 20% to 30% of care is inappropriate.38 Charges of inappropriate care stem largely from the observation that there are wide regional variations in the use of expensive procedures.39,40 This has resulted in a shift in emphasis, with health care costs being emphasized on equal footing with clinical outcomes of care. Relman suggested that clinical outcomes will be used by patients, payors, and providers as a basis for distribution of future funding of health care.41 While wide differences in use of cardiac interventions initially fueled charges of overuse in certain areas,42 recent evaluations suggest that underuse of indicated cardiac interventions (either PTCA or CABG) may be a cause of this variation.4247 Whether caused by underuse or overuse of cardiovascular services, regional variations in resource utilization make it difficult to use outcomes as an indicator of quality of care.

If the causes of regional variations in the use of cardiac interventions seem puzzling, then physician practice behavior might seem bizarre. One study showed that there were unbelievably large variations in care delivered to patients having cardiac surgery.48 Among 6 institutions that treated very similar patients (Veteran's Administration medical centers), there were large differences in the percentage of elective, urgent, and emergent cases at each institution, ranging from 58% to 96% elective, 3% to 31% urgent, and 1% to 8% emergent.48 There was also a 10-fold difference in the preoperative use of intra-aortic balloon counterpulsation for control of unstable angina, varying from 0.8% to 10.6%.48 Similar variations in physician-specific transfusion practices,49 ordering of blood chemistry tests,50 anesthetic practices,51 treatment of chronic renal failure,52 and use of antibiotics5355 have been observed. This variation in clinical practice may reflect uncertainty about the efficacy of available interventions, or differences in practitioners' clinical judgment. Some therapies with proven benefit are underused.51,52 Whatever the causes of variations in physician practice, they distort the allocation of health care funds in an inappropriate way. Solutions to this problem involve altering physician practice patterns, something that has been extremely difficult to do.56 How can physician practice patterns be changed in order to improve outcome? Evidence suggests that the principal process of outcome assessmentthe case-by-case review (traditionally done in the morbidity and mortality conference format)may not be cost-effective and may not improve quality57 and should be replaced by profiles of practice patterns at institutional, regional, or national levels. One proposed model for quality improvement involves oversight that emphasizes the appropriate balance between internal mechanisms of quality improvement (risk-adjusted outcome analysis) and external accountability.57


Perhaps the most important tool of any outcome assessment endeavor is a database that is made up of a representative sample of the study group of interest. The accuracy of the data elements in any such database cannot be overemphasized.5860 Factors such as the source of data, the outcome of interest, the methods used for data collection, standardized definitions of the data elements, data reliability checking, and the time frame of data collection are essential features that must be considered when either constructing a new database or deciding how to use an existing database.59,60 The quality of the database of interest must be evaluated.

Data obtained from claims databases have been criticized. Because these data are generated for the collection of bills, their clinical accuracy is inadequate, and it is likely that these databases overestimate complications for billing purposes.61,62 Furthermore, claims data were found to underestimate the effects of comorbid illness and to have major deficiencies in important prognostic variables for CABG, namely left ventricular function and number of diseased vessels.63 The Duke Databank for Cardiovascular Disease found major discrepancies between clinical and claims databases, with claims data failing to identify more than half of the patients with important comorbid conditions such as congestive heart failure, cerebrovascular disease, and angina.64 The Health Care Financing Administration (HCFA) uses claims data to evaluate variations in the mortality rates in hospitals treating Medicare patients. After an initially disastrous effort at risk adjustment from claims data,65,66 new algorithms were developed. Despite these advances, the HCFA administration halted release of the 1993 Medicare hospital mortality report because of concerns about the database and fears that the figures would unfairly punish inner-city public facilities.67 The importance of the quality of databases used to generate comparisons cannot be overemphasized.

Analytic Tools of Risk Stratification

Implicit in risk adjustment is the use of some analytic technique to determine the significant risk factors that are predictive of the outcome of interest. Some physicians take the "ostrich" approach when it comes to any statistical concept more sophisticated than the t test. This approach is both unscientific and potentially harmful to patient care. The current shift to outcomes analysis carries with it a more intensive reliance on statistical techniques that are capable of evaluating large populations with multiple variables of interest in an interdependent manneri.e., multivariate analyses. A modicum of statistical knowledge helps unravel the intricacies of risk adjustment and provides confidence in the results of risk-adjustment methodologies. The following sections are not intended to provide the reader with exhaustive knowledge of the statistics of outcome analysis, but rather to provide a resource for critical assessment of these methods and to stimulate the interest of readers to learn more about this important field. Perhaps the biggest single benefit of risk adjustment for outcome analysis will come from physicians increasing their knowledge base about these analytic techniques and gaining confidence in the methodology.

Regression Analysis

The starting point for understanding multivariate statistical methods is a firm grasp of elementary statistics. Several basic texts on statistics are available that are enjoyable reading for the interested health care professional.6870 These texts are a painless way to become familiar with the basic terminology regarding variable description, simple parametric (normally distributed) univariate statistics, linear regression, analysis of variance, nonparametric (not normally distributed) statistical techniques, and ultimately multivariate statistical methods.

A statistical technique that is commonly used to describe how one variable (the dependent or outcome variable) depends on or varies with a set of independent (or predictor) variables is regression analysis. The dependent or outcome variable of interest can be either continuous (e.g., hospital cost or length of stay) or discrete (e.g., mortality). Discrete outcome variables can be either dichotomous (two discrete values, such as alive or dead) or nominal (multiple discrete values, such as improved, unimproved, or worse). The relationship between the outcome variable and the set of descriptor variables can be any type of mathematical relationship. The books by Glantz and Slinker and by Harrell provide an enjoyable primer on regression analysis and are geared to the biomedical sciences.70,71

Regression analysis means determining the relationship that describes how an outcome variable depends on (or is associated with) a set of independent predictor variables. Put in simple terms, multivariate regression analysis is "model building." The resultant model is useful only if it accurately predicts outcomes for patients by determining significant risk factors associated with the outcome of interesti.e., risk adjustment of outcome. When the outcome variable of interest is a continuous variable such as hospital cost, linear multivariate regression is often used to construct a model to predict outcome. A multivariate linear regression model contains a set of independent variables that are linearly related to, and can be used to predict, an outcome variable. These significant independent variables are termed risk factors, and knowledge of these risk factors allows separation of patients according to their degree of riski.e., risk stratification. The linear regression model has two important features. First, the model allows one to estimate the expected risk of a patient based on his/her risk characteristics. Second, various health care providers can be compared by comparing their observed outcomes to the outcomes that would be predicted from consideration of the risk factors of the patients that they treat (so-called observed to expected ratio or "O/E" ratio).

Statistical terminology used to describe variables and variable distribution patterns is particularly important in understanding linear regression statistical modeling. An important concept is statistical variance (R2). R2 is a summary measure of performance of the statistical model. R2 is often described by saying that it is the fraction of the total variability of the dependent variable explained by the statistical model. Most investigators routinely report R2 as a measure of the performance of linear regression risk-adjustment models.72 For example, the APACHE III risk-adjustment scoring system described in Table 6-1 can be used to predict ICU length of stay. When this is done, the model is associated with an R2 value of 0.15.19,72 This implies that 15% of the variability in ICU length of stay can be explained by the variables encompassed in the APACHE III score. Another way of saying this is that 85% of the variability in ICU length of stay is not explained by the APACHE III scoring system. This R2 value does not rate the APACHE III scoring system very highly for predicting ICU length of stay. The APACHE III scoring system uses patient data obtained within 24 hours of admission to the ICU to predict outcome and was developed to predict in-hospital mortality, not hospital costs or length of stay. We have found that the outcome of patients admitted to the ICU depends on events that happen after admission (especially iatrogenic events occurring in the ICU) more so than patient characteristics present on admission to the ICU.73 Hence, it is not surprising that the APACHE III score does not account for all of the variability in ICU length of stay. In addition, it is not clear what level of R2 can be expected when the APACHE III system is used in a very different context than the one for which it was designed. Shwartz and Ash give an excellent review of evaluating the performance of risk-adjustment methods using R2 as a measure of model performance, and their work provides an enlightening insight into the tools of risk adjustment.74 Hartz et al point out that it is unlikely that any large multivariate regression model will completely account for all of the variability of any complex outcome.75 When the outcome variable of interest is a discrete variable (e.g., mortality), then nonlinear regression analysis is used. Logistic regression is the nonlinear method most widely used to model dichotomous outcomes in the health sciences. Logistic regression makes use of the mathematical fact that the expression, ex / (1 - ex), assumes values between 0 and 1 for all positive values of x. The value x in the expression can be a linear sum of predictor variables (either continuous or discrete) and the value of ex / (1 - ex) is the probability of outcome between 0 (e.g., survival) and 1 (e.g., death) for any value of predictor variables. Computer iteration techniques can be used to produce a model consisting of a set of independent variables that best predict the occurrence of a dichotomous outcome variable. The significant independent variables identified by the logistic regression model are risk factors that allow risk stratification of patients according to their risk of experiencing the dichotomous outcome (e.g., survival versus death).

The performance of logistic regression models can be assessed in several ways. However, there is less agreement about how best to measure performance for models that predict binary outcomes than there is about the use of R2 to evaluate linear regression models. One commonly used parameter to evaluate the performance of logistic regression models is the c-statistic.76 The c-statistic is equal to the area under the receiver operator characteristic (ROC) curve and can be generated from the sensitivity and specificity of measurements of any dichotomous outcomes. Table 6-6 describes the formulas for the statistical terms commonly used to describe dichotomous outcomes.

View this table:
[in this window]
[in a new window]
TABLE 6-6 Formulas for analysis of risk-adjusted dichotomous outcomes

Figure 6-2 depicts the ability of a logistic regression model to predict patients who will receive a blood transfusion after CABG based on preoperative variables including preoperative aspirin use.77 This figure is an ROC curve derived from a plot of the sensitivity versus 1 minus the specificity (same as a plot of the true positive rate on the Y-axis versus the false positive rate on the X-axis). The ROC curve in Figure 6-2 is produced by assuming a particular cutoff point for a predicted probability of one outcome (e.g., patients with a predicted probability greater than 0.5 of receiving any transfusion after CABG are considered to be positive for receiving a blood transfusion). The c-statistic for the prediction model is 0.738, suggesting only fair ability of the model to predict postoperative blood transfusion. To put this in perspective, a c-statistic of 1.0 indicates perfect discrimination of the model, and a c-statistic of 0.5 indicates no discrimination. So a c-statistic of 0.738 is about halfway between perfect and worthless. An excellent critique of the various methods used to assess performance of regression models of dichotomous outcomes is given by Ash and Shwartz in the book by Iezzoni.78

View larger version (15K):
[in this window]
[in a new window]
FIGURE 6-2 Receiver operating characteristic (ROC) curve demonstrating association of preoperative aspirin use in 2606 patients before CABG with postoperative blood transfusion. For this analysis, postoperative transfusion was considered a dichotomous variable (patients either received a transfusion or they did not). The area between the diagonal line and the upper curve represents the c-statistic.77

Hierarchical Models and Logistic Regression

Logistic regression models have been used to develop risk profiles for providers (both hospitals and individual surgeons), or so-called report cards.7986 This has caused anguish on the part of providers,87 and concern on the part of statisticians and epidemiologists.79,80 The typical approach that report cards take is to grade surgeons by their operative mortality for CABG. In order to grade a provider, the expected number of deaths (E or expected rate) calculated from deaths observed in the entire provider group is compared to the observed number of risk-adjusted deaths for the provider (O or observed rate). This gives an O/E ratio, which is a ratio of the risk-adjusted observed mortality rate to the expected mortality rate based on the group logistic model. In order to make comparisons between providers, a confidence interval (usually a 95% confidence interval) is assigned to the observed mortality rates, and the between-provider mortality rates are presented as a range of values for each provider (Fig. 6-3A). The expected mortality rates are assumed to be independent of the observed mortality ratesan incorrect assumption. Furthermore, no sampling error is attached to the expected valuesanother incorrect assumption. The effect of making these two assumptions is to identify too many outliers (in either direction).

View larger version (43K):
[in this window]
[in a new window]
FIGURE 6-3 Comparison of classical statistical point and interval estimates to estimates from hierarchical models for a sample of New York surgeons performing coronary artery bypass grafting. (A) Estimates and 95% intervals for risk-adjusted mortality rates, classical approach (black circle) and estimates from hierachical models (open circle). The state average of 2.99% is shown with the vertical line. (B) Mean and 95% intervals for rank of surgeon, classical approach (solid circle) and estimates from hierachical models (open circle). (Reproduced with permission from Goldstein H, Spiegelhalter DJ: League tables and their limitations: statistical issues in comparisons of institutional performance (with discussion). J R Stat Soc 1996; 159:385.)

Statistical methodology to account for these incorrect assumptions has been available for many years. The methodology involves construction of hierarchical regression models. Hierarchy means nesting, and the name of these models implies that they incorporate (or nest) other levels of analysis within the analysis of provider mortality. For example, patients are nested within provider groups (patients treated by a given provider), but then patients are also nested within hospital groups (patients treated at a given hospital). The most important feature of hierarchical models is that the model recognizes that the nested observations may be correlated (e.g., mortality may depend on the surgeon, the hospital where care is provided, and other unspecified variables such as referral patterns, academic status, hospital size and location, etc.) and that different sources of variation can occur at each level (or nest). Computer-intensive methods are available to produce hierarchical models for risk-adjusted surgeon mortality. Most statisticians recognize hierarchical models as the preferred method to perform this type of provider analysis, but the methods are complex, labor-intensive, and not included in most commercially available computer software.

Traditional logistic regression modeling to rank surgeons according to their risk-adjusted mortality rates results in exaggerated (incorrect) provider profiles.28,79,80,8890 Goldstein and Spiegelhalter reexamined the 1994 New York State publicly disseminated mortality data using hierarchical logistic regression.88 Figure 6-3 summarizes their findings. The hierarchical analysis dampens the surgeon mortality rates towards the mean of all providers. More importantly, the New York State report identified three outliers (2 low and 1 high) based on simple logistic regression, while hierarchical analysis identified only one outlier (1 high). Similar results have been obtained by Grunkemeier et al when they applied the principles of hierarchical regression analysis to the Providence Health System logistic regression model for CABG operative mortality.80

The use of simplistic models, as in the case of the New York State or the Providence Health System logistic regression models, is incorrect, and probably unethical, given the potential impact that outlier status might have on a surgeon's practice. Two impediments to widespread use of hierarchical models are the absence of the necessary large data sets and the lack of readily available easy-to-use software packages. There are statistical "workarounds" that adjust data sets to obtain similar results to those obtained from hierarchical models,28,91 but it is unclear if such adjustment produces a result that is qualitatively different from that produced by standard hierarchical analysis. At present, hierarchical regression is the "gold standard" for risk adjustment of dichotomous outcomes and producing provider report cards. Unfortunately, this gold standard is rarely used.

Statistics of Survival

When the outcome of interest is a time-dependent variable (e.g., hospital length of stay or survival after valve implantation), then regression modeling may be a more complex but still manageable process. Regression models for time-dependent outcome variables can be developed using computer iteration methods. Several excellent texts are available that cover the gamut of technical information from the relatively simple92,93 to the complex.9496 One model that has been used extensively in the biomedical sciences is the Cox proportional hazards regression model.94 In some regression models, such as logistic regression, the dependent or outcome variable is known with precision. With time-dependent outcome variables, the possibility exists that only a portion of the survival time is observed for some patients. Thus the data available for analysis will consist of some outcomes that are incomplete or "censored." Some regression models, such as logistic regression, do not easily adapt to censored data. Cox's model overcomes these technical problems by assuming that the independent variables are related to survival time by a multiplicative effect on the hazard functionthus it is a "proportional hazards" model. The hazard function is defined as the slope of the survival curve (or the time decay curve) for a series of time-dependent observations. In the Cox model, one assumes that the hazard functions are proportional, a reasonable assumption when comparing survival in two or more similar groups. Hence, it is not necessary to know the underlying survival function in order to determine the relative importance of independent variables that contribute to the overall survival curve. Table 6-7 shows an example of the use of Cox regression to evaluate the independent variables that are predictive of hospital length of stay (a time-dependent outcome variable) for patients undergoing CABG.11 For the purposes of this analysis, hospital deaths were considered censored observations. The independent variables shown in Table 6-7 are considered risk factors for increased length-of-stay, and can be used to stratify patients into groups with varying risk of prolonged hospitalization.

View this table:
[in this window]
[in a new window]
TABLE 6-7 Cox proportional hazards regression model for significant predictor variables associated with increased hospital length of stay in 938 patients undergoing CABG during 199311

Measures of the performance of Cox regression models are less well developed than those for logistic regression or for multivariate linear regression. Simple methods of checking the predictive value of Cox models are usually reported in the medical literature. One of the most commonly used methods is called cross-validation, or jackknife analysis. This consists of using Cox regression to determine significant independent variables predictive of outcome in a "training" set of data. The significant variables and their coefficients from the Cox model are then used to compute values of the outcome variables in a different data set (called the validation set, or the jackknife set). The agreement between the predicted values and the observed values in the validation set is used as an index of the performance of the Cox model. Figure 6-4 shows a cross-validation set of data used to check the results of the Cox regression described in Table 6-7. This figure shows fair "validation" of the Cox model using a relatively small data set (1200 patients operated upon during 1994 at a single institution). Performance of Cox models can also be expressed in terms of ROC curves and the c-statistic as for logistic regression models.

View larger version (19K):
[in this window]
[in a new window]
FIGURE 6-4 Cross-validation of Cox regression model used to predict hospital length of stay after CABG. Data from patients operated upon in 1993 were used as the "training" data set to predict values for LOS in patients undergoing operation in 1994 ("validation set"). Risk scores were generated by assigning numerical weights to significant variables in the Cox model based on regression coefficients shown in Table 6-7. The methods used to derive the patient risk scores are described in the references.11,287 (Reproduced with permission from Ferraris VA, Ferraris SP: Risk factors for postoperative morbidity. J Thorac Cardiovasc Surg 1996; 111:731.)

Bayesian Analysis: Models Based on Experience

Thomas Bayes was a nonconformist minister and mathematician who is given credit for describing the probability of an event based on knowledge of prior probabilities that the same event has already occurred.97 Using the Bayesian approach, three sets of probabilities are defined: (1) the probability of an event before the presence of a new finding is revealed (prior probability); (2) the probability that an event is observed given that an independent variable is positive (conditional probability); and (3) the probability of an event occurring after the presence of a new finding is revealed (posterior probability). The mathematical relationship between the three probabilities is Bayes' theorem. The prior and posterior probabilities are defined with respect to a given set of independent variables. In the sequential process common to all Bayesian analyses, the posterior probabilities for one finding become the prior probabilities for the next, and a mathematical combination of prior and conditional probabilities produces posterior probabilities. Bayes' theorem can be expressed in terms of the nomenclature of Table 6-6 as:

where PBayes is defined as the probability of a given outcome if prior probabilities are known. The principles of Bayesian statistics have been used widely in decision analysis98 and can also be used to generate multivariate regression models based on historical data about independent variables.99102 Bayesian multivariate regression models are generated using computer-based iterative techniques99,102104 and have been used in the past, but not at present, to develop the risk stratification analysis for the Society of Thoracic Surgeons National Cardiac Database.103,105 Evaluation of the performance of the Bayesian statistical regression models is usually done by cross-validation studies similar to those used to validate the Cox survival regression model in Figure 6-4. Performance of Bayesian models can also be expressed in terms of ROC curves and the c-statistic as for logistic regression models. Marshall et al have shown that Bayesian models of risk adjustment give comparable results and produce ROC curves similar to those generated from logistic regression analysis using conventional models.106


An implicit part of assessing outcome is the development of a best standard of care for a given illness or disease process. Once the most efficacious treatment is known, then comparisons with, or deviations from, the standard can be assesseda process called "benchmarking." As mentioned above, the "best standard" is not always known. Meta-analysis is a quantitative approach for systematically assessing the results of multiple previous studies to determine the best or preferred outcome. The overall goal of meta-analysis is to combine the results of previous studies to arrive at a consensus conclusion about the best outcome. Stated in a different way, meta-analysis is a tool used to summarize efficacy studies (preferably RCTs) of an intervention in a defined population with a disease in order to determine which intervention is likely to be effective in a large population with a similar disorder. Meta-analysis is a tool that can relate efficacy studies to effectiveness of an intervention by summarizing available medical evidence.

In summarizing available medical evidence on a given subject, information retrieval is king. Nowhere is this more evident than in the Cochrane Collection of available randomized trials on various medical subjects. For example, a recent Cochrane review found 17 trials that evaluated postoperative neurological deficit in patients having hypothermic cardiopulmonary bypass (CPB) compared to normothermic CPB.107 This compares to a recently published meta-analysis on a similar topic that found only 11 trials with which to perform a similar analysis.108 The Cochrane reviewers perform an exhaustive search of all available literature, using not only MEDLINE but also unpublished trials, and so-called "fugitive literature" (government reports, proceedings of conferences, published Ph.D. theses, etc.). The average thoracic surgeon has not heard of "publication bias," but the Cochrane reviewers are acutely aware of it. They realize that RCTs that have a negative result are less likely to pass the peer-review editorial process into publication than RCTs with a significant treatment effectso called "publication bias" in favor of positive clinical trials.109 So for each of the Cochrane reviews, attempts are made to find unpublished and/or negative trials to add to the body of evidence about a given subject.

All trials or observational studies that address the same outcomes of a given intervention are not the same. There are almost always subtle differences in study design, sample size, analysis of results, and inclusion/exclusion criteria. The object of comparing multiple observational studies and RCTs on the same treatment outcome is to come up with a single summary estimate of the effect of the intervention. Calculating a single estimate in the face of such diversity may give a misleading picture of the truth. There are no statistical tricks that account for bias and confounding in the original studies. Heterogeneity of the various RCTs and observational studies on the same or similar treatment outcome is the issue. This heterogeneity makes comparison of RCTs a daunting task, about which volumes have been written.15

There are at least two types of heterogeneity that confound summary estimates of multiple RCTsclinical heterogeneity and statistical heterogeneity. Statistical heterogeneity is present when the between-study variance is large; i.e., similar treatments result in widely varying outcomes in different trials. This form of heterogeneity is easiest to measure. For example, Berlin et al evaluated 22 separate meta-analyses and found that only 14 of 22 had no evidence of statistical heterogeneity.110 Three of the remaining 8 comparative studies gave different results depending on the type of statistical methods used for the analysisthe more statistical heterogeneity, the less certain the statistical inferences from the analysis.

Clinical heterogeneity of groups of RCTs that assess similar outcomes is much more difficult to assess. Measurement of treatment outcomes has plagued reviewers who try to summarize RCTs. Many RCTs address similar treatment options (e.g., hypothermic CPB versus normothermic CPB) but measure slightly different outcomes (e.g., stroke or neuropsychological dysfunction). For example, the Cochrane Heart Group found 17 RCTs that addressed the effect of CPB temperature on postoperative stroke.107 Only 4 of these 17 RCTs measured neuropsychological function, while all 17 measured neurological deficit associated with CPB. In summarizing the results of multiple RCTs comparing a given treatment, it is necessary to match "apples with apples" when looking at outcomes. In this analysis by the Cochrane Heart Group, there was a trend towards a reduction in the incidence of nonfatal strokes in the hypothermic group (OR = 0.68; 95% CI, 0.431.05). Conversely, the number of nonstroke-related perioperative deaths tended to be higher in the hypothermic group (OR = 1.46; 95% CI, 0.92.37). When all "bad" outcomes (stroke, perioperative death, myocardial infarction, low-output syndrome, intra-aortic balloon pump use) were pooled, neither hypothermia nor normothermia had a significant advantage (OR = 1.07; 95% CI, 0.921.24). This suggests that there is clinical heterogeneity among the various RCTs evaluated. There are statistical "tricks," such as stratification or regression, that can investigate and explore the differences among studies, but it is unlikely that clinical heterogeneity can be completely removed from the meta-analysis. Importantly, the Cochrane Group concludes from these data that there is no definite advantage of hypothermia over normothermia in the incidence of clinical events following CPB. This constitutes good evidence (multiple well-done RCTs) to support the notion that normothermic and hypothermic CPB have equal efficacy for most outcomes. An expert panel reviewing the Cochrane evidence might suggest that there is class I evidence (according to the ACC/AHA guideline nomenclature) that neither normothermic nor hypothermic CPB results in increased incidence of perioperative complications. This is an entirely different conclusion than Bartels et al made in their meta-analysis about the same interventions. These authors suggest that there is little evidence to support the usefulness/efficacy of hypothermia in CPB.108 No one can say which meta-analysis is closer to the truth. Much depends on the details of the meta-analyses, but logic suggests that the higher quality study including more available RCTs and statistically rigorous analysis comes closer to the scientific truth.

There is some concern about the findings of meta-analysis.111113 LeLorier et al found significant discrepancies between the conclusions of meta-analyses and subsequent large RCTs.112,113 On review of selected meta-analyses, Bailar found that "problems were so frequent and so serious, including bias on the part of the meta-analyst, that it was difficult to trust the overall best estimates that the method often produces."114,115 Great caution must be used in the interpretation of meta-analyses, but the technique has gained a strong following among clinicians since it may be applied even when the summarized studies are small and there is substantial variation in many of the factors that may have an important bearing on the findings.

"Breakthrough Statistics"

The use of complex statistics is becoming more common in assessing medical data. Arguably, the understanding of this complex material on the part of clinicians has not advanced at a similar rate. In an attempt to address this knowledge gap, Blackstone coined the term "breakthrough statistics" to denote newer methods that are available to handle complex, but clinically important, research questions.35 His goal was to acquaint clinicians with the methods in a nontechnical fashion "so that you may read reports more knowledgeably, interact with your statistical collaborators more closely, or encourage your statistician to consider these methods if they are applicable to your clinical research."35 These worthy goals have direct relevance to outcomes assessment and risk stratification.


One of Blackstone's breakthrough statistical methods deals with a very common problem: the assessment of nonrandomized comparisons. Observational studies, or nonrandomized comparisons, can detect associations between risk and outcome but cannot, strictly speaking, determine which risks cause the particular outcome. Traditionally only RCTs have been able to determine cause and effect. A so-called breakthrough technique to allow nonrandomized comparisons to come closer to inferring cause and effect is the use of balancing scores.

Simple comparison of two nonrandomized treatments is confounded by selection factors. That means that a clinician decided to treat a particular patient with a given treatment for some reason that was not obvious and was not necessarily evidence-based. The selection factors used in this nonrandomized situation are difficult to control, and RCTs eliminate this type of bias. However, RCTs are often not applicable to the general population of interest since they are very narrowly defined.116

Use of nonrandomized comparisons is more versatile and less costly. One of the earliest methods used to account for selection bias was patient matching. Two groups who received different treatment were matched as closely as possible for all factors except the variable of interest. Balancing scores were developed as an extension of patient matching. In the early 1980s, Rosenbaum and Rubin introduced the idea of balancing scores to analyze observational studies.117 They called the simplest form of a balancing score a propensity score. Their techniques were aimed at drawing causal inference from nonrandomized comparisons. The propensity score is a probability of group membership. For example, in a large group of patients having CABG, some receive aspirin before operation and others do not. One might ask whether preoperative aspirin causes increased postoperative blood transfusion. The propensity score is a probability between 0 and 1 that can be calculated for each patient, and this score represents their probability of getting an aspirin before operation. If the aspirin and nonaspirin patients are matched by their propensity scores, the patients will be as nearly matched as possible for every preoperative characteristic excluding the outcome of interest. Not all the patients may be included in the analysis because some aspirin users may have a propensity score that is not closely matched to a nonaspirin user. But those aspirin users who have a matching propensity score to a nonaspirin user will be very closely matched for every variable except for the outcome variable of interest. This is as close to a randomized trial comparison as you can get without actually doing a randomized trial.

How is the propensity score calculated? The relevant question asked to construct the propensity score is which factors predict group membership (e.g., who will receive aspirin and who will not). The probability of receiving an aspirin is a dichotomous variable that can be modeled like any other binary variable. For example, logistic regression can be used to identify factors associated with aspirin use. In the logistic regression analysis to develop the propensity score, as many risk factors as possible are included in the model, and the logistic equation is solved (or modeled) for the probability of being in the aspirin group. This probability is the propensity score. An example of the results obtained from this type of analysis is shown in Table 6-8. 77 In this analysis, 2606 patients (1900 preoperative aspirin users and 606 nonusers) were "balanced" by being divided into 5 equal quintiles according to their propensity scores. Quintile 1 had the least chance of receiving aspirin while quintile 5 had the greatest chance of receiving aspirin before operation. Within each quintile, the patients were matched as closely as possible for all variables except for the outcome variable of interest, i.e., receiving any blood transfusion after CABG, almost like a randomized trial. Notice that within each quintile, aspirin users and nonusers were closely matched for other variables, such as preoperative renal function, gender, and cardiopulmonary bypass time. This indicates that the propensity score matching did what it was intended to do: match the patients for all variables except for the outcome variable of interest (i.e., postoperative transfusion). The results show that the propensity scored quintiles are asymmetrici.e., there is not a consistent association between aspirin and blood transfusion across all quintiles. In the strata that are least likely to receive preoperative aspirin, there are patients who are more likely to receive postoperative transfusion (i.e., patients in quintile 1 have the longest cardiopulmonary bypass time, the greatest number of women, and the largest number of patients with preoperative renal dysfunction). This implies that some patients may have been recognized as high risk preoperatively and were not given aspirini.e., selection bias exists in the data set. There is some evidence that well-done observational studies give comparable results to RCTs dealing with the similar outcomes,118,119 and balancing scores provide optimal means of analyzing nonrandomized studies.

View this table:
[in this window]
[in a new window]
TABLE 6-8 Effect of aspirin (ASA) on postoperative blood transfusion using propensity score-matched quintiles


The importance of risk factor identification for comparing outcomes has already been stressed. Risk factor identification for a given outcome has become commonplace in medicine. A problem arises from this dependence on risk factor analysis, especially logistic regression. Different observers analyzing the same risk factors to predict outcome get different results. Table 6-9 is an example of the variability in risk factor identification that can result. In this table, Grunkemeier et al compared 13 published multivariate risk models for mortality following CABG.80 The number of independent risk factors cited by any one model varied from 5 to 29! Naftel described 9 different factors that contribute to different investigators obtaining different models to predict outcome (i.e., different sets of risk factors associated with the same outcome).120 Some or all of these factors may affect the risk models listed in Table 6-9. One of Naftel's factors that is important in differentiating various models of CABG mortality is variable selection. In Table 6-9, 13 different groups found 13 different variable patterns that apparently adequately predicted operative mortality. How can this be? Recent breakthrough statistical methods have surfaced that address variable selection in statistical modeling.

View this table:
[in this window]
[in a new window]
TABLE 6-9 Risk models for operative mortality for CABG

In the early 1980s, the ready availability of computers began to surface to the consciousness of investigators. Efron et al popularized computer-intensive computational techniques that were not readily available until computers were on every investigator's desk.121124 They coined the term "bootstrap" to describe these computer-intensive methods. Bootstrap analysis is a data-based simulation method for statistical inference. Efron's group and others found that by taking repeated random samples from a data set (1000 random samples is typical) and determining risk factors for an outcome from each of the new data sets using statistical modeling, the predictor variables obtained from all the 1000 random samples were usually different. However, some variables were never selected in the model and others were selected consistently. The frequency of occurrence of risk factors among the 1000 or more models provides variables that have a high degree of reproducibility and reliability as independent risk factors of the given outcome. This process is called "bootstrap bagging,"125 and it has formalized the development of model building, which had previously been more of an art than a science. As a result of this work with bootstrap analysis, model building and risk adjustment will be held to a more rigorous scientific standard.

Cardiac surgeons now treat patients that were considered inoperable as recently as a decade ago. Yet almost no one is happy with the health care system. It costs too much, excludes many, is inefficient, and is ignorant about its own effectiveness. This state of confusion has been likened to the conditions that existed with Japanese industry after World War II. Out of the confusion and crisis of postWorld War II, Japan became a monolith of efficiency. Two major architects of this transformation were an American statistician, W. Edwards Deming, and a Romanian-American theoretician, J.M. Juran. They led the way in establishing and implementing certain principles of management and efficiency based on quality.

Deming's and Juran's principles126129 have been given the acronym of "total quality management," or TQM. The amazing turnaround in Japanese industry has led many organizations to embrace the principles of TQM, including organizations involved in the delivery and assessment of health care.130 Using this approach, health care is viewed as a process requiring raw materials (e.g., sick patients), manufacturing steps (e.g., delivery of care to the sick), and finished products (e.g., outcomes of care). Managerial interventions are important at each step of the process to insure high-quality product. Table 6-10 outlines the key features of TQM.

View this table:
[in this window]
[in a new window]
TABLE 6-10 Principles of total quality management (TQM) applied to health care

An important component of the TQM process is the use and availability of statistical methods to provide the necessary information to managers and workers who must make decisions about the health care process.131133 Although the statistical methods of TQM have a slightly different focus than those outlined above for risk adjustment, the goal is the samei.e., improving the quality of health care. Hence, it is reasonable to include a description of the methods of TQM, since those methods will inevitably come up in discussions about health care outcomes and assessing risks for these patient outcomes.

Table 6-11 provides an outline of the sequential steps involved in solving a problem using TQM. Risk stratification plays an important role in the TQM process. One of the most important applications of risk stratification in TQM is in the early stages of the project, when the definition of the problems that affect quality is being considered. Usually a problem is identified from critical observationse.g., excessive blood transfusion after operation may result in increased morbidity, including disease transmission, increased infection risk, and increased cost. Tools such as flow diagrams that document all of the steps in the process (e.g., steps involved in the blood transfusion process after CABG) are helpful in this phase of the analysis. A logical starting point for efforts to improve the quality of the blood transfusion process would be to focus on a high-risk subset of patients who consume a disproportionate amount of resources. An Italian economist, named Pareto, made the observation that a few factors account for the majority of the outcomes of a complex process, and this has been termed the "Pareto principle."

View this table:
[in this window]
[in a new window]
TABLE 6-11 Steps in a TQM project

The Pareto principle has proven to be a valuable tool in improving quality. A graphical method of identifying the spectrum of outcomes in a process is included in most statistics programs, and is termed a Pareto diagram. Figure 6-5 is an example of a Pareto diagram for blood product transfusion. The data are arranged in histogram format, and the population distribution of patients receiving transfusion is plotted simultaneously. The Pareto diagram is an example of a graphical method of risk identification. By looking at the diagram, it is possible to identify a subset of patients who consume more than a certain threshold of blood products. For example, it can be estimated that 20% of the patients consume 80% of the blood products transfused. Substantial savings in cost and possibly other morbidity will result by decreasing the amount of blood transfusion in these "high-end" users. Strategies can be devised and tested to decrease blood product consumption in the high-risk subset, and ultimately monitors must be set up to test the effectiveness of the new strategies.

View larger version (12K):
[in this window]
[in a new window]
FIGURE 6-5 Pareto diagram of blood transfusion in 1489 patients undergoing cardiac procedures at Albany Medical Center Hospital during 1994.

Application of the Pareto principle is a valuable tool in risk stratification and TQM, but some of the risk stratification analytic tools previously discussed can be equally useful in the TQM process. For example, linear regression analysis might be used to identify which of several factors are most important in predicting improved blood transfusion profiles after CABG, or Cox analysis might be used to identify risk factors for increased hospital length of stay in patients at high risk for excessive blood transfusion. Other tools of TQM such as data sampling strategies and use of control charts play an important role in the process.

Several tools that are typically used in industrial quality improvement and process control134 have been applied to medicine and, in particular, cardiothoracic surgery.135137 Shahian et al used control charts to evaluate special and usual variability of outcomes in patients having average-risk CABG.135 Again, for this type of analysis, CABG is viewed as a process with raw materials (patients with coronary artery disease), manufacturing steps (CABG), and output (operative outcomes). The outcomes are tracked over time using control charts, a well-known quality improvement tool. Control charts are plots of data over time. The data points are usually plotted in conjunction with overlying lines that represent upper and lower control limits. The control limits are established by considering historical data (e.g., rate of blood transfusion or the operative mortality rate). When a data point falls outside of the control limits, it is said to be "out of control." Causes of a process being out of control include either usual causes (random) or special (nonrandom) causes. Shahian et al found that certain postoperative complications (e.g., postoperative bleeding, leg wound infections, and total major complications) were out of control in the early part of their study. After implementing quality improvement measures, these complication rates showed progressive improvement, with the net effect being improvement in the length of hospital stay.

A variation on the control chart methodology, called the cumulative sum analysis, or CUSUM, was used by Novick et al to analyze the effect of changing from on-pump CABG to off-pump CABG as a primary means of operative coronary revascularization.136 These authors found that the CUSUM methodology was more sensitive than standard statistical techniques in detecting a cluster of surgical failures or successes.

Another example of a TQM project that has been carried out in clinical cardiothoracic surgery is the study reported by de Levalet al.137 In this study, surgeons identified an increase in the mortality rate of infants undergoing total repair of D-transposition of the great vessels; the authors applied the principles of TQM to the process of care of these infants. They were able to identify risk factors for poor outcome and to separate the sources of variation in mortality rate into either random (common cause) variation or nonrandom (special cause) variation. By identifying and altering nonrandom causes of increased mortality, which were presumably related to the surgeon or to the process of care, they were able to make a positive impact on patient outcome. Examples like this go far beyond simple risk analysis and begin to get at the true value of these techniques, i.e., improving patient outcomes.

Another set of innovative TQM studies has been carried out by the Northern New England Cardiovascular Disease Study Group.138141 These investigators used a risk-adjustment scheme (Tables 6-1 and 6-9) to predict mortality in patients undergoing CABG at 5 different institutions. After risk stratification, significant variability was found among the different institutions and providers. Statistical methods suggested that the variation in mortality rate was nonrandom ("special variability" in the TQM vernacular). A peer-based, confidential TQM project was initiated to address this variability and to improve outcomes in the region. In order to study this nonrandom variability, representatives from each institution visited all institutions and reviewed the processes involved in performing CABG. Surgical technique, communication among providers, leadership, decision making, training levels, and environment were assessed at each institution. Significant variation among many of the processes was observed, and attempts to correct deficiencies were undertaken at each institution. Subsequent publications from these authors suggest that this approach improves outcomes for all providers at all institutions.141

Risks Factors for Operative Mortality

By far, the bulk of available experience with risk stratification and outcome analysis in cardiothoracic surgery deals with risk factors associated with operative mortality, particularly in patients undergoing coronary revascularization. Most of the risk stratification analyses shown in Table 6-1 and Table 6-9 have been used to evaluate life or death outcomes in surgical patients with ischemic heart disease, in part because mortality is such an easy end point to measure and track. As previously mentioned, each of the risk stratification systems shown in Table 6-1 and Table 6-9, with the exception of the APACHE III system, computes a risk score based on risk factors that are dependent on patient diagnosis. The definition of operative mortality varies among the different systems (either 30-day mortality or in-hospital mortality), but the risk factors identified by each of the stratification schemes in Table 6-9 show many similarities. Some variables are risk factors in almost all stratification systems; some variables are never significant risk factors. Each of the models has been validated using separate data sets; hence, there is some justification in using any of the risk stratification methods both in preoperative assessment of patients undergoing coronary artery bypass grafting (CABG) and in making comparisons among providers (either physicians or hospitals), but certain caveats exist about the validity and reliability of these models (see below). At present it is not possible to recommend one risk stratification method over another. In general, the larger the sample size, the more risk factors can be found.

A large number of patient variables other than those shown in Table 6-9 have been proposed as risk factors for operative mortality following coronary revascularization. Such variables as serum BUN,142 cachexia,143 oxygen delivery,144 HIV,145 case volume,146149 low hematocrit on bypass,150 use of the internal mammary artery,151 the diameter of the coronary artery,152 and resident involvement in the operation153,154 fit this description. On the surface, the clinical relevance of these variables may seem undeniable in published reports, but very few of these putative risk factors have been tested with the rigor of the variables shown in Table 6-9. The regression diagnostics (e.g., ROC curves and cross-validation studies) performed on the models included in Tables 6-1 and 6-9 suggest that the models are good, but not perfect, at predicting outcomes. In statistical terms this means that all of the variability in operative mortality is not explained by the set of risk factors included in the regression models. Hence, it is possible that inclusion of new putative risk factors in the regression equations may improve the validity and precision of the models. New regression models, and new risk factors, must be scrutinized and tested using cross-validation methods and other regression diagnostics before acceptance. It is uncertain whether inclusion of many more risk factors will significantly improve the quality and predictive ability of regression models. For example, the STS risk stratification model described in Tables 6-1 and 6-9 includes many predictor variables, while the Toronto risk-adjustment scheme includes only 5 predictor variables. Yet the regression diagnostics for these two models are similar, suggesting that both models have equal precision and predictive capabilities. This suggests that the models are effective at predicting population behavior but not necessarily suited for predicting individual outcomes. Further work needs to be done, both to explain the differences in risk factors seen between the various risk stratification models and to determine which models are best suited for studies of quality improvement.

Many critical features of any risk-adjustment outcome program must be considered when determining quality of the risk stratification method or when comparing one to another (see below). Daley provides a summary of the key features that are necessary to validate any risk-adjustment model.59,155 She makes the point that no clear-cut evidence exists that differences in risk-adjusted mortalities across providers reflect differences in the process and structure of care.156 This issue needs further study.

Risk Factors for Postoperative Morbidity and Resource Utilization

Patients with nonfatal outcomes following operations for ischemic heart disease make up more than 95% of the pool of patients undergoing operation. Of approximately 500,000 patients having CABG yearly, between 50% and 75% have what is characterized by both the patient and provider as an uncomplicated course following operation. The complications occurring in surviving patients range from serious organ system dysfunction to minor limitation or dissatisfaction with lifestyle, and account for a significant fraction of the cost of the procedures. We estimate that as much as 40% of the yearly hospital costs for CABG are consumed by 10% to 15% of the patients who have serious complications after operation.11 This is an example of the Pareto principle described above, and also suggests that reducing morbidity in CABG patients would have significant impact on cost reduction.

A great deal of information has been accumulated on nonfatal complications after operation for ischemic heart disease. Several large databases have been used to identify risk factors for both nonfatal morbidity and increased resource utilization. Table 6-12 is a summary of some of the risk factors that have been identified by available risk stratification models using either serious postoperative morbidity or increased resource utilization as measures of undesirable outcomes.

View this table:
[in this window]
[in a new window]
TABLE 6-12 Risk factors associated with either increased length of stay (L) or increased incidence of organ failure morbidity (M) or both (L/M) following coronary revascularization

Occasionally studies appear that suggest that a particular patient variable is not a risk factor for a particular patient outcome. Care must be exercised in interpreting negative results. Many putative risk factors labeled as "no different from control" in studies using inadequate samples have not received a fair test. For example, Burns et al studied preoperative template bleeding times in 43 patients undergoing elective CABG.157 They found no increased postoperative blood loss in patients with prolonged bleeding times. In this small sample size, there was a trend toward more units of blood transfused, but differences between high and low bleeding time groups were "not significant" at the {alpha} = 5% (p = .05). Using the author's data, it is possible to compute a ? error for this negative observation of less than 0.5. This means that there is as much as a 50% chance that the negative finding is really a false-negative result. We have found elevated bleeding time (> 10 minutes) to be a significant multivariate risk factor for excessive blood transfusion after CABG in two different studies.36,37 Although there is controversy about the value of bleeding time as a screening test,158,159 it is possible that discarding the bleeding time after an inconclusive negative trial, such as that of Burns et al, may ignore a potentially important risk factor.160

Patient Satisfaction as an Outcome

Other post-CABG outcomes, such as patient satisfaction and sense of well-being, have been less well studied. The increasing importance of patient-reported outcomes reflects the increasing prevalence of chronic disease in our aging population. The goals of therapeutic interventions are often to relieve symptoms and improve quality of life, rather than cure a disease and prolong survival. This is especially important in selecting elderly patients for operation. One report from the United Kingdom suggests that as many as one third of patients over the age of 70 did not have improvement in their disability and overall sense of well-being after cardiac operation.161 Risk stratification methodology may prove to be important in identifying elderly patients who are optimal candidates for revascularization based on quality of life considerations.

Surprisingly little published information is available regarding long-term functional status or patient satisfaction following CABG. One comparative study found no difference between patients older than 65 years and those 65 or younger with regard to quality of life outcomes (symptoms, cardiac functional class, activities of daily living, and emotional and social functioning).162 This study also found a direct relationship between clinical severity and quality of life indicators, since patients with less comorbid conditions and better preoperative functional status had better quality of life indicators 6 months after operation. Rumsfeld et al found that improvement in the self-reported quality of life (from Form SF-36) was more likely in patients who had relatively poor health status before CABG compared to those who had relatively good preoperative health status.163 Interestingly, these same authors found that the poor self-reported quality of life indicator, as measured by SF-36 questionnaire, was an independent predictor of operative mortality following CABG.164 These findings suggest that the risks of patient dissatisfaction after CABG are dependent on preoperative comorbid factors as well as on the indications for, and technical complexities of, the operation itself. At present no risk stratification scheme has been devised to identify patients who are likely to report dissatisfaction with operative intervention following CABG.

There are several difficulties with measurement of patient-reported outcomes, and consequently cardiothoracic surgeons have not been deeply involved with systematic measurements of patient satisfaction after operation. One problem is that patient-reported outcomes may be dependent on the type of patient who is reporting them and not on the type of care received. For example, younger Caucasian patients with better education and higher income are more likely to give less favorable ratings of physician care.165 However, considerable research has been done dealing with instruments to measure patient satisfaction. At least two of these measures, the Short-Form Health Survey (SF-36)166 and the San Jose Medical Group's Patient Satisfaction Measure,167 have been used to monitor patient satisfaction over time. The current status of these and other measures of patient satisfaction does not allow comparisons among providers, because the quality of the data generated by these measures is poor. These instruments are characterized by low response rates, inadequate sampling, infrequent use, and unavailability of satisfactory benchmarks. Nonetheless, available evidence indicates that patient-reported outcomes can be measured reliably168,169 and that feedback on patient satisfaction data to physicians can significantly improve physician practices.170 It is likely that managed care organizations and hospitals will use patient-reported outcome measures to make comparisons between institutions and between individual providers. Risk-adjustment methods for patient-reported outcomes will be required to provide valid comparisons of this type.

There is little or no consensus about how to assess the validity of risk-adjustment methodology. As pointed out by Daley, the concept of validity is something that everyone understands but for which no single meaning exists.156 The concept of validity is made up of many parts. According to Daley, 5 of these parts are as follows:
  1. Face validity: Will whoever uses the risk model accept it as valid?
  2. Content validity: Does the model include risk factors that should have been included based on known risks?
  3. Construct validity: How well does the model compare to other measures of the same outcome?
  4. Predictive validity: How well does the model predict outcome in patients not used to construct the model?
  5. Attributional validity: Does the model measure the attribute of effectiveness of care, not patient variability?

Of these components, face and content validity are arguably the most important. Clinicians can readily accept the results of risk stratification efforts if the model uses variables that are familiar and includes risk factors that the clinician recognizes as important in determining outcome. All of the risk models shown in Table 6-9 satisfy some or all of these criteria of validity. There is no objective measure that defines validity, but most clinicians would agree that the risk models have relevance to clinical practice and contain many of the features that one would expect to be predictive of morbidity and mortality for CABG.

The reliability of a risk stratification model is more easily measured than validity. Reliability of a risk-adjustment method refers to the statistical term of "precision," or the ability to repeat the observations using similar input variables and similar statistical techniques with resultant similar outcome findings. There are literally hundreds of sources of variability in any risk stratification model. Some of these include errors in data input, inconsistencies in coding or physician diagnosis, variations in use of therapeutic intervention, data fragility (final model may be very dependent on a few influential outliers), and the type of rater (physician, nurse, or coding technician), to name a few.171 The most common measure of reliability is Cohen's kappa coefficient, which measures the level of agreement between two or more observations compared to agreement due to chance alone.172 The kappa coefficient is defined as:

where Pc is the fraction representing the agreement that would have occurred by chance, and Po is the observed agreement between two observers. If two observations agree 70% of the time for an observation where agreement by chance alone would occur 54% of the time (i.e., Po = 0.7 and Pc = 0.54), then kappa = 0.35. Landis and Koch have offered a performance estimate of kappa173 as follows:

0 to 0.2: slight agreement
0.2 to 0.4: fair agreement
0.4 to 0.6: good agreement
0.6 to 0.8: substantial agreement
0.8 to 1.0: near perfect agreement

Other methods of measuring the agreement between two models include weighted kappa, interclass correlation coefficient, the tau statistic, and the gamma statistic. These methods are discussed in the work of Hughes and Ash,171 and any of these methods offer an objective means of assessing the reliability of a risk-adjustment model.

Surprisingly little work has been done in assessing the reliability and validity of the risk-adjustment methods used with large cardiac surgical databases. It is absolutely essential that validity and reliability be tested in these models, both in order for clinicians to feel comfortable with the comparisons generated by the risk stratification models, and for policy makers (either government or managed care organizations) to feel confident in making decisions based on risk-adjusted outcomes.

Risk Stratification to Measure the Effectiveness of Care

The biggest single shortcoming of risk-adjustment methodology is its lack of proven effectiveness in delineating quality of care. Even though it may seem obvious that differences in risk-adjusted outcomes reflect differences in quality of care, this is far from proven. What little information there is available on this subject is inconsistent. Hartz et al compared hospital mortality rates for patients undergoing CABG.174 They found that differences in hospital mortality rates were correlated with differences in quality of care between hospitals. Hannan et al attempted to evaluate the quality of care in outlier hospitals in the New York State risk-adjusted mortality cohort.175 They concluded, as did Hartz et al, that risk-adjusted mortality rates for CABG were a reflection of quality of care. The measures of quality used in these studies were somewhat arbitrary and did not reflect a complete array of factors that might be expected to influence outcomes after CABG. In TQM jargon, the entire clinical process of surgical intervention for coronary revascularization was not assessed. Other studies have not found a correlation between global hospital mortality rates and quality of care indicators.83,176178 Indeed, one study suggested that nearly all of the variation in mortality among hospitals reflects variation in patient characteristics rather than in hospital characteristics,178 while another study found that identifying poor-quality hospitals on the basis of mortality rate performance, even with perfect risk adjustment, resulted in less than 20% sensitivity and greater than 50% predictive error.83 These studies suggest that reports that measure quality using risk-adjusted mortality rates misinform the public about hospital (or physician) performance.

Two ongoing quality improvement studies are using risk-adjusted outcome measurements to assess and influence the clinical process of coronary artery bypass grafting: the Northern New England Cardiovascular Disease Study138141 and the Veteran's Administration Cardiac Surgery Risk Assessment Program.179181 Results from these studies suggest that using risk-adjusted outcomes (e.g., mortality and cost) as internal reference levels tracked over time (similar to the control charts of TQM described above) can produce meaningful improvements in outcomes. Whether these risk-adjusted outcomes can be used to indicate quality of care or cost-effectiveness of providers across all institutions is another question that remains unanswered. At present, equating risk-adjusted outcome measurements with effective care is not justified.176178,182188

Controversy exists as to whether changes in a physician's report card over time reflect changes in care or whether these changes are due to other factors not related to the care delivery of individual provider's. Hannan et al suggested that the public release of surgeon-specific, risk-adjusted mortality rates led to a decline in overall mortality in the state of New York from 4.17% in 1989 to 2.45% in 1992 and hence to an improvement in the quality of care.189 The cause of this decline in operative mortality is uncertain, but probably represents a combination of improvement in the process of care (especially in the outlier hospitals), the retirement of low-volume surgeons, and an overall national trend toward decreased CABG mortality. Ghali et al found that states adjacent to New York that did not have report cards had comparable decreased CABG operative mortality during the same time period.190 Without any formal quality improvement initiative or report card, the operative morality rate in Massachusetts decreased from 4.7% in 1990 to 3.3% in 1994. This occurred while the expected operative mortality of patients increased from 4.7% to 5.7%.

The decline in New York State operative mortality over time associated with the publication of surgeon-specific mortality rates was greater than the overall national decrease in CABG death rates. Peterson et al found that the reduction in observed CABG mortality was 22% in New York versus 9% in the rest of the nation, a highly significant difference.191 An interesting finding in this study is that the only other area in the United States with a comparable decline in CABG mortality to that of New York was northern New England. The Northern New England Cooperative Group established a confidential TQM approach to improve CABG outcomes139,140,192194 at about the same time that New York State report cards were published in the lay press.

Ethical Implications of Risk Stratification: The Dilemma of Managed Care

An important component of new proposals for health care reform is mandated reports on quality of care.195 While reporting on quality of care sounds appealing, there are many problems associated with this effort, some of which present the clinician with an ethical dilemma.187,196198 There is general agreement that quality indicators should be risk-adjusted to allow fair comparisons among providers. Risk adjustment in this setting is extremely difficult and may be misleading, and, what is worse, may not reflect quality of care at all. The release of risk-adjusted data may alienate providers and result in the sickest patients having less accessibility to care. This may have already happened in New York State87,188,199 and in other regions where risk-adjusted mortality and cost data have been released to the public. Of even more concern is the selection bias that seems to exist in managed care HMO enrollment. Morgan et al suggest that Medicare HMOs benefit from the selective enrollment of healthier Medicare recipients and the disenrollment or outright rejection of sicker beneficiaries.200 This form of separation of patients into unfavorable or favorable risk categories undermines the effectiveness of the Medicare managed care system and highlights the subtle selection bias that can result when financial incentives overcome medical standards. Careful population-based studies that employ risk adjustment are needed to study this phenomenon.

A major concern of the current move toward market-oriented health care delivery is that health plans will only select the best health risk participantsa practice termed "cream skimming" by van de Venet al.201 The result of cream skimming may be to widen the gap between impoverished, underserved patients and affluent patients. In an effort to address these concerns, plans have been proposed that would reward health plans for serving people with disabilities and residents of low-income areas.202205 At the heart of these plans is some form of risk adjustment to allocate payments to health care organizations based on overall health risk and expected need for health care expenditures. The use of risk stratification in this setting is new and unproved but offers great promise.

A related problem is that physicians are being rewarded by hospitals and managed care organizations for limiting costs. Incentives are evolving that threaten our professionalism.197 On the surface, this may seem like a strong statement, but one only has to read some of the "compromise" positions that have been advocated in order to deal with the changing health care climate. "Advice," such as hiring lawyers to optimize care (managed or capitated contracts), recruiting younger patients into and discouraging Medicare-age patients away from a managed care practice, forbidding physicians from disclosing the existence of more costly services not covered by their managed care plan, and using accounting services to track and limit frequency of office visits, has been offered to physicians.197,206208 A particular telling indictment is the finding by Himmelstein et al that investor-owned HMOs deliver lower quality of care than not-for-profit plans.209 Physicians who own managed care organizations live by two standardsthe professional standard of providing high-quality patient care and the financial standard of making a profit from this care delivery. Strategies must be devised that allow physicians both to maintain a professional approach to patients and to participate in the marketplace without compromising patient care.

The Costs of Gathering Data

Collecting risk-adjusted data adds to the administrative costs of the health care system. It is estimated that 20% of health care costs ($150$180 billion per year) are spent on the administration of health care.210 The logistical costs of implementing a risk-adjustment system are substantial. Additional costs are incurred in implementing quality measures that are suggested by risk stratification methodology. A disturbing notion is that the costs of quality care may outweigh the payers' willingness to pay for these benefits. For example, Iowa hospitals estimated that they spent $2.5 million annually to gather MedisGroups severity data that was mandated by the state. Because of the cost, the state abandoned this mandate and concluded that neither consumers nor purchasers used the data anyway.211

It is possible that quality improvement may cost rather than save money, although one of the principles of TQM (often quoted by Deming) is that the least expensive means to accomplish a task (e.g., deliver health care) is the means that employs the highest quality in the process. Ultimately, improved quality will be cost-efficient, but start-up costs may be daunting, and several organizations have already expressed concerns about the logistical costs of data gathering and risk adjustment.211,212 It is imperative that part of any cost savings realized by improved quality be factored into the total costs of gathering risk-adjusted data.

Decision Analysis: The Infancy of Risk Stratification and Outcomes Assessment

A great deal of effort has gone into the development of risk models to predict outcomes from cardiac surgical interventions (e.g., Tables 6-1 and 6-9). For populations of patients undergoing operations, the models are fairly effective at predicting outcomes (with certain caveats as mentioned above). The biggest drawback of these risk models is that they exhibit dismal performance at predicting outcomes for an individual patient. Consequently, risk adjustment to predict individual outcomes is extremely difficult to apply at the bedside. For patient-specific needs, risk stratification and outcomes assessment are in their infancy.

What are needed are patient-specific predictors for clinical decision making. On the surface, it would seem that a decision about whether to operate upon a patient with coronary artery disease is straightforward. But a decision of this sort is an extremely complex synthesis of diverse pieces of evidence, similar to the decisions airline pilots make about complicated flight problems. There are enormous variations in the way that surgeons practice. These variations can increase cost, cause harm, and confuse patients. The tools of decision analysis have been applied to the area of physician decision making in an effort to eliminate these variations and to provide accurate and effective decisions at the patient bedside.213,214

In the jargon of decision analysis, the decision to operate on a given patient for the treatment of coronary artery disease is extremely complex because there are more than two alternative treatments, more than two outcomes, and many intervening events that may occur to alter outcomes. Decision models generated to address surgical outcomes typically employ the familiar decision tree. They are commonly perceived as complex and difficult to understand by those unfamiliar with the methods, especially in the context of clinical decision making. Attempts have been made to simplify the methods and to apply them to the medical decision-making process.214 An important part of creating the decision tree is to estimate the probabilities of each of the various outcomes for a given set of interventions. This part of the decision analysis tree relies heavily on the results of risk stratification and regression modeling, especially computer-intensive methods such as Bayesian models,215,216 to arrive at a probability of risk for a particular outcome. All of the diverse pieces of evidence used to form consensus guidelines, including meta-analyses, expert opinions, unpublished sources, randomized trials, and observational studies, are employed to arrive at probabilities of outcome and intervening events. For example, Gage et al performed a cost-effectiveness analysis of aspirin and warfarin for stroke prophylaxis in patients with nonvalvular atrial fibrillation.217 They used data from published meta-analyses of individual-level data from 5 randomized trials of antithrombotic therapy in atrial fibrillation to estimate the rates of stroke without therapy, the percentage reduction in the risk of stroke in users of warfarin, and the percentage of intracranial hemorrhages following anticoagulation that were mild, moderate, or severe. Their decision tree suggests that in 65-year-old patients with nonvalvular atrial fibrillation (NVAF) but no other risk factors for stroke, prescribing warfarin instead of aspirin would affect quality-adjusted survival minimally but increase costs significantly. Application of decision analysis methods to clinical decision making can standardize care and decrease risks of therapy, but these methods are in the early developmental phase and much more work needs to be done before ready acceptance by surgeons.

Volume/Outcome Relationship and Targeted Regionalization

At least 10 large studies have addressed the notion that hospitals performing small numbers of CABG operations have higher operative mortality. Seven of these 10 studies found increased operative mortality in low-volume providers.148,218223 In three other large studies there was no such association.147,224,225 Interestingly, in the three studies done more recently (since 1996) there was no clear relationship between outcome and volume. In two separate studies done on some of the same patients in the New York State cardiac surgery database, completely opposite results were obtained.223,225 The Institute of Medicine summarized the relationship between higher volume and better outcome ( and concluded that procedure or patient volume is an imprecise indicator of quality, even though a majority of the studies reviewed showed some association between higher volumes and better outcomes.226

The dilemma is that some low-volume providers have excellent outcomes while some high-volume providers have poor outcomes. These observations on operator volume and outcome prompted some authorities to suggest "regionalization" to refer nonemergent CABG patients to large-volume centers.222,227,228 A role for "selective regionalization" was advocated by Nallamothu et al, since they found that low-risk patients did equally well in high-volume or low-volume hospitals.147 They suggest regional referral to high-volume institutions for elective high-risk patients. Crawford et al pointed out that a policy of regionalized referrals for CABG might have several adverse effects on health care, including increased cost, decreased patient satisfaction, and reduced availability of surgical services in remote or rural locations.229 It is simplistic to suggest that hospital volume is a principal surrogate of outcome, and much more sophistication is required to sort out this relationship. Nonetheless, decisions about utilization of health care resources will undoubtedly be made based on the presumed association between high volume and good outcome.

Medical Errors and Outcomes

The Institute of Medicine (IOM) released a startling report on medical errors that occur in the U. S. health system ( Based mainly on two large studies, one using 1984 data in New York State and the other using 1992 data in Colorado and Utah,230,231 this report suggested that at least 44,000 Americans die each year from preventable hospital errors.232 This estimate was not much different than ones obtained from similar analyses on patients in Australia,233 in England,234 and in Israel.235 This IOM report caused a storm of controversy, with many experts fearing that the report could harm quality improvement initiatives.236,237 Some had doubts about the methodology used to derive estimates of medical errors,238,239 while others made an emotional plea for drastic measures to reduce errors.240

The airline industry has had dramatic success in limiting errors, and its experience is used as a model of successful implementation of error avoidance behavior and process improvement.237 These principles have been applied to pediatric cardiac surgery by de Leval et al with some success.241 These workers evaluated patient and procedural variables that resulted in adverse outcomes. In addition they employed self-assessment questionnaires and human factors researchers who observed behavior in the operating room, an approach similar to the quality improvement steps used in the airline industry. Their study highlighted the important role of human factors in adverse surgical outcomes, but, more importantly, they found that appropriate behavioral responses in the operating room can overcome potential harmful events that occur in the operating room. Studies of this sort that emphasize behavior modification and process improvement hold great promise for future error reduction in cardiac surgery.

Computer applications have been applied to the electronic medical record in hopes of minimizing physician errors in ordering. Computerized physician order entry (CPOE) is one of these applications that monitors and offers suggestions when physician's orders do not meet a predesigned computer algorithm. CPOE is viewed as a quality indicator and private employer-based organizations have used the presence of CPOE to judge whether hospitals should be part of their preferred network ( One of these private groups is the Leapfrog Group; their initial survey in 2001 found that only 3.3% of responding hospitals currently had CPOE systems in place ( In New York State, several large corporations and health care insurers have agreed to pay hospitals that meet the CPOE standards a discount bonus on all health care billings submitted. Other computer-based safety initiatives that involve the electronic medical record are likely to surface in the future. The impact of these innovations on the quality of health care is untested and any benefit remains to be proven.

Information technology used to reduce medical errors has met with mixed success. Innovations that employ monitoring of electronic medical records may reduce errors.242,243 However, one study that tried to program guidelines for treating congestive heart failure into a network of physicians' interactive microcomputer workstations found it difficult.244 The task proved difficult because the guidelines often lack explicit definitions (e.g., for symptom severity and adverse events) that are necessary to navigate the computer algorithm. Another study attempted to implement prophylactic care measures (e.g., update of tetanus immunization in trauma patients) using reminders in the electronic inpatient medical record.245,246 These investigators were unable to increase the use of prophylactic measures in hospitalized patients using this computer-based approach. Much work needs to be done before computer-aided methods lead to medical error reduction, but the future will see more efforts of this type made.

Public Access and Provider Accountability

Multiple factors bring surgical results to the public attention. Such things as publication of individual surgeon's risk-adjusted mortality rates, limitation of referrals to high-mortality surgeons by insurers, legislative initiatives to reduce medical errors, and the abundant proliferation of the Internet as an information resource lead to increased public awareness of surgical outcomes. Like it or not, thoracic surgeons must be prepared to accept this scrutiny, and perhaps even to benefit from it, since the trend of public scrutiny of surgical results is an increasing one.

The World Wide Web provides ready access to medical facts of all sorts, including information about thoracic surgery. Everything from access to the thoracic literature, to outcomes of randomized trials, to surgeon-specific risk-adjusted mortality rates, to comparison of hospital outcomes can be obtained by the lay public with rather simple searches on the Internet. This ready public access will undoubtedly increase. Examples of available information sources for the public are listed in Table 6-13. There is very little external scrutiny attached to most of the information sources listed in Table 6-13. The public accepts almost all information available on these sites at face value, and quality control of the information sources is limited to self-imposed efforts on the part of the authors of the various information sources. The Agency for Healthcare Research and Quality (AHRQ) attempted to empower the public to critically evaluate the various Web-based sources of health care information in order to limit the spread of misinformation that may creep into various Web sites. The success of these efforts is uncertain but is becoming extremely critical as the amount of health care information available on the Web skyrockets.

View this table:
[in this window]
[in a new window]
TABLE 6-13 Partial listing of publicly available information sources related to thoracic surgery

New Risk Stratification Methods: Neural Networks and Other Computer-Intensive Methods

The goal of risk adjustment is to account for the contribution of patient-related risk factors to the outcome of interest. This allows patient outcomes to be used as an indicator of the care rendered by physicians or administered by hospitals. This chapter has outlined some of the risk-adjustment methods commonly used for this purpose, including use of multivariate analyses to predict patient outcomes based on patient risk factors. Inevitably, refinements and use of newer techniques will be brought to bear on the problem of risk adjustment. One of the most promising newer risk-adjustment methods is the use of neural networks (also termed artificial intelligence) to develop prediction models based on patient risk factors.247250 A weakness of multivariate regression techniques is that some variables have too low an incidence to be used in multivariate regression models but still contribute significantly to outcome. This weakness is overcome by use of neural networks and cluster analysis. Both of these techniques use computer iteration to look for patterns of variables associated with outcome and are far less affected by low frequency of a particular variable. These methods may find more than one solution to the best prediction of outcome and may produce a combination of variables that reflect a unique clinical situation. Neural network modeling has been used to predict length of stay in the ICU after cardiac surgery249 and to predict valve-related complications following valve implantation.248 Preliminary evidence suggests that the performance of neural network models may be superior to multivariate regression models, with the c-statistic for neural network models approaching 95% in ideal situations,250 but one study did not find any added benefit from modeling in-hospital death following PTCA using neural networks compared to logistic regression.251 Computer-intensive methods including use of neural networks, bootstrapping, and balancing scores (see earlier discussion) will become more important as refinements in risk-adjustment methodology progress.

Information Management: Electronic Medical Records

As already mentioned, accurate patient data are essential in order to apply the principles of risk stratification and quality improvement outlined in this chapter. The quality and accuracy of administrative (or claims) databases have been questioned.24,61,6366 Therefore, risk-adjustment methodology has placed greater reliance on data extracted from the medical record. The American College of Surgeons was among the earliest advocates of the utility of medical records for quality review.252 In the 1960s, Weed advocated standardization and computerization of medical records.253255 Little substantive progress had been made as far as the computerization of medical records until the need arose for management of large amounts of data of the sort required for risk adjustment and outcomes assessment. Medical records are an invaluable source of information about patient risk factors and outcomes. With these facts in mind, more and more pilot studies are being undertaken to computerize and standardize the medical record in a variety of clinical situations.245,256264 Iezzoni has pointed out the difficulties with computerized medical records and suggests that they may not adequately reflect the importance of chronic disability and decreased functional status.265 Nevertheless, it is apparent that the need for data about large groups of patients exists, especially for managed care and capitation initiatives. It is reasonable to expect that efforts to computerize medical records will expand. Applications of electronic medical records that may be available in the future for cardiothoracic surgeons include monitoring of patient outcomes,256 supporting clinical decision making with real-time analysis of the electronic medical record,257,259,262 and real-time tracking of resource utilization using computerized hospital records.260

Risk stratification and outcomes analysis will have an increasing role in cardiac surgery. The methods of risk analysis are straightforward but are in their early stages. The goal of risk adjustment in the analysis of outcomes is to account for the contribution of patient-related risk factors, in order that patient outcomes can be used as an indicator of the quality of care rendered by physicians and hospitals. The future will undoubtedly see refinements in risk-adjustment methods and increasing use of these techniques at all levels of health care delivery, including the distribution of health care dollars. Thoracic surgeons have been at the forefront of these methodologies (sometimes unwillingly), but much work remains to be done in the area of education about the use of risk stratification and application of risk-adjusted outcome analysis toward improving quality of care. We are obliged to have an understanding of the techniques, with the ultimate goal being to improve patient outcomes and maintain high professional quality.

  1. Nightingale F: Notes on Hospitals. London, Longman, Green, Longman, Roberts, and Green, 1863.
  2. Cohen IB: Florence Nightingale. Sci Am 1984; 250:128.[Medline]
  3. Codman E: A Study in Hospital Efficiency as Demonstrated by the Case Report of the First Five Years of a Private Hospital. Boston, Thomas Todd Company, 1917.
  4. Daniels M, Hill A: Chemotherapy of pulmonary tuberculosis in young adults: an analysis of the combined results of three Medical Research Council trials. BMJ 1952; 1:1162.
  5. Cochrane A: Effectiveness & Efficiency: Random Reflections on Health Services. London, Royal Society of Medicine Press Limited, 1971; p. 1.
  6. Steen PM, Brewster AC, Bradbury RC, et al: Predicted probabilities of hospital death as a measure of admission severity of illness. Inquiry 1993; 30:128.[Medline]
  7. Tu JV, Jaglal SB, Naylor CD: Multicenter validation of a risk index for mortality, intensive care unit stay, and overall hospital length of stay after cardiac surgery. Steering Committee of the Provincial Adult Cardiac Care Network of Ontario. Circulation 1995; 91:677.[Medline]
  8. Knaus WA, Wagner DP, Zimmerman JE, Draper EA: Variations in mortality and length of stay in intensive care units. Ann Intern Med 1993; 118:753.[Abstract/Free?Full?Text]
  9. Iezzoni LI: Risk Adjustment for Measuring Healthcare Outcomes. Chicago, Health Administration Press, 1997.
  10. Ferraris VA, Ferraris SP, Singh A: Operative outcome and hospital cost. J Thorac Cardiovasc Surg 1998; 115:593.[Abstract/Free?Full?Text]
  11. Ferraris VA, Ferraris SP: Risk factors for postoperative morbidity. J Thorac Cardiovasc Surg 1996; 111:731.[Abstract/Free?Full?Text]
  12. Iezzoni LI: Risk adjustment for medical effectiveness research: an overview of conceptual and methodological considerations. J Invest Med 1995; 43:136.[Medline]
  13. Riordan CJ, Engoren M, Zacharias A, et al: Resource utilization in coronary artery bypass operation: does surgical risk predict cost? Ann Thorac Surg 2000; 69:1092.[Abstract/Free?Full?Text]
  14. Gold MR, Siegel JE, Russell LB, Weinstein MC: Cost-effectiveness in Health and Medicine. Oxford, Oxford University Press, 1996.
  15. Petitti D: Meta-analysis, decision analysis, and cost-effectiveness analysis, in Kelsey JL MM, Stolley PD, Vessey MP (eds): Monographs in Epidemiology and Biostatistics, Vol. 31. New York, Oxford University Press, 2000.
  16. Stewart AL, Greenfield S, Hays RD, et al: Functional status and well-being of patients with chronic conditions: results from the Medical Outcomes Study. JAMA 1989; 262:907.[Abstract]
  17. Charlson ME, Pompei P, Ales KL, MacKenzie CR: A new method of classifying prognostic comorbidity in longitudinal studies: development and validation. J Chron Dis 1987; 40:373.[Medline]
  18. Charlson ME, Sax FL, MacKenzie CR, et al: Morbidity during hospitalization: can we predict it? J Chron Dis 1987; 40:705.[Medline]
  19. Knaus WA, Wagner DP, Draper EA, et al: The APACHE III prognostic system: risk prediction of hospital mortality for critically ill hospitalized adults. Chest 1991; 100:1619.[Abstract/Free?Full?Text]
  20. Keeler EB, Kahn KL, Draper D, et al: Changes in sickness at admission following the introduction of the prospective payment system. JAMA 1990; 264:1962.[Abstract]
  21. Greenfield S, Apolone G, McNeil BJ, Cleary PD: The importance of co-existent disease in the occurrence of postoperative complications and one-year recovery in patients undergoing total hip replacement: comorbidity and outcomes after hip replacement. Med Care 1993; 31:141.[Medline]
  22. Greenfield S, Aronow HU, Elashoff RM, Watanabe D: Flaws in mortality data: the hazards of ignoring comorbid disease. JAMA 1988; 260:2253.[Abstract]
  23. Goldman L, Caldera DL, Nussbaum SR, et al: Multifactorial index of cardiac risk in noncardiac surgical procedures. N Engl J Med 1977; 297:845.[Abstract]
  24. Iezzoni LI, Foley SM, Daley J, et al: Comorbidities, complications, and coding bias: does the number of diagnosis codes matter in predicting in-hospital mortality? JAMA 1992; 267:2197.[Abstract]
  25. Iezzoni LI, Daley J, Heeren T, et al: Using administrative data to screen hospitals for high complication rates. Inquiry 2000; 31:40.
  26. McCarthy EP, Iezzoni LI, Davis RB, et al: Does clinical evidence support ICD-9-CM diagnosis coding of complications? Med Care 2000; 38:868.[Medline]
  27. Brinkley J: U.S. releasing lists of hospitals with abnormal mortality rates. New York Times, March 12, 1986; p 1.
  28. Gatsonis CA, Epstein AM, Newhouse JP, et al: Variations in the utilization of coronary angiography for elderly patients with an acute myocardial infarction: an analysis using hierarchical logistic regression. Med Care 1995; 33:625.[Medline]
  29. Guadagnoli E, Hauptman PJ, Ayanian JZ, et al: Variation in the use of cardiac procedures after acute myocardial infarction. N Engl J Med 1995; 333:573.[Abstract/Free?Full?Text]
  30. Mark DB, Naylor CD, Hlatky MA, et al: Use of medical resources and quality of life after acute myocardial infarction in Canada and the United States. N Engl J Med 1994; 331:1130.[Abstract/Free?Full?Text]
  31. Anderson GM, Grumbach K, Luft HS, et al: Use of coronary artery bypass surgery in the United States and Canada: influence of age and income. JAMA 1993; 269:1661.[Abstract]
  32. Chassin MR: Improving quality of care with practice guidelines. Front Health Serv Manage 1993; 10:40.[Medline]
  33. Eagle KA, Guyton RA, Davidoff R, et al: ACC/AHA Guidelines for Coronary Artery Bypass Graft Surgery: a report of the American College of Cardiology/American Heart Association Task Force on Practice Guidelines (Committee to Revise the 1991 Guidelines for Coronary Artery Bypass Graft Surgery). J Am Coll Cardiol 1999; 34:1262.[Free?Full?Text]
  34. Shaneyfelt TM, Mayo-Smith MF, Rothwangl J: Are guidelines following guidelines? The methodological quality of clinical practice guidelines in the peer-reviewed medical literature. JAMA 1999; 281:1900.[Abstract/Free?Full?Text]
  35. Blackstone EH: Breaking down barriers: helpful breakthrough statistical methods you need to understand better. J Thorac Cardiovasc Surg 2001; 122:430.[Free?Full?Text]
  36. Ferraris VA, Gildengorin V: Predictors of excessive blood use after coronary artery bypass grafting: a multivariate analysis. J Thorac Cardiovasc Surg 1989; 98:492.[Abstract]
  37. Ferraris VA, Berry WR, Klingman RR: Comparison of blood reinfusion techniques used during coronary artery bypass grafting. Ann Thorac Surg 1993; 56:433.[Abstract]
  38. Barbour G: The role of outcomes data in health care reform. Ann Thorac Surg 1994; 58:1881.[Abstract]
  39. Chassin MR: Explaining geographic variations: the enthusiasm hypothesis. Med Care 1993; 31:YS37.[Medline]
  40. Wennberg JE, Freeman JL, Shelton RM, Bubolz TA: Hospital use and mortality among Medicare beneficiaries in Boston and New Haven. N Engl J Med 1989; 321:1168.[Abstract]
  41. Relman AS: Assessment and accountability: the third revolution in medical care. N Engl J Med 1988; 319:1220.[Medline]
  42. Schneider EC, Leape LL, Weissman JS, et al: Racial differences in cardiac revascularization rates: does "overuse" explain higher rates among white patients? Ann Int Med 2001; 135:328.[Abstract/Free?Full?Text]
  43. Philbin EF, McCullough PA, DiSalvo TG, et al: Underuse of invasive procedures among Medicaid patients with acute myocardial infarction. Am J Public Health 2001; 91:1082.[Abstract]
  44. Filardo G, Maggioni AP, Mura G, et al: The consequences of under-use of coronary revascularization; results of a cohort study in Northern Italy. Eur Heart J 2001; 22:654.[Abstract/Free?Full?Text]
  45. Kravitz RL, Laouri M: Measuring and averting underuse of necessary cardiac procedures: a summary of results and future directions. Jt Comm J Qual Improv 1997; 23:268.[Medline]
  46. Peterson ED, Shaw LK, DeLong ER, et al: Racial variation in the use of coronary-revascularization procedures: are the differences real? Do they matter? N Engl J Med 1997; 336:480.[Abstract/Free?Full?Text]
  47. Asch SM, Sloss EM, Hogan C, et al: Measuring underuse of necessary care among elderly Medicare beneficiaries using inpatient and outpatient claims. JAMA 2000; 284:2325.[Abstract/Free?Full?Text]
  48. Tobler HG, Sethi GK, Grover FL, et al: Variations in processes and structures of cardiac surgery practice. Med Care 1995; 33:OS43.[Medline]
  49. Stover EP, Siegel LC, Parks R, et al: Variability in transfusion practice for coronary artery bypass surgery persists despite national consensus guidelines: a 24-institution study. Institutions of the Multicenter Study of Perioperative Ischemia Research Group. Anesthesiology 1998; 88:327.[Medline]
  50. Lyon AW, Greenway DC, Hindmarsh JT: A strategy to promote rational clinical chemistry test utilization. Am J Clin Pathol 1995; 103:718.[Medline]
  51. Macario A, Chung A, Weinger MB: Variation in practice patterns of anesthesiologists in California for prophylaxis of postoperative nausea and vomiting. J Clin Anesth 2001; 13:353.[Medline]
  52. Nissenson AR, Collins AJ, Hurley J, et al: Opportunities for improving the care of patients with chronic renal insufficiency: current practice patterns. J Am Soc Nephrol 2001; 12:1713.[Abstract/Free?Full?Text]
  53. Nourse C, Byrne C, Leonard L, Butler K: Glycopeptide prescribing in a tertiary referral paediatric hospital and applicability of hospital infection control practices advisory committee (HICPAC) guidelines to children. Eur J Pediatr 2000; 159:193.[Medline]
  54. Gonzales R, Malone DC, Maselli JH, Sande MA: Excessive antibiotic use for acute respiratory infections in the United States. Clin Infect Dis 2001; 33:757.[Medline]
  55. Jones MI, Greenfield SM, Bradley CP: Prescribing new drugs: qualitative study of influences on consultants and general practitioners. BMJ 2001; 323:378.[Abstract/Free?Full?Text]
  56. Schneider EC, Eisenberg JM: Strategies and methods for aligning current and best medical practices: the role of information technologies. West J Med 1998; 168:311.[Medline]
  57. American College of Physicians. The oversight of medical care: a proposal for reform. Ann Intern Med 1994; 120:423.[Abstract/Free?Full?Text]
  58. Ebert PA: The importance of data in improving practice: effective clinical use of outcomes data. Ann Thorac Surg 1994; 58:1812.[Abstract]
  59. Daley J: Criteria by which to evaluate risk-adjusted outcomes programs in cardiac surgery. Ann Thorac Surg 1994; 58:1827.[Abstract]
  60. Edwards FH, Clark RE, Schwartz M: Practical considerations in the management of large multiinstitutional databases. Ann Thorac Surg 1994; 58:1841.[Abstract]
  61. Jollis JG, Ancukiewicz M, DeLong ER, et al: Discordance of databases designed for claims payment versus clinical information systems: implications for outcomes research. Ann Intern Med 1993; 119:844.[Abstract/Free?Full?Text]
  62. Hannan EL, Racz MJ, Jollis JG, Peterson ED: Using Medicare claims data to assess provider quality for CABG surgery: does it work well enough? Health Serv Res 1997; 31:659.[Medline]
  63. Green J, Wintfeld N: How accurate are hospital discharge data for evaluating effectiveness of care? Med Care 1993; 31:719.[Medline]
  64. Lee TH: Evaluating the Quality of Cardiovascular Care: A Primer. Bethesda, MD, American College of Cardiology Press, 1995; p 35.
  65. Blumberg MS: Comments on HCFA hospital death rate statistical outliers. Health Serv Res 1987; 21:715.[Medline]
  66. Dubois RW: Hospital mortality as an indicator of quality, in Goldfield N, Nash DB (eds): Providing Quality Care. Philadelphia, American College of Physicians, 1989; p 107.
  67. Podolsky D, Beddingfield KT: America's best hospitals. U.S. News and World Report 1993; 115:66.
  68. Glantz SA: Primer of Biostatistics. New York, McGraw-Hill, 2002.
  69. Motulsky H: Intuitive Biostatistics. New York, Oxford University Press, 1995.
  70. Harrell FE: Regression Modeling Strategies: With Applications to Linear Models, Logistic Regression, and Survival Analysis. New York, Springer, 2001.
  71. Glantz SA, Slinker BK: Primer of Applied Regression and Analysis of Variance. New York, McGraw-Hill, 2001.
  72. Thomas JW, Ashcraft ML: Measuring severity of illness: six severity systems and their ability to explain cost variations. Inquiry 1991; 28:39.[Medline]
  73. Ferraris VA, Propp ME: Outcome in critical care patients: a multivariate study. Crit Care Med 1992; 20:967.[Medline]
  74. Shwartz M, Ash AS: Evaluating the performance of risk-adjustment methods: continuous measures, in Iezzoni LI (ed): Risk Adjustment for Measuring Health Care Outcomes. Ann Arbor, MI, Health Administration Press, 1994; p 287.
  75. Hartz AJ, Kuhn EM, Kayser KL: Report cards on cardiac surgeons. N Engl J Med 1995; 333:939.
  76. Harrell FE, Lee KL, Califf RM, et al: Regression modelling strategies for improved prognostic prediction. Stat Med 1984; 3:143.[Medline]
  77. Ferraris VA, Ferraris SP, Joseph O, et al: Aspirin and postoperative bleeding after coronary artery bypass grafting. Ann Surg 2002; 235:820.[Medline]
  78. Ash RS, Shwartz M: Evaluating the performance of risk-adjustment methods: dichotomous measures, in Iezzoni L (ed): Risk Adjustment for Measuring Health Care Outcomes. Ann Arbor, MI, Health Administration Press, 1997; p 427.
  79. Shahian DM, Normand SL, Torchiana DF, et al: Cardiac surgery report cards: comprehensive review and statistical critique. Ann Thorac Surg 2001; 72:2155.[Abstract/Free?Full?Text]
  80. Grunkemeier GL, Zerr KJ, Jin R: Cardiac surgery report cards: making the grade. Ann Thorac Surg 2001; 72:1845.[Free?Full?Text]
  81. Schneider EC, Epstein AM: Use of public performance reports: a survey of patients undergoing cardiac surgery. JAMA 1998; 279:1638.[Abstract/Free?Full?Text]
  82. Hofer TP, Hayward RA, Greenfield S, et al: The unreliability of individual physician "report cards" for assessing the costs and quality of care of a chronic disease. JAMA 1999; 281:2098.[Abstract/Free?Full?Text]
  83. Thomas JW, Hofer TP: Accuracy of risk-adjusted mortality rate as a measure of hospital quality of care. Med Care 1999; 37:83.[Medline]
  84. Bindman AB: Can physician profiles be trusted? JAMA 1999; 281:2142.[Free?Full?Text]
  85. Marshall MN, Shekelle PG, Leatherman S, Brook RH: The public release of performance data: what do we expect to gain? A review of the evidence. JAMA 2000; 283:1866.[Abstract/Free?Full?Text]
  86. Marshall MN: Accountability and quality improvement: the role of report cards. Qual Health Care 2001; 10:67.[Free?Full?Text]
  87. Ferraris VA: The dangers of gathering data. J Thorac Cardiovasc Surg 1992; 104:212.[Medline]
  88. Goldstein H, Spiegelhalter DJ: League tables and their limitations: statistical issues in comparisons of institutional performance (with discussion). J R Stat Soc 1996; 159:385.
  89. Thomas N, Longford NT, Rolph JE: Empirical Bayes methods for estimating hospital-specific mortality rates. Stat Med 1994; 13:889.[Medline]
  90. Shwartz M, Ash AS, Iezzoni LI: Comparing outcomes across providers, in Iezzoni LI (ed): Risk Adjustment for Measuring Healthcare Outcomes. Chicago, Health Administration Press, 1997; p 471.
  91. Hartz AJ, Kuhn EM, Kayser KL, et al: Assessing providers of coronary revascularization: a method for peer review organizations. Am J Public Health 1992; 82:1631.[Abstract/Free?Full?Text]
  92. Lagakos SW: Statistical analysis of survival data, in Bailar JC, Mosteller F (eds): Medical Uses of Statistics. Boston, NEJM Books, 1992; p 281.
  93. Marubini E, Valsecchi MG: Analyzing Survival Data from Clinical Trials and Observational Studies. New York, Wiley, 1995.
  94. Cox DR, Oakes D: Analysis of Survival Data. London, Chapman and Hall, 1984.
  95. Kalbfleisch JD, Prentice RL: The Statistical Analysis of Failure Time Data. New York, Wiley, 1980.
  96. Lee E: Statistical Methods for Survival Data Analysis. New York, Wiley, 1992.
  97. Bayes T: An essay towards solving a problem in the doctrine of chances, in Press SJ (ed): Bayesian Statistics: Principles, Models, and Applications. New York, Wiley, 1989; p 189.
  98. Pauker SG, Kassirer JP: Decision analysis, in Bailar JC, Mosteller F (eds): Medical Uses of Statistics. Boston, NEJM Books, 1992; p 159.
  99. Edwards FH, Graeber GM: The theorem of Bayes as a clinical research tool. Surg Gynecol Obstet 1987; 165:127.[Medline]
  100. Press SJ: Bayesian Statistics: Principles, Models, and Applications. New York, Wiley, 1989.
  101. Bernardo JM: Bayesian Theory. New York, Wiley, 1994.
  102. Spiegelhalter DJ, Myles JP, Jones DR, Abrams KR: Bayesian methods in health technology assessment: a review. Health Technol Assess 2000; 4:1.[Medline]
  103. Edwards FH, Clark RE, Schwartz M: Coronary artery bypass grafting: the Society of Thoracic Surgeons National Database experience. Ann Thorac Surg 1994; 57:12.[Abstract]
  104. Casella G: An introduction to empirical Bayes data-analysis. Am Statistician 1985; 39:83.
  105. Hattler BG, Madia C, Johnson C, et al: Risk stratification using the Society of Thoracic Surgeons Program. Ann Thorac Surg 1994; 58:1348.[Abstract]
  106. Marshall G, Shroyer AL, Grover FL, Hammermeister KE: Bayesian-logit model for risk assessment in coronary artery bypass grafting. Ann Thorac Surg 1994; 57:1492.[Abstract]
  107. Rees K, Beranek-Stanley M, Burke M, Ebrahim S: Hypothermia to reduce neurological damage following coronary artery bypass surgery. Cochrane Database Syst Rev 2001; 1:CD002138.
  108. Bartels C, Gerdes A, Babin-Ebell J, et al: Cardiopulmonary bypass"evidence- or experience-based." J Thorac Cardiovasc Surg 2002; 124:20.[Abstract/Free?Full?Text]
  109. Begg CB, Berlin JA: Publication bias and dissemination of clinical research. J Natl Cancer Inst 1989; 81:107.[Abstract/Free?Full?Text]
  110. Berlin JA, Laird NM, Sacks HS, Chalmers TC: A comparison of statistical methods for combining event rates from clinical trials. Stat Med 1989; 8:141.[Medline]
  111. Tierney WM: Meta-analysis and bouillabaisse. Ann Intern Med 1996; 125:519.[Free?Full?Text]
  112. LeLorier J, Gregoire G, Benhaddad A, et al: Discrepancies between meta-analyses and subsequent large randomized, controlled trials. N Engl J Med 1997; 337:536.[Abstract/Free?Full?Text]
  113. LeLorier J, Gregoire G: Comparing results from meta-analyses vs large trials. JAMA 1998; 280:518.[Free?Full?Text]
  114. Bailar JC: The promise and problems of meta-analysis. N Engl J Med 1997; 337:559.[Free?Full?Text]
  115. Bailar JC: The practice of meta-analysis. J Clin Epidemiol 1995; 48:149.[Medline]
  116. Longford NT: Selection bias and treatment heterogeneity in clinical trials. Stat Med 1999; 18:1467.[Medline]
  117. Rosenbaum PR, Rubin DB: The central role of the propensity score in observational studies for causal effects. Biometrika 1983; 70:41.[Abstract/Free?Full?Text]
  118. Benson K, Hartz AJ: A comparison of observational studies and randomized, controlled trials. N Engl J Med 2000; 342:1878.[Abstract/Free?Full?Text]
  119. Concato J, Shah N, Horwitz RI: Randomized, controlled trials, observational studies, and the hierarchy of research designs. N Engl J Med 2000; 342:1887.[Abstract/Free?Full?Text]
  120. Naftel DC: Do different investigators sometimes produce different multivariable equations from the same data? J Thorac Cardiovasc Surg 1994; 107:1528.[Free?Full?Text]
  121. Efron B, Tibshirani R: An Introduction to the Bootstrap. New York, Chapman & Hall, 1993.
  122. Diaconis P, Efron B: Computer-intensive methods in statistics. Sci Amer 1983; 248:116.
  123. Efron B: Model selection and the bootstrap. Math Soc Sci 1983; 5:236.
  124. Efron B: The bootstrap and modern statistics. J Am Stat Assoc 2000; 95:1293.
  125. Breiman L: Bagging predictors. Machine Learning 1996; 26:123.
  126. Deming WE: Out of the Crisis. Cambridge, MA, Massachusetts Institute of Technology Center for Advanced Engineering Study, 1986.
  127. Juran JM, Gryna FM: Juran's Quality Control Handbook. New York, McGraw-Hill, 1988.
  128. Juran JM: Juran on Leadership for Quality: An Executive Handbook. New York, Free Press, 1989.
  129. Juran JM: A History of Managing for Quality: The Evolution, Trends, and Future Directions of Managing for Quality. Milwaukee, WI, ASQC Quality Press, 1995.
  130. Berwick DM, Godfrey AB, Roessner J, National Demonstration Project on Quality Improvement in Health Care (eds): Curing Health Care: New Strategies for Quality Improvement. San Francisco, Jossey-Bass, 1990.
  131. Plsek PE: Resource B: A primer on quality improvement tools, in Berwick DM, Godfrey AB, Roessner J, National Demonstration Project on Quality Improvement in Health Care (eds): Curing Health Care: New Strategies for Quality Improvement. San Francisco, Jossey-Bass, 1990; p 177.
  132. Ryan T: Statistical Methods for Quality Improvement. New York, Wiley, 1989.
  133. Walton M: The Deming Management Method. New York, Putnam, 1986.
  134. Ryan TP: Statistical Methods for Quality Improvement. New York, Wiley, 2000.
  135. Shahian DM, Williamson WA, Svensson LG, Restuccia JD, Dagostino RS: Applications of statistical quality control to cardiac surgery. Ann Thorac Surg 1996; 62:1351.[Abstract/Free?Full?Text]
  136. Novick RJ, Fox SA, Stitt LW, et al: Cumulative sum failure analysis of a policy change from on-pump to off-pump coronary artery bypass grafting. Ann Thorac Surg 2001; 72:S1016.[Abstract/Free?Full?Text]
  137. de Leval MR, Francois K, Bull C, Brawn W, Spiegelhalter D: Analysis of a cluster of surgical failures: application to a series of neonatal arterial switch operations. J Thorac Cardiovasc Surg 1994; 107:914.[Abstract/Free?Full?Text]
  138. O'Connor GT, Plume SK, Olmstead EM, et al: Multivariate prediction of in-hospital mortality associated with coronary artery bypass graft surgery. Northern New England Cardiovascular Disease Study Group. Circulation 1992; 85:2110.[Medline]
  139. O'Connor GT, Plume SK, Olmstead EM, et al: A regional prospective study of in-hospital mortality associated with coronary artery bypass grafting. The Northern New England Cardiovascular Disease Study Group. JAMA 1991; 266:803.[Abstract]
  140. Kasper JF, Plume SK, O'Connor GT: A methodology for QI in the coronary artery bypass grafting procedure involving comparative process analysis. Qual Rev Bull 1992; 18:129.
  141. Nugent WC, Schults WC: Playing by the numbers: how collecting outcomes data changed by life. Ann Thorac Surg 1994; 58:1866.[Abstract]
  142. Hartz AJ, Kuhn EM, Kayser KL, Johnson WD: BUN as a risk factor for mortality after coronary-artery bypass-grafting. Ann Thorac Surg 1995; 60:398.[Abstract/Free?Full?Text]
  143. Otaki M: Surgical treatment of patients with cardiac cachexia: an analysis of factors affecting operative mortality. Chest 1994; 105:1347.[Abstract/Free?Full?Text]
  144. Boyd O, Grounds RM, Bennett ED: A randomized clinical-trial of the effect of deliberate perioperative increase of oxygen delivery on mortality in high-risk surgical patients. JAMA 1993; 270:2699.[Abstract]
  145. Frater RW, Sisto D, Condit D: Cardiac surgery in human immunodeficiency virus (HIV) carriers. Eur J Cardiothorac Surg 1989; 3:146.[Abstract]
  146. Mandal AK, Kaushik VS, Oparah SS: Risk of aortocoronary bypass-surgery in a low-volume inner-city hospital. J Natl Med Assoc 1991; 83:519.[Medline]
  147. Nallamothu BK, Saint S, Ramsey SD, Hofer TP, Vijan S, Eagle KA: The role of hospital volume in coronary artery bypass grafting: is more always better? J Am Coll Cardiol 2001; 38:1923.[Abstract/Free?Full?Text]
  148. Hannan EL, Kilburn H, Bernard H, Odonnell JF, Lukacik G, Shields EP: Coronary-artery bypass-surgery: the relationship between inhospital mortality-rate and surgical volume after controlling for clinical risk-factors. Med Care 1991; 29:1094.[Medline]
  149. Sowden AJ, Deeks JJ, Sheldon TA: Volume and outcome in coronary-artery bypass graft-surgery: true association or artifact? BMJ 1995; 311:151.[Abstract/Free?Full?Text]
  150. DeFoe GR, Ross CS, Olmstead EM, et al: Lowest hematocrit on bypass and adverse outcomes associated with coronary artery bypass grafting. Northern New England Cardiovascular Disease Study Group. Ann Thorac Surg 2001; 71:7696.
  151. Leavitt BJ, O'Connor GT, Olmstead EM, et al: Use of the internal mammary artery graft and in-hospital mortality and other adverse outcomes associated with coronary artery bypass surgery. Circulation 2001; 103:507.[Medline]
  152. O'Connor NJ, Morton JR, Birkmeyer JD, et al: Effect of coronary artery diameter in patients undergoing coronary bypass surgery. Northern New England Cardiovascular Disease Study Group. Circulation 1996; 93:652.[Medline]
  153. Sethi GK, Hammermeister KE, Oprian C, Henderson W: Impact of resident training on postoperative morbidity in patients undergoing single valve-replacement. J Thorac Cardiovasc Surg 1991; 101:1053.[Abstract]
  154. Kress DC, Kroncke GM, Chopra PS, et al: Comparison of survival in cardiac surgery at a Veterans Administration-hospital and its affiliated university hospital. Arch Surg 1988; 123:439.[Abstract]
  155. Hammermeister KE, Daley J, Grover FL: Using outcomes data to improve clinical practice: what we have learned. Ann Thorac Surg 1994; 58:1809.[Medline]
  156. Daley J: Validity of risk-adjustment methods, in Iezzoni LI (ed): Risk Adjustment for Measuring Healthcare Outcomes. Chicago, Health Administration Press, 1997; p 331.
  157. Burns ER, Billett HH, Frater RW, Sisto DA: The preoperative bleeding time as a predictor of postoperative hemorrhage after cardiopulmonary bypass. J Thorac Cardiovasc Surg 1986; 92:3102.
  158. Gewirtz AS, Miller ML, Keys TF: The clinical usefulness of the preoperative bleeding time. Arch Pathol Lab Med 1996; 120:353.[Medline]
  159. De Caterina R, Lanza M, Manca G, et al: Bleeding time and bleeding: an analysis of the relationship of the bleeding time test with parameters of surgical bleeding. Blood 1994; 84:3363.[Abstract/Free?Full?Text]
  160. Freiman JA, Chalmers TC, Smith HJ, Kuebler RR: The importance of beta, the type II error, and sample size in the design and interpretation of the randomized controlled trial, in Bailar JC, Mosteller F (eds): Medical Uses of Statistics. Boston, NEJM Books, 1992; p 357.
  161. Kallis P, Unsworth-White J, Munsch C, et al: Disability and distress following cardiac surgery in patients over 70 years of age. Eur J Cardiothorac Surg 1993; 7:306.[Abstract]
  162. Guadagnoli E, Ayanian JZ, Cleary PD: Comparison of patient-reported outcomes after elective coronary artery bypass grafting in patients aged greater than or equal to and less than 65 years. Am J Cardiol 1992; 70:60.[Medline]
  163. Rumsfeld JS, Magid DJ, O'Brien M, et al: Changes in health-related quality of life following coronary artery bypass graft surgery. Ann Thorac Surg 2001; 72:2026.[Abstract/Free?Full?Text]
  164. Rumsfeld JS, MaWhinney S, McCarthy M, et al: Health-related quality of life as a predictor of mortality following coronary artery bypass graft surgery. Participants of the Department of Veterans Affairs Cooperative Study Group on Processes, Structures, and Outcomes of Care in Cardiac Surgery. JAMA 1999; 281:1298.[Abstract/Free?Full?Text]
  165. Lee TH, Shammash JB, Ribeiro JP, et al: Estimation of maximum oxygen uptake from clinical data: performance of the Specific Activity Scale. Am Heart J 1988; 115:203.[Medline]
  166. Ware JE, Sherbourne CD: The MOS 36item short-form health survey (SF-36): I, conceptual framework and item selection. Med Care 1992; 30:473.[Medline]
  167. Lee TH, American College of Cardiology: Private Sector Relations Committee: Evaluating the Quality of Cardiovascular Care: A Primer. Bethesda, MD, American College of Cardiology, 1995; p 53.
  168. Ware JE Jr, Davies AR, Rubin HR: Patients' assessment of their care, in U.S. Congress Office of Technology Assessment (ed): The Quality of Medical Care: Information for Consumers. Washington, DC, U.S. Government Printing Office, Vol. (OTA-H-386), June, 1988.
  169. Kaplan SH, Ware JE Jr: The patient's role in health care and quality assessment, in Goldfield N, Nash DB (eds): Providing Quality Care: The Challenge to Clinicians. Philadelphia, American College of Physicians, 1989.
  170. Cope DW, Linn LS, Leake BD, Barrett PA: Modification of residents' behavior by preceptor feedback of patient satisfaction. J Gen Intern Med 1986; 1:394.[Medline]
  171. Hughes JS, Ash AS: Reliability of risk-adjustment methods, in Iezzoni LI (ed): Risk Adjustment for Measuring Healthcare Outcomes. Chicago, Health Administration Press, 1997; p 365.
  172. Fleiss JL: Statistical Methods for Rates and Proportions. New York, Wiley, 1981.
  173. Landis JR, Koch GG: The measurement of observer agreement for categorical data. Biometrics 1977; 33:159.[Medline]
  174. Hartz AJ, Kuhn EM, Green R, Rimm AA: The use of risk-adjusted complication rates to compare hospitals performing coronary artery bypass surgery or angioplasty. Int J Technol Assess Health Care 1992; 8:524.[Medline]
  175. Hannan EL, Kilburn H, O'Donnell JF, et al: Adult open heart surgery in New York State: an analysis of risk factors and hospital mortality rates. JAMA 1990; 264:2768.[Abstract]
  176. Park RE, Brook RH, Kosecoff J, et al: Explaining variations in hospital death rates: randomness, severity of illness, quality of care. JAMA 1990; 264:484.[Abstract]
  177. Thomas JW, Holloway JJ, Guire KE: Validating risk-adjusted mortality as an indicator for quality of care. Inquiry 1993; 30:6.[Medline]
  178. Silber JH, Rosenbaum PR, Ross RN: Comparing the contributions of groups of predictorswhich outcomes vary with hospital rather than patient characteristics. J Am Stat Assoc 1995; 90:7.
  179. Hammermeister KE, Burchfiel C, Johnson R, Grover FL: Identification of patients at greatest risk for developing major complications at cardiac surgery. Circulation 1990; 82(5 suppl):IV380.
  180. Grover FL, Hammermeister KE, Burchfiel C: Initial report of the Veterans Administration Preoperative Risk Assessment Study for Cardiac Surgery. Ann Thorac Surg 1990; 50:12.[Abstract]
  181. Hammermeister KE, Johnson R, Marshall G, Grover FL: Continuous assessment and improvement in quality of care: a model from the Department of Veterans Affairs Cardiac Surgery. Ann Surg 1994; 219:281.[Medline]
  182. Koska MT: Are severity data an effective consumer tool? Hospitals 1989; 63:24.
  183. Iezzoni LI: Using risk-adjusted outcomes to assess clinical practice: an overview of issues pertaining to risk adjustment. Ann Thorac Surg 1994; 58:1822.[Abstract]
  184. Wu AW: The measure and mismeasure of hospital quality: appropriate risk-adjustment methods in comparing hospitals. Ann Intern Med 1995; 122:149.[Free?Full?Text]
  185. Daley J, Henderson WG, Khuri SF: Risk-adjusted surgical outcomes. Ann Rev Med 2001; 52:275.[Medline]
  186. Iezzoni LI, Greenberg LG: Widespread assessment of risk-adjusted outcomes: lessons from local initiatives. Jt Comm J Qual Improv 1994; 20:305.[Medline]
  187. Kassirer JP: The use and abuse of practice profiles. N Engl J Med 1994; 330:634.[Free?Full?Text]
  188. Green J, Wintfeld N: Report cards on cardiac surgeons: assessing New York State's approach. N Engl J Med 1995; 332:1229.[Free?Full?Text]
  189. Hannan EL, Kilburn H, Racz M, Shields E, Chassin MR: Improving the outcomes of coronary artery bypass surgery in New York State. JAMA 1994; 271:761.[Abstract]
  190. Ghali WA, Ash AS, Hall RE, Moskowitz MA: Statewide quality improvement initiatives and mortality after cardiac surgery. JAMA 1997; 277:379.[Abstract]
  191. Peterson ED, DeLong ER, et al: The effects of New York's bypass surgery provider profiling on access to care and patient outcomes in the elderly. J Am Coll Cardiol 1998; 32:993.[Abstract/Free?Full?Text]
  192. Malenka DJ, O'Connor GT: A regional collaborative effort for CQI in cardiovascular disease. Northern New England Cardiovascular Study Group. Jt Comm J Qual Improv 1995; 21:627.[Medline]
  193. O'Connor GT, Plume SK, Wennberg JE: Regional organization for outcomes research. Ann N Y Acad Sci 1993; 703:44.[Abstract]
  194. Borer A, Gilad J, Meydan N, et al: Impact of active monitoring of infection control practices on deep sternal infection after open-heart surgery. Ann Thorac Surg 2001; 72:515.[Abstract/Free?Full?Text]
  195. Epstein AM: Changes in the delivery of care under comprehensive health-care reform. N Engl J Med 1993; 329:1672.[Free?Full?Text]
  196. McNeil BJ, Pedersen SH, Gatsonis C: Current issues in profiling quality of care. Inquiry 1992; 29:298.[Medline]
  197. Kassirer JP: Managed care and the morality of the marketplace. N Engl J Med 1995; 333:50.[Free?Full?Text]
  198. Epstein A: Sounding board: performance reports on qualityprototypes, problems, and prospects. N Engl J Med 1995; 333:57.[Free?Full?Text]
  199. Omoigui N, Annan K, Brown K, et al: Potential explanation for decreased CABG-related mortality in New York State: outmigration to Ohio. Circulation 1994; 90:93.
  200. Morgan RO, Virnig BA, DeVito CA, Persily NA: The Medicare-HMO revolving doorthe healthy go in and the sick go out. N Engl J Med 1997; 337:169.[Abstract/Free?Full?Text]
  201. Van de Ven WP, Van Vliet RC, Van Barneveld EM, Lamers LM: Risk-adjusted capitation: recent experiences in The Netherlands. Health Aff (Millwood) 1994; 13:120.[Abstract]
  202. Gauthier AK, Lamphere JA, Barrand NL: Risk selection in the health care market: a workshop overview. Inquiry 1995; 32:14.[Medline]
  203. Kronick R, Zhou Z, Dreyfus T: Making risk adjustment work for everyone. Inquiry 1995; 32:41.[Medline]
  204. Bowen B: The practice of risk adjustment. Inquiry 1995; 32:33.[Medline]
  205. Newhouse JP: Patients at risk: health reform and risk adjustment. Health Aff (Millwood) 1994; 13:132.[Abstract]
  206. Rodwin MA: Conflicts in managed care. N Engl J Med 1995; 332:604.[Free?Full?Text]
  207. Walker LM: Managed care 1995: turn capitation into a moneymaker. Medical Economics 1995; 72:58.
  208. Morain C: When managed care takes over, watch out! Med Economics 1995; 72:38.
  209. Himmelstein DU, Woolhandler S, Hellander I, Wolfe SM: Quality of care in investor-owned vs not-for-profit HMOs. JAMA 1999; 282:159.[Abstract/Free?Full?Text]
  210. Woolhandler S, Himmelstein DU: Costs of care and administration at for-profit and other hospitals in the United States. N Engl J Med 1997; 336:769.[Abstract/Free?Full?Text]
  211. Anonymous:Iowa: classic test of a future concept. Med Outcomes & Guidelines Alert 1995: 3:8.
  212. Gardner E: Florida hospitals plan quality-indicator project. Mod Healthcare 1993; 6:36.
  213. Petitti DB: Meta-Analysis, Decision Analysis, and Cost-Effectiveness Analysis: Methods for Quantitative Synthesis in Medicine. New York, Oxford University Press, 2000; p 140.
  214. Hunink MG: In search of tools to aid logical thinking and communicating about medical decision making. Med Decis Making 2001; 21:267.[Abstract]
  215. Eddy DM, Hasselblad V, Shachter R: A Bayesian method for synthesizing evidence: the confidence profile method. Int J Technol Assess Health Care 1990; 6:31.[Medline]
  216. Eddy DM: Clinical decision making: from theory to practice: resolving conflicts in practice policies. JAMA 1990; 264:389.[Medline]
  217. Gage BF, Cardinalli AB, Albers GW, Owens DK: Cost-effectiveness of warfarin and aspirin for prophylaxis of stroke in patients with nonvalvular atrial fibrillation. JAMA 1995; 274:1839.[Abstract]
  218. Riley G, Lubitz J: Outcomes of surgery among the Medicare aged: surgical volume and mortality. Health Care Financing Rev 1985; 7:37.[Medline]
  219. Showstack JA, Rosenfeld KE, Garnick DW, et al: Association of volume with outcome of coronary artery bypass graft surgery: scheduled vs nonscheduled operations. JAMA 1987; 257:785.[Abstract]
  220. Hannan EL, O'Donnell JF, Kilburn H, et al: Investigation of the relationship between volume and mortality for surgical procedures performed in New York State hospitals. JAMA 1989; 262:503.[Abstract]
  221. Farley DE, Ozminkowski RJ: Volume-outcome relationships and in-hospital mortality: the effect of changes in volume over time. Med Care 1992; 30:77.[Medline]
  222. Grumbach K, Anderson GM, Luft HS, et al: Regionalization of cardiac surgery in the United States and Canada: geographic access, choice, and outcomes. JAMA 1995; 274:1282.[Abstract]
  223. Hannan EL, Siu AL, Kumar D, et al: The decline in coronary artery bypass graft surgery mortality in New York State: the role of surgeon volume. JAMA 1995; 273:209.[Abstract]
  224. Shroyer ALW, Marshall G, Warner BA, et al: No continuous relationship between Veterans Affairs hospital coronary artery bypass grafting surgical volume and operative mortality. Ann Thorac Surg 1996; 61:17.[Abstract/Free?Full?Text]
  225. Sollano JA, Gelijns AC, Moskowitz AJ, et al: Volume-outcome relationships in cardiovascular operations: New York State, 19901995. J Thorac Cardiovasc Surg 1999; 117:419.[Abstract/Free?Full?Text]
  226. Hewitt M, for the Committee on Quality of Health Care in America and the National Cancer Policy Board: Interpreting the Volume-Outcome Relationship in the Context of Health Care Quality. Workshop summary. Washington, DC, Institute of Medicine, 2000.
  227. Luft HS, Bunker JP, Enthoven AC: Should operations be regionalized? The empirical relation between surgical volume and mortality. N Engl J Med 1979; 301:1364.[Abstract]
  228. Tu JV, Naylor CD: Coronary artery bypass mortality rates in Ontario: a Canadian approach to quality assurance in cardiac surgery. Steering Committee of the Provincial Adult Cardiac Care Network of Ontario. Circulation 1996; 94:2429.[Medline]
  229. Crawford FA, Anderson RP, Clark RE, et al: Volume requirements for cardiac surgery credentialing: a critical examination. The Ad Hoc Committee on Cardiac Surgery Credentialing of The Society of Thoracic Surgeons. Ann Thorac Surg 1996; 61:12.[Abstract/Free?Full?Text]
  230. Brennan TA, Leape LL, Laird NM, et al: Incidence of adverse events and negligence in hospitalized patients. Results of the Harvard Medical Practice Study I. N Engl J Med 1991; 324:370.[Abstract]
  231. Thomas EJ, Studdert DM, Burstin HR, et al: Incidence and types of adverse events and negligent care in Utah and Colorado. Med Care 2000; 38:261.[Medline]
  232. Kohn LT, Corrigan J, Donaldson MS, Institute of Medicine (U.S.), Committee on Quality of Health Care in America: To Err Is Human: Building a Safer Health System. Washington, DC, National Academy Press, 2000.
  233. Runciman WB, Webb RK, Helps SC, et al: A comparison of iatrogenic injury studies in Australia and the USA, II: reviewer behaviour and quality of care. Int J Qual Health Care 2000; 12:379.[Abstract/Free?Full?Text]
  234. Vincent CA: Research into medical accidents: a case of negligence? BMJ 1989; 299:1150.
  235. Donchin Y, Gopher D, Olin M, et al: A look into the nature and causes of human errors in the intensive care unit. Crit Care Med 1995; 23:294.[Medline]
  236. Brennan TA: The Institute of Medicine report on medical errors: could it do harm? N Engl J Med 2000; 342:1123.[Free?Full?Text]
  237. Richardson WC, Berwick DM, Bisgard JC, et al: The Institute of Medicine Report on Medical Errors: misunderstanding can do harm. Quality of Health Care in America Committee. MedGenMed 2000; Sep 19:E42.
  238. Hayward RA, Hofer TP: Estimating hospital deaths due to medical errors: preventability is in the eye of the reviewer. JAMA 2001; 286:415.[Abstract/Free?Full?Text]
  239. Wears RL, Janiak B, Moorhead JC, et al: Human error in medicine: promise and pitfalls, part 2. Ann Intern Med 2000; 36:142.
  240. Berwick DM, Leape LL: Reducing errors in medicine. Qual Health Care 1999; 8:145.[Medline]
  241. de Leval MR, Carthey J, Wright DJ, et al: Human factors and cardiac surgery: a multicenter study. J Thorac Cardiovasc Surg 2000; 119:661.[Abstract/Free?Full?Text]
  242. Langdorf MI, Fox JC, Marwah RS, et al: Physician versus computer knowledge of potential drug interactions in the emergency department. Acad Emerg Med 2000; 7:1321.[Abstract/Free?Full?Text]
  243. Soza H: Reducing medical errors through technology. Cost Qual 2000; 6:24.
  244. Tierney WM, Overhage JM, Takesue BY, et al: Computerizing guidelines to improve care and patient outcomes: the example of heart failure. J Am Med Inform Assoc 1995; 2:316.[Abstract]
  245. Overhage JM, Tierney WM, McDonald CJ: Computer reminders to implement preventive care guidelines for hospitalized patients. Arch Intern Med 1996; 156:1551.[Abstract]
  246. Litzelman DK, Tierney WM: Physicians' reasons for failing to comply with computerized preventive care guidelines. J Gen Intern Med 1996; 11:497.[Medline]
  247. Orr RK: Use of an artificial neural network to quantitate risk of malignancy for abnormal mammograms. Surgery 2001; 129:459.[Medline]
  248. Katz S, Katz AS, Lowe N, Quijano RC: Neural net-bootstrap hybrid methods for prediction of complications in patients implanted with artificial heart valves. J Heart Valve Dis 1994; 3:49.[Medline]
  249. Tu JV, Guerriere MR: Use of a neural network as a predictive instrument for length of stay in the intensive care unit following cardiac surgery. Comput Biomed Res 1993; 26:220.[Medline]
  250. Steen PM: Approaches to predictive modeling. Ann Thorac Surg 1994; 58:1836.[Abstract]
  251. Freeman RV, Eagle KA, Bates ER, et al: Comparison of artificial neural networks with logistic regression in prediction of in-hospital death after percutaneous transluminal coronary angioplasty. Am Heart J 2000; 140:511.[Medline]
  252. American College of Surgeons: Standard of efficiency for the first hospital survey by the College. Bull Am Coll Surg 1918; 3:1.
  253. Weed LL: Medical records that guide and teach. N Engl J Med 1968; 278:652 concl.
  254. Weed LL: Medical records that guide and teach. N Engl J Med 1968; 278:593.
  255. Weed LL: What physicians worry about: how to organize care of multiple-problem patients. Mod Hospital 1968; 110:90.
  256. Henzler C, Harper JJ: Implementing a computer-assisted appropriateness review using DRG 182/183. Jt Comm J Qual Improv 1995; 21:239.[Medline]
  257. Safran C, Rind DM, Davis RB, et al: Guidelines for management of HIV infection with computer-based patient's record. Lancet 1995; 346:341.[Medline]
  258. Rector AL, Nowlan WA, Kay S, Goble CA, Howkins TJ: A framework for modelling the electronic medical record. Methods Inf Med 1993; 32:109.[Medline]
  259. Tierney WM: Improving clinical decisions and outcomes with information: a review. Int J Med Inform 2001; 62:1.[Medline]
  260. Tierney WM, Miller ME, Overhage JM, McDonald CJ: Physician inpatient order writing on microcomputer workstations: effects on resource utilization. JAMA 1993; 269:379.[Abstract]
  261. Overhage JM, Tierney WM, Zhou XH, McDonald CJ: A randomized trial of "corollary orders" to prevent errors of omission. J Am Med Inform Assoc 1997; 4:364.[Abstract/Free?Full?Text]
  262. McConnell T: Safer, cheaper, smarter: Computerized physician order entry promises to streamline and improve healthcare delivery. Health Manage Technol 2001; 22:16.
  263. Dexter PR, Perkins S, Overhage JM, et al: A computerized reminder system to increase the use of preventive care for hospitalized patients. N Engl J Med 2001; 345:965.[Abstract/Free?Full?Text]
  264. Kuperman GJ, Teich JM, Gandhi TK, Bates DW: Patient safety and computerized medication ordering at Brigham and Women's Hospital. Jt Comm J Qual Improv 2001; 27:509.[Medline]
  265. Iezzoni L: Measuring the severity of illness and case mix, in Goldfield ND (ed): Providing Quality Care: The Challenge to Clinicians. Philadelphia, American College of Physicians, 1989; p 70.
  266. Hannan EL, Kumar D, Racz M, Siu AL, Chassin MR: New York State's Cardiac Surgery Reporting System: four years later. Ann Thorac Surg 1994; 58:1852.[Abstract]
  267. Edwards FH, Grover FL, Shroyer AL, et al: The Society of Thoracic Surgeons National Cardiac Surgery Database: current risk assessment. Ann Thorac Surg 1997; 63:903.[Abstract/Free?Full?Text]
  268. Grover FL, Johnson RR, Shroyer AL, et al: The Veterans Affairs Continuous Improvement in Cardiac Surgery Study. Ann Thorac Surg 1994; 58:1845.[Abstract]
  269. Grover FL, Shroyer AL, Hammermeister KE: Calculating risk and outcome: the Veterans Affairs database. Ann Thorac Surg 1996; 62:S6.
  270. Parsonnet V, Dean D, Bernstein AD: A method of uniform stratification of risk for evaluating the results of surgery in acquired adult heart disease. Circulation 1989; 79:I3.
  271. Nashef SA, Carey F, Silcock MM, et al: Risk stratification for open heart surgery: trial of the Parsonnet system in a British hospital. BMJ 1992; 305:1066.
  272. Martinez-Alario J, Tuesta ID, Plasencia E, et al: Mortality prediction in cardiac surgery patients: comparative performance of Parsonnet and general severity systems. Circulation 1999; 99:2378.[Medline]
  273. Parsonnet V, Bernstein AD, Gera M: Clinical usefulness of risk-stratified outcome analysis in cardiac surgery in New Jersey. Ann Thorac Surg 1996; 61:S8.
  274. Tu JV, Wu K: The improving outcomes of coronary artery bypass graft surgery in Ontario, 1981 to 1995. CMAJ 1998; 159:221.[Abstract]
  275. O'Connor GT, Plume SK, Olmstead EM, et al: A regional intervention to improve the hospital mortality associated with coronary artery bypass graft surgery. The Northern New England Cardiovascular Disease Study Group. JAMA 1996; 275:841.[Abstract]
  276. Higgins TL, Estafanous FG, Loop FD, et al: Stratification of morbidity and mortality outcome by preoperative risk factors in coronary artery bypass patients: a clinical severity score. JAMA 1992; 267:2344.[Abstract]
  277. Ghali WA, Quan H, Brant R: Coronary artery bypass grafting in Canada: national and provincial mortality trends, 19921995. CMAJ 1998; 159:25.[Abstract]
  278. Weintraub WS, Wenger NK, Jones EL, et al: Changing clinical characteristics of coronary surgery patients: differences between men and women. Circulation 1993; 88:II79.
  279. Grover FL, Johnson RR, Marshall G, Hammermeister KE: Factors predictive of operative mortality among coronary artery bypass subsets. Ann Thorac Surg 1993; 56:1296.[Abstract]
  280. Iyer VS, Russell WJ, Leppard P, Craddock D: Mortality and myocardial infarction after coronary artery surgery: a review of 12,003 patients. Med J Aust 1993; 159:166.0[Medline]
  281. Ivanov J, Tu JV, Naylor CD: Ready-made, recalibrated, or remodeled? Issues in the use of risk indexes for assessing mortality after coronary artery bypass graft surgery. Circulation 1999; 99:2098.[Medline]
  282. Higgins TL, Estafanous FG, Loop FD, et al: ICU admission score for predicting morbidity and mortality risk after coronary artery bypass grafting. Ann Thorac Surg 1997; 64:1050.[Abstract/Free?Full?Text]
  283. Mozes B, Olmer L, Galai N, Simchen E: A national study of postoperative mortality associated with coronary artery bypass grafting in Israel. ISCAB Consortium. Israel Coronary Artery Bypass Study. Ann Thorac Surg 1998; 66:1254.[Abstract/Free?Full?Text]
  284. DeLong ER, Peterson ED, DeLong DM, et al: Comparing risk-adjustment methods for provider profiling. Stat Med 1997; 16:2645.[Medline]
  285. Reich DL, Bodian CA, Krol M, et al: Intraoperative hemodynamic predictors of mortality, stroke, and myocardial infarction after coronary artery bypass surgery. Anesth Analg 1999; 89:814.[Abstract/Free?Full?Text]
  286. Bernstein AD, Parsonnet V: Bedside estimation of risk as an aid for decision-making in cardiac surgery. Ann Thorac Surg 2000; 69:823.[Abstract/Free?Full?Text]
  287. Lahey SJ, Borlase BC, Lavin PT, Levitsky S: Preoperative risk factors that predict hospital length of stay in coronary artery bypass patients >60 years old. Circulation 1992; 86:II181.

This Article
Right arrow Alert me when this article is cited
Right arrow Alert me if a correction is posted
Right arrow Similar articles in this book
Right arrow Author home page(s):
Victor A. Ferraris
Google Scholar
Right arrow Articles by Ferraris, V. A.
Right arrow Articles by Ferraris, S. P.
Right arrow Articles citing this Article
Right arrow Search for Related Content
Right arrow Articles by Ferraris, V. A.
Right arrow Articles by Ferraris, S. P.
Related Collections
Right arrow Cardiac - other