Change in outcomes in orthopaedics can be considered following operative intervention, and by examining time-series following system interventions. The measures of performance in both settings are similar and reflect the variables we consider to lie at the core of orthopaedic practice. Although there is a degree of overlap with variables used to measure learning, these are largely related to patient outcomes and health economics.
Prior to implementing and evaluating change, researchers must identify appropriate measures to determine whether an intervention works. Ideally, these should be part of routinely collected data for quality improvement purposes. An example includes the National Hip Fracture Database in the United Kingdom that routinely collects standardised outcome data. It is based on this that the World Hip Trauma Evaluation (WHiTE) study has founded a reliable and organised framework for comprehensive cohort studies on fragility hip fractures.
Patient outcomes in orthopaedics mainly include mortality, postoperative complications, infection, performance testing, and PROMs. Of these there has been a recent surge in PROMs research. This is because PROMs lie at the heart of patient-centred care. There is no surprise that health-related quality of life measures such as the EuroQol are increasingly being employed to guide operative decision making in trauma[29,32]. Simultaneously, there is a trend towards including patients in setting research questions through priority setting partnerships, and patient and public involvement is now indispensable to healthcare research. Cost-utility, the financial cost for health gain, is the variable that the National Institute for Health and Care Excellence (NICE) uses when forming guidelines for healthcare provision. It is thus very important that orthopaedic surgeons understand and incorporate cost-utility analysis in their research.
Variables used to evaluate an intervention are usually divided into outcome measures, process measures, and balancing measures[5,36]. Outcome measures monitor how a system is performing, process measures assess the implementation of an intervention, and balancing measures assess unintended consequences of the intervention.
Once outcome measures are identified and data is collected, analysis of the data is required to evaluate change.
Operative intervention: Analysing change following operative intervention forms the basis of retrospective and prospective research studies. The level of evidence for a given study depends on a multitude of factors, most importantly study design. There are three types of outcome variables: Continuous (e.g., operative time), categorical (e.g., presence or absence of a complication), and time-to-event (e.g., time to revision of a joint replacement). Statistical tests comparing outcomes consider the type of variable and can include parametric (t-test) and non-parametric (Mann-Whitney) tests, crosstabs (e.g., Chi-squared test and Fischer’s test), and survival analysis. These tests usually output a significance value (P-value) which is a measure of the likelihood that the result was due to chance.
Increased focus is being placed on the minimal clinically important difference - the smallest change in an outcome that a patient would identify as important, and which would usually indicate a change in patient management. Even a very small change can be shown to be statistically significant with a large enough sample size, but this may not be important. There is significant variation in the reporting of sample size calculations in orthopaedic literature and until recently, reporting guidelines were lacking. Adoption of the DELTA2 guidance on choosing a target difference and reporting sample size in RCTs should improve this.
RCTs are considered the gold-standard hypothesis-testing study design. This is mainly because they allow for controlling of confounding variables that complicate observational studies. Over the last decade there has been a surge in trauma trials on an international scale, starting with the CRASH-2 trial on the effectiveness of tranexamic acid in trauma. Other large-scale randomised trials have followed suit, investigating fixation of intracapsular neck of femur fractures, fixation of distal radius fractures and ongoing research on the optimal timing of hip fracture surgery to mention a few.
Although RCTs are excellent for answering certain research questions, retrospective studies remain indispensable. In the era of information technology, ‘Big Data’ is becoming ubiquitous. Using Big Data to identify research questions, guide efficient targeting of resources and subsequently address these questions with randomised trials may not be the exception in a few years. It is definitely appearing promising so far. One major limitation that will need to be addressed in future if RCTs are to output the highest quality data is surgeon equipoise. Surgeons are rarely in true equipoise and they usually have a clear idea of what management option is the best for a given patient. Although few would question the importance of decision making in surgery, it can present an obstacle when patient randomisation is required. This must be addressed through improved surgeon education and standardised randomisation processes.
Time-series analysis: A toolbox for detecting change: Many quality improvement projects evaluate the effectiveness of an intervention by collecting data over time. Data can be graphically displayed as control charts, also known as Shewart charts. They are a statistical process control tool used to determine whether a system is in control and provide immediate feedback about performance.
Orthopaedic surgeons may be more familiar with audit cycles. Audit is a framework of quality improvement where performance is compared to a published standard. Part of this process includes introducing an intervention and assessing its effectiveness by comparing performance before and after the intervention by simple statistical group tests. Although ubiquitous in clinical orthopaedics and indeed in all medical specialties, such approaches are sensitive to secular (background) trends. Interrupted time-series (ITS) analysis is a useful tool for evaluating the effectiveness of interventions where data is collected at several time-points before and after the intervention to determine whether any change could be explained by secular trends. Cochrane recommends this tool to evaluate interventions and several recent orthopaedic studies have employed this technique[50,51].
ITS does not come without limitations, and is known to display bias for detecting change at the time of the studied intervention where other changes at different time-points may be equally, if not more important[52,53]. Segmented linear regression models have been developed for evaluating change in retrospective studies by enabling more than one linear segment to describe the periods before and after an intervention. A recent study employing this technique revealed that improvements in time to surgery and 30-d mortality following hip fracture over a 6-year period were likely the result of a combination of surgical, anaesthetic, and procedural improvements over time, rather than due to the introduction of a dedicated hip fracture unit (Figure 3). Future work is required to determine the optimal way to describe retrospective time-series: How many linear segments should be used, and how to best model binary outcomes.
Figure 3 Time to surgery for neck of femur fractures.
The vertical dashed line marks the onset of a dedicated hip fracture unit. The line-plateau is the best-fitting linear model for the entire period: the line has equation y = −0.0414t + 40.1868; plateau at y = 24.7033 reached after 375 d. The initial drop may be related to the introduction of the Best Practice Tariff. The hip fracture unit did not significantly affect time to surgery.