P-values from random effects linear regression models

This post was originally published here

`lme4::lmer`

is a useful frequentist approach to hierarchical/multilevel linear regression modelling. For good reason, the model output only includes t-values and doesn’t include p-values (partly due to the difficulty in estimating the degrees of freedom, as discussed here).

Yes, p-values are evil and we should continue to try and expunge them from our analyses. But I keep getting asked about this. So here is a simple bootstrap method to generate two-sided parametric p-values on the fixed effects coefficients. Interpret with caution.

```library(lme4)

# Run model with lme4 example data
fit = lmer(angle ~ recipe + temp + (1|recipe:replicate), cake)

# Model summary
summary(fit)

# lme4 profile method confidence intervals
confint(fit)

# Bootstrapped parametric p-values
boot.out = bootMer(fit, fixef, nsim=1000) #nsim determines p-value decimal places
p = rbind(
(1-apply(boot.out\$t<0, 2, mean))*2,
(1-apply(boot.out\$t>0, 2, mean))*2)
apply(p, 2, min)

# Alternative "pipe" syntax
library(magrittr)

lmer(angle ~ recipe + temp + (1|recipe:replicate), cake) %>%
bootMer(fixef, nsim=100) %\$%
rbind(
(1-apply(t<0, 2, mean))*2,
(1-apply(t>0, 2, mean))*2) %>%
apply(2, min)```

P-values from random effects linear regression models

This post was originally published here

is a useful frequentist approach to hierarchical/multilevel linear regression modelling. For good reason, the model output only includes t-values and doesn’t include p-values (partly due to the difficulty in estimating the degrees of freedom, as discussed here).

Yes, p-values are evil and we should continue to try and expunge them from our analyses. But I keep getting asked about this. So here is a simple bootstrap method to generate two-sided parametric p-values on the fixed effects coefficients. Interpret with caution.

Prediction is very difficult, especially about the future

This post was originally published here

As Niels Bohr, the Danish physicist, put it, “prediction is very difficult, especially about the future”. Prognostic models are commonplace and seek to help patients and the surgical team estimate the risk of a specific event, for instance, the recurrence of disease or a complication of surgery. “Decision-support tools” aim to help patients make difficult choices, with the most useful providing personalized estimates to assist in balancing the trade-offs between risks and benefits. As we enter the world of precision medicine, these tools will become central to all our practice.

In the meantime, there are limitations. Overwhelming evidence shows that the quality of reporting of prediction model studies is poor. In some instances, the details of the actual model are considered commercially sensitive and are not published, making the assessment of the risk of bias and potential usefulness of the model difficult.

In this edition of HPB, Beal and colleagues aim to validate the American College of Surgeons National Quality Improvement Program (ACS NSQIP) Surgical Risk Calculator (SRC) using data from 854 gallbladder cancer and extrahepatic cholangiocarcinoma patients from the US Extrahepatic Biliary Malignancy Consortium. The authors conclude that the “estimates of risk were variable in terms of accuracy and generally calculator performance was poor”. The SRC underpredicted the occurrence of all examined end-points (death, readmission, reoperation and surgical site infection) and discrimination and calibration were particularly poor for readmission and surgical site infection. This is not the first report of predictive failures of the SRC. Possible explanations cited previously include small sample size, homogeneity of patients, and too few institutions in the validation set. That does not seem to the case in the current study.

The SRC is a general-purpose risk calculator and while it may be applicable across many surgical domains, it should be used with caution in extrahepatic biliary cancer. It is not clear why the calculator does not provide measures of uncertainty around estimates. This would greatly help patients interpret its output and would go a long way to addressing some of the broader concerns around accuracy.

Prediction is very difficult, especially about the future

This post was originally published here

As Niels Bohr, the Danish physicist, put it, “prediction is very difficult, especially about the future”. Prognostic models are commonplace and seek to help patients and the surgical team estimate the risk of a specific event, for instance, the recurrence of disease or a complication of surgery. “Decision-support tools” aim to help patients make difficult choices, with the most useful providing personalized estimates to assist in balancing the trade-offs between risks and benefits. As we enter the world of precision medicine, these tools will become central to all our practice.

In the meantime, there are limitations. Overwhelming evidence shows that the quality of reporting of prediction model studies is poor. In some instances, the details of the actual model are considered commercially sensitive and are not published, making the assessment of the risk of bias and potential usefulness of the model difficult.

In this edition of HPB, Beal and colleagues aim to validate the American College of Surgeons National Quality Improvement Program (ACS NSQIP) Surgical Risk Calculator (SRC) using data from 854 gallbladder cancer and extrahepatic cholangiocarcinoma patients from the US Extrahepatic Biliary Malignancy Consortium. The authors conclude that the “estimates of risk were variable in terms of accuracy and generally calculator performance was poor”. The SRC underpredicted the occurrence of all examined end-points (death, readmission, reoperation and surgical site infection) and discrimination and calibration were particularly poor for readmission and surgical site infection. This is not the first report of predictive failures of the SRC. Possible explanations cited previously include small sample size, homogeneity of patients, and too few institutions in the validation set. That does not seem to the case in the current study.

The SRC is a general-purpose risk calculator and while it may be applicable across many surgical domains, it should be used with caution in extrahepatic biliary cancer. It is not clear why the calculator does not provide measures of uncertainty around estimates. This would greatly help patients interpret its output and would go a long way to addressing some of the broader concerns around accuracy.

Radical but conservative liver surgery

This post was originally published here

Cutting-edge liver surgery is often associated with modern technology such as the robot. In this edition of HPB, Torzilli and colleagues provide a fascinating account of 12 years of “radical but conservative” open liver surgery.

This is extreme parenchymal-sparing hepatectomy (PSH) in 169 patients with colorectal liver metastases. In all cases, tumour was touching or infiltrating portal pedicles or hepatic veins, a situation where most surgeons would advocate a major hepatectomy where possible. The PSH by its nature results in a 0 mm resection margin when the vessel is preserved, which was the aim in many of these procedures. Although this is off-putting, the cut-edge recurrence rate was no higher than average.

PSH in the form of “easy atypicals” is performed by all HPB surgeons. There are two main differences here. First is the aim to detach tumours from intrahepatic vascular structures. For instance, hepatic veins in contact with tumour were preserved and only resected if infiltrated. Even then, they were tangentially incised if possible and reconstructed with a bovine pericardial patch. Second is the careful attention paid to identifying and using communicating hepatic veins. This is well described but used extensively here to allow complete resection of segments while avoiding congestion in the draining region.

Short-term mortality and morbidity rates are comparable with other published series. A median survival of 36 months and 5-year overall survival of around 30% is reasonable given some of these patients may not be offered surgery in certain centres. The authors describe the parenchymal sparing approach “failing” in 14 (10%) patients: 7 (5%) has recurrence at the cut edge and 8 (6%) within segments which would have been removed using a standard approach. 44% of the 55 patients with liver-only recurrence underwent re-resection.

This is not small surgery. The average operating time is 8.5 h with the longest taking 18.5 h. The 66% thoracotomy rate is also notable in an era of minimally invasive surgery and certainly differs from my own practice. This study is challenging and I look forward to the debates that should arise from it.

Effect of day of the week on mortality after emergency general surgery

This post was originally published here

Out latest paper published in the BJS describes short- and long-term outcomes after emergency surgery in Scotland. We looked for a weekend effect and didn’t find one.

• In around 50,000 emergency general surgery patients, we didn’t find an association between day of surgery or day of admission and death rates;
• In around 100,000 emergency surgery patients including orthopaedic and gynaecology procedures, we didn’t find an association between day of surgery or day of admission and death rates;
• In around 500,000 emergency and planned surgery patients, we didn’t find an association between day of surgery or day of admission and death rates.

We also found that emergency surgery performed at weekends, or in those admitted at weekends, was performed a little quicker compared with weekdays.

More details can be found here:

Press coverage

Broadcast: BBC GOOD MORNING SCOTLAND, HEART FM,

Print: DAILY TELEGRAPH, DAILY MIRROR, METRO, HERALD, HERALD (Leader), SCOTSMAN, THE NATIONAL, YORKSHIRE POST, GLASGOW EVENING TIMES

Publishing mortality rates for individual surgeons

This post was originally published here

This is our new analysis of an old topic.In Scotland, individual surgeon outcomes were published as far back as 2006. It wasn’t pursued in Scotland, but has been mandated for surgeons in England since 2013.

This new analysis took the current mortality data and sought to answer a simple question: how useful is this information in detecting differences in outcome at the individual surgeon level?

Well the answer, in short, is not very useful.

We looked at mortality after planned bowel and gullet cancer surgery, hip replacement, and thyroid, obesity and aneurysm surgery. Death rates are relatively low after planned surgery which is testament to hard working NHS teams up and down the country. This together with the fact that individual surgeons perform a relatively small proportion of all these procedures means that death rates are not a good way to detect under performance.

At the mortality rates reported for thyroid (0.08%) and obesity (0.07%) surgery, it is unlikely a surgeon would perform a sufficient number of procedures in his/her entire career to stand a good chance of detecting a mortality rate 5 times the national average.

Surgeon death rates are problematic in more fundamental ways. It is the 21st century and much of surgical care is delivered by teams of surgeons, other doctors, nurses, physiotherapists, pharmacists, dieticians etc. In liver transplantation it is common for one surgeon to choose the donor/recipient pair, for a second surgeon to do the transplant, and for a third surgeon to look after the patient after the operation. Does it make sense to look at the results of individuals? Why not of the team?

It is also important to ensure that analyses adequately account for the increased risk faced by some patients undergoing surgery. If my granny has had a heart attack and has a bad chest, I don’t want her to be deprived of much needed surgery because a surgeon is worried that her high risk might impact on the public perception of their competence. As Harry Burns the former Chief Medical Officer of Scotland said, those with the highest mortality rates may be the heroes of the health service, taking on patients with difficult disease that no one else will face.

We are only now beginning to understand the results of surgery using measures that are more meaningful to patients. These sometimes get called patient-centred outcome measures. Take a planned hip replacement, the aim of the operation is to remove pain and increase mobility. If after 3 months a patient still has significant pain and can’t get out for the groceries, the operation has not been a success. Thankfully death after planned hip replacement is relatively rare and in any case, might have little to do with the quality of the surgery.

Transparency in the results of surgery is paramount and publishing death rates may be a step towards this, even if they may in fact be falsely reassuring. We must use these data as part of a much wider initiative to capture the success and failures of surgery. Only by doing this will we improve the results of surgery and ensure every patient receives the highest quality of care.

Press coverage

Print:

• New Scientist
• Scotsman
• Daily Mail
• Express
• the I

Online:

All the transplant statistics you need

This post was originally published here

If you have a hunger for statistics on organ transplantation, check out NHS Blood and Transplant. There are regularly updated and reflect what is actually happening in UK transplant today. We should have a competition for novel ways of presenting these visually. Ideas?!