Kenny McLean and I recently attended rmedicine 2019 in Boston MA. The conference is aimed at clinicians and non-clinicians who use R for day-to-day research and monitoring of clinical processes.
Day 1 covered two parallel workshops: R Markdown for Medicine and Wrangling Survival Data
I attended R Markdown for Medicine run by Alison Hill from RStudio. Using
.rmd files has become the default for the Surgical Informatics Group and, so it seems, a great number of others who attended rmedicine. Around a third of the presentations at rmedicine covered workflows involving sharing of data via either
.rmd files or through
shiny, an R package for creating deploy-able dashboards for data visualisation and interactive exploration.
R Markdown for Medicine
An Overview of Useful Tips and Tricks
R markdown is an extension of R which allows you to combine narrative text and R code within one document. This means your notes, code, results and plots are all in one place. Code is contained in between three backticks with an
after the first set. Inline code can also be used between single backticks followed by
r without the curly brackets and then the code. This means that results can be changed automatically so that for a trial when you describe the results of numbers included / excluded, this only needs changed in one place so that the rest of the text (and / or flowcharts) updates automatically. It is also possible to mix-and-match other chunks of code from other languages.
Parameters are set in the YAML header at the top of the
.rmd document. If you set a parameter of data to a default
.rda file then this can be changed for other similar files without creating a new document. A really useful example would be when you have multiple hospitals or multiple diseases each with a separate data file, a report can then be generated for each file. If you use
rmarkdown::render() along with
purrr::pwalk you can generate a separate output file for any number of hospitals / diseases / countries / individuals etc. in just a couple of lines of code.
Use Helper Packages
There are some greater
.rmd helper packages to improve the workflow, improve the rendering of documents and generally make life easier.
bookdown allows several
.rmd documents to be combined to a book but also has some general usefulness for single documents as well. Using
bookdown::html_document2 in the YAML header under the output field is designed to improve cross-referencing of tables and figures compared to the default versions.
wordcountaddin allows an accurate word count to be performed which will not count YAML or code etc. without knitting the document. This is much easier than knitting the document and then performing a word count!
citr allows automated insertion of markdown citations to assist with referencing. Check out my earlier blog on referencing to get an idea of how to set up
.bib files. I may add another blog on this topic, watch this space!
xaringan is a useful package for creating HTML presentations with high levels of customisation. It is possible to use an additional
.css file for even greater customisation and styling of your slides but
xaringan offers a great deal of user-friendly options.
distill appears to be good at supporting mobile-friendly web publishing for scientific communication with flexible figure layouts, table pagination,
LaTeX math support and incorporation of
There are countless other helper packages and more likely to be on their way. Many allow additional aesthetic modification of the output documents and may allow you to run
R code rather than modifying a
List Numbering the Lazy Way
List numbering in
.rmd works without needing to manually enter the correct numbers. Just make a list where every element begins with
.rmd will transform it into an appropriately-numbered list. Great if you need to add in a new element to the middle of the list later!
Multiple lots in a Grid
I’ve previously come across
patchwork as a way to plot several plots into a grid which could be 1×2, 2×2, 3 in one column and one in the other etc. There are also two other packages
egg. I haven’t explored the differences between them but if you find that one doesn’t give you the exact customisation or alignment you need then possibly try another one.
cowplot looks as if it might perform better at overlaying plots on top of another and at exact axis line matching.
Use the here package to help with file paths
here is a great package for swapping between Windows and Mac file paths (no more swapping backslashes and forward slashes!). Using
here::here() will default to looking for a file in the
.Rproj directory rather than the
.rmd directory which is the default otherwise – great if you want to have multiple
.rmd documents each in their own sub-directory with a shared data file in the parent directory.
Customise Code Outputs
R markdown allows customisation of appearance of code. Some of this can be done through modifying a
.css file but there are some simpler ways to make basic changes. Try adding
comment = "#>" to
knitr::opst_chunk$set()to customise how comments appear in your document.
Word document creations tips
R markdown is generally great for HTML and PDF formats. The options for knitting to Word are not as well developed but there are some good options. The bookdown package is useful as discussed. The redoc package has been used to facilitate conversion to and from word – not tried it personally but if it can print out to word and then handle tracked changes back into markdown then it could be very useful.
For converting more complex tables and figures to word an option is to knit to
rtf (rich text format) and then open the
rtf file in word. This tends to be very good at keeping the desired formatting.
Future updates – hopefully!
R markdown is a great resource although there are a handful of minor issues which are currently difficult to resolve. One of the main problems I find it with tables and cross-referencing. I really like the syntax and customisation of the
gt package but at present it appears cross-referencing in a way which works across HTML, PDF and Word outputs is not supported – a great opportunity to submit a pull request if you think you can get this to work.
Other Useful rmedicine Packages and Ideas
survival Package Update
The latest version (version 3.0) of the
survival package was presented by Terry Therneau and is now available on github. This package is used by over 650 additional downstream dependencies. The latest version allows for multiple observations per subject, multiple endpoints per subject and multiple types of end-point. This will be particularly useful for competing risks analyses e.g. outcomes for liver transplant patients (transplanted, still on list, removed from list as no longer eligible or died).
Keep an eye-out for Kenny McLean’s blog where he plans to cover the
survival package and many other useful packages presented at rmedicine 2019.
hreport Automated Trial Reporting
hreport by Frank Harrel (currently available on github) is for automated reporting of trials and studies with generation of interactive html graphs based in
plotly. Several aspects of a study can be rendered easily into plots demonstrating accrual, exclusions, descriptive statistics, adverse events and time-to-event data. Another key theme of rmedicine 2019 appears to have been the use of
plotly or similar packages to enable interaction with data.
timevis – interactive timelines
timevis allows generation of highly interactive timeline plots which allow zooming, adding or removal of events, resizing, etc.
For working with projects that require a number of packages that then need shared with a colleague,
holepunch provides a quick method for generating a list of dependencies and a
Dockerfile. The package creates a link for another user to open a free RStudio server with all of the required packages installed. This may be useful for trouble-shooting in a department and showing code examples.
rmedicine 2019 has shown that clinical researchers are moving increasingly towards literate programming, interactive visualisations and automated workflows using R and Rmarkdown.
The conference was a great mix of methods presentations and data presentations from R users. You definitely don’t need any in-depth knowledge of R to benefit from it and I’d highly recommend booking for rmedicine 2020.