Converting old WordPress posts to Hugo

This post was originally published here

Between 2014-2018 I published 29 posts on riinudata.wordpress.com. Today I’m converting all of those to my new website powered by blogdown-Hugo.

Step 1

Read the Migration: From WordPress chapter of the blogdown book.

Step 2

Get all your wordpress posts into one XML: WP Admin – Tools – Export.

Step 3

Install Exitwp and its dependencies (pyyamp, beautifulsoup4, html2text):

This worked on macOS1 High Sierra – I already had python installed.

Step 4

Working in the directory that git clone created (exitwp):

  • Put the WordPress XML in the wordpress-xml directory.
  • Run xmllint riinu_wordpress.xml, worked the first time for me and I didn’t get any errors (so not sure what the fix errors if there are would entail).
  • Back in the exitwp folder, run python exitwp.py
  • This created folders build/jekyll/riinudata.wordpress.com/_posts and the content looked like this:
  • Move all these into exitwp/post folder.

Step 5

  • Take a copy of https://github.com/yihui/oldblog_xml/blob/master/convert.R to clean these .markdown files up and ready for Hugo. I edited the first three lines, skipped the “Do not run if…” chunk as I’d already done that in Step 3, edited the authors = c(), did not run the very last chunk (local({if (!dir.exist...})).
  • Move all of the files (now .md) into content/post of your blogdown repo. Build and voila!

Further modifications

Looks like most of my posts were converted like a charm, with nicely formatted code blocks and images. But I few things I noticed that I think I have to fix:

  • GitHub gists are now displayed as links, will make those into code blocks (or embed them using a Hugo shortcodes.
  • Most images show up perfectly, but some have gotten stuck in a code block, e.g. showing up as <img src="https://surgicalinformatics.org/wp-content/uploads/2018/02/rplot.png" alt="Rplot"/>. Will sort these

Overall I feared a lot worse and am super happy with the conversion experience. Took exactly 3 h.

My name is Hildegard and I approve this message.

My name is Hildegard and I approve this message.


  1. I’m only 1.5 years late to discover that OS X has been rebranded as macOS: https://www.wired.com/2016/06/apple-os-x-dead-long-live-macos/

Leave a Reply

Your email address will not be published. Required fields are marked *