Wordpress to Jekyll
I am currently undergoing a process of slowly converting this and my other blogs from WordPress to Jekyll. One of the first items that I needed to account for was converting all of the posts from WordPress into Markdown for use by Jekyll.
Jekyll itself provides a process for importing, but I was intially displeased with the results. I want my posts exported into Markdown files so I can continue to retain them in a simple plaintext format that can be post-processed into a variety of typesettings be it online or perhaps a print format. The default setting only outputs html.
In all honesty, I’m not sure why I’m using Jekyll. The Ruby dependency ecosystem
always seems like such a pain to me. Dependencies not automatically resolving.
Things breaking from one system to the next. But, I don’t really know of any
other big-name static site generators in other languages. I’d do a Python one in
a heartbeat.
So, for my own personal memory. This is the process that I went through to get my posts out of WordPress and into Markdown:
1. Export Content from WordPress
Wordpress has an export tool when you are logged in to the admin dashboard. By selecting “All content,” I can get everything from the site in a massive XML file. This gets us a little closer.
2. Ignore Jekyll-Import
Jekyll has a series of importers for popular sources. It even has two for WordPress! I tried both with little satisfaction. They take the exported XML file and spit out HTML copies of our articles. If I wanted to get back to MarkDown, this would require additional post-processing.
3. ExitWP
I stumpled upon a Python tool that does the trick so much better. ExitWP takes the exported XML file and converts all of our articles into *.markdown files.
Follow the instructions to install the dependencies. Dump the XML file into the
wordpress-xml
directory and then run python exitwp.py
. I found that there
were some linting issues in my XML file that caused it to fail. Opening the file
in VIM and tracking them down via it’s XML linting functionality made it pretty
simple.
4. Copy Your Images Directory
Unfortunately, you are still left copying the images directory and manually updating the links to images to get things working. This isn’t a major problem for me as a migration does entail a lot of additional overhead if you want to do it right – 301 redirects, image updates, cleaning up posts.