Blog Migration

Like all great, wild animals, the humble webblog must occasionally embark upon the mighty journey of a migration.

Footage of me migrating the blog

Footage of me migrating the blog

The migration of my old site to this one is (pretty much) complete!

Old, migrated content is now available on the old posts page.

The process was surprisingly painless, actually. Knowing WordPress, I thought that it would take hours - most of which would be waiting for scripts to finish running - and would be very tedious. To my utter shock, the process took barely a single hour - most of which was me fussing over image sizes. In any case, I thought it would make a nice first blog post to talk about my process of migrating the site over.

Migrating Posts

Migrating posts was the easiest part of all. Built in to WordPress, there is an archiving feature which permits all the site’s downloads to be archived locally in a WordPress XML Archive format. This format could then be loaded back up by another blog, should you wish to move all the content between two WordPress installs. This format contains all posts, pages, images and settings - as well as references to the installed plugins on the WordPress workshop (I can’t remember what it’s actually called so let’s call it the workshop anyway).

This is your brain on Garry’s Mod

This is your brain on Garry’s Mod

Anyway, I was already breaking out awk and perl to try extracting posts when I found this project on GitHub which saved me a couple of hours. After a surprisingly short download time waiting for npm to finish (only eleven whole dependencies! wow!), I ran the tool and pretty much everything fell into place. Nice one!

Broken Images

I began testing the markdown files locally using the hugo server command as per usual. The issue was that I was pretty sure that the markdown generator was inserting hard links to media stored on the old domain. Given that I planned on eventually taking the old site down, I couldn’t have all the old content randomly breaking.

Sure enough, loading the markdown files with the old server off lead to all the images breaking.

I wrote a quick script that extracted each link and rewrote it to the local path that the tool had already created in the directory. Helpfully, the filenames remained the same - only the path had changed.

Resizing Images

The next image-based problem was that WordPress seemed to be using CSS to resize images on the fly. So, I wrote a quick shell script that looped through each image file it found in the content directory and used imagemagick to resize them to a smaller bounding box. I placed a hard limit on the width that images could be to aid in the flow of the site, but the height remained unbounded to preserve the aspect ratio.

Double Checking

Before I wrapped everything up and called it a day, I went through each post individually just to double check. There were a couple of images where I couldn’t figure out a good way to resize them without it looking ridiculous, so I cropped them down a bit.

And with that, the old posts were migrated across!

Old site redirect

For the moment, the old site will remain up but with an HTTP redirect in place to send people to the new location. I set this to a permanent redirect from the root and nowhere else. The idea behind that was that people would get sent to old links, get a 404 and try going to the root. This isn’t the best method and I would like to set up redirects for everything, but I think this is the most convenient way for a small-scale blog to migrate. Sorry to anybody with broken links!

Email

I am yet to migrate email properly, which means I am currently operating between split mailboxes. That’s a bit annoying, but apparently there is a method to forward mail from one domain to another using MX records. The only thing I worry about is that it might get complicated with how DMARC and email keys work. To be honest, I just used a script to set up my current email signing stuff.

One annoying detail to add on to this is that my email infrastructure is a bit less flexible than I thought because my hosting provider refuses to provide reverse DNS (PTR records) for IPv6 addresses. I “fixed” this by just setting the address family that Dovecot is allowed to use to just Inet4. This fixes outgoing mail being rejected by GMail, among others.

Summary

So, in summary:

  • WordPress is pretty easy to migrate to markdown
  • You should migrate your WordPress site to markdown
  • The site is pretty much migrated

Cheers and see you in the next one!

Ethan Marshall

A programmer who, to preserve his sanity, took refuge in electrical engineering. What an idiot.


First Published 2024-09-10

Categories: [ tech ]

Tags: [ personal web migration ]