I have been using a custom build CMS that’s really basic for a tiny site for some years. It has around 200 posts so it’s not a huge site but too much too migrate to a new host by hand. I have been using WordPress for some other sites and I really like it so I decided to look into migrating the data from my current custom CMS to WordPress (and host it on wordpress.com). I also wanted to replace some low resolution images with the same images but in higher quality (if possible automatically).
I decided to try to export all posts on my old site to a RSS file that I then convert to a WordPress WXR file (which is a RSS file with some extra tags). I couldn’t find any specification for the WXR file but it’s really simple so I’ll just write down some notes.
The first part is the XML specification and blog description. It should look something like this
http://localhost/~ola/wp
The Car Numberplate Game
Sat, 27 Sep 2008 16:08:57 +0000
WordpressImportFileGenerator 0.1
en
1.0
http://platespotting.wordpress.com
http://platespotting.wordpress.com
photo< ![CDATA[photo]]>
plate< ![CDATA[plate]]>
uncategorized< ![CDATA[Uncategorized]]>
photo< ![CDATA[photo]]>
plate< ![CDATA[plate]]>
Then each blog post, page or attachment is wrapped in tags
http://localhost/~ola/wp/?p=190
Sat, 16 Aug 2008 21:19:15 +0000
< ![CDATA[admin]]>
< ![CDATA[photo]]>
< ![CDATA[plate]]>
< ![CDATA[photo]]>
< ![CDATA[photo]]>
< ![CDATA[plate]]>
< ![CDATA[plate]]>
http://localhost/~ola/wp/?p=190
< ![CDATA[
A house in our neighbourhood caught fire so we went to look and on our way I found 181.
Enjoy!]]>
190
2008-08-16 23:19:15
2008-08-16 21:19:15
open
open
181
publish
0
0
post
location
nacka, stockholm
googlemapurl
http://googlemapuprl.com/testurl
http://localhost/~ola/wp/?attachment_id=191
Sun, 21 Sep 2008 16:28:12 +0000
< ![CDATA[admin]]>
< ![CDATA[Uncategorized]]>
http://localhost/~ola/wp/wp-content/uploads/2008/09/181_080816.jpg
< ![CDATA[]]>
191
2008-09-21 18:28:12
2008-09-21 16:28:12
open
open
181_080816
inherit
190
0
attachment
http://localhost/~ola/wp/wp-content/uploads/2008/09/181_080816.jpg
_wp_attached_file
/home/ola/public_html/wp/wp-content/uploads/2008/09/181_080816.jpg
_wp_attachment_metadata
a:6:{s:5:"width";i:2816;s:6:"height";i:2112;s:14:"hwstring_small";s:23:"height='96' width='128'";s:4:"file";s:66:"/home/ola/public_html/wp/wp-content/uploads/2008/09/181_080816.jpg";s:5:"sizes";a:2:{s:9:"thumbnail";a:3:{s:4:"file";s:22:"181_080816-150x150.jpg";s:5:"width";i:150;s:6:"height";i:150;}s:6:"medium";a:3:{s:4:"file";s:22:"181_080816-300x225.jpg";s:5:"width";i:300;s:6:"height";i:225;}}s:10:"image_meta";a:10:{s:8:"aperture";d:2.600000000000000088817841970012523233890533447265625;s:6:"credit";s:0:"";s:6:"camera";s:20:"Canon PowerShot A540";s:7:"caption";s:0:"";s:17:"created_timestamp";i:1218917746;s:9:"copyright";s:0:"";s:12:"focal_length";d:5.79999999999999982236431605997495353221893310546875;s:3:"iso";i:0;s:13:"shutter_speed";d:0.0166666666666666664353702032030923874117434024810791015625;s:5:"title";s:0:"";}}
Along with all item tags comes the the history for each post but since I didn’t have any history in my old CMS so I exclude it.
To create WXR XML from my standard RSS XML I decided to build a quick .NET program (download the source code) that just reads all fields from the RSS file and then converts them to their corresponding fields in WXR. You’ll need to customize the tool but the class WordPressWxrItem.cs might be a good start…