Below table summarizes the undergone changes. The objective is NOT to cause any disruption to blog readers(subscribers), try to preserve the search engine ranking and migrate the URL to new home safely. Its not a trivial task, so I took some time to put this article, so that it can help few people out there. Even though this article is focused from blogger.com to blogengine.net, majority of the concepts will apply to any blog migration.
I started blogging in March 2005, its slightly over 3 years now. Even though I'm not a prolific blogger I've blogged quite a bit mainly on BizTalk server and my target audience is always people around BizTalk, SOA, BPM space and occasionally about .NET / ASP. I used Google's blogger.com and their ability to FTP the files to remote server functionality for my blogging needs. This helped me to keep the blog posting at my own domain http://www.digitaldeposit.net/blog/. Blogger's FTP functionality didn't give the same level of functionality you'll get by using their own allocated url (ex: http://digitaldeposit.blogspot.com) and their own hosting. When it comes to blogger and FTP, theme support is really cumbersome, doesn't support archiving properly, doesn't support categories (labels) properly etc, etc. Also, when it comes to comments, it will take you to bloggers own page, which I guess undoubtedly put some users off. Apart from all these issues, my blog missed lot of basic requirements of a blog site, things like search, tag cloud, archive, etc and it was not easy to browse through the blog content. Unless otherwise a search engine finds my article, its very unlikely you'll be able to get to it. I'll explain here the whole migration process:
Blog Engine Selection:
The first choice I had to make is selecting the right blog engine for me, this time I want to host my own engine (ASP .NET 2.0 based, really don't have a choice here) and cut down all the inter dependencies. I reviewed few of the available ones in the market like Community Server, DasBlog, Graffiti CMS, and ended up using BlogEngine.NET 1.4 a open source one, just because of its simplicity.
Setting up BlogEngine.NET:
Setting up BlogEngine.NET is so simple, it hardly took me 2 minutes. Create a virtual directory to the folder, right click on "App_Data" folder and give write access (within IIS on properties page), open windows explorer, right-click on "App_Data" folder and give IIS_WPG group write access (Windows 2003 machine). Now you are ready to go, its that simple.
You login using the username/password "admin/admin", create a new user, give admin rights, set your preferred password and delete the original admin account.
Migration from Blogger.com:
The next big step is, I need to make sure I can port all the content from Blogger.Com to the new engine. BlogEngine.NET had the option to Import/Export. It supported BlogML/RSS format for import and BlogML format for export. BlogML is an open source format aimed at migrating blogs from one provider to another. New Blogger.com site (http://draft.blogger.com) got the option to export your blog, but the only option available is to export it in ATOM format. I found a useful PowerShell script to do the conversion, but it didn't work out straight away due to some namespace issues. Due to my lack of PowerShell experience I ported the script to C# . During the conversion I can see the elegance of PowerShell while working with Xml structure. In lot of places a 1 line PowerShell script will translate to 4 lines in C#.
Tools:
In order to migrate my blog I created 3 small utilities:
1. BlogML Generator : This one is basically used to generate the BlogML xml file reading the data from my blogger.com account using Google gdata API. It retrieves all the blog posts with comments (labels-categories are not handled) to generate the BlogML xml file.
2. SiteMap Generator: This tool will create the sitemap file from the file system (structure created by blogger.com on your FTP location). The tool can only work with physical file system, so you should download the full blog content using a FTP client to your local machine and run it.
3. ISAPI_REWITE Generator: This tool will generate the ISAPI_REWRITE rules with the help of the original (the one created in previous step) and new site map files (BlogEngine .NET provides one out of the box). This is the key piece responsible for redirecting my old url's to new ones.
Note: These utilities are not bullet proof, it just did the job for me, hopefully it will do it for you as well.
Once the BlogML file is generated, then the import process is piece of cake! On the bottom of the setting tab (after you login) you'll see the options to import/export. Click on the import button and provide all the details asked (there are validation button to check the data you've entered) and click Import. Its fairly quick.
DOWNLOAD THE TOOLS HERE (TBD)
URL Change:
I decided to move my blog to a brand new URL http://blogs.digitaldeposit.net/saravana/ for various personal reasons. I thought this is the best time to do it since, the blog engine migration is going to change all the URL's any way. So, it doesn't matter that much whether the domain name remains the same or not.
URL Redirection:
This is the key (Scary!!) piece, took me longer time than I anticipated to get it correct (hopefully!!). For some keywords like "biztalk soap adapter" fortunately the no 1. result in Google points to my blog and obviously I don't want to loose it.
In order to tackle this issue, I decided to redirect every single URL from my old blog to new one. URLRewriter is a big topic on ASP .NET world. This article from Scott Guthrie's will give you an idea. I analysed few solutions their pros and cons and decided to use the free version of ISAPI_REWRITE from HeliconTech, it seems reliable, free version of a commercial product and its been referred by people like Scott Hanselman and Jeff Atwood .
The first thing I'm interested to sort out is the entry points to my blog, I want only one single entry to my personal online presence. Once you sit with a pen and paper you'll me amazed with the number of entry points you'll have to your blog. The below figures gives you an idea of what I mean by single entry point.
More the number of entry points you have, you are diluting the search engine traffic. For search engines every URL is different, and they work on the basis of page rank. If every single person referring to my blog is pointed to one single URL, that's obviously going to create more votes for my blog and hence the chance of getting higher page rank.
Here are my rules.
#RewriteLogLevel 9
#LogLevel debug
#Rule 1
RewriteCond %{HTTP:Host} ^(digitaldeposit.net|www.digitaldeposit.net)$ [NC]
RewriteCond %{REQUEST_URI} ^/blog(/)?(default.htm)?$ [NC]
RewriteRule .? http://blogs.digitaldeposit.net/saravana/ [R=301,L]
#Rule 2
RewriteCond %{HTTP:Host} ^(digitaldeposit.net|www.digitaldeposit.net)$ [NC]
RewriteCond %{REQUEST_URI} ^(/)?(default.htm|default.aspx|index.htm)?$ [NC]
RewriteRule .? http://blogs.digitaldeposit.net/saravana/ [R=301,L]
#Rule 3
RewriteCond %{HTTP:Host} ^(blogs.digitaldeposit.net)$ [NC]
RewriteCond %{REQUEST_URI} ^(/)?(default.aspx|default.htm|index.htm)?$ [NC]
RewriteRule .? http://blogs.digitaldeposit.net/saravana/ [R=301,L]
#Rule 4
RewriteRule ^/saravana/(default.aspx|default.htm|index.htm)?$ http://blogs.digitaldeposit.net/saravana/ [R=301,L]
#Rule 5
RewriteRule ^/blog/2008/06/read-this-before-using-btscleanupmsgbox.html$ http://blogs.digitaldeposit.net/saravana/post/2008/06/04/Read-this-before-using-bts_CleanupMsgBox-stored-procedure.aspx [R=302, NC, L]
RewriteRule ^/blog/2008/07/how-can-i-find-installed-hotfixes-on.html$ http://blogs.digitaldeposit.net/saravana/post/2008/07/18/How-can-I-find-the-installed-hotfixes-on-the-serverworkstation.aspx [R=302, NC, L]
RewriteRule ^/blog/2008/07/mvp-year-2.html$ http://blogs.digitaldeposit.net/saravana/post/2008/07/04/MVP-Year-2.aspx [R=302, NC, L]
Note:
R=302 means permanent redirect (you can try 301 to say its temporary redirect), "L" means its the last rule, don't execute anything after this, "NC" means ignore case. Rule 5 is auto generated by my utility using the sitemap files and will have entry for each and every single URL (blog post)
Selecting/Modifying a theme:
There a quite a lot of themes to choose from, you can also download community ones from the following link. Customize it if required.
Configured Windows Live Writer:
BlogEngine.NET supports MetaWebLog API, so you can use tools like MS Word 2007, Live Writer etc. I configured Live Writer and posted a test post with an image, to make sure everything is working correctly.
Don't forget to change these settings:
Some of these settings are cosmetic, but some are crucial.
1. Update Feed Burner URL: If in case you've outsourced your feeds to Feed Burner, make sure you change the settings to point to your feed burner feed. You need to change it in two different places. (a) In BlogEngine.NET settings, and (b) Login to Feedburner and point your feed to the BlogEngine.NET one.
2. Robot.txt: You need to update the URL to your blog sitemap.axd file inside the Robot.txt file and un-comment the path (Instructions inside the file). This is crucial, else search engines will struggle.
3. Google Analytics code: If you've signed up for Google Analytics, make sure you place the <script> code and update settings.
Setup Backup procedures directly:
It's important to setup a backup strategy. We normally don't plan for backup's until we hit our first disaster. Decide on the backup method that best suits you. Check the options with your hosting provider, some charge reasonable amount for backup storage, or you can use products like WS_FTP pro, which got support for scheduled transfer.
Results: 3 weeks after the migration:
I purposely didn't blog about it or let anyone know about my migration. I left it running for nearly 3 weeks now, and you can see from the below picture the new search result is pointing to the new URL hurray!!
Nandri!
Saravana