Importing Blogger to Jekyll, Part 2

While waiting on PVP queues to pop in World of Warcraft, I decided to muck around with importing my blog again, and ended up closing the game because this was a far more interesting problem. I had started out with this script to import from blogger’s “export” feature, but had to make a few changes. First of all, Jekyll was choking on several page titles because of various invalid characters. Replacing these with HTML entities cleared up most of it, except for one entry that had a backslash in it’s title.

Next I decided to try editing the script to add the geotagging data in a sensible format that the mapping plugin can use. This is probably an attrocious hack, but it’s my first time ever poking around with Ruby and I was mostly too lazy to go look up what any of the operators mean:

--- import.rb.old       2014-04-11 22:12:40.000000000 +1000
+++ import.rb   2014-04-25 17:24:22.000000000 +1000
@@ -3,6 +3,7 @@
 require 'fileutils'
 require 'date'
 require 'uri'
+require 'htmlentities'

 # usage: ruby import.rb my-blog.xml
 # my-blog.xml is a file from Settings -> Basic -> Export in blogger.
@@ -39,9 +40,9 @@
 end

 def write(post, path='_posts')
-  puts "Post [#{post.title}] has #{post.comments.count} comments"
+#  puts "Post [#{post.title}] has #{post.comments.count} comments"

-  puts "writing #{post.file_name}"
+#  puts "writing #{post.file_name}"
   File.open(File.join(path, post.file_name), 'w') do |file|
     file.write post.header
     file.write "\n\n"
@@ -81,12 +82,27 @@
   end

   def title
-    @title ||= @node.at_css('title').content
+    @title ||= HTMLEntities.new.encode(@node.at_css('title').content.gsub('\\', '\'))
   end

   def content
     @content ||= @node.at_css('content').content
   end
+
+  def location
+    out = ''
+    point ||= @node.css('georss|point')[0]
+    unless point.nil?
+      out = out + "mapping:\n    latitude: " + point.content.split[0]
+      out = out + "\n    longitude: " + point.content.split[1] + "\n"
+
+      loc ||= @node.css('georss|featurename')[0]
+      unless loc.nil?
+        out = out + 'location: ' + loc.content
+      end
+    end
+    @location ||= out
+  end

   def creation_date
     @creation_date ||= creation_datetime.strftime("%Y-%m-%d")
@@ -122,6 +138,7 @@
       %{title: "#{title}"},
       %{date: #{creation_datetime}},
       %{comments: false},
+      location,
       categories,
       '---'
     ].compact.join("\n")

It works thus far. The plugins I’ve set up so far, to get it to almost-blogger-functionality:

I’ve still got a bit more to do - I need to set up category pages, and then a bunch of template work (I will effectively have to throw away my very simple site layout because Blogger’s template engine is an absolute mess). There’s a few other things I’d like to do simply because I never could with blogger, like a better calendar view, having the day’s entries listed in chronological order, and so on.

It probably won’t get finished today…

Red Lion VIC 3371, Australia

Published:


Modified:


Filed under:


Location:

Red Lion VIC 3371, Australia

Navigation: Older Entry Newer Entry