Profile picture Schedule a Meeting
c a n d l a n d . n e t

Migrate Google Sites to Jekyll

Dusty Candland | |

Export

Install https://github.com/famzah/google-sites-backup

google-sites-backup/run.sh gdata-python-client/ google-sites-backup/

Convert to Markdown

Install reverse_markdown

cd into the exported proj

find . -iname ".html" -exec echo "tidy -q -omit -b -i -c {} | reverse_markdown > {}.md" ; | sed s/.html.md/.md/ > fix.sh chmod +x fix.sh ./fix.sh find . -iname ".html" | xargs rm

Cleanup

Add frontmatter

find . -iname ".md" -exec perl -0777 -i -pe 's/.</head>//igs' {} ; find . -iname ".md" -exec perl -0777 -i -pe 's/^# (.)$/---\nlayout: page\ntitle: $1\n---/m' {} ;

Clean up left over extras, spaces, extra header lines, and

find . -iname ".md" | xargs -t -I {} sed -i'' 's/Â//g' {} find . -iname ".md" -exec perl -0777 -i -pe 's/^[\s|]$//gm' {} ; find . -iname ".md" -exec perl -0777 -i -pe 's/^.?---/---/ms' {} ; find . -iname ".md" -exec perl -i -pe 's/^ ([^ ].*)$/$1/g' {} ;

Remove absolute links

ack --ignore-dir=_site -l "sites.google.com/a/roximity.com/wiki" | xargs perl -i -pe "s/https://sites.google.com/a/roximity.com/wiki//g"

Fix resource links

ack --ignore-dir=site -l "//rsrc/\d*/" | xargs perl -i -pe "s//_/rsrc/\d*///g"

Rename %20 to underscores in file names.

for i in find . -name "*%20*"; do mv -v $i echo $i | sed 's/%20/_/g' ; done

Still had to do a fair amount of clean up from the converted markdown.

Plugins

These make the stucture and navigation match the google sites somewhat.

Related Pages

Lots of our page had files as downloads. I like the idea of putting downloads in a sub directory and having them auto populate on the page. Also some of our navigation is based on pages in a matching directory. This plugin populates a sub_pages collection and a downloads collection. The view renders those collections

module AssociateRelatedPages
  class Generator < Jekyll::Generator
    def generate(site)
      page_lookup = site.pages.reduce({}) { |lookup, page| lookup["/" + page.path] = page; lookup; }

      site.pages.each do |page|
        subdir = File.join(site.source, page.dir, page.basename)
        if File.exist?(subdir) and File.directory?(subdir)
          entries = Dir.entries(subdir)

          page.data["sub_pages"] = entries.select{ |e|
            e =~ /\.md$/
          }.map{ |e|
            page_lookup[File.join(page.dir, page.basename, e)]
          }

          page.data["downloads"] = entries.reject{ |e|
            e == "." || e == ".." || e =~ /\.md$/ ||
              File.directory?(File.join(subdir, e))
          }.map{ |e|
            download = File.join(subdir, e)
            stat = File::Stat.new(download)
            {
              "title" => e,
              "url" => File.join(page.basename, e),
              "size" => stat.size
            }
          }
        end
      end
    end
  end
end
{% if page.sub_pages.size > 0 %}
  <ul>
  {% for page in page.sub_pages %}
    <li>
      <a href="{{ page.url | prepend: site.baseurl }}">{{ page.title }}</a>
    </li>
  {% endfor %}
  </ul>
{% endif %}
{% if page.downloads.size > 0 %}
  <div class="post-downloads">
    <h2>Downloads</h2>
    <ul>
    {% for download in page.downloads %}
      <li>
        <a href="{{ download.url | prepend: site.baseurl }}">{{ download.title }} ({{ download.size }}b)</a>
      </li>
    {% endfor %}
    </ul>
  </div>
{% endif %}

Navigation

The navigation on the google site was mostly based on sub directories. This creates a nav collection used to build the navigation.

module HierarchicalNavigation
  class Generator < Jekyll::Generator
    #{dev: { page: Page, sub: [] }}

    def generate(site)
      nav = {}
      site.pages.sort_by(&:dir).each do |page|
        dirs = page.dir.split('/')
        dir = dirs[1] || ''

        if dirs.count <= 2
          if page.basename == 'index'
            nav[dir] ||= {'page' => nil, 'sub' => []}
            nav[dir]['page'] = page
          else
            nav[dir] ||= {'page' => nil, 'sub' => []}
            nav[dir]['sub'] << page
          end
        end
      end

      site.data['nav'] = nav.values
    end
  end
end
<ul>
{% for nav in site.data['nav'] %}
  {% if nav.page.title %}
  <li class="{% if page.url contains nav.page.url %}active{% endif %}">
    <a class="page-link" href="{{ nav.page.url | prepend: site.baseurl }}">{{ nav.page.title }}</a>
    {% if page.url contains nav.page.dir %}
      <ul>
      {% for sub in nav.sub %}
        {% if sub.title %}
          {% capture sub_dir %}{{ sub.url | remove: ".html" | append: "/" }}{% endcapture %}
          <li class="{% if page.url contains sub.url or page.dir ==  sub_dir %}active{% endif %}">
            <a class="page-link" href="{{ sub.url | prepend: site.baseurl }}">{{ sub.title }}</a>
          </li>
        {% endif %}
      {% endfor %}
      </ul>
    {% endif %}
  </li>
  {% endif %}
{% endfor %}
</ul>

Webmentions

These are webmentions via the IndieWeb and webmention.io. Mention this post from your site: