I've been having a bit of a behind the scenes tidy-up and finally fixed the indentation of my Jekyll includes and layouts so my source is as hunky-dory as can be (I can live with each individual post not having the correct indentation for the time being; I may investigate this plugin for indentation).

Besides indentation, the other annoying thing about Jekyll is how lines consisting of purely Liquid tags end up as blank lines, especially when within loops.

I'm not keen on the idea of using yet more Liquid tags to solve a problem with Liquid tags so I toyed with the idea of stripping the blank lines out with a shell script I could run after the site had been generated:

for file in `find _site/ -name "*.html" -print`; do
    sed  -E '/^[[:space:]]*$/ d' $file > tmp;
    mv tmp $file;

But I then realised the flaw with this approach is that it would also strip blank lines from inside <pre> and/or <code> blocks which is almost certainly not what I want.

Fixing properly, at source, in Jekyll is a no-no at the moment as I'm intentionally running an older version due to the bizarre idea to introduce a dependency on Node.js.

I thought about using Nokogiri to delete the relevant nodes, but I couldn't figure out how to identify the correct nodes. This was something I tried:

doc.xpath('//text()[not(ancestor::code) and normalize-space(.)=""]').remove

But the problem is that it's node based instead of line based and so differentiating between intentionally "empty" (indenting) nodes and completely blank lines is tricky (well it was for me, any way). So I settled on a quick and dirty, but surely wrong, approach to do this:

Dir['_site/**/*.html'].each do |file|
    file_to_strip = IO.readlines(file)
    fout = File.open("temp.html", "w")
    in_code_block = false
    file_to_strip.each do |line|
        if line.include?("<code>") or line.include?("<pre>")
            in_code_block = true
        if (not in_code_block and not line.strip.empty?) or in_code_block
            fout << line
        if line.include?("</code>") or line.include?("</pre>")
            in_code_block = false
    File.rename("temp.html", file)

All this does is open each file recursively in the _site directory, read each line of each file and if the line is blank and outside a <pre> and/or <code> block it gets stripped, otherwise it goes through ok.