I’ve been having a bit of a behind the scenes tidy-up and finally fixed the indentation of my Jekyll includes and layouts so my source is as hunky-dory as can be (I can live with each individual post not having the correct indentation for the time being; I may investigate this plugin for indentation).
Besides indentation, the other annoying thing about Jekyll is how lines consisting of purely Liquid tags end up as blank lines, especially when within loops.
I’m not keen on the idea of using yet more Liquid tags to solve a problem with Liquid tags so I toyed with the idea of stripping the blank lines out with a shell script I could run after the site had been generated:
#!/bin/sh
for file in `find _site/ -name "*.html" -print`; do
sed -E '/^[[:space:]]*$/ d' $file > tmp;
mv tmp $file;
done
But I then realised the flaw with this approach is that it would also strip blank lines from inside <pre>
and/or <code>
blocks which is almost certainly not what I want.
Fixing properly, at source, in Jekyll is a no-no at the moment as I’m intentionally running an older version due to the bizarre idea to introduce a dependency on Node.js.
I thought about using Nokogiri to delete the relevant nodes, but I couldn’t figure out how to identify the correct nodes. This was something I tried:
doc.xpath('//text()[not(ancestor::code) and normalize-space(.)=""]').remove
But the problem is that it’s node based instead of line based and so differentiating between intentionally “empty” (indenting) nodes and completely blank lines is tricky (well it was for me, any way). So I settled on a quick and dirty, but surely wrong, approach to do this:
Dir['_site/**/*.html'].each do |file|
file_to_strip = IO.readlines(file)
fout = File.open("temp.html", "w")
in_code_block = false
file_to_strip.each do |line|
if line.include?("<code>") or line.include?("<pre>")
in_code_block = true
end
if (not in_code_block and not line.strip.empty?) or in_code_block
fout << line
end
if line.include?("</code>") or line.include?("</pre>")
in_code_block = false
end
end
fout.close
File.rename("temp.html", file)
end
All this does is open each file recursively in the _site
directory, read each line of each file and if the line is blank and outside a <pre>
and/or <code>
block it gets stripped, otherwise it goes through ok.