Nice. Haven't gone through it fully, but the header parsing stood out for improvement. Use match to capture number of '#' characters and use length, for example:
$ echo '# ' | awk 'match($0, /^#+ /, m){print length(m[0])-1}'
1
$ echo '### ' | awk 'match($0, /^#+ /, m){print length(m[0])-1}'
3
You can also use capture groups so that you do not need -1 and remove that substr as well.
awk 'match($0, /^(#+) (.+)/, m){l=length(m[1]); print "<h" l ">" m[2] "</h" l ">"}'