Having looked around at various ebook publishers, some common features emerged. Most supported the same set of distribution channels. All accepted MSWord manuscripts. Most could distribute in Kindle (AZW/Mobi, AZW3), Epub and PDF formats. And yet all had what seemed like Draconian formatting standards. Obviously I wasn't expecting any issues with a PDF as a final format, but I did expect that the other, HTML based formats, would have no trouble with a fairly restricted set of set of HTML with limited CSS – specifically:
- H1, H2, H3 and H4 headings
- Bold, italic and underlined normal text
- A monospace text for code samples
- PNG images (anchored as paragraphs)
In addition, having explicit page breaks at the top of each chapter was a nice to have.
While Calibre seemed to cope with a reasonable conversion of the original OpenOffice manuscript, there was a lot of artefacts in the generated epub, and problems with image sizing. Rather than tinker directly with the epub or OpenOffice file, I exported the file as html from OpenOffice and set to cleaning it up. I was able to script most of this, but it still required hand editing to produce sensible, well-formed HTML (and to clean up the non-visible artefacts from OpenOffice, such as <I>unnecessarily</I><I> interrupted</I> <I>tags</i>). This approach had the desired outcome; converting the HTML to epub in Calibre eliminated all the conversion artefacts.
I now had something I could proof read in an e-reader.
In my research, I saw very few technical ebooks. I soon found out the reason why. And the reason for the restrictive style guides. It seems that very few readers are capable of visually representing the layout defined in the file – despite the file format explicitly supporting the HTML elements. In short, they did not properly implement the file format they claimed. Indeed missing it by a wide margin in many cases.
Here is a sample of the programs I tried on my Lenovo Yoga 10 (Android):
|sans font?||serif font?||borders
|Moon+ eReader||no||no||yes||no||sort of||unexpected page breaks injected|
|Aldiko Reader||no||no||yes||no||no||unexpected reformatting of paragraphs|
by Vadim Lopatin
|eReader Prestigious||no||no||yes||no||no||random blocks of whitespace,
particularly adjacent to imgs
|yes||yes||yes||yes||yes||no margins, poor kerning and difficult navigation|
|UB reader||yes||yes||yes||yes||yes||does not ALIGN=CENTER|
|yes||yes||yes||yes||yes||Does not pre-render prev/next pages
making turning pages slow and disconcerting.
Like UB reader, does not ALIGN=CENTER
|Gitden Reader||yes||yes||yes||yes||yes||seems to be making its own mind up about image sizes
but making sensible choices. Otherwise strong
compatability and mathml support
(I was unable to open books stored on the local filesystem using Scribd or Kindle)
Overall, Gitden reader stood out as the most capable eReader for rendering layouts, with UB reader in a close second.
Although all were able to visually represent different header levels, the problems with fonts, borders and backgrounds meant that I have had to use all three techniques to have a reasonable chance that code samples will be visually different from body text in the rendered document.