ePub nerds: Test-fly a Kindle book for us?

Folks who are experts on ePub and .mobi format… would you mind test-flying an eBook version of Real World Divorce? The web version is derived in semi-real-time from Google Docs (with multiple simultaneous co-authors a convenient collaboration environment was important). It turns out that Google Docs turns out some crazily complex HTML. We strip out some of that for the web version and try to strip more of it with a Perl script for the ePub.

So far it seems to be readable with Calibre and Kindle for PC.

Here are links to the current versions of the files:

Any feedback/corrections/etc. welcomed.

14 thoughts on “ePub nerds: Test-fly a Kindle book for us?

  1. 1. when you open an epub the cover should come up as the 1st page – yours doesn’t.

    2. There should be a space on either side between the vertical bars : author|Goode , etc. = author | Goode and a period after Tina, author.

    3. Your table of contents doesn’t look like a TOC. A TOC should be literally just a table, not a 2 paragraph summary after each entry. I prefer numbered chapters but a lot of ebooks don’t #.

    4. Acknowledgments or a dedication are customary.

    5. The chapter headings should be hyperlinks back to the TOC.

    6. Chapter headings should be all caps and/or a different font and /or color.

    7. At the end there should be an “about the author(s)” , maybe with photos and an index with hyperlinks to page #s.

  2. Thanks, Jackie!

    There is an About the Authors chapter, which does have an Acknowledgments section, at the end of the overview chapters (before the state-by-state chapters). I guess we need to help people find it.

    The standards that I could find said that one should have both an explicit TOC in the file and also that the device will generate a TOC. So I thought that the explicit TOC should have some kind of additional info, e.g., subheads or summaries.

    What does it mean to say that a chapter heading should be in a different font? I thought that the fonts were entirely up to the reader software. An eBook is not distributed with its own fonts, is it? Or are there fonts that a publisher can be guaranteed will be present on every reader and specified from the document?

  3. In iBooks on my iPad running iOS 10 there are no line breaks. It’s as if someone wrote

    * { display: inline; font-size: 13pt; }

    Blue text (I assume they’re links) don’t work.

    in iBooks on macOS it says “This book can’t be opened. This book is corrupt.”

  4. Riffing off of what Jackie said, I’d advise against bundling your own font and specifying it in your CSS unless you want to pay extra for licensing fees. The fonts shipped with readers tend to be somewhat better than the 2–4 decent free fonts available on the Internet (Lato, Alegreya, and Noto come to mind).

    While I can’t verify this (see above comment), having chapter headings in a different size and weight (either go bold and large or go thin and large) should be enough.

  5. Tagging onto what Stationary Feast said about iBooks on macOS, the book cover image does display in the thumbnail list of books in the reader. So it’s part-way there in that case.

  6. Without embedding [free] fonts, I think you can specify font families (san-serif, serif, monospace, etc. ) with a [n optional] list of common preferred fonts that may already be on the device and and if the reader doesn’t have any of them it will substitute whatever it has in that family. So in the following example, the device will either pick Gill or Helvetica (whichever comes first) if it has one of them and if it doesn’t then whatever san-serif it has.

    body { font-family: Gill, Helvetica, sans-serif }

    https://www.w3.org/TR/CSS21/fonts.html#propdef-font-family

    Or you can just do the same font but make the chapter headings some combo of different size, color, bold, italic, all caps – anything to make them stand out.

  7. Before I even download and look at the ePub file, understand that you have to produce valid XHTML. That also means semantic XHTML, which may relate to the complaints above about lousy typography for headings and text running together. (Again without looking yet, that means you don’t have proper H1 through H6 and even the simplest thing in the world, P for paragraphs, you also didn’t manage.)

  8. Since we’re here, BBEdit, which you should be using anyway, can natively browse zip files as though they were simple documents and folders. ePub documents are, in part, zip files. You can browse and fix all your lousy HTML right in BBEdit without the unduly complex process of re-zipping the source files for ePub output.

    I’ve been reading your site for about 20 years and I just know you’re the kind of developer who argues about HTML semantics and basically can’t be reasoned with. (You also use neutral quotes and would argue in favour of those.) So I think I should give up.

  9. Joe Clark knows his stuff. I thought of him immediately when I read this post.

    ePub is fairly easy to get right, but exporting from Google Docs puts you at a disadvantage.

    BBEdit/TextWrangler is great, and I’ve had good luck with Apple’s iBooks Creator (I think that’s what it’s called). On non-Mac platforms, I’ve used Calibre (it has an ePub editor, but it’s very manual), or vi (even more manual).

    If you’re looking to set up a Google Docs to ePub conversion pipeline to pick up ongoing document revisions, you’re looking at a bit of a project. Perhaps one that has already been done by someone else, though. I’d check GitHub and GitLab public repositories before committing.

    ePub is an acceptable master source format, but even plain text would be better than Google Docs HTML. You can convert to anything else with Pandoc, or Calibre with a little pain.

    If Joe disagrees with anything I’ve written here, I’m wrong and he’s right.

  10. Joe: I’m sure that you’re right, but we already produced the book! I don’t think we can rewrite it now in BBEdit. It didn’t occur to us that Google Docs, which is more or less a native HTML application, would produce even worse HTML than Microsoft Word. (These companies are going to be building us self-driving cars next?)

    So we want some way to do an ePub/.mobi without re-keying all of the text.

    Going forward, what would be the best way for multiple authors to collaborate and yet produce usable and legal HTML? Sending around BBEdit drafts doesn’t seem like a good idea (too much chance for edits to be lost). We like the multi-user access and versioning of Google Docs.

    What about using WordPress or something similar? That has an HTML editor that doesn’t require typing tags. It doesn’t support truly simultaneous editing but at least one author could work in the morning while another worked in the afternoon. WordPress does keep revisions.

  11. If you want a fantastically aggressive cruft stripper, I’d try using an HTML-to-Markdown converter and then using a Markdown-to-XHTML converter to get proper XHTML that you can paste into your OPS/chapter\d\d.xhtml files.

    For what it’s worth, ePubs are the ___one___ place on the planet where you must use XHTML and not just HTML. This means you’ll need things that generate the proper XHTML xmlns stuff and img, link, etc. tags that’re self-closing, as in .

    Joe: He actually does have proper h1s, although he goes from h1 to h3 without an intervening h2. Unfortunately, he has HTML5 () in .xhtml files. I also looked through his supplied CSS (in BBEdit, no less) and I can’t see anything that looks even remotely like * { display: inline; font-size: 13pt; }.

    Oh, and I’m surprised you suggested editing in BBEdit. The standard and some validator I used once say that mimetype should be the first thing in the zip file and it needs to be uncompressed (!); I did this with `zip -X0 foo.zip mimetype && zip -rDX9 foo.zip * -x mimetype && mv foo.zip foo.epub`.

  12. …going forward, I’d suggest a bunch of Markdown files in a directory shared between all the authors. One per chapter. Won’t lose edits, although you might need to break out diff to resolve conflicts between edits.

    I’d also run all the pages through a quote educator; TextWrangler (BBEdit’s free cousin) has Text->Educate Quotes. Of course, BBEdit does too.

Comments are closed.