Communities

Writing
Writing
Codidact Meta
Codidact Meta
The Great Outdoors
The Great Outdoors
Photography & Video
Photography & Video
Scientific Speculation
Scientific Speculation
Cooking
Cooking
Electrical Engineering
Electrical Engineering
Judaism
Judaism
Languages & Linguistics
Languages & Linguistics
Software Development
Software Development
Mathematics
Mathematics
Christianity
Christianity
Code Golf
Code Golf
Music
Music
Physics
Physics
Linux Systems
Linux Systems
Power Users
Power Users
Tabletop RPGs
Tabletop RPGs
Community Proposals
Community Proposals
tag:snake search within a tag
answers:0 unanswered questions
user:xxxx search by author id
score:0.5 posts with 0.5+ score
"snake oil" exact phrase
votes:4 posts with 4+ votes
created:<1w created < 1 week ago
post_type:xxxx type of post
Search help
Notifications
Mark all as read See all your notifications »
Q&A

What text format is least likely to clash with ebook formats?

+1
−0

There are several different software products for collecting and formatting the electronic written word. Microsoft & Google, are two big names offering products, there are also many others.

Looking at some of the questions here on ebooks, it is obvious that all software is not created equal and some formats translate to other formats better than others.

If I want to write and save my work, with basic formatting. Only modifying; italics, bold & font size for chapter headers. What software and/or file type, will be the most consistent when converting to ebook formats?

I want to be able to write, edit and provide basic formatting to my book, and not have to worry about doing a complete edit review for each ebook format I convert to.

History
Why does this post require attention from curators or moderators?
You might want to add some details to your flag.
Why should this post be closed?

0 comment threads

4 answers

+2
−0

TL;DR; In order not to worry about the output, you have to be sure of the input and the translation process into an ebook. The input has to show you everything that is there and hide nothing, and the translation process must be deterministic.


If you only need basic formatting and want it to be consistently converted, the best you can do is use a mark-up language like reStructured Text, markdown or DocBook.

The reason for this is that common word processors like Word¹ are what-you-see-is-all-you've-got: you often have no idea what the internal structure of a document is even though it shows "right" on the screen.

This happens e.g. when you have three words in italics, and select the middle one and set italics again. Is the internal structure such that italics ends after the first word and begins at the third; or does it end at the beginning of the second word (after the space) and start at the end of the second before the space; or is the second word just italics and does italics within italics mean: display non-italics? And what happens to this structure when you then deleted the second word?²

With a mark-up language the mark-up is explicit and you see what you do: where things start, where things end. The translation of mark-up gets you the same result every time, and that makes it predictable. And predictable means you know what you get as an end-result and you don't have to check it. At least it should not be a result after the first time using some construct ;-).

The preceeding isn't necessarily true for two documents that look the same in Word: they might have different internal structures, which translate to different output in an ebook.

For many people this is not a problem, as to how this differs depends on the the target ebook format. Formats that do less rendering of structure (such as image based ebook formats like DjVu, or typesetting oriented formats like PDF and DVI (LaTeX output)) are similar to output to a printer. Word has control over this, and makes it look the same whether it renders to screen, to the printer or to PDF. That is why you normally have no visual difference directly printing from Word, or generating a PDF and printing that.

This can be different if your input text is translated in another mark-up, as in ebooks in EPUB, MOBI, etc. format. In that case knowing exactly what structure you started with is important as that may influence the output rendering. And that is where Word could get a different output for similar looking input, because there might be hidden elements, but something in a mark-up language will be the same if it looks the same, because there is nothing that is hidden.

The aforementioned example, which looks the same in Word could translate in one of the following HTML pieces (there are more possibilities, and this might be an oversimplification, but the I hope it conveys the principle):

abc def ghiabc def ghi
abc def ghiabc def ghi
abc def ghiabc def ghi

If the output was to be LaTeX instead of HTML (where italics in italics switches back to non-italics) the third would look the same as the first two, and the inter-word spacing would differ after italics ends (it might be different in HTML rendering by Firefox as well, I just can't see it).

The conversion process should be deterministic, and it probably is for your purposes. However Knuth did do the calculation in TeX in integers because he argued that the floating point units in computers (at that time) where not deterministic enough because of rounding differences. IIRC this was to get the same result on different computers, so this is—hopefully—only an example of to what extremes people go to control the repeatability of their output.


For conversion of mark-up, have a look at Sphinx or pandoc. The former can generate output in PDF without having to install LaTeX, the latter has more in- and output formats. Both can generate HTML which you might use as a basis for MOBI as I have not seen any direct generation towards that of mark-up languages (but since I don't have a MOBI device, I have not really looked for that).

¹ I use Word here, but this applies to OpenOffice writer and similar editors as well

² I fear too many of us have edited documents only to have new text at the cursor have some font or formatting that we did not expect and is a result of some program internal formatting lurking invisible.

History
Why does this post require attention from curators or moderators?
You might want to add some details to your flag.

This post was sourced from https://ebooks.stackexchange.com/a/1107. It is licensed under CC BY-SA 3.0.

0 comment threads

+1
−0

I think you should take a look at Sigil, a program for creating/editing epub-format ebooks.

On the download page, there is an installer for Windows and an image for Mac.

Sigil is open source and therefore free. You can input text from whatever you are using and edit either using a code editor or a WYSIWYG editor, which makes it pretty easy. The WYSIWYG editor has basic formatting options, easy chapter breaks and even the means for generating a table of contents.

History
Why does this post require attention from curators or moderators?
You might want to add some details to your flag.

This post was sourced from https://ebooks.stackexchange.com/a/1108. It is licensed under CC BY-SA 3.0.

0 comment threads

+1
−0

ASCIIDoc is a nice compromise between bare bones (like markdown) and full featured (like tex).

There are lots of examples online, but the basic idea is this: you write in plain text, using as-obvious-as-reasonable styles to indicate headers, italics, lists, etc. then you can export that format to PDF, HTML, EPUB, etc.

A quick example to get you started:

= The Book Title

== The first chapter
Nec vitae mus fringilla eu vel pede sed pellentesque. Nascetur fugiat
nobis. Eu felis id mauris sollicitudin ut. Sem volutpat feugiat.
Ornare convallis urna vitae.

Nec mauris sed aliquam nam mauris dolor lorem imperdiet.

== The second chapter
Ut suspendisse nulla. Auctor felis facilisis. Rutrum vivamus nec
lectus porttitor dui dapibus eu ridiculus tempor sodales et. Sit a
cras. Id tellus cubilia erat.

Quisque nullam et. Blandit dui tempor. Posuere in elit diam egestas
sem vivamus vel ac.

Then you can generate an EPUB formatted book file using AsciiDoc’s a2x wrapper:

a2x -fepub -dbook mybook.txt

History
Why does this post require attention from curators or moderators?
You might want to add some details to your flag.

This post was sourced from https://ebooks.stackexchange.com/a/1110. It is licensed under CC BY-SA 3.0.

0 comment threads

+1
−0

If you are just interested in basic formatting, markdown is by far the simplest choice. You may write your text on any platform, and then converting it.

History
Why does this post require attention from curators or moderators?
You might want to add some details to your flag.

This post was sourced from https://ebooks.stackexchange.com/a/1109. It is licensed under CC BY-SA 3.0.

0 comment threads

Sign up to answer this question »