This 2010 article has been updated for the release of EPUB 3. Read the newer version here: EPUB: an introduction

1. What is “ePub”?

ePub is a format for digital books. It is an XML format that has been defined by the International Digital Publishing Form (IDPF) (, which is the international standards committee for digital publishing.

2. What is XML format, and what does it mean for books?

XML is a kind of programming language that is text-based and semantic (ie meaningful). It enables information to be structured in a standard way so that it can be easily exchanged between systems without knowledge of specific hardware or software. That means the same XML file can be used on multiple platforms and displayed in different ways, as long as it conforms to the rules of the XML document defining the type of information. These rules are called the Document Definition Type (DTD), and a file that conforms to a DTD is said to be valid.

ePub books are thus text files that end in the extension .epub. The text is structured according to the ePub DTD, as defined by the IDPF. Valid .epub files can then be used on any platform or device that supports ePub.

The structure and language for the XML files defined by ePub are specific to books. For example, sections within a book are denoted by the <chapter> tag. By contrast, scholarly journal DTDs define journal sections by <abstract> and <article> tags (there’s more to it than this, but you get the idea – this is what is meant by a semantic language).

3. How are ePub files different from other types of digital book files?

The main difference between ePubs and other types of digital book files, from a user perspective, is that ePubs do not define pages, so that the text can easily reflow and resize to suit different types of digital book readers – it’s a kind of “one size fits all” format.

Of course, there are some other types of digital book files that also reflow, eg other types of XML files and PDF files that are set to reflow. However, the two most common digital book formats are PDF which is deliberately set not to reflow so that page integrity is maintained, and HTML which tends to be defined by the page (though there may be reflow within browser windows).

4. What are the effects of no page definition on production and layout?

The lack of page definition has several implications:

  • there are no running headers or footers (though these can be set at application level to display the current section);
  • there are no set page numbers that correspond to the screen view: the reader’s place within the text is shown in different ways depending on the application;
  • layout is simplified: there is only ever 1 column of text, and images are not positioned in particular places alongside text, but are “in line”;
  • footnotes cannot be placed at the bottom of the relevant page, but become endnotes, placed at the end of the chapter or book, and are linked to the number in the text.

If you are used to producing layouts that depend heavily on positioning of text, images and other elements within the page, you may need to rethink this approach when doing ePub books.

5. Do I need special software and systems to produce an .epub file?

There is no single answer to this; it depends on what systems and software you are already using, your technical and production expertise, and how fussy you are about the outcome.

For example, industry-standard typesetting applications (eg InDesign) either already support .epub file export, or will do so soon. You still need to take account of the design factors listed above, and also that all your text uses styles, including character styles for italics, bold, superscript, etc.

If you outsource your typesetting, your typesetter may be able to produce ePub files for you, or you can choose to outsource ePub conversion of your files to another service provider. There is no shortage of ePub conversion suppliers.

There are also low-cost applications that offer to convert PDFs to ePubs. Of course, the quality of the output varies, and also depends on the quality of the file supplied.

Large publishers may choose to develop their own in-house conversion systems, especially if they already have XML-based systems and in-house technical expertise, while small to medium publishers might prefer to work with technology partners on in-house solutions.

Whether you produce your own ePub files or have someone else do it for you, you need to make sure the process includes validation (see item 2 above) and user testing.

6. Are all books suitable for ePub?

At present, simple text-based books such as novels and standard non-fiction books are easiest to produce as ePubs because of the design considerations noted above (item 4). They are also easiest to read on ePub reading devices as they lend themselves to immersive reading.

Graphic-rich textbooks and illustrated children’s books are probably the most difficult to produce as ePubs right now, and publishers of such books are producing digital versions in other ways that incorporate multimedia (notably multimedia “apps”).

Publishers of scientific, legal and reference material are well advanced in producing xml-based files published on their own platforms and in web-based browser platforms, with greater functionality than currently offered by ePub and ePub readers (see item 7). It is difficult to see much advantage in migrating these to the ePub format at present.

However, the ePub standard, the software and the hardware continue to develop rapidly, so that there is reason to suppose that all books may work well as ePubs in the not too distant future.

7. Where can I see examples of ePub books?

ePub books need ePub compatible software (aka applications or “apps”). There are a number of such apps available, which are either pre-installed on a hardware device (desktop, laptop, tablet or ebook reader) or available for download onto a hardware device. This is not as complicated as it sounds – it is in fact no different from needing Microsoft Word to read .doc files, Adobe Reader to read .pdf files, or a web browser such as Firefox or Safari to read .html files (see previous post: Ebook formats: the basics).

The following list of ePub applications, listed alphabetically, makes no distinction between downloadable or pre-installed software, nor commercial availability. It is also not exhaustive. Most have free books to view, as well as some for purchase.

  • Adobe Digital Editions
  • Apple iBooks
  • Barnes & Noble’s Nook
  • Kobo
  • Sony eReader
  • Stanza

Free ePubs are also available from Project Gutenberg, but you will need one of the above ePub readers to use them. There are other digital book devices around that use different XML-based formats, most notably Amazon’s Kindle, Mobipocket and DAISY. Conversion from ePub format to another XML-based format is relatively straightforward.

8. What can I do with my ePub book files once I have produced them?

Assuming you wish to sell your books, you will need an agreement in place with one or more ePub book vendors (unless you are a major publisher, this is easier said than done, and the topic of a future post). Alternatively, you may have your own sales platform in place.

If you wish to give your ePub books away, then you only need to place them online for readers to download, as Project Gutenberg has (with similar advice on formats and readers).

Where can I find out more?

Information on the ePub standard is found on the IDPF website (

Help with ePub production or conversion is readily available online. You may want to join one of the many ebook and ePub groups on Linked In and find an expert and/or ask the group for advice. If you are a publisher, author or designer, your relevant industry association should be able to help.

For assistance with downloading, buying or using ePub books, or using ePub readers, see the specific vendor websites.

© 2010 Linda Kythe Nix. All rights reserved.