You are on page 1of 14

.1) What is ePub ?

9 EPUB (short for electronic publication) a standard adopted by


International Digital Publishing Forum (IDPF) is free and open, but
powerful.
9 Files have the extension .epub.
9 EPUB is designed for reflowable content, meaning that the text display can
be optimized for the particular display device.
9 The format is meant to function as a single format that publishers and
conversion houses can use in-house, as well as for distribution and sale.
9 It supersedes the Open eBook standard

.2) What are the contents of ePub, which comprises ePub as a whole ?

ePub is not a single file, but it is an zip archive -- the content files are packed
under a single zip archive and named with .epub (e.g. 123456789.epub)
extension. The specification followed here is OCF (Open Container Format) 1.0.

A typical OCF is a zip file that might look like:

mimetype
META-INF/
container.xml
[manifest.xml]
[metadata.xml]
[signatures.xml]
[encryption.xml]
[rights.xml]
OEBPS/
Great Expectations.opf
cover.html
chapters/
chapter01.html
chapter02.html
… other HTML files for the remaining chapters …

The following are the ingredients of the ePub file:

1. (file) mimetype
The first file in the ZIP Container MUST be a file by the ASCII name of
‘mimetype’ which holds the MIME type for the ZIP Container (i.e.,
“application/epub+zip” as a 20 character ASCII string; no padding, CR/LF,
white-space or case change). The file MUST NOT be compressed nor encrypted
and there MUST NOT be an extra field in its ZIP header

2. (folder) META-INF
This folder holds the container.xml file.
Container.xml:
The only line in container.xml is about pointing out where the OPF lies.

<?xml version="1.0" encoding="UTF-8"?>


<container xmlns="urn:oasis:names:tc:opendocument:xmlns:container"
version="1.0">
<rootfiles>
<rootfile full-path="OEBPS/package.opf" media-
type="application/oebps-package+xml"/>
</rootfiles>
</container>

Note that the element <rootfiles> points that the “package.opf” is in the folder
OEBPS

2. (folder) OEBPS: (Under the specification OPS (Open Publication Structure) 2.0):

This folder otherwise can be named as OPS. This folder contains all the HTML
files, OPF (package.opf), NCX (toc.ncx) and two sub-folders namely (a) css (b)
images

OPF (package.opf)
The OPF specificationʹs purpose is to ʺdefine the mechanism by which the
various components of an OPS (OEPBS) publication are tied together and
provides additional structure and meaning to the electronic publication.ʺ This is
accomplished by two XML files with the extensions .opf and .ncx. The mimetype
of OPF file is application/oebps-package+xml

The skeleton of package.opf will look like this:

<?xml version="1.0" encoding="UTF-8"?>


<package xmlns="http://www.idpf.org/2007/opf" version="2.0" unique-
identifier="p9780833036117">

<metadata xmlns:opf="http://www.idpf.org/2007/opf"
xmlns:dc="http://purl.org/dc/elements/1.1/">
<dc:title> </dc:title>
<dc:creator opf:role="author" opf:file-as=""> </dc:creator>
<dc:publisher> </dc:publisher>
<dc:rights> </dc:rights>
<dc:identifier id=" " opf:scheme="ISBN"></dc:identifier>
<dc:language> </dc:language>
<meta name="cover" content="cover-image"/>
<meta name="srp-usd" content="0167.40"/>
</metadata>

<manifest>
</manifest>

<spine>
</spine>

<guide>
</guide>

</package>

(i) Package:

the package node must have the unique-identifier attribute

<package xmlns="http://www.idpf.org/2007/opf" version="2.0" unique-


identifier="p9780833036117">

ISBN of the book

(ii) Metadata
The metadata element contains all the metadata information for a particular
EPUB file. Three metadata tags are mandatory, though there are many more
available: title, language, identifier.
title contains the title of the book.
language contains the language of the bookʹs contents.
identifier contains a unique identifier for the book, such as its ISBN or a URL.
The identifierʹs id attribute should equal the unique-identifier attribute from the
package element.

Example of Metadata:

<metadata xmlns:opf="http://www.idpf.org/2007/opf"
xmlns:dc="http://purl.org/dc/elements/1.1/">
<dc:title>Training</dc:title>
<dc:creator opf:role="author" opf:file-as="RUSSELL, GLENN,">Glenn,
Russell</dc:creator>
<dc:publisher>RAND Corporation</dc:publisher>
<dc:rights>&#169; Copyright 2003 City of Los Angeles</dc:rights>
<dc:identifier id="p9780833036117"
opf:scheme="ISBN">9780833036117</dc:identifier>
<dc:language>en-US</dc:language>
<meta name="cover" content="cover-image"/>
<meta name="srp-usd" content="0167.40"/>
</metadata>

(those elements marked in blue are mandatory)

(iii) Manifest:

The manifest element lists all the files contained in package. Each file is
represented by an item element, and has the attributes id, href, media-type. All
XHTML (content documents), stylesheets, images or other media, embedded
fonts, and the .ncx file should be listed here. Only the .opf file, container.xml,
and mimetype files should not be included.

<item id="cover" href="cover.xml" media-type="application/xhtml+xml"/>

Example of Manifest:
<manifest>
<item id="cover" href="cover.xml" media-type="application/xhtml+xml"/>
<item id="cover-image" href="images/9780833036117.jpg" media-type="image/jpeg"/>
<item id="style" href="css/stylesheet.css" media-type="text/css"/>
<item id="ncx" href="toc.ncx" media-type="application/x-dtbncx+xml"/>
<item id="page-template" href="css/page-template.xpgt" media-type="application/adobe-
page-template+xml"/><item id="copyright" href="copyright.html" media-
type="application/xhtml+xml"/>
<item id="toc" href="toc.html" media-type="application/xhtml+xml"/>
<item id="titlepage" href="titlepage.html" media-type="application/xhtml+xml"/>
<item id="Art_tab1.2.jpg" href="images/Art_tab1.2.jpg" media-type="image/jpeg"/>
<item id="Art_fig2.1.jpg" href="images/Art_fig2.1.jpg" media-type="image/jpeg"/>
<item id="Art_P156.jpg" href="images/Art_P156.jpg" media-type="image/jpeg"/>
<item id="Art_P157.jpg" href="images/Art_P157.jpg" media-type="image/jpeg"/>
<item id="preface001" href="preface001.html" media-type="application/xhtml+xml"/>
<item id="frontmatter001" href="frontmatter001.html" media-
type="application/xhtml+xml"/>
<item id="frontmatter002" href="frontmatter002.html" media-
type="application/xhtml+xml"/>
<item id="introduction" href="introduction.html" media-type="application/xhtml+xml"/>
<item id="acknowledgments" href="acknowledgments.html" media-
type="application/xhtml+xml"/>
<item id="chapter001" href="chapter001.html" media-type="application/xhtml+xml"/>
<item id="chapter002" href="chapter002.html" media-type="application/xhtml+xml"/>
<item id="glossary" href="glossary.html" media-type="application/xhtml+xml"/>
<item id="appendix001" href="appendix001.html" media-type="application/xhtml+xml"/>
<item id="appendix002" href="appendix002.html" media-type="application/xhtml+xml"/>
<item id="bibliography" href="bibliography.html" media-type="application/xhtml+xml"/>
<item id="footnotes" href="footnotes.html" media-type="application/xhtml+xml"/>
</manifest>

If you are embedding font, it should be open-type and need to list it within
manifest element like this:

<item id="myfont" href="css/myfont.otf" media-type="application/x-font-


opentype"/>

(iv) Spine:

The spine element lists all the XHTML content documents in their linear reading
order. Also, any content document that can be reached through linking or the
table of contents must be listed as well. The toc attribute of spine must contain the
id of the .ncx file listed in the manifest. Each itemref elementʹs idref is set to the id of
its respective content document.
Example of Spine:

<spine toc="ncx">
<itemref linear="no" idref="cover"/>
<itemref linear="yes" idref="toc"/>
<itemref linear="yes" idref="titlepage"/>
<itemref linear="yes" idref="copyright"/>
<itemref linear="yes" idref="preface001"/>
<itemref linear="yes" idref="frontmatter001"/>
<itemref linear="yes" idref="frontmatter002"/>
<itemref linear="yes" idref="introduction"/>
<itemref linear="yes" idref="acknowledgments"/>
<itemref linear="yes" idref="glossary"/>
<itemref linear="yes" idref="chapter001"/>
<itemref linear="yes" idref="chapter002"/>
<itemref linear="yes" idref="appendix001"/>
<itemref linear="yes" idref="appendix002"/>
<itemref linear="yes" idref="bibliography"/>
<itemref linear="yes" idref="footnotes"/>
</spine>

(v) Guide:

The guide element is an optional element for the purpose of identifying


fundamental structural components of the book. Each reference element has the
attributes type, title, href. Files referenced in href must be listed in the manifest, and
are allowed to have an element identifier

Example of Guide:

<guide>
<reference type="copyright" title="Copyright" href="copyright.html"/>
<reference type="toc" title="Table of Contents" href="toc.html"/>
<reference type="text" title="Begin Reading" href="chapter001.html"/>
</guide>

The following are the value for attribute type :


cover -- the book cover(s), jacket information, etc.
title-page -- page with possibly title, author, publisher, and other metadata
toc -- table of contents
index -- back-of-book style index
glossary
acknowledgements
bibliography
colophon
copyright-page
dedication
epigraph
foreword
loi -- list of illustrations
lot -- list of tables
notes
preface
text -- First "real" page of content (e.g. "Chapter 1")

NCX (toc.ncx)

The .ncx file (Navigation Control file for XML) the hierarchical structure of a
Publication to allow the user to navigate through it. The NCX is similar to a table
of contents in that it enables the reader to jump directly to any of the major
structural elements of the document, i.e. part, chapter, or section, but it will often
contain more elements of the document than the publisher chooses to include in
the original print table of contents. Its development was motivated by the need
to provide quick access to the main structural elements of a document without
the need to parse the entire documents. The specification for .ncx was developed
for Digital Talking Book (DTB), and is not a part of the EPUB specification. The
.ncx file has a mimetype of application/x-dtbncx+xml.

The following four metadata items are mandatory for all NCX documents,

<meta name="dtb:uid" content="123456789X"/> <!-- same as in .opf -->


<meta name="dtb:depth" content="1"/> <!-- 1 or higher -->
<meta name="dtb:totalPageCount" content="0"/> <!-- must be 0 -->
<meta name="dtb:maxPageNumber" content="0"/> <!-- must be 0 -->

The skeleton of the NCX will be like this:


<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE ncx PUBLIC "-//NISO//DTD ncx 2005-1//EN"
"http://www.daisy.org/z3986/2005/ncx-2005-1.dtd">

<ncx version="2005-1" xml:lang="en"


xmlns="http://www.daisy.org/z3986/2005/ncx/">

<head>
<meta name="dtb:uid" content="123456789X"/>
<meta name="dtb:depth" content="1"/>
<meta name="dtb:totalPageCount" content="0"/>
<meta name="dtb:maxPageNumber" content="0"/>
</head>

<docTitle>
<text>x x x</text>
</docTitle>

<docAuthor>
<text>x x, x</text>
</docAuthor>

<navMap>
<navPoint class="chapter" id="chapter1" playOrder="1">
<navLabel><text>Chapter 1</text></navLabel>
<content src="chapterl.xhtml"/>
</navPoint>
</navMap>

</ncx>

Please note here, is that the values for the docTitle, docAuthor, meta name=ʺdtb:uidʺ
elements should match their analogs in the .opf file. Also, the meta
name=ʺdtb:depthʺ element is set equal to the depth of the navMap element. navPoint
elements can be nested to create a hierarchical table of contents. navLabelʹs
content is the text that will appear in the table of contents generated by reading
systems that use the .ncx. navPointʹs content element points to a content document
listed in the manifest and can also include an element identifier. navPoint
contains the attributes playorder and the id

The following is the example of a typical toc.ncx:


The area marked as . . . has been edited because of repeated elements
<?xml version="1.0" encoding="UTF-8"?>
<ncx xmlns="http://www.daisy.org/z3986/2005/ncx/" version="2005-1" xml:lang="en-US">
<head>
<meta name="dtb:uid" content="p9780833036117"/>
<meta name="dtb:depth" content="1"/>
<meta name="dtb:totalPageCount" content="0"/>
<meta name="dtb:maxPageNumber" content="0"/>
</head>
<docTitle><text>Training the 21st Century Police Officer</text></docTitle>
<docAuthor><text>Glenn, Russell</text></docAuthor>
<navMap>
<navPoint playOrder="1" id="title-page"><navLabel><text>Title
Page</text></navLabel><content src="titlepage.html"/></navPoint>
<navPoint playOrder="2" id="copyright"><navLabel><text>Copyright
Page</text></navLabel><content src="copyright.html"/></navPoint>
<navPoint playOrder="3" id="toc"><navLabel><text>Table of
Contents</text></navLabel><content src="toc.html"/></navPoint>
<navPoint playOrder="4"
id="preface001"><navLabel><text>PREFACE</text></navLabel><content
src="preface001.html"/>
<navPoint playOrder="5" id="preface001-h1-1"><navLabel><text>RAND PUBLIC SAFETY AND
JUSTICE</text></navLabel><content src="preface001.html#h1-1"/></navPoint></navPoint>
<navPoint playOrder="6" id="frontmatter001"><navLabel><text>Table of
Figures</text></navLabel><content src="frontmatter001.html"/></navPoint>
<navPoint playOrder="7" id="frontmatter002"><navLabel><text>Table of
Tables</text></navLabel><content src="frontmatter002.html"/></navPoint>
<navPoint playOrder="8"
id="introduction"><navLabel><text>SUMMARY</text></navLabel><content
src="introduction.html"/></navPoint>
<navPoint playOrder="9"
id="acknowledgments"><navLabel><text>ACKNOWLEDGMENTS</text></navLabel><content
src="acknowledgments.html"/></navPoint>
<navPoint playOrder="10"
id="glossary"><navLabel><text>ACRONYMS</text></navLabel><content
src="glossary.html"/></navPoint>
<navPoint playOrder="11" id="chapter001"><navLabel><text>Chapter One - BACKGROUND AND
METHODOLOGY</text></navLabel><content src="chapter001.html"/>
<navPoint playOrder="12" id="chapter001-h1-2"><navLabel><text>SETTING THE
CONTEXT</text></navLabel><content src="chapter001.html#h1-2"/></navPoint>
. . .
<navPoint playOrder="16" id="chapter002"><navLabel><text>Chapter Two - THE LUSTER IN
THE BADGE</text></navLabel><content src="chapter002.html"/>
<navPoint playOrder="17" id="chapter002-h1-6"><navLabel><text>WHAT DOES IT MEAN TO
SERVE</text></navLabel><content src="chapter002.html#h1-6"/></navPoint>
. . .
<navPoint playOrder="44" id="appendix001"><navLabel><text>Appendix A - PROJECT TEAM
MEMBERS</text></navLabel><content src="appendix001.html"/>
<navPoint playOrder="45" id="appendix001-h1-
29"><navLabel><text>RAND</text></navLabel><content src="appendix001.html#h1-
29"/></navPoint></navPoint>
<navPoint playOrder="46" id="appendix002"><navLabel><text>Appendix B - RAND TRAINING
ASSESSMENT INSTRUMENT</text></navLabel><content src="appendix002.html"/></navPoint>
. . .
<navPoint playOrder="102"
id="bibliography"><navLabel><text>BIBLIOGRAPHY</text></navLabel><content
src="bibliography.html"/></navPoint>
. . .
</navMap>
</ncx>

HTML files:
An EPUB file uses XHTML 1.1 to construct the content of a book as of version
2.0. DTBook shall also be used, it is very effective especially in the conversion of
educational material and the materials highly structured.
The following is the XHTML modules for OPS :
XHTML 1.1 Module
Elements (non-normative)
Name
Structure body, head, html, title
abbr, acronym, address, blockquote, br, cite, code, dfn, div, em, h1, h2, h3,
Text
h4, h5, h6, kbd, p, pre, q, samp, span, strong, var
Hypertext a
List dl, dt, dd, ol, ul, li
Object object, param
Presentation b, big, hr, i, small, sub, sup, tt
Edit del, ins
Bidirectional Text bdo
Table caption, col, colgroup, table, tbody, td, tfoot, th, thead, tr
Image img
Client-Side Image
area, map
Map
Meta-Information meta
Style Sheet style
Style Attribute
style attribute
(deprecated)
Link link
Base base

A typical book will contain the following HTML files:

acknowledgments.html appendix010.html chapter006.html


appendix001.html appendix011.html copyright.html
appendix002.html appendix012.html footnotes.html
appendix003.html appendix013.html frontmatter001.html
appendix004.html bibliography.html frontmatter002.html
appendix005.html chapter001.html glossary.html
appendix006.html chapter002.html introduction.html
appendix007.html chapter003.html preface001.html
appendix008.html chapter004.html titlepage.html
appendix009.html chapter005.html toc.html

Of the above, let us dissect toc.html:


<?xml version="1.0" encoding="UTF-8"?>
<html xmlns="http://www.w3.org/1999/xhtml">

<head>
<meta http-equiv="Content-Type" content="text/html; charset=UTF-8" />
<title>Training the 21st Century Police Officer:</title>
<link rel="stylesheet" href="css/stylesheet.css" type="text/css" />
<link rel="stylesheet" href="css/page-template.xpgt" type="application/adobe-page-
template+xml" />
</head>

<body>
<h1 class="toc-title">Table of Contents</h1>
<p class="toc"><a href="titlepage.html">Title Page</a></p>
<p class="toc"><a href="copyright.html">Copyright Page</a></p>
<p class="toc"><a href="preface001.html">PREFACE</a></p>
<p class="toc-h1"><a href="preface001.html#h1-1">RAND PUBLIC SAFETY AND
JUSTICE</a></p>
<p class="toc"><a href="frontmatter001.html">Table of Figures</a></p>
<p class="toc"><a href="frontmatter002.html">Table of Tables</a></p>
<p class="toc"><a href="introduction.html">SUMMARY</a></p>
<p class="toc"><a href="acknowledgments.html">ACKNOWLEDGMENTS</a></p>
<p class="toc"><a href="chapter001.html">Chapter One - BACKGROUND AND
METHODOLOGY</a></p>
<p class="toc-h1"><a href="chapter001.html#h1-2">SETTING THE CONTEXT</a></p>
. . .
<p class="toc"><a href="chapter002.html">Chapter Two - THE LUSTER IN THE BADGE</a></p>
<p class="toc-h1"><a href="chapter002.html#h1-6">WHAT DOES IT MEAN?</a></p>
. . .
<p class="toc"><a href="chapter003.html">Chapter Three - CORPORATENESS</a></p>
<p class="toc-h1"><a href="chapter003.html#h1-11">INTRODUCTION</a></p>
<p class="toc-h1"><a href="chapter003.html#h1-12">EVERY COP A TEACHER</a></p>
. . .
<p class="toc"><a href="chapter004.html">Chapter Four - THE POLICE SOCIETY</a></p>
<p class="toc-h1"><a href="chapter004.html#h1-18">THE REDEFINED SERVICE</a></p>
. . .
<p class="toc"><a href="chapter005.html">Chapter Five - DEVELOPING POLICE
EXPERTISE</a></p>
<p class="toc-h1"><a href="chapter005.html#h1-24">OVERVIEW</a></p>
. . .
<p class="toc"><a href="chapter006.html">Chapter Six - CONCLUSION</a></p>
<p class="toc-h1"><a href="chapter006.html#h1-27">OVERARCHING RECOMMENDATION</a></p>
. . .
<p class="toc"><a href="glossary.html">ACRONYMS</a></p>
<p class="toc"><a href="appendix001.html">Appendix A - PROJECT TEAM MEMBERS</a></p>
<p class="toc-h1"><a href="appendix001.html#h1-29">RAND</a></p>
<p class="toc"><a href="appendix002.html">Appendix B - RAND TRAINING DOCUMENT</a></p>
. . .
<p class="toc"><a href="appendix012.html">Appendix L - ANALYSIS OF TRAINING
COURSES</a></p>
<p class="toc"><a href="appendix013.html">Appendix M - SUMMARY OF
RECOMMENDATIONS</a></p>
<p class="toc-h1"><a href="appendix013.html#h1-72">OVERARCHING RECOMMENDATION</a></p>
. . .
<p class="toc"><a href="bibliography.html">BIBLIOGRAPHY</a></p>
</body>
</html>

CSS:

Styling and layout is performed using a subset of CSS 2.0, referred to as OPS
Style Sheets. This specialized syntax requires only a portion of CSS properties to
be supported by reading systems and adds a few custom ones. Some custom
ones are oeb-page-head, oeb-page-foot, and oeb-column-number. Font-embedding can
be accomplished using the @font-face property, as well as including the font file in
the OPFʹs manifest. The mimetype for CSS documents in EPUB is text/css

EPUB also requires that PNG, JPEG, GIF and SVG and are supported for image
types. These use the mimetypes image/png, image/jpeg, image/gif, image/svg+xml
respectively. Other media types are allowed, but creators must include
alternative renditions in supported types

Unicode is required, and content producers must use either UTF-8 or UTF-16
encoding. This is to support international and multilingual books. However,
reading systems are not required to provide the fonts necessary to display every
unicode character, though they are required to display at least a placeholder for
characters that cannot be displayed fully.
Advantages and Disadvantages of ePublishing (ePublishing alone, without print
edition)

• Negligible investment by the publisher translates to a greater willingness


to take on untried writers and non-traditional characters, story lines, and
manuscript lengths.
• Faster publishing time for accepted manuscripts. Rather than waiting up
to two years for a manuscript to see print, e-publishing generally
publishes work within a few weeks to a few months after acceptance.
• Greater flexibility within the writer/publisher relationship. E-publishing
affords more say to writers in preparing works for publication. A paper
publisher might ask a writer to change a character, plot line, or other
features of a story to make it more marketable. An e-publisher might also
make suggestions, but the writer will generally have more say. The writer
might also be instrumental in providing graphics for the work, such as an
electronic jacket.
• Writers have the ability to update text often and easily at virtually no cost.
This is particularly handy for works related to fast-moving industries such
as computer technology. Since the e-publisher does not have an
investment in printed books already lining shelves, text can be
electronically updated in seconds.
• E-publishing offers greater longevity for works with slower sales. While
paper publishers will remove slow movers from active status (print),
electronic storage affords unlimited archiving. This gives new writers time
to build a following by having their entire catalog available over extended
periods of time.
• Works published electronically have an ISBN number, just like printed
books. This means anyone can walk into a storefront bookstore and order
an electronic copy of the book.
• Writers get a higher percentage of royalties through e-publishing because
the initial financial layout for the publisher is so much less than for a
paper publisher. Some writers receive as much as 70% of the profits in
royalties.
• With e-publishing writers normally retain all other rights to the work,
such as the option to go to a paper publisher later, adapt a screenplay, or
use the work in some other capacity. Paper publishers, on the other hand,
tend to covet as many rights as possible from the writer in the initial
boilerplate contract.

If this all sounds a little too rosy, note the disadvantages of e-publishing:
• To date, electronic works sell far fewer copies than paper books. Many
people aren’t aware of e-publishing and others prefer reading a book from
print rather than electronically. Good sales, according to one e-publisher,
amount to 500 copies for a successful manuscript.
• Writers are responsible for providing their own ongoing marketing for e-
published work. A book might be great, but if nobody knows about it, it
won’t sell. Authors also can’t count on the public seeing their books on
shelves or in store windows.
• If interested in building credentials, e-published works do not carry the
same weight as traditional paper publishers. The sense is that the bar is
somehow lower for e-published works than for printed works. However,
this may change with time as e-publishing becomes more established.
• Writers do not receive an advance. This is not just a financial
disadvantage, but might disqualify e-published authors from
participating in certain organizations where membership requirements
include works paid by advance. That said, sales royalties are often paid
more frequently by e-publishers, such as quarterly rather than annually.
• Piracy is another concern in the e-publishing industry. It is a fairly simple
thing, technically speaking, for a recipient of an e-work to edit the file,
make several copies, and sell the work out from under the nose of the e-
publisher and author. Some e-publishers counter that the relatively small
market for e-works provides little impetus for this.
• Prices are not always significantly cheaper for e-works, despite the lower
overhead. This might be a deterrent to sales.

Despite the disadvantages, e-publishing can be a good way for a new writer to
gain a following. Romance, science fiction, murder mystery and fantasy are all
possible genres for e-publishing. It is also ideal for How-To books that must be
updated frequently. Businesses can also save money on employee manuals and
training materials by e-publishing them. An added advantage here is that works
can be clickable. Table of contents and indexes can all make navigating through
technical e-books a breeze.