Skip to content

vanisitor/EpubReader

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

58 Commits
 
 
 
 
 
 
 
 

Repository files navigation

EpubReader

.NET library for reading EPUB files.

Supports .NET Framework >= 4.6, .NET Core >= 1.0, and .NET Standard >= 1.3.

Supports EPUB 2 (2.0, 2.0.1) and EPUB 3 (3.0, 3.0.1, 3.1).

Download | WPF & .NET Core demo apps

Migration from 2.x

How to migrate from 2.x to 3.x

Example

// Opens a book and reads all of its content into memory
EpubBook epubBook = EpubReader.ReadBook("alice_in_wonderland.epub");

            
// COMMON PROPERTIES

// Book's title
string title = epubBook.Title;

// Book's authors (comma separated list)
string author = epubBook.Author;

// Book's authors (list of authors names)
List<string> authors = epubBook.AuthorList;

// Book's cover image (null if there is no cover)
byte[] coverImageContent = epubBook.CoverImage;
if (coverImageContent != null)
{
    using (MemoryStream coverImageStream = new MemoryStream(coverImageContent))
    {
        Image coverImage = Image.FromStream(coverImageStream);
    }
}
            
// TABLE OF CONTENTS

// Enumerating chapters
foreach (EpubNavigationItem chapter in epubBook.Navigation)
{
    // Title of chapter
    string chapterTitle = chapter.Title;
                
    // Nested chapters
    List<EpubNavigationItem> subChapters = chapter.NestedItems;
}

// READING ORDER

// Enumerating the whole text content of the book in the order of reading
foreach (EpubTextContentFile textContentFile in book.ReadingOrder)
{
    // HTML of current text content file
    string htmlContent = textContentFile.Content;
}

            
// CONTENT

// Book's content (HTML files, stlylesheets, images, fonts, etc.)
EpubContent bookContent = epubBook.Content;

            
// IMAGES

// All images in the book (file name is the key)
Dictionary<string, EpubByteContentFile> images = bookContent.Images;

EpubByteContentFile firstImage = images.Values.First();

// Content type (e.g. EpubContentType.IMAGE_JPEG, EpubContentType.IMAGE_PNG)
EpubContentType contentType = firstImage.ContentType;

// MIME type (e.g. "image/jpeg", "image/png")
string mimeType = firstImage.ContentMimeType;

// Creating Image class instance from the content
using (MemoryStream imageStream = new MemoryStream(firstImage.Content))
{
    Image image = Image.FromStream(imageStream);
}

// Cover metadata
if (bookContent.Cover != null)
{
    string coverFileName = bookContent.Cover.FileName;
    EpubContentType coverContentType = bookContent.Cover.ContentType;
    string coverMimeType = bookContent.Cover.ContentMimeType;
}

// HTML & CSS

// All XHTML files in the book (file name is the key)
Dictionary<string, EpubTextContentFile> htmlFiles = bookContent.Html;

// All CSS files in the book (file name is the key)
Dictionary<string, EpubTextContentFile> cssFiles = bookContent.Css;

// Entire HTML content of the book
foreach (EpubTextContentFile htmlFile in htmlFiles.Values)
{
    string htmlContent = htmlFile.Content;
}

// All CSS content in the book
foreach (EpubTextContentFile cssFile in cssFiles.Values)
{
    string cssContent = cssFile.Content;
}


// OTHER CONTENT

// All fonts in the book (file name is the key)
Dictionary<string, EpubByteContentFile> fonts = bookContent.Fonts;

// All files in the book (including HTML, CSS, images, fonts, and other types of files)
Dictionary<string, EpubContentFile> allFiles = bookContent.AllFiles;


// ACCESSING RAW SCHEMA INFORMATION

// EPUB OPF data
EpubPackage package = epubBook.Schema.Package;

// Enumerating book's contributors
foreach (EpubMetadataContributor contributor in package.Metadata.Contributors)
{
    string contributorName = contributor.Contributor;
    string contributorRole = contributor.Role;
}

// EPUB 2 NCX data
Epub2Ncx epub2Ncx = epubBook.Schema.Epub2Ncx;

// Enumerating EPUB 2 NCX metadata
foreach (Epub2NcxHeadMeta meta in epub2Ncx.Head)
{
    string metadataItemName = meta.Name;
    string metadataItemContent = meta.Content;
}

// EPUB 3 navigation
Epub3NavDocument epub3NavDocument = epubBook.Schema.Epub3NavDocument;

// Accessing structural semantics data of the head item
StructuralSemanticsProperty? ssp = epub3NavDocument.Navs.First().Type;

More examples

  1. How to extract the plain text of the whole book.
  2. How to extract the table of contents.
  3. How to iterate over all EPUB files in a directory and collect some statistics.

Download latest stable release

Via NuGet package from nuget.org

DLL file from GitHub: for .NET Framework (38.3 KB) / for .NET Core (38.4 KB) / for .NET Standard (38.4 KB)

Demo apps

Download WPF demo app (WpfDemo.zip, 479 KB)

This .NET Framework application demonstrates how to open EPUB books and extract their content using the library.

HTML renderer used in this demo app may have difficulties while rendering HTML content for some of the books if the HTML structure is too complicated.

Download .NET Core console demo app (NetCoreDemo.zip, 17.6 MB)

This .NET Core console application demonstrates how to open EPUB books and retrieve their text content.

About

.NET library for reading EPUB files

Resources

License

Stars

Watchers

Forks

Packages

No packages published

Languages

  • C# 100.0%