This is a DITA-OT Plug-in to extend the available input formats for DITA-OT. Non DITA
input sources can be pre-processed using Pandoc to create create valid DITA source. Files written
in multiple input formats can be directly added to a *.ditamap
and processed as if they had been written in DITA.
Table of Contents
Pandoc is a Haskell library for converting from one markup format to another, and a command-line tool that uses this library. It can convert from the following formats:
-
Markdown:
commonmark
(CommonMark Markdown) ,gfm
(GitHub-Flavored Markdown) ,markdown
(Pandoc’s Markdown) ,markdown_mmd
(MultiMarkdown) ,markdown_phpextra
(PHP Markdown Extra) ,markdown_strict
(original unextended Markdown) -
Wiki Formats:
dokuwiki
(DokuWiki markup) ,mediawiki
(MediaWiki markup) ,muse
(Muse) ,tikiwiki
(TikiWiki markup) ,twiki
(TWiki markup) ,vimwiki
(Vimwiki) -
Other Formats:
creole
(Creole 1.0) ,docbook
(DocBook) ,docx
(Word docx) ,epub
(EPUB) ,fb2
(FictionBook2 e-book) ,haddock
(Haddock markup) ,html
(HTML) ,ipynb
(Jupyter notebook) ,jats
(JATS XML) ,json
(JSON version of native AST) ,latex
(LaTeX) ,man
(roff man) ,native
(native Haskell) ,odt
(ODT) ,opml
(OPML) ,org
(Emacs Org mode) ,rst
(reStructuredText) ,t2t
(txt2tags) ,textile
(Textile)
This plug-in contains a Lua template which extends the output formats supported by Pandoc to include DITA. The output consists of a single DITA topic for each input file added to the ditamap.
Unlike the standard Markdown Plug-in, this plug-in does not fail if the
h1...h6
headers are incorrectly incremented. This is because the Lua template has been designed to calculate that
headers are incrementing at most one level at a time - the downside of this is that the output maybe unexpected.
Note that because Pandoc’s intermediate representation of a document is less expressive than many of the formats it converts between, one should not expect perfect conversions between every format and every other. Pandoc attempts to preserve the structural elements of a document, but not formatting details such as margin size. And some document elements, such as complex tables, may not fit into pandoc’s simple document model. While conversions from pandoc’s Markdown to all formats aspire to be perfect, conversions from formats more expressive than pandoc’s Markdown can be expected to be lossy.
The DITA-OT Pandoc Pass Through plug-in has been tested against DITA-OT 4.x. It is recommended that you upgrade to the latest version.
The DITA-OT Pandoc plug-in is a file reader for the DITA Open Toolkit.
-
Full installation instructions for downloading DITA-OT can be found here.
- Download the
dita-ot-4.2.zip
package from the project website at dita-ot.org/download - Extract the contents of the package to the directory where you want to install DITA-OT.
- Optional: Add the absolute path for the
bin
directory to the PATH system variable.
This defines the necessary environment variable to run the
dita
command from the command line. - Download the
curl -LO https://github.com/dita-ot/dita-ot/releases/download/4.2/dita-ot-4.2.zip
unzip -q dita-ot-4.2.zip
rm dita-ot-4.2.zip
- Run the plug-in installation commands:
dita install https://github.com/doctales/org.doctales.xmltask/archive/master.zip
dita install https://github.com/jason-fox/fox.jason.passthrough/archive/master.zip
dita install https://github.com/jason-fox/fox.jason.passthrough.pandoc/archive/master.zip
The dita
command line tool requires no additional configuration.
To download a copy follow the instructions on the Install page
If running DITA-OT with the Oxygen editor on Mac OS, and if you start Oxygen from the Terminal using sh oxygen.sh
in the Oxygen installation
folder, when Oxygen runs the DITA OT, the build file manages to run the pandoc
executable. Starting Oxygen by double clicking the shortcut
in the Finder, does not work reliably, it works only if the path to the Pandoc executable /usr/local/bin/
is fully specified in the configuration file.
fox.jason.passthrough.pandoc/cfg/configuration.properties
To mark a file to be passed through for Pandoc processing, label it with format="pandoc"
within the *.ditamap
as
shown:
<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE bookmap PUBLIC "-//OASIS//DTD DITA BookMap//EN" "bookmap.dtd">
<bookmap>
...etc
<chapter format="pandoc" href="sample.docx"/>
</bookmap>
The additional file will run against the Pandoc XXX-to-DITA lua filter to be converted to a *.dita
file and will be
added to the build job without further processing. The navtitle
of the included topic will be the same as root name of
the file. Any underscores in the filename will be replaced by spaces in title.
The examples below use Markdown as a passthrough format, other formats need to provide equivalent annotations to obtain
full functionality. Where possible, annotation aligns with the
Markdown DITA syntax reference based on
CommonMark. The chapter title
is taken from the first header found. Thereafter the document
is processed as expected:
# Chapter title
The abstract (if any) goes here...
## Topic 1
Body of topic 1 goes here.
## Topic 2
Body of topic 2 goes here.
...etc
Ideally input files should only contain a single <h1>
header
Pandoc header_attributes can be used to define id
or
outputclass
attributes:
# Topic title {#carrot .juice}
The following class values in header_attributes have a special meaning on header levels.
section
example
They are used to generate <section>
and <example>
elements:
# Topic title
## Section title {.section}
## Example title {.example}
The following class values in header_attributes has a special meaning on header levels.
note
They are used to generate <note>
elements:
# Topic title
Contents of the topic go here ...
---
## Note|Warning|Tip|Important {.note}
Contents of the note
---
Contents of the topic continue here ...
The type
of the note is defined by the title of the header. The <note>
will continue until the next header element
or horizontal rule ---
, which ever comes sooner
YAML metadata block as defined in Pandoc pandoc_metadata_block can be used to specify different metadata elements. The supported elements are:
author
source
publisher
permissions
audience
category
keyword
resourceid
shortdesc
Unrecognized keys are output using data element.
---
author:
- Author One
- Author Two
source: Source
publisher: Publisher
permissions: Permissions
audience: Audience
category: Category
keyword:
- Keyword1
- Keyword2
resourceid:
- Resourceid1
- Resourceid2
workflow: review
---
<title>Sample with YAML header</title>
<prolog>
<author>Author One</author>
<author>Author Two</author>
<source>Source</source>
<publisher>Publisher</publisher>
<permissions view="Permissions"/>
<metadata>
<audience audience="Audience"/>
<category>Category</category>
<keywords>
<keyword>Keyword1</keyword>
<keyword>Keyword2</keyword>
</keywords>
</metadata>
<resourceid appid="Resourceid1"/>
<resourceid appid="Resourceid2"/>
<data name="workflow" value="review"/>
</prolog>
Ditamap <topicmeta>
processing is also supported.
<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE bookmap PUBLIC "-//OASIS//DTD DITA BookMap//EN" "bookmap.dtd">
<bookmap>
<chapter format="pandoc" processing-role="normal" type="topic" href="markdown.md">
<topicmeta>
<shortdesc>This is where the shortdesc goes</shortdesc>
<metadata>
<keywords>
<keyword>Keyword1</keyword>
<keyword>Keyword2</keyword>
</keywords>
</metadata>
</topicmeta>
</chapter>
</bookmap>
This allows for topic metadata to be added to files for formats other than Markdown.
Apache 2.0 © 2019 - 2024 Jason Fox