Rek Bellum

This is a list of notes I've compiled when came time to learn how to generate PDFs, and e-books with free, open-source software. This guide was made and tested on the Elementary Linux distribution.

Contents

Necessary downloads for this tutorial:

Some of these are tricky to install, please refer to my install tutorial if you have problems. If you only have plans to export a PDF, only install Pandoc, Pdfunite and LateX. To generate MOBIs and EPUBs with Pandoc, install Pandoc, ImageMagick and Calibre.

Now that you've installed all necessary tools, let's begin:

What is Pandoc?

Pandoc is a terminal utility that converts files from one markup format into another. You can convert an HTML page to Markdown, Markdown to PDF, Markdown to EPUB etc. It's a useful tool to create articles, scientific papers, and books.

When generating PDFs, Pandoc's use of LaTeX (a document preparation system) encourages authors not to worry too much about the appearance of their documents but to concentrate on getting the right content. To find out how to install LaTeX, read my tutorial.

Pandoc Markdown

Make sure that the file you want turned into a PDF has an .md extension, and take note of the path to your project file, as you'll need that to generate your PDF later. The next step is to add Pandoc Markdown syntax to the text.

Pandoc Markdown is an extended and slightly revised version of John Gruber’s Markdown syntax. Markdown is a simple, and easy-to-use markup language to format plain-text content. This language is used to annotate a document in a way that is syntactically distinguishable from the text.

For example, to make text bold, add two asterisks on each side of the word:

In January 2016, we bought our first sailboat, a **Yamaha 33**.

Hash symbols preceding a word or phrase make a heading. Be sure to add a space between the hash and the first word:

## Introduction

One hash would make the title bigger, three would make the heading smaller, four even smaller, five even smaller etc. Headings are important too for creating epubs, as Pandoc will use the headers to generate the table of contents of your e-book.

From Markdown to PDF

To generate a PDF from your .md or .txt file, locate your project file using the terminal. Use cd to navigate to your project folder. The example below is if you'd saved your project file to Documents, under a folder named myproject. Remember, caps matter when using cd:

cd Documents/myproject/

Enter the following command, replacing filename with the name of your project file, and filename_export with for the final exported PDF file:

pandoc filename.txt --pdf-engine=xelatex -o filename_export.pdf

In that same folder where you had your .txt file saved, you'll find your Pandoc-generated PDF.

Using Pandoc Markdown Extensions

In Pandoc, you can use extensions (see below) to specify the output of your PDF more granularly, say, if you want to change the paper size, add tables or footnotes etc.

Changing papersize

By default, the Pandoc PDF format is a4. If you want to change it, you'll have to add the following when generating the PDF in the command line:

--variable=geometry:a5paper

Appended with the rest, to generate the PDF from a Markdown file, it'll look like this:

pandoc --variable=geometry:a5paper filename.txt --pdf-engine=xelatex -o filename_export.pdf

If you're file is a .md, change the extension of your project file.

Adding extensions

If you want add tables, you'll have to add an extension. Extensions are specified in the command line when generating a pdf. In the example below, after -- from markdown, I've added a + followed by the extension simple tables. If the extension name has two words, adding an underscore between them is necessary.

--from markdown+simple_tables

You can add more extensions, adding a + inbetween each one. In the example below, I've added both simple tables and citations.

--from markdown+simple_tables+citations

Below is an example of a terminal command for a PDF generated from a .txt file, in a5 format, with the simple tables extension.

pandoc --variable=geometry:a5paper filename.txt --pdf-engine=xelatex --from markdown+simple_tables -o filename_export.pdf

Book cover PDF

Adding a book cover for a PDF with Pandoc is not straightforward, or maybe there is something I am missing? I could not figure out how to add images as full-page covers, BUT—I found a workaround to do it...and again (another big BUT), it requires ImageMagick.

We will use ImageMagick to convert the cover image into a PDF, which in turn, we will combine together into yet another PDF.

Once you've installed ImageMagick (see tutorial for install), save your cover image in your project folder. If you put it in a separate folder, be sure to specify it. Now, use this ImageMagick command to convert the image to a PDF:

convert cover.jpg cover.pdf

To combine the cover PDF with the main text PDF use pdfunite. *Note that the combined.pdf is the name of your final, merged pdf:

pdfunite cover.pdf filename.pdf combined.pdf

I am sure that this isn't the best way to do this, but it works! Sometimes hacky hacks are fine, but by all means, if you have a better solution, do say!

Making epubs

EPUB is a popular ebook format, but if you convert a Markdown file to an epub it may break some of your formatting. For example, EPUBs ignore the \newpage Markdown tag, but assigning a page break to a specific level of header with css fixes that problem. For EPUBs it is a good idea to use css to control the final look of the document.

First step is to create a blank text file, and to save it as epub.css.

I prefer not to modify the look too much, so I keep my css to minimum. For example:

body {font-size: 14px}
table {font-size: 14px}
h4 {page-break-before: always;}
h2 {page-break-before: always;}
blockquote {font-style: italic;}
img { max-width: 100% }

I don't recommend setting a fixed font size like in the example above, as the font may be too small to read on a phone. I specified a max-width for images otherwise they have a tendency to spill out of the page. Of course, these are only a few of the many things one can do to improve the look of documents with css.

When exporting the file as an EPUB with Pandoc, add the extension like the above example, along with arguments to generate a table of contents from your list of headers, the book's metadata, a book cover and your css file.

Table of Contents: To change the name of the table of the contents generated by Pandoc, add the following command, changing 'custom text' for the desired name of your table of contents:

-V toc-title:"Custom text"

If the table of contents depth is not specified, it defaults to 3 (level 1, 2 and 3 headers will be included in the generated table of contents). To keep the table to a depth of level 2, add:

--toc-depth=2

Metadata file: A metadata file includes all the author, rights and publishing information. Create a new file and save it as metadata.yaml. Then, add the following:

---
title:
- type: main
  text: Book title
- type: subtitle
  text: My sub book title (if any)
creator:
- role: author
  text: Your name
publisher: Your publisher's name
lang: en-CA
rights: © 2021 MyCompany, *your chosen license*
table-of-contents: true
...

The text in the .yaml file need to be delimited by a line of three hyphens (---) at the top and a line of three hyphens (---) or three dots (...) at the bottom. Metadata will be taken from the fields of the YAML object and added to any existing document metadata. See more about metadata files. Next, pass it to Pandoc as an argument:

--epub-metadata=metadata.yaml

Book cover: To add a cover, save the file to the folder with the rest of your text, and pass it to Pandoc as an argument:

--epub-cover-image=cover.jpg

The final Pandoc command to convert a file from Markdown to EPUB would look like this (with variations, depending on the extensions you choose):

~pandoc filename.md --from markdown+simple_tables+line_blocks --toc -V toc-title:"Table of Contents" --toc-depth=2 --epub-metadata=metadata.yaml --epub-cover-image=cover.jpg --css epub.css -w epub -o filename.epub

Test your book on a variety of devices before publishing it.

MOBI

Pandoc cannot convert files to MOBI, but it's possible to convert it using Calibre. To start, make an EPUB (steps above). To find out how to install Calibre, read my tutorial.

Run the following command in the terminal:

ebook-convert "input_file.epub" "output_file.mobi"

input_file is the input and output_file is the output. Both must be specified as the first two arguments to the command. It's possible to add more options to further change the look of the output file. There are too many options to name here, but all are listed here.

Calibre doubles as a book viewer, so you can test your book on that. Note that like with an EPUB, test the file on as many devices as you can.

Now, all that's left is to publish your e-book! I personally choose not to use Amazon, and publish on itch.io. There are no publishing fees, and with it, it's possible to automate uploads using Butler (itch.io's command line tools) if say, you want to publish changes to your book fast. If anyone is interested in me writing about this let me know, and I will. Also, if you've read this, and have questions don't hesitate to ask me! I'll do my best to help.