WARNING: This document will not render correctly using nbviewer or nbconvert. To render this notebook correctly, open in IPython Notebook and run Cell->Run All from the menu bar.

Introduction

The IPython Notebook allows Markdown, HTML, and inline LaTeX in Mardown Cells. The inline LaTeX is parsed with MathJax and Markdown is parsed with marked. Any inline HTML is left to the web browser to parse. NBConvert is a utility that allows users to easily convert their notebooks to various formats. Pandoc is used to parse markdown text in NBConvert. Since what the notebook web interface supports is a mix of Markdown, HTML, and LaTeX, Pandoc has trouble converting notebook markdown. This results in incomplete representations of the notebook in nbviewer or a compiled Latex PDF.

This isn't a Pandoc flaw; Pandoc isn't designed to parse and convert a mixed format document. Unfortunately, this means that Pandoc can only support a subset of the markup supported in the notebook web interface. This notebook compares output of Pandoc to the notebook web interface.

Changes:

05102013

  • heading anchors
  • note on remote images

06102013

  • remove strip_math_space filter
  • add lxml test

Utilities

Define functions to render Markdown using the notebook and Pandoc.

In [1]:
from IPython.nbconvert.utils.pandoc import pandoc
from IPython.display import HTML, Javascript, display

from IPython.nbconvert.filters import citation2latex, strip_files_prefix, \
                                     markdown2html, markdown2latex

def pandoc_render(markdown):
    """Render Pandoc Markdown->LaTeX content."""
    
    ## Convert the markdown directly to latex.  This is what nbconvert does.
    #latex = pandoc(markdown, "markdown", "latex")
    #html = pandoc(markdown, "markdown", "html", ["--mathjax"])
    
    # nbconvert template conversions
    html = strip_files_prefix(markdown2html(markdown))
    latex = markdown2latex(citation2latex(markdown))
    display(HTML(data="<div style='display: inline-block; width: 30%; vertical-align: top;'>" \
                 "<div style='background: #AAFFAA; width: 100%;'>NBConvert Latex Output</div>" \
                 "<pre class='prettyprint lang-tex' style='background: #EEFFEE; border: 1px solid #DDEEDD;'><xmp>" + latex + "</xmp></pre>"\
                 "</div>" \
                 "<div style='display: inline-block; width: 2%;'></div>" \
                 "<div style='display: inline-block; width: 30%; vertical-align: top;'>" \
                 "<div style='background: #FFAAAA; width: 100%;'>NBViewer Output</div>" \
                 "<div style='display: inline-block; width: 100%;'>" + html + "</div>" \
                 "</div>"))
    javascript = """
    $.getScript("https://google-code-prettify.googlecode.com/svn/loader/run_prettify.js");
"""
    display(Javascript(data=javascript))

def notebook_render(markdown):
    javascript = """
var mdcell = new IPython.MarkdownCell();
mdcell.create_element();
mdcell.set_text('""" + markdown.replace("\\", "\\\\").replace("'", "\'").replace("\n", "\\n") + """');
mdcell.render();
$(element).append(mdcell.element)
.removeClass()
.css('left', '66%')
.css('position', 'absolute')
.css('width', '30%')
mdcell.element.prepend(
    $('<div />')
    .removeClass()
    .css('background', '#AAAAFF')
    .css('width', '100 %')
    .html('Notebook Output')

);
container.show()
"""
    display(Javascript(data=javascript))

    
def pandoc_html_render(markdown):
    """Render Pandoc Markdown->LaTeX content."""
    
    # Convert the markdown directly to latex.  This is what nbconvert does.
    latex = pandoc(markdown, "markdown", "latex")
    
    # Convert the pandoc generated latex to HTML so it can be rendered in 
    # the web browser.
    html = pandoc(latex, "latex", "html", ["--mathjax"])
    display(HTML(data="<div style='background: #AAFFAA; width: 40%;'>HTML Pandoc Output</div>" \
                 "<div style='display: inline-block; width: 40%;'>" + html + "</div>"))
    return html
    
def compare_render(markdown):
    notebook_render(markdown)
    pandoc_render(markdown)

Outputs

In [1]:
try:
    import lxml
    print 'LXML found!'
except:
    print 'Warning! No LXML found - the old citation2latex filter will not work'
LXML found!

General markdown

Heading level 6 is not supported by Pandoc.

In [2]:
compare_render(r"""

# Heading 1 
## Heading 2 
### Heading 3 
#### Heading 4 
##### Heading 5 
###### Heading 6""")
NBConvert Latex Output
\section{Heading 1}

\subsection{Heading 2}

\subsubsection{Heading 3}

\paragraph{Heading 4}

\subparagraph{Heading 5}

Heading 6
NBViewer Output

Heading 1

Heading 2

Heading 3

Heading 4

Heading 5
Heading 6

Headers aren't recognized by (Pandoc on Windows?) if there isn't a blank line above the headers.

In [3]:
compare_render(r"""
# Heading 1 
## Heading 2 
### Heading 3 
#### Heading 4 
##### Heading 5 
###### Heading 6 """)

print("\n"*10)
NBConvert Latex Output
\section{Heading 1}

\subsection{Heading 2}

\subsubsection{Heading 3}

\paragraph{Heading 4}

\subparagraph{Heading 5}

Heading 6
NBViewer Output

Heading 1

Heading 2

Heading 3

Heading 4

Heading 5
Heading 6










If internal links are defined, these will not work in nbviewer and latex as the local link is not existing.

In [4]:
compare_render(r"""
[Link2Heading](http://127.0.0.1:8888/0a2d8086-ee24-4e5b-a32b-f66b525836cb#General-markdown)
""")
NBConvert Latex Output
\href{http://127.0.0.1:8888/0a2d8086-ee24-4e5b-a32b-f66b525836cb\#General-markdown}{Link2Heading}
NBViewer Output

Basic Markdown bold and italic works.

In [5]:
compare_render(r"""
This is Markdown **bold** and *italic* text.
""")
NBConvert Latex Output
This is Markdown \textbf{bold} and \emph{italic} text.
NBViewer Output

This is Markdown bold and italic text.

Nested lists work as well

In [6]:
compare_render(r"""
- li 1
- li 2
    1. li 3
    1. li 4
- li 5
""")
NBConvert Latex Output
\begin{itemize}
\itemsep1pt\parskip0pt\parsep0pt
\item
  li 1
\item
  li 2

  \begin{enumerate}
  \def\labelenumi{\arabic{enumi}.}
  \itemsep1pt\parskip0pt\parsep0pt
  \item
    li 3
  \item
    li 4
  \end{enumerate}
\item
  li 5
\end{itemize}
NBViewer Output
  • li 1
  • li 2
    1. li 3
    2. li 4
  • li 5

Unicode support

In [7]:
compare_render(ur"""
überschuß +***^°³³ α β θ
""")
NBConvert Latex Output
überschuß +\emph{*}\^{}°³³ α β θ
NBViewer Output

überschuß +*^°³³ α β θ

Pandoc may produce invalid latex, e.g \sout is not allowed in headings

In [8]:
compare_render(r"""

# Heading 1 ~~strikeout~~
""")
NBConvert Latex Output
\section{Heading 1 \sout{strikeout}}
NBViewer Output

Heading 1 strikeout

Horizontal lines work just fine

In [9]:
compare_render(r"""
above

--------

below
""")
NBConvert Latex Output
above

\begin{center}\rule{3in}{0.4pt}\end{center}

below
NBViewer Output

above


below

Extended markdown of pandoc

(maybe we should deactivate this)

In [10]:
compare_render(r"""
This is Markdown ~subscript~ and ^superscript^ text.
""")
NBConvert Latex Output
This is Markdown \textsubscript{subscript} and
\textsuperscript{superscript} text.
NBViewer Output

This is Markdown subscript and superscript text.

No space before underline behaves inconsistent (Pandoc extension: intraword_underscores - deactivate?)

In [11]:
compare_render(r"""
This is Markdown not_italic_.
""")
NBConvert Latex Output
This is Markdown not\_italic\_.
NBViewer Output

This is Markdown not_italic_.

Pandoc allows to define tex macros which are respected for all output formats, the notebook not.

In [12]:
compare_render(r"""
\newcommand{\tuple}[1]{\langle #1 \rangle}

$\tuple{a, b, c}$
""")
NBConvert Latex Output
\newcommand{\tuple}[1]{\langle #1 \rangle}

$\tuple{a, b, c}$
NBViewer Output

\(\langle a, b, c \rangle\)

When placing the \newcommand inside a math environment it works within the notebook and nbviewer, but produces invalid latex (the newcommand is only valid in the same math environment).

In [13]:
compare_render(r"""
$\newcommand{\foo}[1]{...:: #1 ::...}$
$\foo{bar}$
""")
NBConvert Latex Output
$\newcommand{\foo}[1]{...:: #1 ::...}$ $\foo{bar}$
NBViewer Output

\(\newcommand{\foo}[1]{...:: #1 ::...}\) \(\foo{bar}\)

HTML or LaTeX injections

Raw HTML gets dropped entirely when converting to $\LaTeX$.

In [14]:
compare_render(r"""
This is HTML <b>bold</b> and <i>italic</i> text.
""")
NBConvert Latex Output
This is HTML bold and italic text.
NBViewer Output

This is HTML bold and italic text.

Same for something like center

In [15]:
compare_render(r"""
<center>Center aligned</center>
""")
NBConvert Latex Output
Center aligned
NBViewer Output
Center aligned

Raw $\LaTeX$ gets droppen entirely when converted to HTML. (I don't know why the HTML output is cropped here???)

In [16]:
compare_render(r"""
This is \LaTeX \bf{bold} and \emph{italic} text.
""")
NBConvert Latex Output
This is \LaTeX \bf{bold} and \emph{italic} text.
NBViewer Output

This is

A combination of raw $\LaTeX$ and raw HTML

In [17]:
compare_render(r"""
**foo** $\left( \sum_{k=1}^n a_k b_k \right)^2 \leq$ <b>b\$ar</b> $$test$$ 
\cite{}
""")
NBConvert Latex Output
\textbf{foo} $\left( \sum_{k=1}^n a_k b_k \right)^2 \leq$ b\$ar \[test\]
\cite{}
NBViewer Output

foo \(\left( \sum_{k=1}^n a_k b_k \right)^2 \leq\) b$ar \[test\]

Tables

HTML tables render in the notebook, but not in Pandoc.

In [18]:
compare_render(r"""
<table>
    <tr>
        <td>a</td>
        <td>b</td>
    </tr>
    <tr>
        <td>c</td>
        <td>d</td>
    </tr>
</table>
""")
NBConvert Latex Output
a

b

c

d
NBViewer Output
a b
c d

Instead, Pandoc supports simple ascii tables. Unfortunately marked.js doesn't support this, and therefore it is not supported in the notebook.

In [19]:
compare_render(r"""
+---+---+
| a | b |
+---+---+
| c | d |
+---+---+
""")
NBConvert Latex Output
\begin{longtable}[c]{@{}ll@{}}
\hline\noalign{\medskip}
\begin{minipage}[t]{0.06\columnwidth}\raggedright
a
\end{minipage} & \begin{minipage}[t]{0.06\columnwidth}\raggedright
b
\end{minipage}
\\\noalign{\medskip}
\begin{minipage}[t]{0.06\columnwidth}\raggedright
c
\end{minipage} & \begin{minipage}[t]{0.06\columnwidth}\raggedright
d
\end{minipage}
\\\noalign{\medskip}
\hline
\end{longtable}
NBViewer Output

a

b

c

d

An alternative to basic ascii tables is pipe tables. Pipe tables can be recognized by Pandoc and are supported by marked, hence, this is the best way to add tables.

In [20]:
compare_render(r"""
|Left |Center |Right|
|:----|:-----:|----:|
|Text1|Text2  |Text3|
""")
NBConvert Latex Output
\begin{longtable}[c]{@{}lcr@{}}
\hline\noalign{\medskip}
Left & Center & Right
\\\noalign{\medskip}
\hline\noalign{\medskip}
Text1 & Text2 & Text3
\\\noalign{\medskip}
\hline
\end{longtable}
NBViewer Output
Left Center Right
Text1 Text2 Text3

Pandoc recognizes cell alignment in simple tables. Since marked.js doesn't recognize ascii tables, it can't render this table.

In [21]:
compare_render(r"""
Right Aligned Center Aligned Left Aligned
------------- -------------- ------------
          Why      does      this
     actually      work?     Who
        knows       ...
""")

print("\n"*5)
NBConvert Latex Output
\begin{longtable}[c]{@{}lll@{}}
\hline\noalign{\medskip}
Right Aligned & Center Aligned & Left Aligned
\\\noalign{\medskip}
\hline\noalign{\medskip}
Why & does & this
\\\noalign{\medskip}
actually & work? & Who
\\\noalign{\medskip}
knows & \ldots{} &
\\\noalign{\medskip}
\hline
\end{longtable}
NBViewer Output
Right Aligned Center Aligned Left Aligned
Why does this
actually work? Who
knows ...





Images

Markdown images work on both. However, remote images are not allowed in $\LaTeX$. Maybe add a preprocessor to download these. The alternate text is displayed in nbviewer next to the image.

In [22]:
compare_render(r"""
![Alternate Text](http://ipython.org/_static/IPy_header.png)
""")
NBConvert Latex Output
\begin{figure}[htbp]
\centering
\includegraphics{http://ipython.org/_static/IPy_header.png}
\caption{Alternate Text}
\end{figure}
NBViewer Output
Alternate Text

Alternate Text

HTML Images only work in the notebook.

In [23]:
compare_render(r"""
<img src="http://ipython.org/_static/IPy_header.png">
""")
NBConvert Latex Output
NBViewer Output

Math

Simple inline and displaystyle maths work fine

In [24]:
compare_render(r"""
My equation:
$$ 5/x=2y $$

It is inline $ 5/x=2y $ here.
""")
NBConvert Latex Output
My equation: \[ 5/x=2y \]

It is inline \$ 5/x=2y \$ here.
NBViewer Output

My equation: \[ 5/x=2y \]

It is inline $ 5/x=2y $ here.

If the first \$ is on a new line, the equation is not captured by md2tex, if both \$s are on a new line md2html fails (Note the raw latex is dropped) but the notebook renders it correctly.

In [25]:
compare_render(r"""
$5 \cdot x=2$

$
5 \cdot x=2$

$
5 \cdot x=2
$
""")
NBConvert Latex Output
$5 \cdot x=2$

\$ 5 \cdot x=2\$

\$ 5 \cdot x=2 \$
NBViewer Output

\(5 \cdot x=2\)

$ 5 x=2$

$ 5 x=2 $

MathJax permits some $\LaTeX$ math constructs without \$s, of course these raw $\LaTeX$ is stripped when converting to html. Moreove, the & are escaped by the lxml parsing #4251.

In [26]:
compare_render(r"""
\begin{align}
a & b\\
d & c
\end{align}

\begin{eqnarray}
a & b \\
c & d
\end{eqnarray}
""")
NBConvert Latex Output
\begin{align}
a &amp; b\\
d &amp; c
\end{align}

\begin{eqnarray}
a &amp; b \\
c &amp; d
\end{eqnarray}
NBViewer Output

There is another lxml issue, #4283

In [27]:
compare_render(r"""
1<2 is true, but 3>4 is false.

$1<2$ is true, but $3>4$ is false.

1<2 it is even worse if it is alone in a line.
""")
NBConvert Latex Output
14 is false.

$14$ is false.

1
NBViewer Output

1<2 is true, but 3>4 is false.

\(1<2\) is true, but \(3>4\) is false.

1<2 it is even worse if it is alone in a line.

Listings, and Code blocks

In [28]:
compare_render(r"""
some source code

```
a = "test"
print(a)
```
""")
NBConvert Latex Output
some source code

\begin{verbatim}
a = "test"
print(a)
\end{verbatim}
NBViewer Output

some source code

a = "test"
print(a)

Language specific syntax highlighting by Pandoc requires additional dependencies to render correctly.

In [29]:
compare_render(r"""
some source code

```python
a = "test"
print(a)
```
""")
NBConvert Latex Output
some source code

\begin{Shaded}
\begin{Highlighting}[]
\NormalTok{a = }\StringTok{"test"}
\KeywordTok{print}\NormalTok{(a)}
\end{Highlighting}
\end{Shaded}
NBViewer Output

some source code

a = "test"
print(a)