WARNING: This document will not render correctly using nbviewer or nbconvert. To render this notebook correctly, open in IPython Notebook and run Cell->Run All from the menu bar.

Introduction

The IPython Notebook allows Markdown, HTML, and inline LaTeX in Mardown Cells. The inline LaTeX is parsed with MathJax and Markdown is parsed with marked. Any inline HTML is left to the web browser to parse. NBConvert is a utility that allows users to easily convert their notebooks to various formats. Pandoc is used to parse markdown text in NBConvert. Since what the notebook web interface supports is a mix of Markdown, HTML, and LaTeX, Pandoc has trouble converting notebook markdown. This results in incomplete representations of the notebook in nbviewer or a compiled Latex PDF.

This isn't a Pandoc flaw; Pandoc isn't designed to parse and convert a mixed format document. Unfortunately, this means that Pandoc can only support a subset of the markup supported in the notebook web interface. This notebook compares output of Pandoc to the notebook web interface.

Changes:

05102013

  • heading anchors
  • note on remote images

06102013

  • remove strip_math_space filter
  • add lxml test

Utilities

Define functions to render Markdown using the notebook and Pandoc.

In [1]:
from IPython.nbconvert.utils.pandoc import pandoc
from IPython.display import HTML, Javascript, display

from IPython.nbconvert.filters import citation2latex, strip_files_prefix, \
                                     markdown2html, markdown2latex

def pandoc_render(markdown):
    """Render Pandoc Markdown->LaTeX content."""
    
    ## Convert the markdown directly to latex.  This is what nbconvert does.
    #latex = pandoc(markdown, "markdown", "latex")
    #html = pandoc(markdown, "markdown", "html", ["--mathjax"])
    
    # nbconvert template conversions
    html = strip_files_prefix(markdown2html(markdown))
    latex = markdown2latex(citation2latex(markdown))
    display(HTML(data="<div style='display: inline-block; width: 30%; vertical-align: top;'>" \
                 "<div style='background: #AAFFAA; width: 100%;'>NBConvert Latex Output</div>" \
                 "<pre class='prettyprint lang-tex' style='background: #EEFFEE; border: 1px solid #DDEEDD;'><xmp>" + latex + "</xmp></pre>"\
                 "</div>" \
                 "<div style='display: inline-block; width: 2%;'></div>" \
                 "<div style='display: inline-block; width: 30%; vertical-align: top;'>" \
                 "<div style='background: #FFAAAA; width: 100%;'>NBViewer Output</div>" \
                 "<div style='display: inline-block; width: 100%;'>" + html + "</div>" \
                 "</div>"))
    javascript = """
    $.getScript("https://google-code-prettify.googlecode.com/svn/loader/run_prettify.js");
"""
    display(Javascript(data=javascript))

def notebook_render(markdown):
    javascript = """
var mdcell = new IPython.MarkdownCell();
mdcell.create_element();
mdcell.set_text('""" + markdown.replace("\\", "\\\\").replace("'", "\'").replace("\n", "\\n") + """');
mdcell.render();
$(element).append(mdcell.element)
.removeClass()
.css('left', '66%')
.css('position', 'absolute')
.css('width', '30%')
mdcell.element.prepend(
    $('<div />')
    .removeClass()
    .css('background', '#AAAAFF')
    .css('width', '100 %')
    .html('Notebook Output')

);
container.show()
"""
    display(Javascript(data=javascript))

    
def pandoc_html_render(markdown):
    """Render Pandoc Markdown->LaTeX content."""
    
    # Convert the markdown directly to latex.  This is what nbconvert does.
    latex = pandoc(markdown, "markdown", "latex")
    
    # Convert the pandoc generated latex to HTML so it can be rendered in 
    # the web browser.
    html = pandoc(latex, "latex", "html", ["--mathjax"])
    display(HTML(data="<div style='background: #AAFFAA; width: 40%;'>HTML Pandoc Output</div>" \
                 "<div style='display: inline-block; width: 40%;'>" + html + "</div>"))
    return html
    
def compare_render(markdown):
    notebook_render(markdown)
    pandoc_render(markdown)

Outputs

In [1]:
try:
    import lxml
    print 'LXML found!'
except:
    print 'Warning! No LXML found - the old citation2latex filter will not work'
LXML found!

General markdown

Heading level 6 is not supported by Pandoc.

In [2]:
compare_render(r"""

# Heading 1 
## Heading 2 
### Heading 3 
#### Heading 4 
##### Heading 5 
###### Heading 6""")
<IPython.core.display.Javascript at 0x21ac2d0>
NBConvert Latex Output
\section{Heading 1}

\subsection{Heading 2}

\subsubsection{Heading 3}

\paragraph{Heading 4}

\subparagraph{Heading 5}

Heading 6
NBViewer Output

Heading 1

Heading 2

Heading 3

Heading 4

Heading 5
Heading 6
<IPython.core.display.Javascript at 0x21b9fd0>

Headers aren't recognized by (Pandoc on Windows?) if there isn't a blank line above the headers.

In [3]:
compare_render(r"""
# Heading 1 
## Heading 2 
### Heading 3 
#### Heading 4 
##### Heading 5 
###### Heading 6 """)

print("\n"*10)
<IPython.core.display.Javascript at 0x21ac550>
NBConvert Latex Output
\section{Heading 1}

\subsection{Heading 2}

\subsubsection{Heading 3}

\paragraph{Heading 4}

\subparagraph{Heading 5}

Heading 6
NBViewer Output

Heading 1

Heading 2

Heading 3

Heading 4

Heading 5
Heading 6
<IPython.core.display.Javascript at 0x21ac550>










If internal links are defined, these will not work in nbviewer and latex as the local link is not existing.

In [4]:
compare_render(r"""
[Link2Heading](http://127.0.0.1:8888/0a2d8086-ee24-4e5b-a32b-f66b525836cb#General-markdown)
""")
<IPython.core.display.Javascript at 0x21ac210>
NBConvert Latex Output
\href{http://127.0.0.1:8888/0a2d8086-ee24-4e5b-a32b-f66b525836cb\#General-markdown}{Link2Heading}
NBViewer Output
<IPython.core.display.Javascript at 0x21ac210>

Basic Markdown bold and italic works.

In [5]:
compare_render(r"""
This is Markdown **bold** and *italic* text.
""")
<IPython.core.display.Javascript at 0x21ac450>
NBConvert Latex Output
This is Markdown \textbf{bold} and \emph{italic} text.
NBViewer Output

This is Markdown bold and italic text.

<IPython.core.display.Javascript at 0x21ac450>

Nested lists work as well

In [6]:
compare_render(r"""
- li 1
- li 2
    1. li 3
    1. li 4
- li 5
""")
<IPython.core.display.Javascript at 0x21ac150>
NBConvert Latex Output
\begin{itemize}
\itemsep1pt\parskip0pt\parsep0pt
\item
  li 1
\item
  li 2

  \begin{enumerate}
  \def\labelenumi{\arabic{enumi}.}
  \itemsep1pt\parskip0pt\parsep0pt
  \item
    li 3
  \item
    li 4
  \end{enumerate}
\item
  li 5
\end{itemize}
NBViewer Output
  • li 1
  • li 2
    1. li 3
    2. li 4
  • li 5
<IPython.core.display.Javascript at 0x21ac150>

Unicode support

In [7]:
compare_render(ur"""
überschuß +***^°³³ α β θ
""")
<IPython.core.display.Javascript at 0x22b6950>
NBConvert Latex Output
überschuß +\emph{*}\^{}°³³ α β θ
NBViewer Output

überschuß +*^°³³ α β θ

<IPython.core.display.Javascript at 0x21ac3d0>

Pandoc may produce invalid latex, e.g \sout is not allowed in headings

In [8]:
compare_render(r"""

# Heading 1 ~~strikeout~~
""")
<IPython.core.display.Javascript at 0x21ac590>
NBConvert Latex Output
\section{Heading 1 \sout{strikeout}}
NBViewer Output

Heading 1 strikeout

<IPython.core.display.Javascript at 0x21ac590>

Horizontal lines work just fine

In [9]:
compare_render(r"""
above

--------

below
""")
<IPython.core.display.Javascript at 0x21ac150>
NBConvert Latex Output
above

\begin{center}\rule{3in}{0.4pt}\end{center}

below
NBViewer Output

above


below

<IPython.core.display.Javascript at 0x21ac450>

Extended markdown of pandoc

(maybe we should deactivate this)

In [10]:
compare_render(r"""
This is Markdown ~subscript~ and ^superscript^ text.
""")
<IPython.core.display.Javascript at 0x21ac150>
NBConvert Latex Output
This is Markdown \textsubscript{subscript} and
\textsuperscript{superscript} text.
NBViewer Output

This is Markdown subscript and superscript text.

<IPython.core.display.Javascript at 0x21ac150>

No space before underline behaves inconsistent (Pandoc extension: intraword_underscores - deactivate?)

In [11]:
compare_render(r"""
This is Markdown not_italic_.
""")
<IPython.core.display.Javascript at 0x21ac5d0>
NBConvert Latex Output
This is Markdown not\_italic\_.
NBViewer Output

This is Markdown not_italic_.

<IPython.core.display.Javascript at 0x21ac5d0>

Pandoc allows to define tex macros which are respected for all output formats, the notebook not.

In [12]:
compare_render(r"""
\newcommand{\tuple}[1]{\langle #1 \rangle}

$\tuple{a, b, c}$
""")
<IPython.core.display.Javascript at 0x21ac450>
NBConvert Latex Output
\newcommand{\tuple}[1]{\langle #1 \rangle}

$\tuple{a, b, c}$
NBViewer Output

\(\langle a, b, c \rangle\)

<IPython.core.display.Javascript at 0x21ac450>

When placing the \newcommand inside a math environment it works within the notebook and nbviewer, but produces invalid latex (the newcommand is only valid in the same math environment).

In [13]:
compare_render(r"""
$\newcommand{\foo}[1]{...:: #1 ::...}$
$\foo{bar}$
""")
<IPython.core.display.Javascript at 0x21ac590>
NBConvert Latex Output
$\newcommand{\foo}[1]{...:: #1 ::...}$ $\foo{bar}$
NBViewer Output

\(\newcommand{\foo}[1]{...:: #1 ::...}\) \(\foo{bar}\)

<IPython.core.display.Javascript at 0x21ac590>

HTML or LaTeX injections

Raw HTML gets dropped entirely when converting to $\LaTeX$.

In [14]:
compare_render(r"""
This is HTML <b>bold</b> and <i>italic</i> text.
""")
<IPython.core.display.Javascript at 0x21ac5d0>
NBConvert Latex Output
This is HTML bold and italic text.
NBViewer Output

This is HTML bold and italic text.

<IPython.core.display.Javascript at 0x21ac5d0>

Same for something like center

In [15]:
compare_render(r"""
<center>Center aligned</center>
""")
<IPython.core.display.Javascript at 0x21ac210>
NBConvert Latex Output
Center aligned
NBViewer Output
Center aligned
<IPython.core.display.Javascript at 0x21ac210>

Raw $\LaTeX$ gets droppen entirely when converted to HTML. (I don't know why the HTML output is cropped here???)

In [16]:
compare_render(r"""
This is \LaTeX \bf{bold} and \emph{italic} text.
""")
<IPython.core.display.Javascript at 0x21ac590>
NBConvert Latex Output
This is \LaTeX \bf{bold} and \emph{italic} text.
NBViewer Output

This is

<IPython.core.display.Javascript at 0x21ac590>

A combination of raw $\LaTeX$ and raw HTML

In [17]:
compare_render(r"""
**foo** $\left( \sum_{k=1}^n a_k b_k \right)^2 \leq$ <b>b\$ar</b> $$test$$ 
\cite{}
""")
<IPython.core.display.Javascript at 0x21ac590>
NBConvert Latex Output
\textbf{foo} $\left( \sum_{k=1}^n a_k b_k \right)^2 \leq$ b\$ar \[test\]
\cite{}
NBViewer Output

foo \(\left( \sum_{k=1}^n a_k b_k \right)^2 \leq\) b$ar \[test\]

<IPython.core.display.Javascript at 0x21ac590>

Tables

HTML tables render in the notebook, but not in Pandoc.

In [18]:
compare_render(r"""
<table>
    <tr>
        <td>a</td>
        <td>b</td>
    </tr>
    <tr>
        <td>c</td>
        <td>d</td>
    </tr>
</table>
""")
<IPython.core.display.Javascript at 0x21ac5d0>
NBConvert Latex Output
a

b

c

d
NBViewer Output
a b
c d
<IPython.core.display.Javascript at 0x21ac5d0>

Instead, Pandoc supports simple ascii tables. Unfortunately marked.js doesn't support this, and therefore it is not supported in the notebook.

In [19]:
compare_render(r"""
+---+---+
| a | b |
+---+---+
| c | d |
+---+---+
""")
<IPython.core.display.Javascript at 0x21ac210>
NBConvert Latex Output
\begin{longtable}[c]{@{}ll@{}}
\hline\noalign{\medskip}
\begin{minipage}[t]{0.06\columnwidth}\raggedright
a
\end{minipage} & \begin{minipage}[t]{0.06\columnwidth}\raggedright
b
\end{minipage}
\\\noalign{\medskip}
\begin{minipage}[t]{0.06\columnwidth}\raggedright
c
\end{minipage} & \begin{minipage}[t]{0.06\columnwidth}\raggedright
d
\end{minipage}
\\\noalign{\medskip}
\hline
\end{longtable}
NBViewer Output

a

b

c

d

<IPython.core.display.Javascript at 0x21ac210>

An alternative to basic ascii tables is pipe tables. Pipe tables can be recognized by Pandoc and are supported by marked, hence, this is the best way to add tables.

In [20]:
compare_render(r"""
|Left |Center |Right|
|:----|:-----:|----:|
|Text1|Text2  |Text3|
""")
<IPython.core.display.Javascript at 0x21ac150>
NBConvert Latex Output
\begin{longtable}[c]{@{}lcr@{}}
\hline\noalign{\medskip}
Left & Center & Right
\\\noalign{\medskip}
\hline\noalign{\medskip}
Text1 & Text2 & Text3
\\\noalign{\medskip}
\hline
\end{longtable}
NBViewer Output
Left Center Right
Text1 Text2 Text3
<IPython.core.display.Javascript at 0x21ac150>

Pandoc recognizes cell alignment in simple tables. Since marked.js doesn't recognize ascii tables, it can't render this table.

In [21]:
compare_render(r"""
Right Aligned Center Aligned Left Aligned
------------- -------------- ------------
          Why      does      this
     actually      work?     Who
        knows       ...
""")

print("\n"*5)
<IPython.core.display.Javascript at 0x21ac450>
NBConvert Latex Output
\begin{longtable}[c]{@{}lll@{}}
\hline\noalign{\medskip}
Right Aligned & Center Aligned & Left Aligned
\\\noalign{\medskip}
\hline\noalign{\medskip}
Why & does & this
\\\noalign{\medskip}
actually & work? & Who
\\\noalign{\medskip}
knows & \ldots{} &
\\\noalign{\medskip}
\hline
\end{longtable}
NBViewer Output
Right Aligned Center Aligned Left Aligned
Why does this
actually work? Who
knows ...
<IPython.core.display.Javascript at 0x21ac450>





Images

Markdown images work on both. However, remote images are not allowed in $\LaTeX$. Maybe add a preprocessor to download these. The alternate text is displayed in nbviewer next to the image.

In [22]:
compare_render(r"""
![Alternate Text](http://ipython.org/_static/IPy_header.png)
""")
<IPython.core.display.Javascript at 0x22b6690>
NBConvert Latex Output
\begin{figure}[htbp]
\centering
\includegraphics{http://ipython.org/_static/IPy_header.png}
\caption{Alternate Text}
\end{figure}
NBViewer Output
Alternate Text

Alternate Text

<IPython.core.display.Javascript at 0x21ac450>

HTML Images only work in the notebook.

In [23]:
compare_render(r"""
<img src="http://ipython.org/_static/IPy_header.png">
""")
<IPython.core.display.Javascript at 0x22b65d0>
NBConvert Latex Output
NBViewer Output

<IPython.core.display.Javascript at 0x21ac450>

Math

Simple inline and displaystyle maths work fine

In [24]:
compare_render(r"""
My equation:
$$ 5/x=2y $$

It is inline $ 5/x=2y $ here.
""")
<IPython.core.display.Javascript at 0x22b6950>
NBConvert Latex Output
My equation: \[ 5/x=2y \]

It is inline \$ 5/x=2y \$ here.
NBViewer Output

My equation: \[ 5/x=2y \]

It is inline $ 5/x=2y $ here.

<IPython.core.display.Javascript at 0x21ac450>

If the first \$ is on a new line, the equation is not captured by md2tex, if both \$s are on a new line md2html fails (Note the raw latex is dropped) but the notebook renders it correctly.

In [25]:
compare_render(r"""
$5 \cdot x=2$

$
5 \cdot x=2$

$
5 \cdot x=2
$
""")
<IPython.core.display.Javascript at 0x22b66d0>
NBConvert Latex Output
$5 \cdot x=2$

\$ 5 \cdot x=2\$

\$ 5 \cdot x=2 \$
NBViewer Output

\(5 \cdot x=2\)

$ 5 x=2$

$ 5 x=2 $

<IPython.core.display.Javascript at 0x21ac450>

MathJax permits some $\LaTeX$ math constructs without \$s, of course these raw $\LaTeX$ is stripped when converting to html. Moreove, the & are escaped by the lxml parsing #4251.

In [26]:
compare_render(r"""
\begin{align}
a & b\\
d & c
\end{align}

\begin{eqnarray}
a & b \\
c & d
\end{eqnarray}
""")
<IPython.core.display.Javascript at 0x22b6690>
NBConvert Latex Output
\begin{align}
a &amp; b\\
d &amp; c
\end{align}

\begin{eqnarray}
a &amp; b \\
c &amp; d
\end{eqnarray}
NBViewer Output
<IPython.core.display.Javascript at 0x21ac450>

There is another lxml issue, #4283

In [27]:
compare_render(r"""
1<2 is true, but 3>4 is false.

$1<2$ is true, but $3>4$ is false.

1<2 it is even worse if it is alone in a line.
""")
<IPython.core.display.Javascript at 0x22b6950>
NBConvert Latex Output
14 is false.

$14$ is false.

1
NBViewer Output

1<2 is true, but 3>4 is false.

\(1<2\) is true, but \(3>4\) is false.

1<2 it is even worse if it is alone in a line.

<IPython.core.display.Javascript at 0x21ac450>

Listings, and Code blocks

In [28]:
compare_render(r"""
some source code

```
a = "test"
print(a)
```
""")
<IPython.core.display.Javascript at 0x22b68d0>
NBConvert Latex Output
some source code

\begin{verbatim}
a = "test"
print(a)
\end{verbatim}
NBViewer Output

some source code

a = "test"
print(a)
<IPython.core.display.Javascript at 0x21ac450>

Language specific syntax highlighting by Pandoc requires additional dependencies to render correctly.

In [29]:
compare_render(r"""
some source code

```python
a = "test"
print(a)
```
""")
<IPython.core.display.Javascript at 0x22b6850>
NBConvert Latex Output
some source code

\begin{Shaded}
\begin{Highlighting}[]
\NormalTok{a = }\StringTok{"test"}
\KeywordTok{print}\NormalTok{(a)}
\end{Highlighting}
\end{Shaded}
NBViewer Output

some source code

a = "test"
print(a)
<IPython.core.display.Javascript at 0x21ac450>
Back to top