CommonMark + GFM compatible Markdown parser and renderer https://docs.rs/comrak
Go to file
amelia cuss 120a36cfd5 0.21.0 2024-01-25 21:42:34 +11:00
.github flake.nix, workflows: use Nix for clippy/fmt to start. 2023-11-29 15:15:49 +11:00
benches Add benchmark script, use hyperfine 2023-06-21 15:56:44 +05:30
examples Add extension option, multiline-block-quotes 2024-01-23 18:13:51 -06:00
fuzz flake.nix: use nightly for cargo fuzz. 2024-01-24 15:37:03 +11:00
hooks update-readme: use Comrak as an example of Markdown editing 2021-04-10 14:58:35 +10:00
script Pass new extension option in cibuild 2024-01-23 18:23:43 -06:00
src parser: add missing triple-backtick. 2024-01-24 11:34:35 +11:00
vendor Add multline blockquote extension 2024-01-23 16:33:18 -06:00
.editorconfig Add clippy, clean up lots 2017-04-15 14:08:13 +10:00
.gitattributes scanners: begin port to re2c 2023-03-25 17:06:42 +11:00
.gitignore flake.nix, workflows: use Nix for clippy/fmt to start. 2023-11-29 15:15:49 +11:00
.gitmodules Use github/cmark-gfm 2023-09-29 15:33:22 -05:00
CODE_OF_CONDUCT.md CODE_OF_CONDUCT: use @Xe's Creator's Code 2023-03-28 19:05:58 +11:00
COPYING README: new URL, copyright year, etc. 2024-01-18 18:44:21 +11:00
Cargo.lock 0.21.0 2024-01-25 21:42:34 +11:00
Cargo.toml 0.21.0 2024-01-25 21:42:34 +11:00
Makefile README: new URL, copyright year, etc. 2024-01-18 18:44:21 +11:00
README.md 0.21.0 2024-01-25 21:42:34 +11:00
RELEASE_CHECKLIST.md RELEASE_CHECKLIST: update 2023-03-31 21:11:13 +11:00
changelog.txt 0.21.0 2024-01-25 21:42:34 +11:00
flake.lock flake.nix: use nightly for cargo fuzz. 2024-01-24 15:37:03 +11:00
flake.nix flake.nix: use nightly for cargo fuzz. 2024-01-24 15:37:03 +11:00
rustfmt.toml cargo fmt 2020-08-14 17:39:17 +10:00
spec_out.txt update specs 2019-04-09 09:28:19 +10:00

Comrak

Build Status Spec
Status: 671/671 Financial Contributors on Open
Collective crates.io version docs.rs

Rust port of github's cmark-gfm. Currently synced with release 0.29.0.gfm.13.

Installation

Specify it as a requirement in Cargo.toml:

[dependencies]
comrak = "0.21"

Comrak supports Rust stable.

Mac & Linux Binaries

curl https://webinstall.dev/comrak | bash

Windows 10 Binaries

curl.exe -A "MS" https://webinstall.dev/comrak | powershell

Usage

$ comrak --help
A 100% CommonMark-compatible GitHub Flavored Markdown parser and formatter

Usage: comrak [OPTIONS] [FILE]...

Arguments:
  [FILE]...
          CommonMark file(s) to parse; or standard input if none passed

Options:
  -c, --config-file <PATH>
          Path to config file containing command-line arguments, or 'none'
          
          [default: /Users/kivikakk/.config/comrak/config]

      --hardbreaks
          Treat newlines as hard line breaks

      --smart
          Use smart punctuation

      --github-pre-lang
          Use GitHub-style <pre lang> for code blocks

      --full-info-string
          Enable full info strings for code blocks

      --gfm
          Enable GitHub-flavored markdown extensions: strikethrough, tagfilter, table, autolink, and
          tasklist. Also enables --github-pre-lang

      --relaxed-tasklist-character
          Enable relaxing which character is allowed in a tasklists

      --relaxed-autolinks
          Enable relaxing of autolink parsing, allowing links to be recognized when in brackets

      --default-info-string <INFO>
          Default value for fenced code block's info strings if none is given

      --unsafe
          Allow raw HTML and dangerous URLs

      --gemojis
          Translate gemojis into UTF-8 characters

      --escape
          Escape raw HTML instead of clobbering it

  -e, --extension <EXTENSION>
          Specify extension name(s) to use
          
          Multiple extensions can be delimited with ",", e.g. --extension strikethrough,table
          
          [possible values: strikethrough, tagfilter, table, autolink, tasklist, superscript,
          footnotes, description-lists, multiline-block-quotes]

  -t, --to <FORMAT>
          Specify output format
          
          [default: html]
          [possible values: html, xml, commonmark]

  -o, --output <FILE>
          Write output to FILE instead of stdout

      --width <WIDTH>
          Specify wrap width (0 = nowrap)
          
          [default: 0]

      --header-ids <PREFIX>
          Use the Comrak header IDs extension, with the given ID prefix

      --front-matter-delimiter <DELIMITER>
          Ignore front-matter that starts and ends with the given string

      --syntax-highlighting <THEME>
          Syntax highlighting for codefence blocks. Choose a theme or 'none' for disabling
          
          [default: base16-ocean.dark]

      --list-style <LIST_STYLE>
          Specify bullet character for lists (-, +, *) in CommonMark output
          
          [default: dash]
          [possible values: dash, plus, star]

      --sourcepos
          Include source position attribute in HTML and XML output

  -h, --help
          Print help information (use `-h` for a summary)

  -V, --version
          Print version information

By default, Comrak will attempt to read command-line options from a config file specified by
--config-file. This behaviour can be disabled by passing --config-file none. It is not an error if
the file does not exist.

And there's a Rust interface. You can use comrak::markdown_to_html directly:

use comrak::{markdown_to_html, Options};
assert_eq!(markdown_to_html("Hello, **世界**!", &Options::default()),
           "<p>Hello, <strong>世界</strong>!</p>\n");

Or you can parse the input into an AST yourself, manipulate it, and then use your desired formatter:

extern crate comrak;
use comrak::{parse_document, format_html, Arena, Options};
use comrak::nodes::{AstNode, NodeValue};

// The returned nodes are created in the supplied Arena, and are bound by its lifetime.
let arena = Arena::new();

let root = parse_document(
    &arena,
    "This is my input.\n\n1. Also my input.\n2. Certainly my input.\n",
    &Options::default());

fn iter_nodes<'a, F>(node: &'a AstNode<'a>, f: &F)
    where F : Fn(&'a AstNode<'a>) {
    f(node);
    for c in node.children() {
        iter_nodes(c, f);
    }
}

iter_nodes(root, &|node| {
    match &mut node.data.borrow_mut().value {
        &mut NodeValue::Text(ref mut text) => {
            let orig = std::mem::replace(text, vec![]);
            *text = String::from_utf8(orig).unwrap().replace("my", "your").as_bytes().to_vec();
        }
        _ => (),
    }
});

let mut html = vec![];
format_html(root, &Options::default(), &mut html).unwrap();

assert_eq!(
    String::from_utf8(html).unwrap(),
    "<p>This is your input.</p>\n\
     <ol>\n\
     <li>Also your input.</li>\n\
     <li>Certainly your input.</li>\n\
     </ol>\n");

Benchmarking

For running benchmarks, you will need to install hyperfine and optionally cmake.

If you want to just run the benchmark for comrak, with the current state of the repo, you can simply run

make bench-comrak

This will build comrak in release mode, and run benchmark on it. You will see the time measurements as reported by hyperfine in the console.

Makefile also provides a way to run benchmarks for comrak current state (with your changes), comrak main branch, cmark-gfm, pulldown-cmark and markdown-it.rs. For this you will need to install cmake. After that make sure that you have set-up the git submodules. In case you have not installed submodules when cloning, you can do it by running

git submodule update --init

After this is done, you can run

make bench-all

which will run benchmarks across all, and report the time take by each as well as relative time.

Apart from this, CI is also setup for running benchmarks when a pull request is first opened. It will add a comment with the results on the pull request in a tabular format comparing the 5 versions. After that you can manually trigger this CI by commenting /run-bench on the PR, this will update the existing comment with new results. Note benchmarks won't be automatically run on each push.

Security

As with cmark and cmark-gfm, Comrak will scrub raw HTML and potentially dangerous links. This change was introduced in Comrak 0.4.0 in support of a safe-by-default posture.

To allow these, use the unsafe_ option (or --unsafe with the command line program). If doing so, we recommend the use of a sanitisation library like ammonia configured specific to your needs.

Extensions

Comrak supports the five extensions to CommonMark defined in the GitHub Flavored Markdown Spec:

Comrak additionally supports its own extensions, which are yet to be specced out (PRs welcome!):

  • Superscript
  • Header IDs
  • Footnotes
  • Description lists
  • Front matter
  • Shortcodes

By default none are enabled; they are individually enabled with each parse by setting the appropriate values in the ComrakExtensionOptions struct.

Plugins

Codefence syntax highlighter

At the moment syntax highlighting of codefence blocks is the only feature that can be enhanced with plugins.

Create an implementation of the SyntaxHighlighterAdapter trait, and then provide an instance of such adapter to Plugins.render.codefence_syntax_highlighter. For formatting a markdown document with plugins, use the markdown_to_html_with_plugins function, which accepts your plugin as a parameter.

See the syntax_highlighter.rs and syntect.rs examples for more details.

Syntect

syntect is a syntax highlighting library for Rust. By default, comrak offers a plugin for it. In order to utilize it, create an instance of plugins::syntect::SyntectAdapter and use it as your Plugins option.

Comrak's design goal is to model the upstream cmark-gfm as closely as possible in terms of code structure. The upside of this is that a change in cmark-gfm has a very predictable change in Comrak. Likewise, any bug in cmark-gfm is likely to be reproduced in Comrak. This could be considered a pro or a con, depending on your use case.

The downside, of course, is that the code is not what I'd call idiomatic Rust (so many RefCells), and while contributors and I have made it as fast as possible, it simply won't be as fast as some other CommonMark parsers depending on your use-case. Here are some other projects to consider:

  • Raph Levien's pulldown-cmark. It's very fast, uses a novel parsing algorithm, and doesn't construct an AST (but you can use it to make one if you want). cargo doc uses this, as do many other projects in the ecosystem. It appears semi-maintained as of March 2023.
  • markdown-rs (1.x) looks worth watching.
  • Know of another library? Please open a PR to add it!

As far as I know, Comrak is the only library to implement all of the GitHub Flavored Markdown extensions to the spec, but this tends to only be important if you want to reproduce GitHub's Markdown rendering exactly, e.g. in a GitHub client app.

Contributing

Contributions are highly encouraged; where possible I practice Optimistic Merging as described by Peter Hintjens. Please keep the code of conduct in mind when interacting with this project.

For now the preferred method is pull requests on GitHub, in order to maximise the number of eyes, but a mailing list for patches is in the works.

Thank you to Comrak's many contributors for PRs and issues opened!

Code Contributors

Financial Contributors

Become a financial contributor and help sustain Comrak's development. I'm self-employed --- open-source software relies on the collective.

Individuals

Organizations

Support this project with your organization. Your logo will show up here with a link to your website.

[Contribute](https://opencollective.com/comrak/contribute)

Contact

Ashe Connor <ashe kivikakk ee>

Copyright (c) 20172024, Asherah Connor. Licensed under the 2-Clause BSD License.

cmark itself is is copyright (c) 2014, John MacFarlane.

See COPYING for all the details.