How remark-gfm Renders Tables
How remark-gfm Renders Tables: The Complete, Life-Saving Technical Guide
1. High-Level Architecture: How Table Parsing Works in remark-gfm
Core Libraries and Flow
- Table support is provided by integrating two key libraries:
- micromark-extension-gfm: Low-level tokenization of GFM features (including tables)
- mdast-util-gfm: Converts micromark tokens into MDAST nodes (the markdown AST used by remark)
The Plugin Entry Point
- The main entrypoint is
remarkGfm(options)(see yourlib/index.js). - When remark parses markdown, this plugin injects:
gfm(settings)from micromark-extension-gfm into the tokenization phasegfmFromMarkdown()from mdast-util-gfm into the AST conversion phasegfmToMarkdown()for serializing AST back to markdown
2. The Table Parsing Pipeline: Step-by-Step
Step 1: Markdown Source → Tokenization (micromark)
- micromark is a streaming tokenizer/parser for markdown.
- When the parser encounters a table structure (lines with pipes
|and header/row delimiters), thegfmextension recognizes the table syntax. - Key logic:
- Detects a table when a line contains pipes and is not indented as a code block.
- Recognizes header rows (with
|) and delimiter rows (with---,:---:, etc. for alignment). - Emits tokens for
table,tableRow,tableCell, and alignment.
- Relevant file:
micromark-extension-gfm/table.js(not in your copy, but open source and well-documented).
Step 2: Token Stream → MDAST (mdast-util-gfm)
- The token stream from micromark is passed to
mdast-util-gfm. - This utility converts tokens into MDAST nodes:
table(type: 'table')tableRow(type: 'tableRow')tableCell(type: 'tableCell')
- Each node contains children for rows/cells, and alignment info is added as an
alignproperty on the table node.
Step 3: MDAST → HTML/Component Rendering (remark/rehype)
- Once in MDAST, the table node can be rendered by any remark-compatible renderer (e.g., rehype, Astro, custom renderers).
- The structure of the MDAST table node is:js
{ type: 'table', align: [null, 'center', 'right'], children: [ { type: 'tableRow', children: [ { type: 'tableCell', children: [...] }, ... ] }, ... ] } - Renderers walk this tree to produce
<table>,<tr>,<td>, etc. in the final HTML.
3. Key Functions and Their Roles
In remark-gfm (your lib/index.js)
remarkGfm(options)- Registers the GFM micromark extension and the MDAST converters.
- This is the only function in the file, but it wires up the entire GFM feature set.
In micromark-extension-gfm
gfm()- Returns an object with extensions for tables, autolinks, strikethrough, etc.
- The table extension is responsible for detecting the markdown table syntax.
- Table detection logic:
- Looks for lines matching the GFM table pattern (pipes, header separator row, etc).
- Emits tokens for each structural part (table start, row, cell, alignment).
In mdast-util-gfm
gfmFromMarkdown()- Registers handlers for micromark tokens to convert them into MDAST nodes.
- For tables:
tabletoken →tablenodetableRowtoken →tableRownodetableCelltoken →tableCellnode- Alignment is extracted from the delimiter row and stored as
align.
4. Table Node Format in MDAST
js
{
type: 'table',
align: [null, 'center', 'right'], // alignment for each column
children: [
{
type: 'tableRow',
children: [
{ type: 'tableCell', children: [...] },
...
]
},
...
]
} 5. Dependencies and How They Work Together
- remark-gfm: The plugin you use to enable GFM features in remark.
- micromark-extension-gfm: Handles the low-level parsing/tokenizing of GFM features (including tables).
- mdast-util-gfm: Converts micromark tokens into MDAST nodes for tables, footnotes, etc.
- remark-parse: The core markdown parser for remark.
- remark-stringify: Serializes MDAST back to markdown (including tables).
- unified: The processing engine that wires it all together.
6. How to Rebuild Table Rendering Independently
a. Table Detection (Tokenizer)
- Write a parser that reads lines and matches the GFM table pattern:
- At least one pipe (
|) per line - A header row, then a delimiter row (e.g.,
| --- | ---: | :---: |) - Optionally, leading/trailing pipes can be omitted
- Parse out:
- Number of columns
- Alignment for each column (from delimiter row)
- Each row/cell’s content
b. AST Construction
- Build a tree of nodes:
table→ hasalignandchildren(rows)tableRow→ haschildren(cells)tableCell→ haschildren(inline markdown nodes)
c. Rendering
- Walk the AST and output HTML:
<table>→<tr>→<td>/<th>- Apply alignment as
style="text-align:..."on<td>/<th>
7. Example: Minimal Table Parser (Pseudo-code)
js
function parseTable(markdown) {
// 1. Split into lines, find header/delimiter/data rows
// 2. Parse delimiter row for alignment
// 3. Build AST nodes as above
} 8. Further Reading & Official Sources
9. Life-Saving Summary
- remark-gfm does not parse tables itself: it wires up micromark (tokenizer) and mdast-util-gfm (AST converter).
- micromark-extension-gfm is where table detection happens (tokenizes the pipes, header, delimiter, and cells).
- mdast-util-gfm converts those tokens into the MDAST table/tree structure.
- You can rebuild this flow by writing your own tokenizer and AST builder as described above.
If you need a working, minimal example in code, or want to see a full implementation, just ask.