How remark-gfm Renders Tables
How remark-gfm Renders Tables: The Complete, Life-Saving Technical Guide
1. High-Level Architecture: How Table Parsing Works in remark-gfm
Core Libraries and Flow
- Table support is provided by integrating two key libraries:
- micromark-extension-gfm: Low-level tokenization of GFM features (including tables)
- mdast-util-gfm: Converts micromark tokens into MDAST nodes (the markdown AST used by remark)
The Plugin Entry Point
- The main entrypoint is
remarkGfm(options)
(see yourlib/index.js
). - When remark parses markdown, this plugin injects:
gfm(settings)
from micromark-extension-gfm into the tokenization phasegfmFromMarkdown()
from mdast-util-gfm into the AST conversion phasegfmToMarkdown()
for serializing AST back to markdown
2. The Table Parsing Pipeline: Step-by-Step
Step 1: Markdown Source → Tokenization (micromark)
- micromark is a streaming tokenizer/parser for markdown.
- When the parser encounters a table structure (lines with pipes
|
and header/row delimiters), thegfm
extension recognizes the table syntax. - Key logic:
- Detects a table when a line contains pipes and is not indented as a code block.
- Recognizes header rows (with
|
) and delimiter rows (with---
,:---:
, etc. for alignment). - Emits tokens for
table
,tableRow
,tableCell
, and alignment.
- Relevant file:
micromark-extension-gfm/table.js
(not in your copy, but open source and well-documented).
Step 2: Token Stream → MDAST (mdast-util-gfm)
- The token stream from micromark is passed to
mdast-util-gfm
. - This utility converts tokens into MDAST nodes:
table
(type: 'table')tableRow
(type: 'tableRow')tableCell
(type: 'tableCell')
- Each node contains children for rows/cells, and alignment info is added as an
align
property on the table node.
Step 3: MDAST → HTML/Component Rendering (remark/rehype)
- Once in MDAST, the table node can be rendered by any remark-compatible renderer (e.g., rehype, Astro, custom renderers).
- The structure of the MDAST table node is:js
{ type: 'table', align: [null, 'center', 'right'], children: [ { type: 'tableRow', children: [ { type: 'tableCell', children: [...] }, ... ] }, ... ] }
- Renderers walk this tree to produce
<table>
,<tr>
,<td>
, etc. in the final HTML.
3. Key Functions and Their Roles
In remark-gfm
(your lib/index.js
)
remarkGfm(options)
- Registers the GFM micromark extension and the MDAST converters.
- This is the only function in the file, but it wires up the entire GFM feature set.
In micromark-extension-gfm
gfm()
- Returns an object with extensions for tables, autolinks, strikethrough, etc.
- The table extension is responsible for detecting the markdown table syntax.
- Table detection logic:
- Looks for lines matching the GFM table pattern (pipes, header separator row, etc).
- Emits tokens for each structural part (table start, row, cell, alignment).
In mdast-util-gfm
gfmFromMarkdown()
- Registers handlers for micromark tokens to convert them into MDAST nodes.
- For tables:
table
token →table
nodetableRow
token →tableRow
nodetableCell
token →tableCell
node- Alignment is extracted from the delimiter row and stored as
align
.
4. Table Node Format in MDAST
js
{
type: 'table',
align: [null, 'center', 'right'], // alignment for each column
children: [
{
type: 'tableRow',
children: [
{ type: 'tableCell', children: [...] },
...
]
},
...
]
}
5. Dependencies and How They Work Together
- remark-gfm: The plugin you use to enable GFM features in remark.
- micromark-extension-gfm: Handles the low-level parsing/tokenizing of GFM features (including tables).
- mdast-util-gfm: Converts micromark tokens into MDAST nodes for tables, footnotes, etc.
- remark-parse: The core markdown parser for remark.
- remark-stringify: Serializes MDAST back to markdown (including tables).
- unified: The processing engine that wires it all together.
6. How to Rebuild Table Rendering Independently
a. Table Detection (Tokenizer)
- Write a parser that reads lines and matches the GFM table pattern:
- At least one pipe (
|
) per line - A header row, then a delimiter row (e.g.,
| --- | ---: | :---: |
) - Optionally, leading/trailing pipes can be omitted
- Parse out:
- Number of columns
- Alignment for each column (from delimiter row)
- Each row/cell’s content
b. AST Construction
- Build a tree of nodes:
table
→ hasalign
andchildren
(rows)tableRow
→ haschildren
(cells)tableCell
→ haschildren
(inline markdown nodes)
c. Rendering
- Walk the AST and output HTML:
<table>
→<tr>
→<td>
/<th>
- Apply alignment as
style="text-align:..."
on<td>
/<th>
7. Example: Minimal Table Parser (Pseudo-code)
js
function parseTable(markdown) {
// 1. Split into lines, find header/delimiter/data rows
// 2. Parse delimiter row for alignment
// 3. Build AST nodes as above
}
8. Further Reading & Official Sources
9. Life-Saving Summary
- remark-gfm does not parse tables itself: it wires up micromark (tokenizer) and mdast-util-gfm (AST converter).
- micromark-extension-gfm is where table detection happens (tokenizes the pipes, header, delimiter, and cells).
- mdast-util-gfm converts those tokens into the MDAST table/tree structure.
- You can rebuild this flow by writing your own tokenizer and AST builder as described above.
If you need a working, minimal example in code, or want to see a full implementation, just ask.