Handle Citations in Markdown Content
Handle Citations in Markdown Content
Executive Summary
This prompt describes how to implement a citation handling system for markdown content in an Astro-based website. The system extracts citation references from anywhere in the document, processes them into a structured format, and renders them in a dedicated section at the end of the content. This approach maintains clean content while providing proper attribution for sources.
Implementation Flow
The citation handling system follows a specific flow through several components:
- Entry Point:
site/src/pages/more-about/[vocabulary].astro
- Dynamic route for vocabulary pages
- Loads content from the vocabulary collection
- Passes content to OneArticle layout
- Content Processing:
site/src/layouts/OneArticle.astro
- Processes markdown with a pipeline of remark plugins
- Uses
remarkCitations
plugin to extract and transform citations - Passes transformed MDAST to the component for rendering
- Rendering:
site/src/components/articles/OneArticleOnPage.astro
- Wraps content in a styled article container
- Passes MDAST to AstroMarkdown component
- Component Rendering:
site/src/components/markdown/AstroMarkdown.astro
- Handles different node types in the MDAST
- Routes citation nodes to ArticleCitations component
- Routes blockquote nodes to ArticleCallout component
- Citation Rendering:
site/src/components/markdown/citations/ArticleCitations.astro
- Renders the citations container with proper styling
- Processes each citation with appropriate formatting
- Callout Handling:
site/src/components/markdown/callouts/ArticleCallout.astro
- Filters out citation nodes from callout content
- Prevents duplicate rendering of citations in callouts
Key Components
1. remarkCitations Plugin (site/src/utils/markdown/remarkCitations.ts
)
This remark plugin is the core of the citation handling system:
typescript
// Main plugin function
export default function remarkCitations() {
return (tree: Root) => {
let allCitations: CitationNode[] = [];
let nodesToRemove: number[] = [];
// First pass: find all citations in text content
visit(tree, 'text', (node, index, parent) => {
// Extract citations using regex pattern
// Add to allCitations array
// Mark nodes for removal
});
// Remove nodes marked for deletion
nodesToRemove.sort((a, b) => b - a).forEach(index => {
tree.children.splice(index, 1);
});
// Create citations section if citations were found
if (allCitations.length > 0) {
const citationsNode = createCitationsSectionNode(allCitations);
tree.children.push(citationsNode as unknown as Paragraph);
}
return tree;
};
}
The plugin:
- Searches for citation patterns in text nodes
- Extracts them into a structured format
- Removes them from their original location
- Creates a dedicated citations section at the end of the document
2. ArticleCitations Component (site/src/components/markdown/citations/ArticleCitations.astro
)
Renders the citations in a structured format:
astro
<div class="citations">
{citations.map((citation) => (
<div class={citation.type === 'citation-attribution' ? 'citation-attribution' : 'citation'}>
{citation.children?.map((child) => {
if (child.type === 'link') {
return (
<a href={child.url} target="_blank" rel="noopener noreferrer">
{child.children?.[0].value}
</a>
);
}
return child.value;
})}
</div>
))}
</div>
3. ArticleCallout Component (site/src/components/markdown/callouts/ArticleCallout.astro
)
Handles citations within callouts by:
- Processing callout content with remarkCitations
- Filtering out citation nodes and headers from the callout content
- Preventing duplicate rendering of citations
typescript
// Remove citations nodes from content before converting to HTML
const contentWithoutCitations: Root = {
type: 'root',
children: citationsRoot.children.filter((node) => {
// Filter out citations and citation nodes
if (node.type === 'citations' || node.type === 'citation') {
return false;
}
// Filter out the "Citations:" header node (could be heading or paragraph)
if (node.type === 'heading' || node.type === 'paragraph') {
// Check if this node contains "Citations:" text
const hasOnlyChildWithCitationsText = node.children?.length === 1 &&
node.children[0].type === 'text' &&
node.children[0].value === 'Citations:';
if (hasOnlyChildWithCitationsText) {
return false;
}
}
// Keep all other nodes
return true;
})
};
Citation Format
Citations should be formatted as follows in markdown content:
markdown
[1] https://example.com/article1
[2] https://example.com/article2
Each citation consists of:
- A number in square brackets:
[1]
- A space
- A URL starting with http:// or https://
The system will automatically:
- Extract these citations from anywhere in the document
- Create a "Citations:" section at the end of the content
- Render each citation with proper formatting and clickable links
Example Usage
In a markdown file (e.g.,
content/vocabulary/Agile.md
): markdown
---
date_created: 2025-03-29
date_modified: 2025-04-07
---
# Agile Methodology
Agile is an iterative approach to software development[1].
It emphasizes flexibility and customer collaboration[2].
[1] https://agilemanifesto.org/
[2] https://www.atlassian.com/agile
This will be rendered with the citations extracted and placed at the end of the document in a properly formatted citations section.
Troubleshooting
If citations are not rendering correctly:
- Check Citation Format: Ensure citations follow the exact pattern
[number] URL
- Inspect AST: Look at the debug output in the console to see how citations are being processed
- Check Filtering Logic: If citations appear in callouts, ensure the filtering logic is correctly identifying citation nodes
Conclusion
This citation handling system provides a clean way to include references in markdown content. By automatically extracting and formatting citations, it maintains readability while ensuring proper attribution.