Improved Citation Handling in Markdown Content

Summary

Enhanced the citation handling system to properly filter and display citations in markdown content, ensuring "Citations:" headers are correctly removed from callouts and citations are properly rendered.

Why Care

Citations are a critical part of academic and reference content. This improvement ensures that citations are consistently extracted, formatted, and displayed while preventing duplicate rendering or unwanted text artifacts. The changes make the content more readable and professionally formatted.

Implementation

Changes Made

  • Modified ArticleCallout.astro to improve filtering logic for citations:
    • Enhanced node filtering to catch all forms of "Citations:" headers
    • Added support for detecting "Citations:" text in paragraph nodes
    • Implemented more robust filtering conditions
  • Refined remarkCitations.ts plugin:
    • Improved citation detection pattern
    • Enhanced citation extraction from text nodes
    • Better handling of citation nodes in the AST

Technical Details

Citation Filtering in Callouts

In site/src/components/markdown/callouts/ArticleCallout.astro:
typescript
// Before: Simple type-based filtering
const contentWithoutCitations: Root = {
  type: 'root',
  children: citationsRoot.children.filter((node) => 
    node.type !== 'citations' && node.type !== 'citation'
  )
};

// After: Comprehensive filtering with text content analysis
const contentWithoutCitations: Root = {
  type: 'root',
  children: citationsRoot.children.filter((node) => {
    // Filter out citations and citation nodes
    if (node.type === 'citations' || node.type === 'citation') {
      return false;
    }
    
    // Filter out the "Citations:" header node (could be heading or paragraph)
    if (node.type === 'heading' || node.type === 'paragraph') {
      // Check if this node contains "Citations:" text
      const hasOnlyChildWithCitationsText = node.children?.length === 1 && 
        node.children[0].type === 'text' && 
        node.children[0].value === 'Citations:';
      
      if (hasOnlyChildWithCitationsText) {
        return false;
      }
    }
    
    // Keep all other nodes
    return true;
  })
};

Citation Processing Flow

  1. remarkCitations.ts extracts citations from markdown text
  2. Citations are transformed into structured nodes
  3. ArticleCitations.astro renders the citations with proper formatting
  4. ArticleCallout.astro filters out citation nodes from callouts

Integration Points

  • The citation handling system integrates with:
    • Markdown processing pipeline in OneArticle.astro
    • AST transformation in the remark plugin system
    • Component rendering in AstroMarkdown.astro
    • Callout handling in ArticleCallout.astro

Documentation

  • Created comprehensive documentation in content/lost-in-public/prompts/render-logic/Handle-Citations-Logic-and-Render-Citations-Component.md
  • Documentation includes:
    • Implementation flow
    • Key components
    • Citation format
    • Example usage
    • Troubleshooting tips

Testing Notes

The changes have been tested with:
  • Standard citation formats [number] URL
  • Citations within callout blocks
  • Multiple citations in a single document
  • Citations with various numbering patterns

Future Considerations

  • Consider adding support for more citation formats (e.g., academic citations)
  • Implement citation grouping by topic or section
  • Add citation cross-referencing within the document