Refactored FileSystemObserver to Remove YAML Libraries
Summary
Removed all YAML libraries from the FileSystemObserver system and replaced with regex-based frontmatter parsing to fix date formatting inconsistencies and prevent infinite update loops.
Why Care
The observer system was causing inconsistent date formatting across content files, adding timestamps and quotes that violated our style guidelines. This created an infinite loop of changes and made content files difficult to read and compare. The fix ensures consistent date formatting across all content files and prevents future formatting issues.
Implementation
Changes Made
- Removed
js-yaml
library imports from all observer files:/tidyverse/observers/fileSystemObserver.ts
/tidyverse/observers/services/templateRegistry.ts
/tidyverse/observers/services/openGraphService.ts
/tidyverse/observers/services/citationService.ts
- Replaced YAML parsing with regex-based frontmatter extraction in:
/tidyverse/observers/fileSystemObserver.ts
- UpdatedextractFrontmatter
function/tidyverse/observers/services/openGraphService.ts
- AddedextractFrontmatterForOpenGraph
function/tidyverse/observers/services/citationService.ts
- UpdatedextractFrontmatterAndBody
function
- Enhanced
formatFrontmatter
function in/tidyverse/observers/fileSystemObserver.ts
:- Added export to make it available to other modules
- Added
neverQuoteFields
list to prevent quotes around title, lede, etc. - Improved date field handling using the Single Source of Truth
formatDate
utility
- Created and ran one-off script
/tidyverse/observers/scripts/fix-date-timestamps.ts
to fix existing files:- Processed 329 files in vocabulary directory
- Processed 50 files in prompts directory
- Removed timestamps and quotes from all date fields
- Updated
/tidyverse/observers/index.ts
to use the correct content root path
Technical Details
- Regex-based frontmatter parsing implementation:
typescript
// From fileSystemObserver.ts
async function extractFrontmatter(
content: string,
filePath: string,
reportingService: ReportingService
): Promise<{frontmatter: Record<string, any> | null, hasConversions: boolean, convertedFrontmatter?: Record<string, any>}> {
// Check if content has frontmatter
if (!content.startsWith('---')) {
return { frontmatter: null, hasConversions: false };
}
// Find the end of frontmatter
const endIndex = content.indexOf('---', 3);
if (endIndex === -1) {
return { frontmatter: null, hasConversions: false };
}
// Extract frontmatter content
const frontmatterContent = content.substring(3, endIndex).trim();
// Parse frontmatter using regex, not YAML library
const frontmatter: Record<string, any> = {};
// ... parsing logic ...
}
- Enhanced date formatting to prevent quotes and timestamps:
typescript
// From fileSystemObserver.ts
export function formatFrontmatter(frontmatter: Record<string, any>): string {
// Fields that should never have quotes, even if they contain special characters
const neverQuoteFields = ['title', 'lede', 'category', 'status', 'augmented_with'];
// Handle date fields specially to avoid quotes and timestamps
if (key.startsWith('date_') && value) {
// Use the formatDate utility function from commonUtils
const formattedDate = formatDate(value);
yamlContent += `${key}: ${formattedDate}\n`;
}
// Never add quotes to certain fields, even if they contain special characters
else if (neverQuoteFields.includes(key)) {
yamlContent += `${key}: ${value}\n`;
}
// ... other formatting logic ...
}
- Single Source of Truth date formatting:
typescript
// From commonUtils.ts
export function formatDate(dateValue: any): string | null {
if (!dateValue || dateValue === 'null') {
return null;
}
try {
// If it's already in YYYY-MM-DD format, return it
if (typeof dateValue === 'string' && /^\d{4}-\d{2}-\d{2}$/.test(dateValue)) {
return dateValue;
}
// Handle ISO string format with time component
if (typeof dateValue === 'string' && dateValue.includes('T')) {
// Just extract the date part
return dateValue.split('T')[0];
}
// Format as YYYY-MM-DD
const date = new Date(dateValue);
const year = date.getFullYear();
const month = String(date.getMonth() + 1).padStart(2, '0');
const day = String(date.getDate()).padStart(2, '0');
return `${year}-${month}-${day}`;
} catch (error) {
console.error(`Error formatting date:`, error);
return null;
}
}
Integration Points
- The updated observer system works with all content directories:
/content/vocabulary/
/content/tooling/
/content/lost-in-public/prompts/
- The fix-date-timestamps.ts script can be run on any content directory to standardize date formatting:
bash
cd tidyverse/observers/scripts && npx ts-node fix-date-timestamps.ts <directory-path>
Documentation
- Created issue resolution document:
/content/lost-in-public/issue-resolution/2025-04-11_Frontmatter--Date-formatting-fix.md
- Detailed explanation of the issue, solution, and lessons learned
- Instructions for running the fix script on other directories
- Key lessons documented:
- Avoid YAML libraries for frontmatter parsing
- Use regex-based parsing for better control
- Implement Single Source of Truth for date formatting
- Handle special fields (title, lede) explicitly