Enhanced OpenGraph Service with Asynchronous Screenshot Fetching
Summary
Enhanced the OpenGraph service in the FileSystemObserver to asynchronously fetch screenshot URLs for all Markdown files with URLs, providing a fallback image source when no OpenGraph image is available.
Why Care
This improvement ensures that all content with URLs will eventually have screenshot previews, enhancing visual representation across the site without blocking the main processing flow. The implementation follows a non-blocking approach that allows the observer to continue processing files while screenshots are fetched in the background.
Implementation
Changes Made
- Enhanced the OpenGraph service to check for the
og_screenshot_url
property and fetch it asynchronously if missing - Updated the following files:
/tidyverse/observers/services/openGraphService.ts
: Added asynchronous screenshot URL fetching/tidyverse/observers/package.json
: Removed gray-matter dependency, using js-yaml instead/tidyverse/observers/fileSystemObserver.ts
: Integrated screenshot URL fetching
Technical Details
- Implemented a non-blocking background process for fetching screenshot URLs:
typescript
// /tidyverse/observers/services/openGraphService.ts
function fetchScreenshotUrlInBackground(url: string, filePath: string): void {
// Skip if we're already fetching this URL
if (screenshotFetchInProgress.has(url)) {
console.log(`Screenshot fetch already in progress for ${url}, skipping duplicate request`);
return;
}
// Add to tracking set
screenshotFetchInProgress.add(url);
console.log(`Starting background screenshot fetch for ${url} (${filePath})`);
// Don't await this promise - let it run in the background
(async () => {
try {
const screenshotUrl = await fetchScreenshotUrl(url, filePath);
if (screenshotUrl) {
console.log(`✅ Received screenshot URL for ${url} in background process: ${screenshotUrl}`);
// Read the file content
const content = await fs.readFile(filePath, 'utf8');
// Check if content has frontmatter
if (!content.startsWith('---')) {
console.log(`No frontmatter found in ${filePath}, cannot update screenshot URL`);
return;
}
// Find the end of frontmatter
const endIndex = content.indexOf('---', 3);
if (endIndex === -1) {
console.log(`Invalid frontmatter format in ${filePath}, cannot update screenshot URL`);
return;
}
// Extract frontmatter content
const frontmatterContent = content.substring(3, endIndex).trim();
try {
// Parse YAML frontmatter
const frontmatter = yaml.load(frontmatterContent) as Record<string, any>;
// Update frontmatter with screenshot URL
frontmatter.og_screenshot_url = screenshotUrl;
// Format the updated frontmatter
let yamlContent = yaml.dump(frontmatter);
// Insert updated frontmatter back into the file
const newContent = `---\n${yamlContent}---\n\n${content.substring(endIndex + 3).trimStart()}`;
await fs.writeFile(filePath, newContent, 'utf8');
console.log(`Updated ${filePath} with screenshot URL in background process`);
} catch (error) {
console.error(`Error parsing frontmatter in ${filePath}:`, error);
}
} else {
console.log(`⚠️ No screenshot URL found for ${url} in background process`);
}
} catch (error) {
console.error(`Error in background screenshot fetch for ${url}:`, error);
} finally {
// Remove from tracking set when done
screenshotFetchInProgress.delete(url);
}
})();
}
- Used the correct OpenGraph.io API endpoint format for screenshots:
typescript
// /tidyverse/observers/services/openGraphService.ts
const screenshotApiUrl = `https://opengraph.io/api/1.1/screenshot/${encodeURIComponent(url)}?dimensions=lg&quality=80&accept_lang=en&use_proxy=true&app_id=${apiKey}`;
- Removed dependency on gray-matter, using js-yaml directly for frontmatter parsing:
json
// /tidyverse/observers/package.json
"dependencies": {
"chokidar": "^3.5.3",
"dotenv": "^16.0.3",
"fs-extra": "^11.1.1",
"js-yaml": "^4.1.0",
"minimatch": "^9.0.3",
"node-fetch": "^2.6.9",
"uuid": "^9.0.0"
}
Integration Points
- The screenshot URL fetching integrates with the existing FileSystemObserver system
- The implementation uses the same OpenGraph.io API key as the existing OpenGraph service
- The screenshot URLs are stored in the frontmatter of Markdown files as
og_screenshot_url
- The ReportingService tracks successful and failed screenshot URL fetches
Documentation
- The implementation follows the project's code style with comprehensive commenting
- The OpenGraph service now checks for
og_screenshot_url
in frontmatter and fetches it if missing - The screenshot fetching happens asynchronously to avoid blocking the main process
- The implementation uses a tracking set to prevent duplicate requests for the same URL