Maintain an Image Generator Obsidian Plugin

Maintain an Image Generator Obsidian Plugin
1. Executive Summary
This specification outlines the development of an Obsidian plugin that integrates with the Recraft API to generate and manage images directly within the Obsidian note-taking environment. The plugin will allow users to generate images based on text prompts and automatically update their notes' frontmatter with the generated image URLs.
2. Background & Motivation
Problem Statement
Users need an efficient way to generate and manage images within their Obsidian vault without leaving the application. Current workflows require switching between multiple tools and manually updating note metadata.
Background
The Lossless Group has implemented the image generation functionality twice:
- In the AI-Labs project
zsh
ai-labs/apis/recraft/generate-banner-and-portrait-images-recraft.py`
ai-labs/apis/imagekit/convertImageToImagkitUrl.cjs`
- In the Tidyverse Observer System
zsh
tidyverse/observers/services/imageKitService.ts
Why This Is Important Now
- Streamlines content creation workflow
- Reduces context switching for users
- Maintains consistency in image generation and management
- Integrates with existing Obsidian ecosystem
Current Limitations
- Manual process for generating and adding images
- No native Obsidian solution for AI image generation
- Inconsistent metadata management across notes
a) Reference Projects and Starter Code
- Recraft API Documentation
- Obsidian Plugin Developer Documentation
- Example implementations in
ai-labs/apis/recraft/generate-banner-and-portrait-images-recraft.py` and `ai-labs/apis/imagekit/convertImageToImagkitUrl.cjs
b) Nuances for Consideration
- We will start with our own starter code, which has its own project architecture. We will clone the code from: The Lossless Group's Obsidian Plugin Starter Code
- All functionality there will need to be replaced with custom to this project. However, the architecture, patterns and styles should be preserved.
- Obsidian plugins run on ES Build, which does not allow for typical console related logging. We will need to develop a logging and reporting mechanism in the future, but for now let's get this running.
- Secure API key management through Obsidian's settings. Obsidian handles this elegantly. No use of .env files unless for development only.
- Support for Recraft API's potential image types (e.g.,
vector_illustration
)- Support for a "Custom Style" file to include in the Recraft API Call
- Correct handling of image generation requests and responses according to Recraft API's API Docs.
- Error handling through Obsidian notifications.
c) Constraints and Applicable Lessons Learned
- Do not use TSX or any other library other than pure typescript.
- Clearly declare types for all variables and functions, and do not use generic types.
- Maintain types in the types directory.
- Define interfaces in the working file.
- Do not "lump" code into one or two files. We have a project architecture, use it.
- API rate limits are only necessary in batch processes.
- Can only show errors through Obsidian notifications API.
- logging to be developed and streamlined later.
- Handle frontmatter rigorously in extract and write processes using only string operations.
- Must use our battle-tested
yamlFrontmatter.ts
utility. - MAY NOT USE YAML LIBRARIES because they may corrupt the current YAML syntax developed by users for Obsidian.
- EXTREMELY IMPORTANT on any file or batch operations as the user is likely to not notice corruptions until well after the fact.
3. Goals & Non-Goals
Goals
- Generate images using Recraft API from within Obsidian
- Securely manage API keys through Obsidian settings
- Automatically update frontmatter with generated image URLs according to user specifications in the modal. Defaults may be in the settings.
- Support for custom styles and parameters
- User may evaluate the response images and resend for touch up or new generation.
Non-Goals
- Image editing capabilities
- Support for image generation APIs other than Recraft (initially)
- Offline image generation (initially)
4. Technical Design
High-Level Architecture
graph TD
A[Obsidian Plugin] -->|API Request| B[Recraft API]
A -->|Store| C[Obsidian Vault]
B -->|Image URL| A
A -->|Update| D[Note Frontmatter]
bash
obsidian-image-generator/
├── src/
│ ├── services/
│ │ ├── RecraftService.ts # Handles Recraft API calls
│ │ ├── ImageKitService.ts # Handles ImageKit uploads
│ │ └── fileService.ts # Handles file operations
│ │ └── directoryService.ts # Handles directory operations
│ ├── ui/
│ │ ├── CurrentFileModal.ts
│ │ ├── TargetDirectoryModal.ts
│ │ └── settings.ts
│ └── styles/
│ └── styles.css # Plugin entry point
└── main.ts
Note: the
main.ts
file must be the entry point and must be in the root directory, and must be named main.ts
. This is enforced and required by the Obsidian Plugin Marketplace.Detailed Design
API Integration
- Use Recraft API for image generation
- Support for custom styles (default:
vector_illustration
) - Secure API key storage using Obsidian's settings
- Eventual support for batch processing for a target directory.
- Eventual support for the ImageKit API for image storage.
- Eventual support for Model Context Protocol SDK.
File Operations
- Update frontmatter with
banner_image
andportrait_image
URLs, or the properties set by the user in settings or in the modal. - Preserve existing frontmatter structure and formatting while only using string operations.
Services
- File Operations Service
fileService.ts
extractAndDisplayImagePromptInModal
: Extract markdown headersextractYamlFrontmatter()
: Get YAML frontmatterupdateImagePromptBeforeRequest()
: Update frontmatter valueswriteResponseInYamlProperty()
: Maintain consistent frontmatter orderwriteResponseInContentAtCursor()
: Uses correct image embed syntax.
- Recraft API Service:
RecraftService.ts
- Manage Recraft API communication
- Handle authentication and rate limiting
- Process responses and error handling
Error Handling
- Send notifications through Obsidian's notification API
- Network request failures
- Invalid API keys
- File system errors
- Invalid frontmatter
5. Prior to Implementation
Review and Discuss Prompts
Review and Discuss Code
Key Components from Current Implementation
Image Generation (Python Script)
- Uses Recraft API for generating both banner and portrait images
- Handles YAML frontmatter manipulation via string operations
- Supports custom styles through a JSON configuration
- Includes comprehensive error handling and logging
Image Upload (Node.js Script)
- Handles uploading to ImageKit.io
- Processes markdown files to find and update image URLs
- Preserves YAML formatting and comments
- Supports both single file and directory processing
Key Observations
Critical Dependencies
- Recraft API for image generation
- ImageKit for image hosting
- Environment variables for API keys
Important Constraints
- Must handle YAML frontmatter carefully (no YAML libraries)
- Needs to preserve all existing frontmatter formatting
- Must handle API keys securely
Performance Considerations
- Current implementation processes files asynchronously
- Includes rate limiting and error handling
6. Implementation Plan
Phases and Step by Step
- Phase 1: Initialize Settings Infrastructure
- Create settings infrastructure using Obsidian's plugin API
- Offer user Base URL and API Key, Image Sizes.
- Add support for custom styles
- Update the Code Changelog file.
Finished on 2025-07-20

- Phase 2: Test API Calls
- Create basic image generation command using curl or typescript APIs
- Analyze response object.
- Document example requests and response objects.
- Update the Code Changelog file.
Finished on 2025-07-20
- Phase 3: Iterate on Modal
- Assure proper YAML extraction and
image_prompt:
value extraction using theyamlFrontmatter.ts
utility. - Load settings from data.json that display in the modal.
- Request Portrait?
- Request Banner?
- Request Square?
- Confirm style?
- Use Custom Styles?
- Create a button to submit the
image_prompt
value to the Recraft API as part of the curl based request JSON object.- Add progress indicators using the OpenGraphFetcher example
- Successfully send the
image_prompt
value to the Recraft API and receive a response,- updating the progress bar.
- Audit the implementation of proper Obsidian classes through Obsidian Class Variables
- Implement error handling and recovery. Errors need to be sent through Obsidian notifications.
- Update the Code Changelog file.
Partially Finished on 2025-07-21
- Phase 4: Introduce ImageKit API Settings & Upload Functionality
- Introduce ImageKit API Settings
- Setting to remove downloaded Recraft Generated images from the download folder but only after successfully uploading to ImageKit with a response object from ImageKit with the unique image URL written to file in place of the Recraft generated image URL.
- Toggle on Modal for removing downloaded Recraft Generated images for the above, default to the user preference in settings.
- Update the Code Changelog file.
Finished on 2025-07-21
- Phase 5: Polish and Optimization
- Assure write operations to YAML are flawless.
- Performance optimizations
- Add configuration options
- Figure out how to include the custom styles into the API request object.
- Analyze other request options for Recraft API
- Update the Code Changelog file.
- Phase 6: User Documentation and user guides
- Audit Settings and Modal for helpful user instructions and tooltips.
- Update README with thorough user instructions.
- Phase 7: Developer Documentation
- Update README for potential contributors
- Update Astro Starlight documentation.
Future Iterations
- Implement proper download/upload system to back up into an image delivery service.
- Implement batch processing for multiple images in a single file.
- Implement target directory processing for multiple images across multiple files.
- Identify and match the list of possible image generations.
- Identify and match the list of possible local images to send to image delivery service.
- Implement MCP for using LLM API calls to generate potential image prompts.
Dependencies
- Obsidian API
- Recraft API
- Node.js libraries for HTTP requests
- YAML parsing utility cloned through
Testing Strategy
- Unit tests for utility functions
- Integration tests for API communication
- End-to-end tests for plugin functionality
- Manual testing in Obsidian environment
6. Alternatives Considered
Alternative 1: Browser Extension
- Pros: Could work across multiple applications
- Cons: Less integrated with Obsidian, more complex security model
- Decision: Chose native plugin for better integration and user experience
Alternative 2: Local Image Generation
- Pros: No API dependencies
- Cons: Higher resource requirements, potentially lower quality
- Decision: Use Recraft API for consistent quality and performance
7. Open Questions
- Should we implement local caching of generated images?
- What are the rate limits for the Recraft API?
- How should we handle API key rotation?
8. Appendix
Glossary
- Frontmatter: YAML metadata at the beginning of markdown files
- Recraft API: The image generation service being integrated
- Obsidian Plugin: The custom extension being developed
References
Revision History
- 2025-07-20: Initial draft created
- 2025-07-19: Project conception
Objective
Basics
Inputs
Model API
We will use the Recraft API to generate images.
Example Requests and Responses
ai-labs/recraft/recraft-examples.md
Recraft Styles Implementation
Settings Structure
typescript
interface ImageGinSettings {
// ... existing settings
style: {
useCustomStyle: boolean;
presetStyle: {
base: 'realistic_image' | 'digital_illustration' | 'vector_illustration' | 'icon';
substyle?: string; // e.g., 'b_and_w', 'enterprise', etc.
};
customStyleId?: string; // For user-created styles
};
}
Default Configuration
typescript
const DEFAULT_SETTINGS: ImageGinSettings = {
// ... other settings
style: {
useCustomStyle: false,
presetStyle: {
base: 'digital_illustration',
substyle: 'graphic_intensity'
}
}
};
UI Components
Style Selection
- Style Type Toggle
- Radio buttons: "Use Preset Style" / "Use Custom Style"
Preset Style Mode
- Base Style Dropdown
- Options: realistic_image, digital_illustration, vector_illustration, icon
- Substyle Dropdown
- Dynamically updates based on selected base style
Custom Style Mode
- Style Selector
- Dropdown to select from user's custom styles
- Create New Style (Future)
- Button to create new style from reference images
Helper Functions
Style Parameter Builder
typescript
function buildStyleParams(settings: ImageGinSettings) {
if (settings.style.useCustomStyle && settings.style.customStyleId) {
return { style_id: settings.style.customStyleId };
} else {
const params: any = {
style: settings.style.presetStyle.base
};
if (settings.style.presetStyle.substyle) {
params.substyle = settings.style.presetStyle.substyle;
}
return params;
}
}
Style Options Configuration
typescript
const STYLE_OPTIONS = {
realistic_image: {
label: 'Realistic Image',
substyles: [
{ id: 'b_and_w', label: 'Black & White' },
{ id: 'enterprise', label: 'Enterprise' },
// ... other substyles
]
},
digital_illustration: {
label: 'Digital Illustration',
substyles: [
{ id: '2d_art_poster', label: '2D Art Poster' },
{ id: 'graphic_intensity', label: 'Graphic Intensity' },
// ... other substyles
]
},
// ... other base styles
};
API Key
The Recraft API key will be accessed through Obsidian's Settings, which keeps things like API keys secure, private, and local.
Target Output
Put the response back into the YAML frontmatter of the input file, preserving all other frontmatter structure and formatting.
banner_image: <URL>
portrait_image: <URL>
Output Script in:
Including Custom Styles from Recraft
How to Use Your Generated Custom Style
You can include a custom style generated by the Recraft API in your model request by referencing its style ID or by loading the style JSON object. Example below uses the style you generated and saved at:
ai-labs/recraft/styles-recraft-2025-04-14T21-24-01.json
Sample Style JSON:
json
{
"creation_time": "2025-04-15T02:24:01.574783871Z",
"credits": 40,
"id": "73a249b2-879e-4240-9973-c6fb1715a882",
"is_private": true,
"style": "digital_illustration"
}
Example (Python):
python
import json
with open('ai-labs/recraft/styles-recraft-2025-04-14T21-24-01.json') as f:
style_obj = json.load(f)
# Use style_obj['id'] as the style identifier in your API request
payload = {
'prompt': 'Your image prompt here',
'style': style_obj['id'],
# ...other parameters
}
Example (JS/TS):
js
const style = require('./ai-labs/recraft/styles-recraft-2025-04-14T21-24-01.json');
// Use style.id as the style identifier in your API request
const payload = {
prompt: 'Your image prompt here',
style: style.id,
// ...other parameters
};
Services
Modals
CurrentFileModal.ts
This modal allows you to interact with the current file in focus. It includes sections for:
BatchDirectoryModal.ts
This modal is for batch processing of files within a directory. It includes:
With these modals, you have full interaction capabilities for both individual files and whole directories, allowing you to perform comprehensive text operations directly within Obsidian.