18 KiB
Performance Optimization Strategy for Large Vault Startup
Executive Summary
When deploying ObsiViewer with a large vault (1000+ markdown files), the initial startup is slow because the application loads all notes with full content before rendering the UI. This document outlines a comprehensive strategy to improve the user experience through metadata-first loading, lazy loading, and server-side optimizations.
Expected Improvement: From 10-30 seconds startup → 2-5 seconds to interactive UI
Problem Analysis
Current Architecture Issues
1. Full Vault Load on Startup ⚠️ CRITICAL
- Location:
server/index.mjs-/api/vaultendpoint - Issue: Loads ALL notes with FULL content synchronously
- Impact:
- 1000 files × 5KB average = 5MB payload
- Blocks UI rendering until complete
- Network transfer time dominates
// Current flow:
app.get('/api/vault', async (req, res) => {
const notes = await loadVaultNotes(vaultDir); // ← Loads ALL notes with content
res.json({ notes });
});
2. Front-matter Enrichment on Every File ⚠️ HIGH IMPACT
- Location:
server/index.mjs-loadVaultNotes()function - Issue: Calls
enrichFrontmatterOnOpen()for every file during initial load - Impact:
- Expensive YAML parsing for each file
- File I/O for each enrichment
- Multiplies load time by 2-3x
// Current code (lines 138-141):
const enrichResult = await enrichFrontmatterOnOpen(absPath);
const content = enrichResult.content;
// This happens for EVERY file during loadVaultNotes()
3. No Lazy Loading Strategy
- Client:
VaultService.allNotes()stores all notes in memory - UI:
NotesListComponentrenders all notes (with virtual scrolling, but still loaded) - Issue: No on-demand content loading when note is selected
4. Meilisearch Indexing Overhead
- Issue: Initial indexing happens during server startup
- Impact: Blocks vault watcher initialization
- Current: Fallback to filesystem if Meilisearch unavailable
5. Large JSON Payload
- Issue: Full markdown content sent for every file
- Impact: Network bandwidth, parsing time, memory usage
- Example: 1000 files × 5KB = 5MB+ payload
Current Data Flow
┌─────────────────────────────────────────────────────────────┐
│ Browser requests /api/vault │
└──────────────────────┬──────────────────────────────────────┘
│
▼
┌─────────────────────────────────────────────────────────────┐
│ Server: loadVaultNotes(vaultDir) │
│ - Walk filesystem recursively │
│ - For EACH file: │
│ - Read file content │
│ - enrichFrontmatterOnOpen() ← EXPENSIVE │
│ - Extract title, tags │
│ - Calculate stats │
└──────────────────────┬──────────────────────────────────────┘
│
▼
┌─────────────────────────────────────────────────────────────┐
│ Send large JSON payload (5MB+) │
└──────────────────────┬──────────────────────────────────────┘
│
▼
┌─────────────────────────────────────────────────────────────┐
│ Client: Parse JSON, store in VaultService.allNotes() │
│ - Blocks UI rendering │
│ - High memory usage │
└──────────────────────┬──────────────────────────────────────┘
│
▼
┌─────────────────────────────────────────────────────────────┐
│ Render UI with all notes │
│ - NotesListComponent renders all items │
│ - AppShellNimbusLayoutComponent initializes │
└─────────────────────────────────────────────────────────────┘
Recommended Optimization Strategy
Phase 1: Metadata-First Loading (QUICK WIN - 1-2 days)
Goal: Load UI in 2-3 seconds instead of 10-30 seconds
1.1 Split Endpoints
Create two endpoints:
/api/files/metadata- Fast, lightweight metadata only/api/vault- Full content (keep for backward compatibility)
// NEW: Fast metadata endpoint
app.get('/api/files/metadata', async (req, res) => {
try {
// Try Meilisearch first (already implemented)
const client = meiliClient();
const indexUid = vaultIndexName(vaultDir);
const index = await ensureIndexSettings(client, indexUid);
const result = await index.search('', {
limit: 10000,
attributesToRetrieve: ['id', 'title', 'path', 'createdAt', 'updatedAt']
});
const items = Array.isArray(result.hits) ? result.hits : [];
res.json(items);
} catch (error) {
// Fallback to fast filesystem scan (no enrichment)
const notes = await loadVaultMetadataOnly(vaultDir);
res.json(buildFileMetadata(notes));
}
});
// NEW: Fast metadata-only loader (no enrichment)
const loadVaultMetadataOnly = async (vaultPath) => {
const notes = [];
const walk = async (currentDir) => {
// Same as loadVaultNotes but WITHOUT enrichFrontmatterOnOpen()
// Just read file stats and extract title from first heading
};
await walk(vaultPath);
return notes;
};
1.2 Modify Client Initialization
Update VaultService to load metadata first:
// In VaultService (pseudo-code)
async initializeVault() {
// Step 1: Load metadata immediately (fast)
const metadata = await this.http.get('/api/files/metadata').toPromise();
this.allNotes.set(metadata.map(m => ({
id: m.id,
title: m.title,
filePath: m.path,
createdAt: m.createdAt,
updatedAt: m.updatedAt,
content: '', // Empty initially
tags: [],
frontmatter: {}
})));
// Step 2: Load full content on-demand when note is selected
// (already implemented via /api/files endpoint)
}
1.3 Defer Front-matter Enrichment
Current: Enrichment happens during loadVaultNotes() for ALL files
Proposed: Only enrich when file is opened
// In server/index.mjs - GET /api/files endpoint (already exists)
app.get('/api/files', async (req, res) => {
try {
const pathParam = req.query.path;
// ... validation ...
// For markdown files, enrich ONLY when explicitly requested
if (!isExcalidraw && ext === '.md') {
const enrichResult = await enrichFrontmatterOnOpen(abs);
// ← This is fine here (on-demand), but remove from loadVaultNotes()
}
}
});
// In loadVaultNotes() - REMOVE enrichment
const loadVaultNotes = async (vaultPath) => {
const notes = [];
const walk = async (currentDir) => {
// ... directory walk ...
for (const entry of entries) {
if (!isMarkdownFile(entry)) continue;
try {
// REMOVE: const enrichResult = await enrichFrontmatterOnOpen(absPath);
// Just read the file as-is
const content = fs.readFileSync(entryPath, 'utf-8');
// Extract basic metadata without enrichment
const stats = fs.statSync(entryPath);
const title = extractTitle(content, fallback);
const tags = extractTags(content);
notes.push({
id: finalId,
title,
content,
tags,
mtime: stats.mtimeMs,
// ... other fields ...
});
} catch (err) {
console.error(`Failed to read note at ${entryPath}:`, err);
}
}
};
await walk(vaultPath);
return notes;
};
1.4 Update VaultService to Load Content On-Demand
// In src/app/services/vault.service.ts
export class VaultService {
private allNotesMetadata = signal<Note[]>([]);
private contentCache = new Map<string, string>();
// Lazy-load content when note is selected
async ensureNoteContent(noteId: string): Promise<Note | null> {
const note = this.allNotesMetadata().find(n => n.id === noteId);
if (!note) return null;
// If content already loaded, return
if (note.content) return note;
// Load content from server
try {
const response = await this.http.get(`/api/files`, {
params: { path: note.filePath }
}).toPromise();
// Update note with full content
note.content = response.content;
note.frontmatter = response.frontmatter;
return note;
} catch (error) {
console.error('Failed to load note content:', error);
return note;
}
}
}
Phase 2: Pagination & Streaming (2-3 days)
Goal: Support vaults with 10,000+ files
2.1 Implement Cursor-Based Pagination
// Server endpoint with pagination
app.get('/api/files/metadata/paginated', async (req, res) => {
const limit = Math.min(parseInt(req.query.limit) || 100, 500);
const cursor = req.query.cursor || '';
try {
const client = meiliClient();
const indexUid = vaultIndexName(vaultDir);
const index = await ensureIndexSettings(client, indexUid);
const result = await index.search('', {
limit: limit + 1, // Fetch one extra to determine if more exist
offset: cursor ? parseInt(cursor) : 0,
attributesToRetrieve: ['id', 'title', 'path', 'createdAt', 'updatedAt']
});
const hasMore = result.hits.length > limit;
const items = result.hits.slice(0, limit);
const nextCursor = hasMore ? (parseInt(cursor || '0') + limit).toString() : null;
res.json({ items, nextCursor, hasMore });
} catch (error) {
res.status(500).json({ error: 'Pagination failed' });
}
});
2.2 Implement Virtual Scrolling in NotesListComponent
// In src/app/features/list/notes-list.component.ts
import { ScrollingModule } from '@angular/cdk/scrolling';
@Component({
// ...
imports: [CommonModule, ScrollableOverlayDirective, ScrollingModule],
template: `
<cdk-virtual-scroll-viewport itemSize="60" class="h-full">
<ul>
<li *cdkVirtualFor="let n of filtered()" class="p-3 hover:bg-surface1">
{{ n.title }}
</li>
</ul>
</cdk-virtual-scroll-viewport>
`
})
export class NotesListComponent {
// Virtual scrolling will only render visible items
}
Phase 3: Server-Side Caching (1-2 days)
Goal: Avoid re-scanning filesystem on every request
3.1 Implement In-Memory Metadata Cache
// In server/index.mjs
let cachedMetadata = null;
let metadataCacheTime = 0;
const METADATA_CACHE_TTL = 5 * 60 * 1000; // 5 minutes
const getMetadataFromCache = async () => {
const now = Date.now();
if (cachedMetadata && (now - metadataCacheTime) < METADATA_CACHE_TTL) {
return cachedMetadata;
}
// Rebuild cache
cachedMetadata = await loadVaultMetadataOnly(vaultDir);
metadataCacheTime = now;
return cachedMetadata;
};
// Use in endpoints
app.get('/api/files/metadata', async (req, res) => {
try {
const metadata = await getMetadataFromCache();
res.json(buildFileMetadata(metadata));
} catch (error) {
res.status(500).json({ error: 'Failed to load metadata' });
}
});
// Invalidate cache on file changes
vaultWatcher.on('add', () => { metadataCacheTime = 0; });
vaultWatcher.on('change', () => { metadataCacheTime = 0; });
vaultWatcher.on('unlink', () => { metadataCacheTime = 0; });
3.2 Defer Meilisearch Indexing
// In server/index.mjs - defer initial indexing
let indexingInProgress = false;
const scheduleIndexing = async () => {
if (indexingInProgress) return;
indexingInProgress = true;
// Schedule indexing for later (don't block startup)
setImmediate(async () => {
try {
await fullReindex(vaultDir);
console.log('[Meili] Initial indexing complete');
} catch (error) {
console.warn('[Meili] Initial indexing failed:', error);
} finally {
indexingInProgress = false;
}
});
};
// Call during server startup instead of blocking
app.listen(PORT, () => {
console.log(`Server running on port ${PORT}`);
scheduleIndexing(); // Non-blocking
});
Phase 4: Client-Side Optimization (1 day)
Goal: Smooth UI interactions even with large datasets
4.1 Implement Signal-Based Lazy Loading
// In VaultService
export class VaultService {
private allNotesMetadata = signal<Note[]>([]);
private loadedNoteIds = new Set<string>();
// Load content in background
preloadNearbyNotes(currentNoteId: string, range = 5) {
const notes = this.allNotesMetadata();
const idx = notes.findIndex(n => n.id === currentNoteId);
if (idx === -1) return;
// Preload nearby notes
for (let i = Math.max(0, idx - range); i <= Math.min(notes.length - 1, idx + range); i++) {
const noteId = notes[i].id;
if (!this.loadedNoteIds.has(noteId)) {
this.ensureNoteContent(noteId).then(() => {
this.loadedNoteIds.add(noteId);
});
}
}
}
}
4.2 Optimize Change Detection
// Already implemented in AppComponent
@Component({
// ...
changeDetection: ChangeDetectionStrategy.OnPush, // ✓ Already done
})
export class AppComponent {
// Use signals instead of observables
// Avoid unnecessary change detection cycles
}
Implementation Roadmap
Week 1: Phase 1 (Metadata-First Loading)
- Create
/api/files/metadataendpoint - Implement
loadVaultMetadataOnly()function - Remove enrichment from
loadVaultNotes() - Update
VaultServiceto load metadata first - Test with 1000+ file vault
- Expected Result: 10-30s → 3-5s startup time
Week 2: Phase 2 (Pagination)
- Implement cursor-based pagination
- Add virtual scrolling to NotesListComponent
- Test with 10,000+ files
- Expected Result: Support unlimited file counts
Week 3: Phase 3 (Server Caching)
- Implement in-memory metadata cache
- Defer Meilisearch indexing
- Add cache invalidation on file changes
- Expected Result: Reduced server load
Week 4: Phase 4 (Client Optimization)
- Implement preloading strategy
- Profile and optimize hot paths
- Performance testing
- Expected Result: Smooth interactions
Performance Metrics
Before Optimization
Startup Time (1000 files):
- Server processing: 15-20s
- Network transfer: 5-10s
- Client parsing: 2-3s
- Total: 22-33s
Memory Usage:
- Server: 200-300MB
- Client: 150-200MB
After Phase 1 (Metadata-First)
Startup Time (1000 files):
- Server processing: 1-2s (metadata only)
- Network transfer: 0.5-1s (small payload)
- Client parsing: 0.5-1s
- Total: 2-4s ✓
Memory Usage:
- Server: 50-100MB
- Client: 20-30MB (metadata only)
After Phase 2 (Pagination)
Startup Time (10,000 files):
- Server processing: 0.5s (first page)
- Network transfer: 0.2-0.5s
- Client parsing: 0.2-0.5s
- Total: 1-1.5s ✓
Memory Usage:
- Server: 50-100MB (cache)
- Client: 5-10MB (first page only)
Quick Wins (Can Implement Immediately)
-
Remove enrichment from startup (5 minutes)
- Comment out
enrichFrontmatterOnOpen()inloadVaultNotes() - Defer to on-demand loading
- Comment out
-
Add metadata-only endpoint (30 minutes)
- Create
/api/files/metadatausing existing Meilisearch integration - Use fallback to fast filesystem scan
- Create
-
Implement server-side caching (1 hour)
- Cache metadata for 5 minutes
- Invalidate on file changes
-
Defer Meilisearch indexing (30 minutes)
- Use
setImmediate()instead of blocking startup
- Use
Testing Recommendations
Load Testing
# Generate test vault with 1000+ files
node scripts/generate-test-vault.mjs --files 1000
# Measure startup time
time curl http://localhost:3000/api/files/metadata > /dev/null
# Monitor memory usage
node --inspect server/index.mjs
Performance Profiling
// Add timing logs
console.time('loadVaultMetadata');
const metadata = await loadVaultMetadataOnly(vaultDir);
console.timeEnd('loadVaultMetadata');
// Monitor in browser DevTools
Performance tab → Network → Measure /api/files/metadata
Conclusion
By implementing this optimization strategy in phases, you can reduce startup time from 22-33 seconds to 1-2 seconds while supporting vaults with 10,000+ files. The metadata-first approach is the key quick win that provides immediate benefits.
Recommended Next Steps:
- Implement Phase 1 (Metadata-First) immediately
- Measure performance improvements
- Proceed with Phase 2-4 based on user feedback