573 lines
18 KiB
Markdown
573 lines
18 KiB
Markdown
# Performance Optimization Strategy for Large Vault Startup
|
||
|
||
## Executive Summary
|
||
|
||
When deploying ObsiViewer with a large vault (1000+ markdown files), the initial startup is slow because the application loads **all notes with full content** before rendering the UI. This document outlines a comprehensive strategy to improve the user experience through metadata-first loading, lazy loading, and server-side optimizations.
|
||
|
||
**Expected Improvement**: From 10-30 seconds startup → 2-5 seconds to interactive UI
|
||
|
||
---
|
||
|
||
## Problem Analysis
|
||
|
||
### Current Architecture Issues
|
||
|
||
#### 1. **Full Vault Load on Startup** ⚠️ CRITICAL
|
||
- **Location**: `server/index.mjs` - `/api/vault` endpoint
|
||
- **Issue**: Loads ALL notes with FULL content synchronously
|
||
- **Impact**:
|
||
- 1000 files × 5KB average = 5MB payload
|
||
- Blocks UI rendering until complete
|
||
- Network transfer time dominates
|
||
|
||
```typescript
|
||
// Current flow:
|
||
app.get('/api/vault', async (req, res) => {
|
||
const notes = await loadVaultNotes(vaultDir); // ← Loads ALL notes with content
|
||
res.json({ notes });
|
||
});
|
||
```
|
||
|
||
#### 2. **Front-matter Enrichment on Every File** ⚠️ HIGH IMPACT
|
||
- **Location**: `server/index.mjs` - `loadVaultNotes()` function
|
||
- **Issue**: Calls `enrichFrontmatterOnOpen()` for every file during initial load
|
||
- **Impact**:
|
||
- Expensive YAML parsing for each file
|
||
- File I/O for each enrichment
|
||
- Multiplies load time by 2-3x
|
||
|
||
```typescript
|
||
// Current code (lines 138-141):
|
||
const enrichResult = await enrichFrontmatterOnOpen(absPath);
|
||
const content = enrichResult.content;
|
||
// This happens for EVERY file during loadVaultNotes()
|
||
```
|
||
|
||
#### 3. **No Lazy Loading Strategy**
|
||
- **Client**: `VaultService.allNotes()` stores all notes in memory
|
||
- **UI**: `NotesListComponent` renders all notes (with virtual scrolling, but still loaded)
|
||
- **Issue**: No on-demand content loading when note is selected
|
||
|
||
#### 4. **Meilisearch Indexing Overhead**
|
||
- **Issue**: Initial indexing happens during server startup
|
||
- **Impact**: Blocks vault watcher initialization
|
||
- **Current**: Fallback to filesystem if Meilisearch unavailable
|
||
|
||
#### 5. **Large JSON Payload**
|
||
- **Issue**: Full markdown content sent for every file
|
||
- **Impact**: Network bandwidth, parsing time, memory usage
|
||
- **Example**: 1000 files × 5KB = 5MB+ payload
|
||
|
||
---
|
||
|
||
## Current Data Flow
|
||
|
||
```
|
||
┌─────────────────────────────────────────────────────────────┐
|
||
│ Browser requests /api/vault │
|
||
└──────────────────────┬──────────────────────────────────────┘
|
||
│
|
||
▼
|
||
┌─────────────────────────────────────────────────────────────┐
|
||
│ Server: loadVaultNotes(vaultDir) │
|
||
│ - Walk filesystem recursively │
|
||
│ - For EACH file: │
|
||
│ - Read file content │
|
||
│ - enrichFrontmatterOnOpen() ← EXPENSIVE │
|
||
│ - Extract title, tags │
|
||
│ - Calculate stats │
|
||
└──────────────────────┬──────────────────────────────────────┘
|
||
│
|
||
▼
|
||
┌─────────────────────────────────────────────────────────────┐
|
||
│ Send large JSON payload (5MB+) │
|
||
└──────────────────────┬──────────────────────────────────────┘
|
||
│
|
||
▼
|
||
┌─────────────────────────────────────────────────────────────┐
|
||
│ Client: Parse JSON, store in VaultService.allNotes() │
|
||
│ - Blocks UI rendering │
|
||
│ - High memory usage │
|
||
└──────────────────────┬──────────────────────────────────────┘
|
||
│
|
||
▼
|
||
┌─────────────────────────────────────────────────────────────┐
|
||
│ Render UI with all notes │
|
||
│ - NotesListComponent renders all items │
|
||
│ - AppShellNimbusLayoutComponent initializes │
|
||
└─────────────────────────────────────────────────────────────┘
|
||
```
|
||
|
||
---
|
||
|
||
## Recommended Optimization Strategy
|
||
|
||
### Phase 1: Metadata-First Loading (QUICK WIN - 1-2 days)
|
||
|
||
**Goal**: Load UI in 2-3 seconds instead of 10-30 seconds
|
||
|
||
#### 1.1 Split Endpoints
|
||
|
||
Create two endpoints:
|
||
- **`/api/files/metadata`** - Fast, lightweight metadata only
|
||
- **`/api/vault`** - Full content (keep for backward compatibility)
|
||
|
||
```typescript
|
||
// NEW: Fast metadata endpoint
|
||
app.get('/api/files/metadata', async (req, res) => {
|
||
try {
|
||
// Try Meilisearch first (already implemented)
|
||
const client = meiliClient();
|
||
const indexUid = vaultIndexName(vaultDir);
|
||
const index = await ensureIndexSettings(client, indexUid);
|
||
const result = await index.search('', {
|
||
limit: 10000,
|
||
attributesToRetrieve: ['id', 'title', 'path', 'createdAt', 'updatedAt']
|
||
});
|
||
|
||
const items = Array.isArray(result.hits) ? result.hits : [];
|
||
res.json(items);
|
||
} catch (error) {
|
||
// Fallback to fast filesystem scan (no enrichment)
|
||
const notes = await loadVaultMetadataOnly(vaultDir);
|
||
res.json(buildFileMetadata(notes));
|
||
}
|
||
});
|
||
|
||
// NEW: Fast metadata-only loader (no enrichment)
|
||
const loadVaultMetadataOnly = async (vaultPath) => {
|
||
const notes = [];
|
||
const walk = async (currentDir) => {
|
||
// Same as loadVaultNotes but WITHOUT enrichFrontmatterOnOpen()
|
||
// Just read file stats and extract title from first heading
|
||
};
|
||
await walk(vaultPath);
|
||
return notes;
|
||
};
|
||
```
|
||
|
||
#### 1.2 Modify Client Initialization
|
||
|
||
Update `VaultService` to load metadata first:
|
||
|
||
```typescript
|
||
// In VaultService (pseudo-code)
|
||
async initializeVault() {
|
||
// Step 1: Load metadata immediately (fast)
|
||
const metadata = await this.http.get('/api/files/metadata').toPromise();
|
||
this.allNotes.set(metadata.map(m => ({
|
||
id: m.id,
|
||
title: m.title,
|
||
filePath: m.path,
|
||
createdAt: m.createdAt,
|
||
updatedAt: m.updatedAt,
|
||
content: '', // Empty initially
|
||
tags: [],
|
||
frontmatter: {}
|
||
})));
|
||
|
||
// Step 2: Load full content on-demand when note is selected
|
||
// (already implemented via /api/files endpoint)
|
||
}
|
||
```
|
||
|
||
#### 1.3 Defer Front-matter Enrichment
|
||
|
||
**Current**: Enrichment happens during `loadVaultNotes()` for ALL files
|
||
**Proposed**: Only enrich when file is opened
|
||
|
||
```typescript
|
||
// In server/index.mjs - GET /api/files endpoint (already exists)
|
||
app.get('/api/files', async (req, res) => {
|
||
try {
|
||
const pathParam = req.query.path;
|
||
// ... validation ...
|
||
|
||
// For markdown files, enrich ONLY when explicitly requested
|
||
if (!isExcalidraw && ext === '.md') {
|
||
const enrichResult = await enrichFrontmatterOnOpen(abs);
|
||
// ← This is fine here (on-demand), but remove from loadVaultNotes()
|
||
}
|
||
}
|
||
});
|
||
|
||
// In loadVaultNotes() - REMOVE enrichment
|
||
const loadVaultNotes = async (vaultPath) => {
|
||
const notes = [];
|
||
const walk = async (currentDir) => {
|
||
// ... directory walk ...
|
||
for (const entry of entries) {
|
||
if (!isMarkdownFile(entry)) continue;
|
||
|
||
try {
|
||
// REMOVE: const enrichResult = await enrichFrontmatterOnOpen(absPath);
|
||
// Just read the file as-is
|
||
const content = fs.readFileSync(entryPath, 'utf-8');
|
||
|
||
// Extract basic metadata without enrichment
|
||
const stats = fs.statSync(entryPath);
|
||
const title = extractTitle(content, fallback);
|
||
const tags = extractTags(content);
|
||
|
||
notes.push({
|
||
id: finalId,
|
||
title,
|
||
content,
|
||
tags,
|
||
mtime: stats.mtimeMs,
|
||
// ... other fields ...
|
||
});
|
||
} catch (err) {
|
||
console.error(`Failed to read note at ${entryPath}:`, err);
|
||
}
|
||
}
|
||
};
|
||
await walk(vaultPath);
|
||
return notes;
|
||
};
|
||
```
|
||
|
||
#### 1.4 Update VaultService to Load Content On-Demand
|
||
|
||
```typescript
|
||
// In src/app/services/vault.service.ts
|
||
export class VaultService {
|
||
private allNotesMetadata = signal<Note[]>([]);
|
||
private contentCache = new Map<string, string>();
|
||
|
||
// Lazy-load content when note is selected
|
||
async ensureNoteContent(noteId: string): Promise<Note | null> {
|
||
const note = this.allNotesMetadata().find(n => n.id === noteId);
|
||
if (!note) return null;
|
||
|
||
// If content already loaded, return
|
||
if (note.content) return note;
|
||
|
||
// Load content from server
|
||
try {
|
||
const response = await this.http.get(`/api/files`, {
|
||
params: { path: note.filePath }
|
||
}).toPromise();
|
||
|
||
// Update note with full content
|
||
note.content = response.content;
|
||
note.frontmatter = response.frontmatter;
|
||
|
||
return note;
|
||
} catch (error) {
|
||
console.error('Failed to load note content:', error);
|
||
return note;
|
||
}
|
||
}
|
||
}
|
||
```
|
||
|
||
---
|
||
|
||
### Phase 2: Pagination & Streaming (2-3 days)
|
||
|
||
**Goal**: Support vaults with 10,000+ files
|
||
|
||
#### 2.1 Implement Cursor-Based Pagination
|
||
|
||
```typescript
|
||
// Server endpoint with pagination
|
||
app.get('/api/files/metadata/paginated', async (req, res) => {
|
||
const limit = Math.min(parseInt(req.query.limit) || 100, 500);
|
||
const cursor = req.query.cursor || '';
|
||
|
||
try {
|
||
const client = meiliClient();
|
||
const indexUid = vaultIndexName(vaultDir);
|
||
const index = await ensureIndexSettings(client, indexUid);
|
||
|
||
const result = await index.search('', {
|
||
limit: limit + 1, // Fetch one extra to determine if more exist
|
||
offset: cursor ? parseInt(cursor) : 0,
|
||
attributesToRetrieve: ['id', 'title', 'path', 'createdAt', 'updatedAt']
|
||
});
|
||
|
||
const hasMore = result.hits.length > limit;
|
||
const items = result.hits.slice(0, limit);
|
||
const nextCursor = hasMore ? (parseInt(cursor || '0') + limit).toString() : null;
|
||
|
||
res.json({ items, nextCursor, hasMore });
|
||
} catch (error) {
|
||
res.status(500).json({ error: 'Pagination failed' });
|
||
}
|
||
});
|
||
```
|
||
|
||
#### 2.2 Implement Virtual Scrolling in NotesListComponent
|
||
|
||
```typescript
|
||
// In src/app/features/list/notes-list.component.ts
|
||
import { ScrollingModule } from '@angular/cdk/scrolling';
|
||
|
||
@Component({
|
||
// ...
|
||
imports: [CommonModule, ScrollableOverlayDirective, ScrollingModule],
|
||
template: `
|
||
<cdk-virtual-scroll-viewport itemSize="60" class="h-full">
|
||
<ul>
|
||
<li *cdkVirtualFor="let n of filtered()" class="p-3 hover:bg-surface1">
|
||
{{ n.title }}
|
||
</li>
|
||
</ul>
|
||
</cdk-virtual-scroll-viewport>
|
||
`
|
||
})
|
||
export class NotesListComponent {
|
||
// Virtual scrolling will only render visible items
|
||
}
|
||
```
|
||
|
||
---
|
||
|
||
### Phase 3: Server-Side Caching (1-2 days)
|
||
|
||
**Goal**: Avoid re-scanning filesystem on every request
|
||
|
||
#### 3.1 Implement In-Memory Metadata Cache
|
||
|
||
```typescript
|
||
// In server/index.mjs
|
||
let cachedMetadata = null;
|
||
let metadataCacheTime = 0;
|
||
const METADATA_CACHE_TTL = 5 * 60 * 1000; // 5 minutes
|
||
|
||
const getMetadataFromCache = async () => {
|
||
const now = Date.now();
|
||
if (cachedMetadata && (now - metadataCacheTime) < METADATA_CACHE_TTL) {
|
||
return cachedMetadata;
|
||
}
|
||
|
||
// Rebuild cache
|
||
cachedMetadata = await loadVaultMetadataOnly(vaultDir);
|
||
metadataCacheTime = now;
|
||
return cachedMetadata;
|
||
};
|
||
|
||
// Use in endpoints
|
||
app.get('/api/files/metadata', async (req, res) => {
|
||
try {
|
||
const metadata = await getMetadataFromCache();
|
||
res.json(buildFileMetadata(metadata));
|
||
} catch (error) {
|
||
res.status(500).json({ error: 'Failed to load metadata' });
|
||
}
|
||
});
|
||
|
||
// Invalidate cache on file changes
|
||
vaultWatcher.on('add', () => { metadataCacheTime = 0; });
|
||
vaultWatcher.on('change', () => { metadataCacheTime = 0; });
|
||
vaultWatcher.on('unlink', () => { metadataCacheTime = 0; });
|
||
```
|
||
|
||
#### 3.2 Defer Meilisearch Indexing
|
||
|
||
```typescript
|
||
// In server/index.mjs - defer initial indexing
|
||
let indexingInProgress = false;
|
||
|
||
const scheduleIndexing = async () => {
|
||
if (indexingInProgress) return;
|
||
indexingInProgress = true;
|
||
|
||
// Schedule indexing for later (don't block startup)
|
||
setImmediate(async () => {
|
||
try {
|
||
await fullReindex(vaultDir);
|
||
console.log('[Meili] Initial indexing complete');
|
||
} catch (error) {
|
||
console.warn('[Meili] Initial indexing failed:', error);
|
||
} finally {
|
||
indexingInProgress = false;
|
||
}
|
||
});
|
||
};
|
||
|
||
// Call during server startup instead of blocking
|
||
app.listen(PORT, () => {
|
||
console.log(`Server running on port ${PORT}`);
|
||
scheduleIndexing(); // Non-blocking
|
||
});
|
||
```
|
||
|
||
---
|
||
|
||
### Phase 4: Client-Side Optimization (1 day)
|
||
|
||
**Goal**: Smooth UI interactions even with large datasets
|
||
|
||
#### 4.1 Implement Signal-Based Lazy Loading
|
||
|
||
```typescript
|
||
// In VaultService
|
||
export class VaultService {
|
||
private allNotesMetadata = signal<Note[]>([]);
|
||
private loadedNoteIds = new Set<string>();
|
||
|
||
// Load content in background
|
||
preloadNearbyNotes(currentNoteId: string, range = 5) {
|
||
const notes = this.allNotesMetadata();
|
||
const idx = notes.findIndex(n => n.id === currentNoteId);
|
||
if (idx === -1) return;
|
||
|
||
// Preload nearby notes
|
||
for (let i = Math.max(0, idx - range); i <= Math.min(notes.length - 1, idx + range); i++) {
|
||
const noteId = notes[i].id;
|
||
if (!this.loadedNoteIds.has(noteId)) {
|
||
this.ensureNoteContent(noteId).then(() => {
|
||
this.loadedNoteIds.add(noteId);
|
||
});
|
||
}
|
||
}
|
||
}
|
||
}
|
||
```
|
||
|
||
#### 4.2 Optimize Change Detection
|
||
|
||
```typescript
|
||
// Already implemented in AppComponent
|
||
@Component({
|
||
// ...
|
||
changeDetection: ChangeDetectionStrategy.OnPush, // ✓ Already done
|
||
})
|
||
export class AppComponent {
|
||
// Use signals instead of observables
|
||
// Avoid unnecessary change detection cycles
|
||
}
|
||
```
|
||
|
||
---
|
||
|
||
## Implementation Roadmap
|
||
|
||
### Week 1: Phase 1 (Metadata-First Loading)
|
||
- [ ] Create `/api/files/metadata` endpoint
|
||
- [ ] Implement `loadVaultMetadataOnly()` function
|
||
- [ ] Remove enrichment from `loadVaultNotes()`
|
||
- [ ] Update `VaultService` to load metadata first
|
||
- [ ] Test with 1000+ file vault
|
||
- **Expected Result**: 10-30s → 3-5s startup time
|
||
|
||
### Week 2: Phase 2 (Pagination)
|
||
- [ ] Implement cursor-based pagination
|
||
- [ ] Add virtual scrolling to NotesListComponent
|
||
- [ ] Test with 10,000+ files
|
||
- **Expected Result**: Support unlimited file counts
|
||
|
||
### Week 3: Phase 3 (Server Caching)
|
||
- [ ] Implement in-memory metadata cache
|
||
- [ ] Defer Meilisearch indexing
|
||
- [ ] Add cache invalidation on file changes
|
||
- **Expected Result**: Reduced server load
|
||
|
||
### Week 4: Phase 4 (Client Optimization)
|
||
- [ ] Implement preloading strategy
|
||
- [ ] Profile and optimize hot paths
|
||
- [ ] Performance testing
|
||
- **Expected Result**: Smooth interactions
|
||
|
||
---
|
||
|
||
## Performance Metrics
|
||
|
||
### Before Optimization
|
||
```
|
||
Startup Time (1000 files):
|
||
- Server processing: 15-20s
|
||
- Network transfer: 5-10s
|
||
- Client parsing: 2-3s
|
||
- Total: 22-33s
|
||
|
||
Memory Usage:
|
||
- Server: 200-300MB
|
||
- Client: 150-200MB
|
||
```
|
||
|
||
### After Phase 1 (Metadata-First)
|
||
```
|
||
Startup Time (1000 files):
|
||
- Server processing: 1-2s (metadata only)
|
||
- Network transfer: 0.5-1s (small payload)
|
||
- Client parsing: 0.5-1s
|
||
- Total: 2-4s ✓
|
||
|
||
Memory Usage:
|
||
- Server: 50-100MB
|
||
- Client: 20-30MB (metadata only)
|
||
```
|
||
|
||
### After Phase 2 (Pagination)
|
||
```
|
||
Startup Time (10,000 files):
|
||
- Server processing: 0.5s (first page)
|
||
- Network transfer: 0.2-0.5s
|
||
- Client parsing: 0.2-0.5s
|
||
- Total: 1-1.5s ✓
|
||
|
||
Memory Usage:
|
||
- Server: 50-100MB (cache)
|
||
- Client: 5-10MB (first page only)
|
||
```
|
||
|
||
---
|
||
|
||
## Quick Wins (Can Implement Immediately)
|
||
|
||
1. **Remove enrichment from startup** (5 minutes)
|
||
- Comment out `enrichFrontmatterOnOpen()` in `loadVaultNotes()`
|
||
- Defer to on-demand loading
|
||
|
||
2. **Add metadata-only endpoint** (30 minutes)
|
||
- Create `/api/files/metadata` using existing Meilisearch integration
|
||
- Use fallback to fast filesystem scan
|
||
|
||
3. **Implement server-side caching** (1 hour)
|
||
- Cache metadata for 5 minutes
|
||
- Invalidate on file changes
|
||
|
||
4. **Defer Meilisearch indexing** (30 minutes)
|
||
- Use `setImmediate()` instead of blocking startup
|
||
|
||
---
|
||
|
||
## Testing Recommendations
|
||
|
||
### Load Testing
|
||
```bash
|
||
# Generate test vault with 1000+ files
|
||
node scripts/generate-test-vault.mjs --files 1000
|
||
|
||
# Measure startup time
|
||
time curl http://localhost:3000/api/files/metadata > /dev/null
|
||
|
||
# Monitor memory usage
|
||
node --inspect server/index.mjs
|
||
```
|
||
|
||
### Performance Profiling
|
||
```typescript
|
||
// Add timing logs
|
||
console.time('loadVaultMetadata');
|
||
const metadata = await loadVaultMetadataOnly(vaultDir);
|
||
console.timeEnd('loadVaultMetadata');
|
||
|
||
// Monitor in browser DevTools
|
||
Performance tab → Network → Measure /api/files/metadata
|
||
```
|
||
|
||
---
|
||
|
||
## Conclusion
|
||
|
||
By implementing this optimization strategy in phases, you can reduce startup time from **22-33 seconds to 1-2 seconds** while supporting vaults with 10,000+ files. The metadata-first approach is the key quick win that provides immediate benefits.
|
||
|
||
**Recommended Next Steps**:
|
||
1. Implement Phase 1 (Metadata-First) immediately
|
||
2. Measure performance improvements
|
||
3. Proceed with Phase 2-4 based on user feedback
|