ObsiViewer/docs/PERFORMENCE/strategy/PERFORMANCE_OPTIMIZATION_STRATEGY.md

# Performance Optimization Strategy for Large Vault Startup

## Executive Summary

When deploying ObsiViewer with a large vault (1000+ markdown files), the initial startup is slow because the application loads **all notes with full content** before rendering the UI. This document outlines a comprehensive strategy to improve the user experience through metadata-first loading, lazy loading, and server-side optimizations.

**Expected Improvement**: From 10-30 seconds startup → 2-5 seconds to interactive UI

---

## Problem Analysis

### Current Architecture Issues

#### 1. **Full Vault Load on Startup** ⚠️ CRITICAL
- **Location**: `server/index.mjs` - `/api/vault` endpoint
- **Issue**: Loads ALL notes with FULL content synchronously
- **Impact**:
  - 1000 files × 5KB average = 5MB payload
  - Blocks UI rendering until complete
  - Network transfer time dominates

```typescript
// Current flow:
app.get('/api/vault', async (req, res) => {
  const notes = await loadVaultNotes(vaultDir);  // ← Loads ALL notes with content
  res.json({ notes });
});
```

#### 2. **Front-matter Enrichment on Every File** ⚠️ HIGH IMPACT
- **Location**: `server/index.mjs` - `loadVaultNotes()` function
- **Issue**: Calls `enrichFrontmatterOnOpen()` for every file during initial load
- **Impact**:
  - Expensive YAML parsing for each file
  - File I/O for each enrichment
  - Multiplies load time by 2-3x

```typescript
// Current code (lines 138-141):
const enrichResult = await enrichFrontmatterOnOpen(absPath);
const content = enrichResult.content;
// This happens for EVERY file during loadVaultNotes()
```

#### 3. **No Lazy Loading Strategy**
- **Client**: `VaultService.allNotes()` stores all notes in memory
- **UI**: `NotesListComponent` renders all notes (with virtual scrolling, but still loaded)
- **Issue**: No on-demand content loading when note is selected

#### 4. **Meilisearch Indexing Overhead**
- **Issue**: Initial indexing happens during server startup
- **Impact**: Blocks vault watcher initialization
- **Current**: Fallback to filesystem if Meilisearch unavailable

#### 5. **Large JSON Payload**
- **Issue**: Full markdown content sent for every file
- **Impact**: Network bandwidth, parsing time, memory usage
- **Example**: 1000 files × 5KB = 5MB+ payload

---

## Current Data Flow

```
┌─────────────────────────────────────────────────────────────┐
│ Browser requests /api/vault                                 │
└──────────────────────┬──────────────────────────────────────┘
                       │
                       ▼
┌─────────────────────────────────────────────────────────────┐
│ Server: loadVaultNotes(vaultDir)                            │
│ - Walk filesystem recursively                               │
│ - For EACH file:                                            │
│   - Read file content                                       │
│   - enrichFrontmatterOnOpen() ← EXPENSIVE                   │
│   - Extract title, tags                                     │
│   - Calculate stats                                         │
└──────────────────────┬──────────────────────────────────────┘
                       │
                       ▼
┌─────────────────────────────────────────────────────────────┐
│ Send large JSON payload (5MB+)                              │
└──────────────────────┬──────────────────────────────────────┘
                       │
                       ▼
┌─────────────────────────────────────────────────────────────┐
│ Client: Parse JSON, store in VaultService.allNotes()        │
│ - Blocks UI rendering                                       │
│ - High memory usage                                         │
└──────────────────────┬──────────────────────────────────────┘
                       │
                       ▼
┌─────────────────────────────────────────────────────────────┐
│ Render UI with all notes                                    │
│ - NotesListComponent renders all items                      │
│ - AppShellNimbusLayoutComponent initializes                 │
└─────────────────────────────────────────────────────────────┘
```

---

## Recommended Optimization Strategy

### Phase 1: Metadata-First Loading (QUICK WIN - 1-2 days)

**Goal**: Load UI in 2-3 seconds instead of 10-30 seconds

#### 1.1 Split Endpoints

Create two endpoints:
- **`/api/files/metadata`** - Fast, lightweight metadata only
- **`/api/vault`** - Full content (keep for backward compatibility)

```typescript
// NEW: Fast metadata endpoint
app.get('/api/files/metadata', async (req, res) => {
  try {
    // Try Meilisearch first (already implemented)
    const client = meiliClient();
    const indexUid = vaultIndexName(vaultDir);
    const index = await ensureIndexSettings(client, indexUid);
    const result = await index.search('', {
      limit: 10000,
      attributesToRetrieve: ['id', 'title', 'path', 'createdAt', 'updatedAt']
    });

    const items = Array.isArray(result.hits) ? result.hits : [];
    res.json(items);
  } catch (error) {
    // Fallback to fast filesystem scan (no enrichment)
    const notes = await loadVaultMetadataOnly(vaultDir);
    res.json(buildFileMetadata(notes));
  }
});

// NEW: Fast metadata-only loader (no enrichment)
const loadVaultMetadataOnly = async (vaultPath) => {
  const notes = [];
  const walk = async (currentDir) => {
    // Same as loadVaultNotes but WITHOUT enrichFrontmatterOnOpen()
    // Just read file stats and extract title from first heading
  };
  await walk(vaultPath);
  return notes;
};
```

#### 1.2 Modify Client Initialization

Update `VaultService` to load metadata first:

```typescript
// In VaultService (pseudo-code)
async initializeVault() {
  // Step 1: Load metadata immediately (fast)
  const metadata = await this.http.get('/api/files/metadata').toPromise();
  this.allNotes.set(metadata.map(m => ({
    id: m.id,
    title: m.title,
    filePath: m.path,
    createdAt: m.createdAt,
    updatedAt: m.updatedAt,
    content: '', // Empty initially
    tags: [],
    frontmatter: {}
  })));

  // Step 2: Load full content on-demand when note is selected
  // (already implemented via /api/files endpoint)
}
```

#### 1.3 Defer Front-matter Enrichment

**Current**: Enrichment happens during `loadVaultNotes()` for ALL files
**Proposed**: Only enrich when file is opened

```typescript
// In server/index.mjs - GET /api/files endpoint (already exists)
app.get('/api/files', async (req, res) => {
  try {
    const pathParam = req.query.path;
    // ... validation ...

    // For markdown files, enrich ONLY when explicitly requested
    if (!isExcalidraw && ext === '.md') {
      const enrichResult = await enrichFrontmatterOnOpen(abs);
      // ← This is fine here (on-demand), but remove from loadVaultNotes()
    }
  }
});

// In loadVaultNotes() - REMOVE enrichment
const loadVaultNotes = async (vaultPath) => {
  const notes = [];
  const walk = async (currentDir) => {
    // ... directory walk ...
    for (const entry of entries) {
      if (!isMarkdownFile(entry)) continue;

      try {
        // REMOVE: const enrichResult = await enrichFrontmatterOnOpen(absPath);
        // Just read the file as-is
        const content = fs.readFileSync(entryPath, 'utf-8');

        // Extract basic metadata without enrichment
        const stats = fs.statSync(entryPath);
        const title = extractTitle(content, fallback);
        const tags = extractTags(content);

        notes.push({
          id: finalId,
          title,
          content,
          tags,
          mtime: stats.mtimeMs,
          // ... other fields ...
        });
      } catch (err) {
        console.error(`Failed to read note at ${entryPath}:`, err);
      }
    }
  };
  await walk(vaultPath);
  return notes;
};
```

#### 1.4 Update VaultService to Load Content On-Demand

```typescript
// In src/app/services/vault.service.ts
export class VaultService {
  private allNotesMetadata = signal<Note[]>([]);
  private contentCache = new Map<string, string>();

  // Lazy-load content when note is selected
  async ensureNoteContent(noteId: string): Promise<Note | null> {
    const note = this.allNotesMetadata().find(n => n.id === noteId);
    if (!note) return null;

    // If content already loaded, return
    if (note.content) return note;

    // Load content from server
    try {
      const response = await this.http.get(`/api/files`, {
        params: { path: note.filePath }
      }).toPromise();

      // Update note with full content
      note.content = response.content;
      note.frontmatter = response.frontmatter;

      return note;
    } catch (error) {
      console.error('Failed to load note content:', error);
      return note;
    }
  }
}
```

---

### Phase 2: Pagination & Streaming (2-3 days)

**Goal**: Support vaults with 10,000+ files

#### 2.1 Implement Cursor-Based Pagination

```typescript
// Server endpoint with pagination
app.get('/api/files/metadata/paginated', async (req, res) => {
  const limit = Math.min(parseInt(req.query.limit) || 100, 500);
  const cursor = req.query.cursor || '';

  try {
    const client = meiliClient();
    const indexUid = vaultIndexName(vaultDir);
    const index = await ensureIndexSettings(client, indexUid);

    const result = await index.search('', {
      limit: limit + 1, // Fetch one extra to determine if more exist
      offset: cursor ? parseInt(cursor) : 0,
      attributesToRetrieve: ['id', 'title', 'path', 'createdAt', 'updatedAt']
    });

    const hasMore = result.hits.length > limit;
    const items = result.hits.slice(0, limit);
    const nextCursor = hasMore ? (parseInt(cursor || '0') + limit).toString() : null;

    res.json({ items, nextCursor, hasMore });
  } catch (error) {
    res.status(500).json({ error: 'Pagination failed' });
  }
});
```

#### 2.2 Implement Virtual Scrolling in NotesListComponent

```typescript
// In src/app/features/list/notes-list.component.ts
import { ScrollingModule } from '@angular/cdk/scrolling';

@Component({
  // ...
  imports: [CommonModule, ScrollableOverlayDirective, ScrollingModule],
  template: `
    <cdk-virtual-scroll-viewport itemSize="60" class="h-full">
      <ul>
        <li *cdkVirtualFor="let n of filtered()" class="p-3 hover:bg-surface1">
          {{ n.title }}
        </li>
      </ul>
    </cdk-virtual-scroll-viewport>
  `
})
export class NotesListComponent {
  // Virtual scrolling will only render visible items
}
```

---

### Phase 3: Server-Side Caching (1-2 days)

**Goal**: Avoid re-scanning filesystem on every request

#### 3.1 Implement In-Memory Metadata Cache

```typescript
// In server/index.mjs
let cachedMetadata = null;
let metadataCacheTime = 0;
const METADATA_CACHE_TTL = 5 * 60 * 1000; // 5 minutes

const getMetadataFromCache = async () => {
  const now = Date.now();
  if (cachedMetadata && (now - metadataCacheTime) < METADATA_CACHE_TTL) {
    return cachedMetadata;
  }

  // Rebuild cache
  cachedMetadata = await loadVaultMetadataOnly(vaultDir);
  metadataCacheTime = now;
  return cachedMetadata;
};

// Use in endpoints
app.get('/api/files/metadata', async (req, res) => {
  try {
    const metadata = await getMetadataFromCache();
    res.json(buildFileMetadata(metadata));
  } catch (error) {
    res.status(500).json({ error: 'Failed to load metadata' });
  }
});

// Invalidate cache on file changes
vaultWatcher.on('add', () => { metadataCacheTime = 0; });
vaultWatcher.on('change', () => { metadataCacheTime = 0; });
vaultWatcher.on('unlink', () => { metadataCacheTime = 0; });
```

#### 3.2 Defer Meilisearch Indexing

```typescript
// In server/index.mjs - defer initial indexing
let indexingInProgress = false;

const scheduleIndexing = async () => {
  if (indexingInProgress) return;
  indexingInProgress = true;

  // Schedule indexing for later (don't block startup)
  setImmediate(async () => {
    try {
      await fullReindex(vaultDir);
      console.log('[Meili] Initial indexing complete');
    } catch (error) {
      console.warn('[Meili] Initial indexing failed:', error);
    } finally {
      indexingInProgress = false;
    }
  });
};

// Call during server startup instead of blocking
app.listen(PORT, () => {
  console.log(`Server running on port ${PORT}`);
  scheduleIndexing(); // Non-blocking
});
```

---

### Phase 4: Client-Side Optimization (1 day)

**Goal**: Smooth UI interactions even with large datasets

#### 4.1 Implement Signal-Based Lazy Loading

```typescript
// In VaultService
export class VaultService {
  private allNotesMetadata = signal<Note[]>([]);
  private loadedNoteIds = new Set<string>();

  // Load content in background
  preloadNearbyNotes(currentNoteId: string, range = 5) {
    const notes = this.allNotesMetadata();
    const idx = notes.findIndex(n => n.id === currentNoteId);
    if (idx === -1) return;

    // Preload nearby notes
    for (let i = Math.max(0, idx - range); i <= Math.min(notes.length - 1, idx + range); i++) {
      const noteId = notes[i].id;
      if (!this.loadedNoteIds.has(noteId)) {
        this.ensureNoteContent(noteId).then(() => {
          this.loadedNoteIds.add(noteId);
        });
      }
    }
  }
}
```

#### 4.2 Optimize Change Detection

```typescript
// Already implemented in AppComponent
@Component({
  // ...
  changeDetection: ChangeDetectionStrategy.OnPush, // ✓ Already done
})
export class AppComponent {
  // Use signals instead of observables
  // Avoid unnecessary change detection cycles
}
```

---

## Implementation Roadmap

### Week 1: Phase 1 (Metadata-First Loading)
- [ ] Create `/api/files/metadata` endpoint
- [ ] Implement `loadVaultMetadataOnly()` function
- [ ] Remove enrichment from `loadVaultNotes()`
- [ ] Update `VaultService` to load metadata first
- [ ] Test with 1000+ file vault
- **Expected Result**: 10-30s → 3-5s startup time

### Week 2: Phase 2 (Pagination)
- [ ] Implement cursor-based pagination
- [ ] Add virtual scrolling to NotesListComponent
- [ ] Test with 10,000+ files
- **Expected Result**: Support unlimited file counts

### Week 3: Phase 3 (Server Caching)
- [ ] Implement in-memory metadata cache
- [ ] Defer Meilisearch indexing
- [ ] Add cache invalidation on file changes
- **Expected Result**: Reduced server load

### Week 4: Phase 4 (Client Optimization)
- [ ] Implement preloading strategy
- [ ] Profile and optimize hot paths
- [ ] Performance testing
- **Expected Result**: Smooth interactions

---

## Performance Metrics

### Before Optimization
```
Startup Time (1000 files):
- Server processing: 15-20s
- Network transfer: 5-10s
- Client parsing: 2-3s
- Total: 22-33s

Memory Usage:
- Server: 200-300MB
- Client: 150-200MB
```

### After Phase 1 (Metadata-First)
```
Startup Time (1000 files):
- Server processing: 1-2s (metadata only)
- Network transfer: 0.5-1s (small payload)
- Client parsing: 0.5-1s
- Total: 2-4s ✓

Memory Usage:
- Server: 50-100MB
- Client: 20-30MB (metadata only)
```

### After Phase 2 (Pagination)
```
Startup Time (10,000 files):
- Server processing: 0.5s (first page)
- Network transfer: 0.2-0.5s
- Client parsing: 0.2-0.5s
- Total: 1-1.5s ✓

Memory Usage:
- Server: 50-100MB (cache)
- Client: 5-10MB (first page only)
```

---

## Quick Wins (Can Implement Immediately)

1. **Remove enrichment from startup** (5 minutes)
   - Comment out `enrichFrontmatterOnOpen()` in `loadVaultNotes()`
   - Defer to on-demand loading

2. **Add metadata-only endpoint** (30 minutes)
   - Create `/api/files/metadata` using existing Meilisearch integration
   - Use fallback to fast filesystem scan

3. **Implement server-side caching** (1 hour)
   - Cache metadata for 5 minutes
   - Invalidate on file changes

4. **Defer Meilisearch indexing** (30 minutes)
   - Use `setImmediate()` instead of blocking startup

---

## Testing Recommendations

### Load Testing
```bash
# Generate test vault with 1000+ files
node scripts/generate-test-vault.mjs --files 1000

# Measure startup time
time curl http://localhost:3000/api/files/metadata > /dev/null

# Monitor memory usage
node --inspect server/index.mjs
```

### Performance Profiling
```typescript
// Add timing logs
console.time('loadVaultMetadata');
const metadata = await loadVaultMetadataOnly(vaultDir);
console.timeEnd('loadVaultMetadata');

// Monitor in browser DevTools
Performance tab → Network → Measure /api/files/metadata
```

---

## Conclusion

By implementing this optimization strategy in phases, you can reduce startup time from **22-33 seconds to 1-2 seconds** while supporting vaults with 10,000+ files. The metadata-first approach is the key quick win that provides immediate benefits.

**Recommended Next Steps**:
1. Implement Phase 1 (Metadata-First) immediately
2. Measure performance improvements
3. Proceed with Phase 2-4 based on user feedback