Migrated from design-system-swarm with fresh git history.
Old project history preserved in /home/overbits/apps/design-system-swarm
Core components:
- MCP Server (Python FastAPI with mcp 1.23.1)
- Claude Plugin (agents, commands, skills, strategies, hooks, core)
- DSS Backend (dss-mvp1 - token translation, Figma sync)
- Admin UI (Node.js/React)
- Server (Node.js/Express)
- Storybook integration (dss-mvp1/.storybook)
Self-contained configuration:
- All paths relative or use DSS_BASE_PATH=/home/overbits/dss
- PYTHONPATH configured for dss-mvp1 and dss-claude-plugin
- .env file with all configuration
- Claude plugin uses ${CLAUDE_PLUGIN_ROOT} for portability
Migration completed: $(date)
🤖 Clean migration with full functionality preserved
574 lines
14 KiB
Markdown
574 lines
14 KiB
Markdown
# DSS Self-Debugging Methodology
|
|
|
|
**Purpose**: Use DSS to debug DSS using its own infrastructure
|
|
**Principle**: Self-referential debugging - the system observes itself
|
|
**Date**: 2025-12-05
|
|
|
|
---
|
|
|
|
## Overview: Self-Referential Debugging Architecture
|
|
|
|
```
|
|
DSS Running (Dashboard/Admin UI)
|
|
↓
|
|
Audit Logger (Records all actions)
|
|
├─ User interactions
|
|
├─ API calls
|
|
├─ Permission checks
|
|
├─ State changes
|
|
└─ Errors (with stack traces)
|
|
↓
|
|
Workflow Persistence (Snapshots)
|
|
├─ Current page state
|
|
├─ User/team context
|
|
├─ Component states
|
|
└─ Figma connection status
|
|
↓
|
|
Error Recovery (Crash Detection)
|
|
├─ Recovery points
|
|
├─ Error categorization
|
|
├─ Stack traces
|
|
└─ Retry strategies
|
|
↓
|
|
Browser Console (Raw Logs)
|
|
├─ JavaScript errors
|
|
├─ Network requests
|
|
├─ Component lifecycle
|
|
└─ Performance metrics
|
|
↓
|
|
Debug Inspector (NEW - To Be Built)
|
|
├─ View audit logs
|
|
├─ View state snapshots
|
|
├─ View crash reports
|
|
├─ Analyze performance
|
|
└─ Replay state
|
|
```
|
|
|
|
---
|
|
|
|
## Layer 1: Audit Logger Inspection
|
|
|
|
**Purpose**: See what actions were taken and in what order
|
|
|
|
**Location**: `admin-ui/js/core/audit-logger.js`
|
|
|
|
**Available Data**:
|
|
```javascript
|
|
// Each log entry contains:
|
|
{
|
|
timestamp: 1733425200000,
|
|
sessionId: "session-1733425200000-abc123",
|
|
action: "page_change",
|
|
category: "navigation",
|
|
level: "info",
|
|
details: {
|
|
from: "/",
|
|
to: "/settings",
|
|
user: "admin@example.com",
|
|
timestamp: 1733425200000
|
|
},
|
|
redacted: false
|
|
}
|
|
```
|
|
|
|
**Inspection Query**:
|
|
```javascript
|
|
// Access from browser console:
|
|
window.__DSS_DEBUG.auditLogger.getLogs({
|
|
action: 'api_call',
|
|
level: 'error',
|
|
timeRange: { start: Date.now() - 3600000, end: Date.now() }
|
|
})
|
|
```
|
|
|
|
**What to Look For**:
|
|
1. ✅ Sequence of actions leading to error
|
|
2. ✅ Permission checks that failed
|
|
3. ✅ API calls and their responses
|
|
4. ✅ State mutations and side effects
|
|
5. ✅ Error patterns (repeated errors)
|
|
|
|
---
|
|
|
|
## Layer 2: Workflow Persistence Inspection
|
|
|
|
**Purpose**: See what state the application was in at different points
|
|
|
|
**Location**: `admin-ui/js/core/workflow-persistence.js`
|
|
|
|
**Available Data**:
|
|
```javascript
|
|
// Each snapshot contains:
|
|
{
|
|
id: "snapshot-1",
|
|
timestamp: 1733425200000,
|
|
label: "before-settings-change",
|
|
state: {
|
|
currentPage: "/settings",
|
|
user: { id: "123", role: "TEAM_LEAD" },
|
|
team: { id: "team-1", name: "Design" },
|
|
figmaConnected: true,
|
|
selectedComponent: "Button",
|
|
settings: { theme: "dark", ... }
|
|
}
|
|
}
|
|
```
|
|
|
|
**Inspection Query**:
|
|
```javascript
|
|
// From browser console:
|
|
window.__DSS_DEBUG.workflowPersistence.getSnapshots()
|
|
// Shows all saved snapshots
|
|
|
|
// Restore to specific point:
|
|
window.__DSS_DEBUG.workflowPersistence.restoreSnapshot("snapshot-id")
|
|
```
|
|
|
|
**What to Look For**:
|
|
1. ✅ State before error occurred
|
|
2. ✅ Configuration values at time of issue
|
|
3. ✅ User permissions and roles
|
|
4. ✅ Component selection states
|
|
5. ✅ Connection status (Figma, APIs, etc.)
|
|
|
|
---
|
|
|
|
## Layer 3: Error Recovery Inspection
|
|
|
|
**Purpose**: See what crashed and how to recover
|
|
|
|
**Location**: `admin-ui/js/core/error-recovery.js`
|
|
|
|
**Available Data**:
|
|
```javascript
|
|
// Crash detection:
|
|
{
|
|
crashDetected: true,
|
|
lastActivityTime: 1733425200000,
|
|
timeSinceCrash: 45000, // ms
|
|
errorCategory: "PERMISSION_DENIED",
|
|
error: {
|
|
type: "Error",
|
|
message: "User does not have permission to access settings",
|
|
stack: "..."
|
|
},
|
|
recoveryPoints: [
|
|
{ id: "rp-1", timestamp: 1733425100000, label: "before-api-call" },
|
|
{ id: "rp-2", timestamp: 1733425150000, label: "before-state-update" }
|
|
]
|
|
}
|
|
```
|
|
|
|
**Inspection Query**:
|
|
```javascript
|
|
// From browser console:
|
|
window.__DSS_DEBUG.errorRecovery.getCrashReport()
|
|
// Returns full analysis
|
|
|
|
// Recover to specific point:
|
|
window.__DSS_DEBUG.errorRecovery.recover("rp-2")
|
|
```
|
|
|
|
**What to Look For**:
|
|
1. ✅ Error type and message
|
|
2. ✅ When crash occurred
|
|
3. ✅ Available recovery points
|
|
4. ✅ Stack trace for root cause
|
|
5. ✅ Retry strategies
|
|
|
|
---
|
|
|
|
## Layer 4: Browser Console Analysis
|
|
|
|
**Purpose**: Raw JavaScript and network debugging
|
|
|
|
### JavaScript Errors
|
|
```javascript
|
|
// In browser console (F12):
|
|
// Look for:
|
|
// - Red error messages (uncaught exceptions)
|
|
// - Stack traces with file names and line numbers
|
|
// - Related warnings (yellow)
|
|
|
|
// Search by timestamp or error pattern:
|
|
// Watch Network tab for failed requests
|
|
```
|
|
|
|
### Network Requests
|
|
```
|
|
Network Tab (F12 → Network):
|
|
- Filter by: XHR/Fetch
|
|
- Look for: 4xx/5xx responses
|
|
- Check: Request headers, response payload
|
|
- Analyze: Response time and size
|
|
```
|
|
|
|
### Performance
|
|
```
|
|
Performance Tab (F12 → Performance):
|
|
- Record user interaction
|
|
- Look for: Long tasks (red)
|
|
- Check: JavaScript execution time
|
|
- Analyze: Paint times and layout shifts
|
|
```
|
|
|
|
---
|
|
|
|
## Layer 5: Debug Inspector (To Be Built)
|
|
|
|
**Purpose**: Unified debugging dashboard within DSS
|
|
|
|
**Proposed Interface**:
|
|
```
|
|
┌─────────────────────────────────────────────┐
|
|
│ DSS Debug Inspector (Ctrl+Alt+D) │
|
|
├─────────────────────────────────────────────┤
|
|
│ │
|
|
│ 📊 Dashboard │
|
|
│ ├─ Current State Snapshot │
|
|
│ ├─ Recent Audit Logs (last 20) │
|
|
│ ├─ Last Error (if any) │
|
|
│ └─ Recovery Points Available │
|
|
│ │
|
|
│ 🔍 Audit Logs │
|
|
│ ├─ Filter by action/level/time │
|
|
│ ├─ Search by keyword │
|
|
│ ├─ View full details │
|
|
│ └─ Export as JSON │
|
|
│ │
|
|
│ 💾 Snapshots │
|
|
│ ├─ List all saved snapshots │
|
|
│ ├─ Compare two snapshots │
|
|
│ ├─ Restore to snapshot │
|
|
│ └─ Download snapshot │
|
|
│ │
|
|
│ ⚠️ Errors │
|
|
│ ├─ Show crash report │
|
|
│ ├─ View error timeline │
|
|
│ ├─ Analyze error patterns │
|
|
│ └─ Retry from recovery point │
|
|
│ │
|
|
│ ⚡ Performance │
|
|
│ ├─ Page load metrics │
|
|
│ ├─ API response times │
|
|
│ ├─ Component render times │
|
|
│ └─ Memory usage │
|
|
│ │
|
|
│ 🔐 Permissions │
|
|
│ ├─ Current user role │
|
|
│ ├─ Available actions │
|
|
│ ├─ Denied actions (reason) │
|
|
│ └─ Team permissions │
|
|
│ │
|
|
└─────────────────────────────────────────────┘
|
|
```
|
|
|
|
---
|
|
|
|
## Debugging Workflow: Step-by-Step
|
|
|
|
### Scenario 1: Dashboard Not Loading
|
|
|
|
1. **Check Browser Console** (F12)
|
|
- Any JavaScript errors?
|
|
- Any failed network requests?
|
|
|
|
2. **View Audit Logs**
|
|
```javascript
|
|
window.__DSS_DEBUG.auditLogger.getStats()
|
|
// Shows counts by category
|
|
```
|
|
|
|
3. **Check Last Snapshot**
|
|
```javascript
|
|
const snapshots = window.__DSS_DEBUG.workflowPersistence.getSnapshots();
|
|
const last = snapshots[snapshots.length - 1];
|
|
console.log(last.state); // See state before crash
|
|
```
|
|
|
|
4. **Check Error Recovery**
|
|
```javascript
|
|
window.__DSS_DEBUG.errorRecovery.getCrashReport()
|
|
// Full analysis
|
|
```
|
|
|
|
5. **Attempt Recovery**
|
|
```javascript
|
|
const report = window.__DSS_DEBUG.errorRecovery.getCrashReport();
|
|
if (report.recoveryPoints.length > 0) {
|
|
window.__DSS_DEBUG.errorRecovery.recover(
|
|
report.recoveryPoints[report.recoveryPoints.length - 1].id
|
|
);
|
|
}
|
|
```
|
|
|
|
---
|
|
|
|
### Scenario 2: Permission Denied on Feature
|
|
|
|
1. **Check Current User Role**
|
|
```javascript
|
|
const snapshot = window.__DSS_DEBUG.workflowPersistence.getSnapshots().pop();
|
|
console.log(snapshot.state.user.role);
|
|
```
|
|
|
|
2. **View Permission Checks**
|
|
```javascript
|
|
window.__DSS_DEBUG.auditLogger.getLogs({
|
|
action: 'permission_check',
|
|
details: { action: 'access_settings' }
|
|
});
|
|
```
|
|
|
|
3. **See Denied Reasons**
|
|
```javascript
|
|
window.__DSS_DEBUG.auditLogger.getLogs({
|
|
level: 'warning',
|
|
category: 'permission'
|
|
});
|
|
```
|
|
|
|
4. **Compare Required vs Available**
|
|
- Check route-guards.js for permission mappings
|
|
- See what role is required vs what user has
|
|
|
|
---
|
|
|
|
### Scenario 3: API Call Failing
|
|
|
|
1. **Find API Call in Audit Log**
|
|
```javascript
|
|
window.__DSS_DEBUG.auditLogger.getLogs({
|
|
action: 'api_call',
|
|
timeRange: {
|
|
start: Date.now() - 60000, // Last minute
|
|
end: Date.now()
|
|
}
|
|
});
|
|
```
|
|
|
|
2. **Check Network Tab**
|
|
- Filter by /api/*
|
|
- Look for failed requests (red)
|
|
- Check response for error message
|
|
|
|
3. **View State Before Call**
|
|
```javascript
|
|
// Find snapshot just before API error
|
|
// Compare state to understand what was sent
|
|
```
|
|
|
|
4. **Check Retry Logic**
|
|
```javascript
|
|
window.__DSS_DEBUG.errorRecovery.getLogs()
|
|
// See if retries occurred and outcomes
|
|
```
|
|
|
|
---
|
|
|
|
## Methodology Principles
|
|
|
|
### 1. **Layered Investigation**
|
|
Start with highest-level (audit logs) and drill down (network, code)
|
|
|
|
### 2. **Timeline Analysis**
|
|
Look at events in sequence to understand causality
|
|
|
|
### 3. **State Snapshots**
|
|
Capture state before, during, and after issues
|
|
|
|
### 4. **Permission Auditing**
|
|
Check role-based access at each step
|
|
|
|
### 5. **Error Categorization**
|
|
Group errors by type to identify patterns
|
|
|
|
### 6. **Recovery Strategy**
|
|
Always attempt recovery from known good state
|
|
|
|
### 7. **Non-Invasive**
|
|
Debugging tools should not modify system state
|
|
|
|
---
|
|
|
|
## Browser DevTools Integration
|
|
|
|
### Keyboard Shortcuts
|
|
```
|
|
F12 - Open DevTools
|
|
Ctrl+Shift+K - Open Console
|
|
Ctrl+Alt+D - Open DSS Debug Inspector (when implemented)
|
|
Ctrl+Shift+J - Open Console in new window
|
|
Ctrl+Shift+I - Open DevTools (alternate)
|
|
```
|
|
|
|
### Console Commands
|
|
```javascript
|
|
// Get debug namespace
|
|
window.__DSS_DEBUG
|
|
|
|
// Quick audit log view
|
|
window.__DSS_DEBUG.auditLogger.getLogs().slice(-10)
|
|
|
|
// Quick state view
|
|
window.__DSS_DEBUG.workflowPersistence.getSnapshots().pop().state
|
|
|
|
// Quick error view
|
|
window.__DSS_DEBUG.errorRecovery.getCrashReport()
|
|
|
|
// Export for analysis
|
|
JSON.stringify(window.__DSS_DEBUG.auditLogger.getLogs(), null, 2)
|
|
```
|
|
|
|
---
|
|
|
|
## Browser Log Reader Implementation
|
|
|
|
### What We Can Capture
|
|
|
|
```javascript
|
|
class BrowserLogReader {
|
|
// Capture console logs
|
|
captureConsoleLogs() {
|
|
const logs = [];
|
|
const originalLog = console.log;
|
|
const originalError = console.error;
|
|
|
|
console.log = (...args) => {
|
|
logs.push({
|
|
level: 'log',
|
|
timestamp: Date.now(),
|
|
message: args.join(' ')
|
|
});
|
|
originalLog(...args);
|
|
};
|
|
|
|
console.error = (...args) => {
|
|
logs.push({
|
|
level: 'error',
|
|
timestamp: Date.now(),
|
|
message: args.join(' '),
|
|
stack: new Error().stack
|
|
});
|
|
originalError(...args);
|
|
};
|
|
|
|
return logs;
|
|
}
|
|
|
|
// Capture unhandled errors
|
|
captureErrors() {
|
|
const errors = [];
|
|
window.addEventListener('error', (event) => {
|
|
errors.push({
|
|
type: 'error',
|
|
message: event.message,
|
|
file: event.filename,
|
|
line: event.lineno,
|
|
column: event.colno,
|
|
stack: event.error?.stack
|
|
});
|
|
});
|
|
|
|
window.addEventListener('unhandledrejection', (event) => {
|
|
errors.push({
|
|
type: 'unhandledRejection',
|
|
reason: event.reason,
|
|
stack: event.reason?.stack
|
|
});
|
|
});
|
|
|
|
return errors;
|
|
}
|
|
|
|
// Export for analysis
|
|
exportLogs() {
|
|
return {
|
|
consoleLogs: this.logs,
|
|
errors: this.errors,
|
|
timestamp: new Date().toISOString(),
|
|
url: window.location.href
|
|
};
|
|
}
|
|
}
|
|
```
|
|
|
|
---
|
|
|
|
## Next Steps: Build Debug Inspector
|
|
|
|
### Phase 1: Console Commands (Week 1)
|
|
- [ ] Expose audit logger to window.__DSS_DEBUG
|
|
- [ ] Expose workflow persistence to window.__DSS_DEBUG
|
|
- [ ] Expose error recovery to window.__DSS_DEBUG
|
|
- [ ] Create helper functions for common queries
|
|
|
|
### Phase 2: Dashboard UI (Week 2)
|
|
- [ ] Create Debug Inspector component
|
|
- [ ] Build audit log viewer
|
|
- [ ] Build snapshot manager
|
|
- [ ] Build error analyzer
|
|
|
|
### Phase 3: Advanced Features (Week 3)
|
|
- [ ] State comparison (before/after)
|
|
- [ ] Timeline visualization
|
|
- [ ] Performance profiling
|
|
- [ ] Log export/import
|
|
|
|
---
|
|
|
|
## Usage Example
|
|
|
|
```javascript
|
|
// 1. User reports: "Dashboard is slow"
|
|
|
|
// 2. In browser console:
|
|
const logs = window.__DSS_DEBUG.auditLogger.getLogs({
|
|
category: 'api_call'
|
|
});
|
|
|
|
// 3. Find slow requests:
|
|
logs.forEach(log => {
|
|
if (log.details.duration > 1000) {
|
|
console.log(`Slow API: ${log.details.endpoint} (${log.details.duration}ms)`);
|
|
}
|
|
});
|
|
|
|
// 4. Check state at time of slowness:
|
|
const snapshots = window.__DSS_DEBUG.workflowPersistence.getSnapshots();
|
|
const snapshotAtTime = snapshots.find(s =>
|
|
s.timestamp > slowLog.timestamp - 1000 &&
|
|
s.timestamp < slowLog.timestamp
|
|
);
|
|
|
|
// 5. Analyze:
|
|
console.log({
|
|
slowEndpoint: slowLog.details.endpoint,
|
|
stateAtTime: snapshotAtTime.state,
|
|
duration: slowLog.details.duration
|
|
});
|
|
```
|
|
|
|
---
|
|
|
|
## Benefits of This Approach
|
|
|
|
1. **Self-Contained**: No external tools needed
|
|
2. **Historical**: Full audit trail of everything
|
|
3. **Stateful**: Can see exact state at any point
|
|
4. **Safe**: No modifications to system
|
|
5. **Complete**: Captures all layers
|
|
6. **Fast**: Instant access to information
|
|
|
|
---
|
|
|
|
## Conclusion
|
|
|
|
DSS now has all the infrastructure needed for comprehensive self-debugging:
|
|
- ✅ Audit logger tracks all actions
|
|
- ✅ Workflow persistence captures state
|
|
- ✅ Error recovery analyzes crashes
|
|
- ✅ Route guards log permissions
|
|
|
|
Next step: Build Debug Inspector UI to visualize this data.
|