Files
dss/.dss/DSS_SELF_DEBUG_METHODOLOGY.md
Digital Production Factory 276ed71f31 Initial commit: Clean DSS implementation
Migrated from design-system-swarm with fresh git history.
Old project history preserved in /home/overbits/apps/design-system-swarm

Core components:
- MCP Server (Python FastAPI with mcp 1.23.1)
- Claude Plugin (agents, commands, skills, strategies, hooks, core)
- DSS Backend (dss-mvp1 - token translation, Figma sync)
- Admin UI (Node.js/React)
- Server (Node.js/Express)
- Storybook integration (dss-mvp1/.storybook)

Self-contained configuration:
- All paths relative or use DSS_BASE_PATH=/home/overbits/dss
- PYTHONPATH configured for dss-mvp1 and dss-claude-plugin
- .env file with all configuration
- Claude plugin uses ${CLAUDE_PLUGIN_ROOT} for portability

Migration completed: $(date)
🤖 Clean migration with full functionality preserved
2025-12-09 18:45:48 -03:00

574 lines
14 KiB
Markdown

# DSS Self-Debugging Methodology
**Purpose**: Use DSS to debug DSS using its own infrastructure
**Principle**: Self-referential debugging - the system observes itself
**Date**: 2025-12-05
---
## Overview: Self-Referential Debugging Architecture
```
DSS Running (Dashboard/Admin UI)
Audit Logger (Records all actions)
├─ User interactions
├─ API calls
├─ Permission checks
├─ State changes
└─ Errors (with stack traces)
Workflow Persistence (Snapshots)
├─ Current page state
├─ User/team context
├─ Component states
└─ Figma connection status
Error Recovery (Crash Detection)
├─ Recovery points
├─ Error categorization
├─ Stack traces
└─ Retry strategies
Browser Console (Raw Logs)
├─ JavaScript errors
├─ Network requests
├─ Component lifecycle
└─ Performance metrics
Debug Inspector (NEW - To Be Built)
├─ View audit logs
├─ View state snapshots
├─ View crash reports
├─ Analyze performance
└─ Replay state
```
---
## Layer 1: Audit Logger Inspection
**Purpose**: See what actions were taken and in what order
**Location**: `admin-ui/js/core/audit-logger.js`
**Available Data**:
```javascript
// Each log entry contains:
{
timestamp: 1733425200000,
sessionId: "session-1733425200000-abc123",
action: "page_change",
category: "navigation",
level: "info",
details: {
from: "/",
to: "/settings",
user: "admin@example.com",
timestamp: 1733425200000
},
redacted: false
}
```
**Inspection Query**:
```javascript
// Access from browser console:
window.__DSS_DEBUG.auditLogger.getLogs({
action: 'api_call',
level: 'error',
timeRange: { start: Date.now() - 3600000, end: Date.now() }
})
```
**What to Look For**:
1. ✅ Sequence of actions leading to error
2. ✅ Permission checks that failed
3. ✅ API calls and their responses
4. ✅ State mutations and side effects
5. ✅ Error patterns (repeated errors)
---
## Layer 2: Workflow Persistence Inspection
**Purpose**: See what state the application was in at different points
**Location**: `admin-ui/js/core/workflow-persistence.js`
**Available Data**:
```javascript
// Each snapshot contains:
{
id: "snapshot-1",
timestamp: 1733425200000,
label: "before-settings-change",
state: {
currentPage: "/settings",
user: { id: "123", role: "TEAM_LEAD" },
team: { id: "team-1", name: "Design" },
figmaConnected: true,
selectedComponent: "Button",
settings: { theme: "dark", ... }
}
}
```
**Inspection Query**:
```javascript
// From browser console:
window.__DSS_DEBUG.workflowPersistence.getSnapshots()
// Shows all saved snapshots
// Restore to specific point:
window.__DSS_DEBUG.workflowPersistence.restoreSnapshot("snapshot-id")
```
**What to Look For**:
1. ✅ State before error occurred
2. ✅ Configuration values at time of issue
3. ✅ User permissions and roles
4. ✅ Component selection states
5. ✅ Connection status (Figma, APIs, etc.)
---
## Layer 3: Error Recovery Inspection
**Purpose**: See what crashed and how to recover
**Location**: `admin-ui/js/core/error-recovery.js`
**Available Data**:
```javascript
// Crash detection:
{
crashDetected: true,
lastActivityTime: 1733425200000,
timeSinceCrash: 45000, // ms
errorCategory: "PERMISSION_DENIED",
error: {
type: "Error",
message: "User does not have permission to access settings",
stack: "..."
},
recoveryPoints: [
{ id: "rp-1", timestamp: 1733425100000, label: "before-api-call" },
{ id: "rp-2", timestamp: 1733425150000, label: "before-state-update" }
]
}
```
**Inspection Query**:
```javascript
// From browser console:
window.__DSS_DEBUG.errorRecovery.getCrashReport()
// Returns full analysis
// Recover to specific point:
window.__DSS_DEBUG.errorRecovery.recover("rp-2")
```
**What to Look For**:
1. ✅ Error type and message
2. ✅ When crash occurred
3. ✅ Available recovery points
4. ✅ Stack trace for root cause
5. ✅ Retry strategies
---
## Layer 4: Browser Console Analysis
**Purpose**: Raw JavaScript and network debugging
### JavaScript Errors
```javascript
// In browser console (F12):
// Look for:
// - Red error messages (uncaught exceptions)
// - Stack traces with file names and line numbers
// - Related warnings (yellow)
// Search by timestamp or error pattern:
// Watch Network tab for failed requests
```
### Network Requests
```
Network Tab (F12 → Network):
- Filter by: XHR/Fetch
- Look for: 4xx/5xx responses
- Check: Request headers, response payload
- Analyze: Response time and size
```
### Performance
```
Performance Tab (F12 → Performance):
- Record user interaction
- Look for: Long tasks (red)
- Check: JavaScript execution time
- Analyze: Paint times and layout shifts
```
---
## Layer 5: Debug Inspector (To Be Built)
**Purpose**: Unified debugging dashboard within DSS
**Proposed Interface**:
```
┌─────────────────────────────────────────────┐
│ DSS Debug Inspector (Ctrl+Alt+D) │
├─────────────────────────────────────────────┤
│ │
│ 📊 Dashboard │
│ ├─ Current State Snapshot │
│ ├─ Recent Audit Logs (last 20) │
│ ├─ Last Error (if any) │
│ └─ Recovery Points Available │
│ │
│ 🔍 Audit Logs │
│ ├─ Filter by action/level/time │
│ ├─ Search by keyword │
│ ├─ View full details │
│ └─ Export as JSON │
│ │
│ 💾 Snapshots │
│ ├─ List all saved snapshots │
│ ├─ Compare two snapshots │
│ ├─ Restore to snapshot │
│ └─ Download snapshot │
│ │
│ ⚠️ Errors │
│ ├─ Show crash report │
│ ├─ View error timeline │
│ ├─ Analyze error patterns │
│ └─ Retry from recovery point │
│ │
│ ⚡ Performance │
│ ├─ Page load metrics │
│ ├─ API response times │
│ ├─ Component render times │
│ └─ Memory usage │
│ │
│ 🔐 Permissions │
│ ├─ Current user role │
│ ├─ Available actions │
│ ├─ Denied actions (reason) │
│ └─ Team permissions │
│ │
└─────────────────────────────────────────────┘
```
---
## Debugging Workflow: Step-by-Step
### Scenario 1: Dashboard Not Loading
1. **Check Browser Console** (F12)
- Any JavaScript errors?
- Any failed network requests?
2. **View Audit Logs**
```javascript
window.__DSS_DEBUG.auditLogger.getStats()
// Shows counts by category
```
3. **Check Last Snapshot**
```javascript
const snapshots = window.__DSS_DEBUG.workflowPersistence.getSnapshots();
const last = snapshots[snapshots.length - 1];
console.log(last.state); // See state before crash
```
4. **Check Error Recovery**
```javascript
window.__DSS_DEBUG.errorRecovery.getCrashReport()
// Full analysis
```
5. **Attempt Recovery**
```javascript
const report = window.__DSS_DEBUG.errorRecovery.getCrashReport();
if (report.recoveryPoints.length > 0) {
window.__DSS_DEBUG.errorRecovery.recover(
report.recoveryPoints[report.recoveryPoints.length - 1].id
);
}
```
---
### Scenario 2: Permission Denied on Feature
1. **Check Current User Role**
```javascript
const snapshot = window.__DSS_DEBUG.workflowPersistence.getSnapshots().pop();
console.log(snapshot.state.user.role);
```
2. **View Permission Checks**
```javascript
window.__DSS_DEBUG.auditLogger.getLogs({
action: 'permission_check',
details: { action: 'access_settings' }
});
```
3. **See Denied Reasons**
```javascript
window.__DSS_DEBUG.auditLogger.getLogs({
level: 'warning',
category: 'permission'
});
```
4. **Compare Required vs Available**
- Check route-guards.js for permission mappings
- See what role is required vs what user has
---
### Scenario 3: API Call Failing
1. **Find API Call in Audit Log**
```javascript
window.__DSS_DEBUG.auditLogger.getLogs({
action: 'api_call',
timeRange: {
start: Date.now() - 60000, // Last minute
end: Date.now()
}
});
```
2. **Check Network Tab**
- Filter by /api/*
- Look for failed requests (red)
- Check response for error message
3. **View State Before Call**
```javascript
// Find snapshot just before API error
// Compare state to understand what was sent
```
4. **Check Retry Logic**
```javascript
window.__DSS_DEBUG.errorRecovery.getLogs()
// See if retries occurred and outcomes
```
---
## Methodology Principles
### 1. **Layered Investigation**
Start with highest-level (audit logs) and drill down (network, code)
### 2. **Timeline Analysis**
Look at events in sequence to understand causality
### 3. **State Snapshots**
Capture state before, during, and after issues
### 4. **Permission Auditing**
Check role-based access at each step
### 5. **Error Categorization**
Group errors by type to identify patterns
### 6. **Recovery Strategy**
Always attempt recovery from known good state
### 7. **Non-Invasive**
Debugging tools should not modify system state
---
## Browser DevTools Integration
### Keyboard Shortcuts
```
F12 - Open DevTools
Ctrl+Shift+K - Open Console
Ctrl+Alt+D - Open DSS Debug Inspector (when implemented)
Ctrl+Shift+J - Open Console in new window
Ctrl+Shift+I - Open DevTools (alternate)
```
### Console Commands
```javascript
// Get debug namespace
window.__DSS_DEBUG
// Quick audit log view
window.__DSS_DEBUG.auditLogger.getLogs().slice(-10)
// Quick state view
window.__DSS_DEBUG.workflowPersistence.getSnapshots().pop().state
// Quick error view
window.__DSS_DEBUG.errorRecovery.getCrashReport()
// Export for analysis
JSON.stringify(window.__DSS_DEBUG.auditLogger.getLogs(), null, 2)
```
---
## Browser Log Reader Implementation
### What We Can Capture
```javascript
class BrowserLogReader {
// Capture console logs
captureConsoleLogs() {
const logs = [];
const originalLog = console.log;
const originalError = console.error;
console.log = (...args) => {
logs.push({
level: 'log',
timestamp: Date.now(),
message: args.join(' ')
});
originalLog(...args);
};
console.error = (...args) => {
logs.push({
level: 'error',
timestamp: Date.now(),
message: args.join(' '),
stack: new Error().stack
});
originalError(...args);
};
return logs;
}
// Capture unhandled errors
captureErrors() {
const errors = [];
window.addEventListener('error', (event) => {
errors.push({
type: 'error',
message: event.message,
file: event.filename,
line: event.lineno,
column: event.colno,
stack: event.error?.stack
});
});
window.addEventListener('unhandledrejection', (event) => {
errors.push({
type: 'unhandledRejection',
reason: event.reason,
stack: event.reason?.stack
});
});
return errors;
}
// Export for analysis
exportLogs() {
return {
consoleLogs: this.logs,
errors: this.errors,
timestamp: new Date().toISOString(),
url: window.location.href
};
}
}
```
---
## Next Steps: Build Debug Inspector
### Phase 1: Console Commands (Week 1)
- [ ] Expose audit logger to window.__DSS_DEBUG
- [ ] Expose workflow persistence to window.__DSS_DEBUG
- [ ] Expose error recovery to window.__DSS_DEBUG
- [ ] Create helper functions for common queries
### Phase 2: Dashboard UI (Week 2)
- [ ] Create Debug Inspector component
- [ ] Build audit log viewer
- [ ] Build snapshot manager
- [ ] Build error analyzer
### Phase 3: Advanced Features (Week 3)
- [ ] State comparison (before/after)
- [ ] Timeline visualization
- [ ] Performance profiling
- [ ] Log export/import
---
## Usage Example
```javascript
// 1. User reports: "Dashboard is slow"
// 2. In browser console:
const logs = window.__DSS_DEBUG.auditLogger.getLogs({
category: 'api_call'
});
// 3. Find slow requests:
logs.forEach(log => {
if (log.details.duration > 1000) {
console.log(`Slow API: ${log.details.endpoint} (${log.details.duration}ms)`);
}
});
// 4. Check state at time of slowness:
const snapshots = window.__DSS_DEBUG.workflowPersistence.getSnapshots();
const snapshotAtTime = snapshots.find(s =>
s.timestamp > slowLog.timestamp - 1000 &&
s.timestamp < slowLog.timestamp
);
// 5. Analyze:
console.log({
slowEndpoint: slowLog.details.endpoint,
stateAtTime: snapshotAtTime.state,
duration: slowLog.details.duration
});
```
---
## Benefits of This Approach
1. **Self-Contained**: No external tools needed
2. **Historical**: Full audit trail of everything
3. **Stateful**: Can see exact state at any point
4. **Safe**: No modifications to system
5. **Complete**: Captures all layers
6. **Fast**: Instant access to information
---
## Conclusion
DSS now has all the infrastructure needed for comprehensive self-debugging:
- ✅ Audit logger tracks all actions
- ✅ Workflow persistence captures state
- ✅ Error recovery analyzes crashes
- ✅ Route guards log permissions
Next step: Build Debug Inspector UI to visualize this data.