Files
dss/.dss/WORKFLOWS/04-workflow-debugging.md
Digital Production Factory 276ed71f31 Initial commit: Clean DSS implementation
Migrated from design-system-swarm with fresh git history.
Old project history preserved in /home/overbits/apps/design-system-swarm

Core components:
- MCP Server (Python FastAPI with mcp 1.23.1)
- Claude Plugin (agents, commands, skills, strategies, hooks, core)
- DSS Backend (dss-mvp1 - token translation, Figma sync)
- Admin UI (Node.js/React)
- Server (Node.js/Express)
- Storybook integration (dss-mvp1/.storybook)

Self-contained configuration:
- All paths relative or use DSS_BASE_PATH=/home/overbits/dss
- PYTHONPATH configured for dss-mvp1 and dss-claude-plugin
- .env file with all configuration
- Claude plugin uses ${CLAUDE_PLUGIN_ROOT} for portability

Migration completed: $(date)
🤖 Clean migration with full functionality preserved
2025-12-09 18:45:48 -03:00

534 lines
13 KiB
Markdown

# Workflow 04: Debug Workflow Execution
**Purpose**: Meta-workflow for debugging the workflow system itself and MCP tool integration
**When to Use**:
- Workflow execution fails
- MCP tools not responding
- Workflow steps don't produce expected results
- API endpoints returning errors
- Communication between layers broken
**Estimated Time**: 10-20 minutes
---
## Prerequisites
- Understanding of 3-layer architecture (Browser → API → MCP)
- Access to server logs
- MCP server running
- API server running
---
## Architecture Overview
```
┌─────────────────────────────────────────────────────────────┐
│ Layer 1: Browser (JavaScript) │
│ - browser-logger.js: Captures logs → sessionStorage │
│ - window.__DSS_BROWSER_LOGS API │
└─────────────────────────────────────────────────────────────┘
┌─────────────────────────────────────────────────────────────┐
│ Layer 2: API Server (Python/FastAPI) │
│ - POST /api/browser-logs: Receive logs from browser │
│ - GET /api/browser-logs/:id: Retrieve logs │
│ - GET /api/debug/diagnostic: System status │
│ - GET /api/debug/workflows: List workflows │
└─────────────────────────────────────────────────────────────┘
┌─────────────────────────────────────────────────────────────┐
│ Layer 3: MCP Server (Python/MCP) │
│ - dss_get_browser_diagnostic: Get browser summary │
│ - dss_get_browser_errors: Get browser errors │
│ - dss_get_browser_network: Get network requests │
│ - dss_get_server_diagnostic: Get server health │
│ - dss_run_workflow: Execute workflow │
│ - dss_list_workflows: List available workflows │
└─────────────────────────────────────────────────────────────┘
```
---
## Step-by-Step Procedure
### Step 1: Verify All Services Running
**Check API Server**:
```bash
# Check if running
systemctl status dss-api
# Or check process
ps aux | grep "uvicorn.*tools.api.server:app"
# Check port listening
lsof -i :3456
```
**Expected**:
```
● dss-api.service - DSS API Server
Loaded: loaded
Active: active (running)
```
**If not running**:
```bash
# Start service
systemctl start dss-api
# Or run manually for debugging
cd /home/overbits/dss
uvicorn tools.api.server:app --host 0.0.0.0 --port 3456
```
---
**Check MCP Server**:
```bash
# Check if running
systemctl status dss-mcp
# Or check process
ps aux | grep "tools.dss_mcp.server"
# Check port listening
lsof -i :3457
```
**Expected**:
```
● dss-mcp.service - DSS MCP Server
Loaded: loaded
Active: active (running)
```
**If not running**:
```bash
# Start service
systemctl start dss-mcp
# Or run manually for debugging
cd /home/overbits/dss
python3 -m tools.dss_mcp.server
```
---
### Step 2: Test Each Layer Independently
**Layer 1 - Browser Test**:
```javascript
// In browser console at https://dss.overbits.luz.uy/
// Check logger loaded
console.log(window.__DSS_BROWSER_LOGS ? '✅ Logger loaded' : '❌ Logger not loaded');
// Test logging
console.log('Test log');
console.error('Test error');
// Check logs captured
const logs = window.__DSS_BROWSER_LOGS.all();
console.log(`Captured ${logs.length} logs`);
// Test export
const exported = window.__DSS_BROWSER_LOGS.export();
console.log('Export structure:', Object.keys(exported));
```
**Expected**:
```
✅ Logger loaded
Captured 5 logs
Export structure: ["sessionId", "exportedAt", "logs", "diagnostic"]
```
---
**Layer 2 - API Test**:
```bash
# Test health endpoint
curl http://localhost:3456/health
# Test debug diagnostic
curl http://localhost:3456/api/debug/diagnostic
# Test workflows list
curl http://localhost:3456/api/debug/workflows
# Test browser logs receive (POST)
curl -X POST http://localhost:3456/api/browser-logs \
-H "Content-Type: application/json" \
-d '{
"sessionId": "test-123",
"logs": [{"level": "log", "message": "test"}]
}'
# Test browser logs retrieve (GET)
curl http://localhost:3456/api/browser-logs/test-123
```
**Expected**:
```json
{"status": "healthy", "database": "ok", "mcp": "ok"}
{"status": "healthy", "browser_sessions": 1, ...}
{"workflows": ["01-capture-browser-logs", ...]}
{"status": "stored", "sessionId": "test-123"}
{"sessionId": "test-123", "logs": [...]}
```
---
**Layer 3 - MCP Test**:
```bash
# Test MCP server directly (if running on localhost:3457)
curl http://localhost:3457/tools
# Or use Claude Code to test MCP tools
# (These are available as mcp__dss__get_browser_diagnostic etc.)
```
**Expected**: List of available MCP tools
---
### Step 3: Test Data Flow End-to-End
**Full Flow Test**:
1. **Generate browser logs**:
```javascript
// In browser console
console.log('Flow test: Step 1');
console.error('Flow test: Error');
fetch('/api/config');
```
2. **Export from browser**:
```javascript
const logs = window.__DSS_BROWSER_LOGS.export();
console.log('Exported', logs.logs.length, 'logs');
```
3. **Upload to API**:
```javascript
fetch('/api/browser-logs', {
method: 'POST',
headers: { 'Content-Type': 'application/json' },
body: JSON.stringify(logs)
}).then(r => r.json()).then(result => {
console.log('Uploaded:', result.sessionId);
return result.sessionId;
});
```
4. **Retrieve via API**:
```bash
# Use session ID from step 3
curl http://localhost:3456/api/browser-logs/<SESSION_ID>
```
5. **Access via MCP** (from Claude Code):
```
Use tool: dss_get_browser_diagnostic
```
**Expected**: Data flows through all 3 layers successfully
---
### Step 4: Debug Common Workflow Issues
#### Issue 1: Workflow Not Found
**Symptoms**:
```bash
curl http://localhost:3456/api/debug/workflows
# Returns empty array or missing workflow
```
**Diagnosis**:
```bash
# Check if workflow files exist
ls -la /home/overbits/dss/.dss/WORKFLOWS/
# Check file permissions
chmod 644 .dss/WORKFLOWS/*.md
```
**Solution**:
- Ensure workflow .md files exist in `.dss/WORKFLOWS/`
- Check filenames match expected pattern: `NN-workflow-name.md`
- Restart API server to reload workflow list
---
#### Issue 2: MCP Tool Returns Error
**Symptoms**:
```
Error: Connection refused
Error: Tool not found
Error: Invalid parameters
```
**Diagnosis**:
```bash
# Check MCP server logs
journalctl -u dss-mcp -n 50
# Check API server logs
journalctl -u dss-api -n 50
# Test API endpoint directly
curl http://localhost:3456/api/debug/diagnostic
```
**Solution**:
- If connection refused: Start MCP server
- If tool not found: Check tool registration in `tools/dss_mcp/server.py`
- If invalid parameters: Check tool schema matches input
- Check for Python exceptions in logs
---
#### Issue 3: Browser Logs Not Captured
**Symptoms**:
```javascript
window.__DSS_BROWSER_LOGS.all()
// Returns []
```
**Diagnosis**:
```javascript
// Check if logger initialized
console.log(window.__DSS_BROWSER_LOGGER);
// Check sessionStorage
console.log(sessionStorage.getItem('dss-browser-logs-' + window.__DSS_BROWSER_LOGGER?.sessionId));
// Check for errors
window.__DSS_BROWSER_LOGS.errors();
```
**Solution**:
- If logger undefined: Import browser-logger.js in HTML
- If sessionStorage empty: Check browser privacy settings
- If capturing but not storing: Check storage quota
- Force reload with DevTools open
---
#### Issue 4: API Endpoint Not Found
**Symptoms**:
```bash
curl http://localhost:3456/api/debug/diagnostic
# 404 Not Found
```
**Diagnosis**:
```bash
# Check if endpoint registered in server.py
grep -n "debug/diagnostic" tools/api/server.py
# Check API server started correctly
journalctl -u dss-api -n 20 | grep -i error
# Check routes registered
curl http://localhost:3456/docs
# (FastAPI auto-generated docs)
```
**Solution**:
- If endpoint not in code: Add endpoint to `tools/api/server.py`
- If server error: Fix error and restart
- If routes not registered: Check decorator syntax (`@app.get(...)`)
---
### Step 5: Verify Workflow Execution
**Test Workflow Execution**:
```bash
# List available workflows
curl http://localhost:3456/api/debug/workflows
# Execute a workflow (when implemented)
curl -X POST http://localhost:3456/api/debug/workflows/01-capture-browser-logs
```
**Or via MCP**:
```
Use tool: dss_list_workflows
Use tool: dss_run_workflow with workflow_name="01-capture-browser-logs"
```
**Expected**: Workflow steps execute in order and return results
---
### Step 6: Check Persistence (Supervisord)
**Verify Services Auto-Restart**:
```bash
# Check supervisor status
supervisorctl status
# Test auto-restart
supervisorctl stop dss-api
sleep 5
supervisorctl status dss-api # Should auto-restart
# Check supervisor logs
tail -f /var/log/supervisor/dss-api-stdout.log
tail -f /var/log/supervisor/dss-mcp-stdout.log
```
**Expected**:
```
dss-api RUNNING pid 12345, uptime 0:00:05
dss-mcp RUNNING pid 12346, uptime 0:00:05
```
---
## Debugging Checklist
- [ ] API server running and accessible on port 3456
- [ ] MCP server running and accessible on port 3457
- [ ] Browser logger loaded in dashboard
- [ ] API health endpoint returns "healthy"
- [ ] API debug endpoints respond correctly
- [ ] Workflow files exist in `.dss/WORKFLOWS/`
- [ ] MCP tools registered and accessible
- [ ] Data flows from browser → API → MCP
- [ ] Supervisord manages services
- [ ] Services auto-restart on failure
---
## Common Error Messages
### "NameError: name 'get_connection' is not defined"
**Cause**: Missing import in server.py
**Solution**: Add to imports: `from storage.database import get_connection`
### "ModuleNotFoundError: No module named 'tools.dss_mcp.tools.debug_tools'"
**Cause**: debug_tools.py not created or not in path
**Solution**: Create `tools/dss_mcp/tools/debug_tools.py` with tool definitions
### "Address already in use"
**Cause**: Port 3456 or 3457 already in use
**Solution**: Find and kill process: `lsof -i :3456` then `kill -9 <PID>`
### "sessionStorage is not defined"
**Cause**: Running in Node.js or non-browser environment
**Solution**: Only use browser logger in actual browser context
### "Cannot read property 'diagnostic' of undefined"
**Cause**: Browser logger not loaded
**Solution**: Import browser-logger.js before other scripts
---
## Success Criteria
- ✅ All services running (API, MCP)
- ✅ Each layer tested independently
- ✅ End-to-end data flow working
- ✅ Workflows executable
- ✅ MCP tools accessible from Claude Code
- ✅ Auto-restart working via supervisord
- ✅ No errors in logs
---
## Workflow Development Guide
**Creating New Workflows**:
1. Create `.dss/WORKFLOWS/NN-workflow-name.md`
2. Use this template:
```markdown
# Workflow NN: Title
**Purpose**: What this workflow does
**When to Use**: Scenarios
**Estimated Time**: X-Y minutes
---
## Prerequisites
- Required conditions
- Required tools
---
## Step-by-Step Procedure
### Step 1: Title
**Action**: What to do
**Expected Result**: What should happen
**Troubleshooting**: If issues occur
### Step 2: Title
...
---
## Success Criteria
- ✅ Criterion 1
- ✅ Criterion 2
---
## Related Documentation
- Link 1
- Link 2
```
3. Test workflow manually following steps
4. Document actual results and refine
5. Restart API server to load new workflow
---
## Next Steps
- If all layers working: Workflows are ready to use
- If issues found: Document in `.dss/KNOWN_ISSUES.md`
- If new workflow needed: Create following template above
- Regular testing: Run this workflow monthly to verify system health
---
## Related Documentation
- `.dss/MCP_DEBUG_TOOLS_ARCHITECTURE.md` - Complete architecture spec
- `.dss/WORKFLOWS/01-capture-browser-logs.md` - Browser log workflow
- `.dss/WORKFLOWS/02-diagnose-errors.md` - Error diagnosis workflow
- `.dss/WORKFLOWS/03-debug-performance.md` - Performance workflow
---
## MCP Tool Access
**From Claude Code**:
```
Use tool: dss_list_workflows
Use tool: dss_run_workflow
Use tool: dss_get_server_diagnostic
```
These tools enable workflow execution and debugging from Claude Code.