dss/.dss/WORKFLOWS/04-workflow-debugging.md

# Workflow 04: Debug Workflow Execution

**Purpose**: Meta-workflow for debugging the workflow system itself and MCP tool integration

**When to Use**:
- Workflow execution fails
- MCP tools not responding
- Workflow steps don't produce expected results
- API endpoints returning errors
- Communication between layers broken

**Estimated Time**: 10-20 minutes

---

## Prerequisites

- Understanding of 3-layer architecture (Browser → API → MCP)
- Access to server logs
- MCP server running
- API server running

---

## Architecture Overview

```
┌─────────────────────────────────────────────────────────────┐
│ Layer 1: Browser (JavaScript)                               │
│  - browser-logger.js: Captures logs → sessionStorage        │
│  - window.__DSS_BROWSER_LOGS API                            │
└─────────────────────────────────────────────────────────────┘
                           ↓
┌─────────────────────────────────────────────────────────────┐
│ Layer 2: API Server (Python/FastAPI)                        │
│  - POST /api/browser-logs: Receive logs from browser        │
│  - GET /api/browser-logs/:id: Retrieve logs                 │
│  - GET /api/debug/diagnostic: System status                 │
│  - GET /api/debug/workflows: List workflows                 │
└─────────────────────────────────────────────────────────────┘
                           ↓
┌─────────────────────────────────────────────────────────────┐
│ Layer 3: MCP Server (Python/MCP)                            │
│  - dss_get_browser_diagnostic: Get browser summary          │
│  - dss_get_browser_errors: Get browser errors               │
│  - dss_get_browser_network: Get network requests            │
│  - dss_get_server_diagnostic: Get server health             │
│  - dss_run_workflow: Execute workflow                       │
│  - dss_list_workflows: List available workflows             │
└─────────────────────────────────────────────────────────────┘
```

---

## Step-by-Step Procedure

### Step 1: Verify All Services Running

**Check API Server**:
```bash
# Check if running
systemctl status dss-api

# Or check process
ps aux | grep "uvicorn.*tools.api.server:app"

# Check port listening
lsof -i :3456
```

**Expected**:
```
● dss-api.service - DSS API Server
   Loaded: loaded
   Active: active (running)
```

**If not running**:
```bash
# Start service
systemctl start dss-api

# Or run manually for debugging
cd /home/overbits/dss
uvicorn tools.api.server:app --host 0.0.0.0 --port 3456
```

---

**Check MCP Server**:
```bash
# Check if running
systemctl status dss-mcp

# Or check process
ps aux | grep "tools.dss_mcp.server"

# Check port listening
lsof -i :3457
```

**Expected**:
```
● dss-mcp.service - DSS MCP Server
   Loaded: loaded
   Active: active (running)
```

**If not running**:
```bash
# Start service
systemctl start dss-mcp

# Or run manually for debugging
cd /home/overbits/dss
python3 -m tools.dss_mcp.server
```

---

### Step 2: Test Each Layer Independently

**Layer 1 - Browser Test**:
```javascript
// In browser console at https://dss.overbits.luz.uy/

// Check logger loaded
console.log(window.__DSS_BROWSER_LOGS ? '✅ Logger loaded' : '❌ Logger not loaded');

// Test logging
console.log('Test log');
console.error('Test error');

// Check logs captured
const logs = window.__DSS_BROWSER_LOGS.all();
console.log(`Captured ${logs.length} logs`);

// Test export
const exported = window.__DSS_BROWSER_LOGS.export();
console.log('Export structure:', Object.keys(exported));
```

**Expected**:
```
✅ Logger loaded
Captured 5 logs
Export structure: ["sessionId", "exportedAt", "logs", "diagnostic"]
```

---

**Layer 2 - API Test**:
```bash
# Test health endpoint
curl http://localhost:3456/health

# Test debug diagnostic
curl http://localhost:3456/api/debug/diagnostic

# Test workflows list
curl http://localhost:3456/api/debug/workflows

# Test browser logs receive (POST)
curl -X POST http://localhost:3456/api/browser-logs \
  -H "Content-Type: application/json" \
  -d '{
    "sessionId": "test-123",
    "logs": [{"level": "log", "message": "test"}]
  }'

# Test browser logs retrieve (GET)
curl http://localhost:3456/api/browser-logs/test-123
```

**Expected**:
```json
{"status": "healthy", "database": "ok", "mcp": "ok"}
{"status": "healthy", "browser_sessions": 1, ...}
{"workflows": ["01-capture-browser-logs", ...]}
{"status": "stored", "sessionId": "test-123"}
{"sessionId": "test-123", "logs": [...]}
```

---

**Layer 3 - MCP Test**:
```bash
# Test MCP server directly (if running on localhost:3457)
curl http://localhost:3457/tools

# Or use Claude Code to test MCP tools
# (These are available as mcp__dss__get_browser_diagnostic etc.)
```

**Expected**: List of available MCP tools

---

### Step 3: Test Data Flow End-to-End

**Full Flow Test**:

1. **Generate browser logs**:
```javascript
// In browser console
console.log('Flow test: Step 1');
console.error('Flow test: Error');
fetch('/api/config');
```

2. **Export from browser**:
```javascript
const logs = window.__DSS_BROWSER_LOGS.export();
console.log('Exported', logs.logs.length, 'logs');
```

3. **Upload to API**:
```javascript
fetch('/api/browser-logs', {
  method: 'POST',
  headers: { 'Content-Type': 'application/json' },
  body: JSON.stringify(logs)
}).then(r => r.json()).then(result => {
  console.log('Uploaded:', result.sessionId);
  return result.sessionId;
});
```

4. **Retrieve via API**:
```bash
# Use session ID from step 3
curl http://localhost:3456/api/browser-logs/<SESSION_ID>
```

5. **Access via MCP** (from Claude Code):
```
Use tool: dss_get_browser_diagnostic
```

**Expected**: Data flows through all 3 layers successfully

---

### Step 4: Debug Common Workflow Issues

#### Issue 1: Workflow Not Found

**Symptoms**:
```bash
curl http://localhost:3456/api/debug/workflows
# Returns empty array or missing workflow
```

**Diagnosis**:
```bash
# Check if workflow files exist
ls -la /home/overbits/dss/.dss/WORKFLOWS/

# Check file permissions
chmod 644 .dss/WORKFLOWS/*.md
```

**Solution**:
- Ensure workflow .md files exist in `.dss/WORKFLOWS/`
- Check filenames match expected pattern: `NN-workflow-name.md`
- Restart API server to reload workflow list

---

#### Issue 2: MCP Tool Returns Error

**Symptoms**:
```
Error: Connection refused
Error: Tool not found
Error: Invalid parameters
```

**Diagnosis**:
```bash
# Check MCP server logs
journalctl -u dss-mcp -n 50

# Check API server logs
journalctl -u dss-api -n 50

# Test API endpoint directly
curl http://localhost:3456/api/debug/diagnostic
```

**Solution**:
- If connection refused: Start MCP server
- If tool not found: Check tool registration in `tools/dss_mcp/server.py`
- If invalid parameters: Check tool schema matches input
- Check for Python exceptions in logs

---

#### Issue 3: Browser Logs Not Captured

**Symptoms**:
```javascript
window.__DSS_BROWSER_LOGS.all()
// Returns []
```

**Diagnosis**:
```javascript
// Check if logger initialized
console.log(window.__DSS_BROWSER_LOGGER);

// Check sessionStorage
console.log(sessionStorage.getItem('dss-browser-logs-' + window.__DSS_BROWSER_LOGGER?.sessionId));

// Check for errors
window.__DSS_BROWSER_LOGS.errors();
```

**Solution**:
- If logger undefined: Import browser-logger.js in HTML
- If sessionStorage empty: Check browser privacy settings
- If capturing but not storing: Check storage quota
- Force reload with DevTools open

---

#### Issue 4: API Endpoint Not Found

**Symptoms**:
```bash
curl http://localhost:3456/api/debug/diagnostic
# 404 Not Found
```

**Diagnosis**:
```bash
# Check if endpoint registered in server.py
grep -n "debug/diagnostic" tools/api/server.py

# Check API server started correctly
journalctl -u dss-api -n 20 | grep -i error

# Check routes registered
curl http://localhost:3456/docs
# (FastAPI auto-generated docs)
```

**Solution**:
- If endpoint not in code: Add endpoint to `tools/api/server.py`
- If server error: Fix error and restart
- If routes not registered: Check decorator syntax (`@app.get(...)`)

---

### Step 5: Verify Workflow Execution

**Test Workflow Execution**:

```bash
# List available workflows
curl http://localhost:3456/api/debug/workflows

# Execute a workflow (when implemented)
curl -X POST http://localhost:3456/api/debug/workflows/01-capture-browser-logs
```

**Or via MCP**:
```
Use tool: dss_list_workflows
Use tool: dss_run_workflow with workflow_name="01-capture-browser-logs"
```

**Expected**: Workflow steps execute in order and return results

---

### Step 6: Check Persistence (Supervisord)

**Verify Services Auto-Restart**:
```bash
# Check supervisor status
supervisorctl status

# Test auto-restart
supervisorctl stop dss-api
sleep 5
supervisorctl status dss-api  # Should auto-restart

# Check supervisor logs
tail -f /var/log/supervisor/dss-api-stdout.log
tail -f /var/log/supervisor/dss-mcp-stdout.log
```

**Expected**:
```
dss-api     RUNNING   pid 12345, uptime 0:00:05
dss-mcp     RUNNING   pid 12346, uptime 0:00:05
```

---

## Debugging Checklist

- [ ] API server running and accessible on port 3456
- [ ] MCP server running and accessible on port 3457
- [ ] Browser logger loaded in dashboard
- [ ] API health endpoint returns "healthy"
- [ ] API debug endpoints respond correctly
- [ ] Workflow files exist in `.dss/WORKFLOWS/`
- [ ] MCP tools registered and accessible
- [ ] Data flows from browser → API → MCP
- [ ] Supervisord manages services
- [ ] Services auto-restart on failure

---

## Common Error Messages

### "NameError: name 'get_connection' is not defined"
**Cause**: Missing import in server.py
**Solution**: Add to imports: `from storage.database import get_connection`

### "ModuleNotFoundError: No module named 'tools.dss_mcp.tools.debug_tools'"
**Cause**: debug_tools.py not created or not in path
**Solution**: Create `tools/dss_mcp/tools/debug_tools.py` with tool definitions

### "Address already in use"
**Cause**: Port 3456 or 3457 already in use
**Solution**: Find and kill process: `lsof -i :3456` then `kill -9 <PID>`

### "sessionStorage is not defined"
**Cause**: Running in Node.js or non-browser environment
**Solution**: Only use browser logger in actual browser context

### "Cannot read property 'diagnostic' of undefined"
**Cause**: Browser logger not loaded
**Solution**: Import browser-logger.js before other scripts

---

## Success Criteria

- ✅ All services running (API, MCP)
- ✅ Each layer tested independently
- ✅ End-to-end data flow working
- ✅ Workflows executable
- ✅ MCP tools accessible from Claude Code
- ✅ Auto-restart working via supervisord
- ✅ No errors in logs

---

## Workflow Development Guide

**Creating New Workflows**:

1. Create `.dss/WORKFLOWS/NN-workflow-name.md`
2. Use this template:

```markdown
# Workflow NN: Title

**Purpose**: What this workflow does

**When to Use**: Scenarios

**Estimated Time**: X-Y minutes

---

## Prerequisites
- Required conditions
- Required tools

---

## Step-by-Step Procedure

### Step 1: Title
**Action**: What to do
**Expected Result**: What should happen
**Troubleshooting**: If issues occur

### Step 2: Title
...

---

## Success Criteria
- ✅ Criterion 1
- ✅ Criterion 2

---

## Related Documentation
- Link 1
- Link 2
```

3. Test workflow manually following steps
4. Document actual results and refine
5. Restart API server to load new workflow

---

## Next Steps

- If all layers working: Workflows are ready to use
- If issues found: Document in `.dss/KNOWN_ISSUES.md`
- If new workflow needed: Create following template above
- Regular testing: Run this workflow monthly to verify system health

---

## Related Documentation

- `.dss/MCP_DEBUG_TOOLS_ARCHITECTURE.md` - Complete architecture spec
- `.dss/WORKFLOWS/01-capture-browser-logs.md` - Browser log workflow
- `.dss/WORKFLOWS/02-diagnose-errors.md` - Error diagnosis workflow
- `.dss/WORKFLOWS/03-debug-performance.md` - Performance workflow

---

## MCP Tool Access

**From Claude Code**:
```
Use tool: dss_list_workflows
Use tool: dss_run_workflow
Use tool: dss_get_server_diagnostic
```

These tools enable workflow execution and debugging from Claude Code.