dss/.dss/CHALLENGE_CROSS_REALITY_AUDIT.md

# ZEN CHALLENGE: Cross-Reality Design System Audit

**A Multi-Agent Collaboration Challenge for DSS PowerTools**

---

## Challenge Overview

**Difficulty**: EXPERT

**Agents Required**: 5 (Architect, Local Scout, Staging Inspector, Production Monitor, Synthesizer)

**Estimated Duration**: 4-6 hours

**Objective**: Perform a comprehensive design system audit across three runtime environments (local, staging, production) using adaptive LOCAL/REMOTE modes, then synthesize findings into an actionable improvement plan.

**Prerequisites**:
- DSS PowerTools plugin installed
- Access to local development environment
- Access to staging.dss.overbits.luz.uy (REMOTE)
- Access to dss.overbits.luz.uy (REMOTE)
- Task queue MCP server running
- Browser-logger.js active on both remote environments

---

## Challenge Scenario

Your design system team needs to understand the current state of design token adoption, component consistency, and runtime health across all environments. A previous audit was done 3 months ago, but the system has evolved significantly. You need a fresh, comprehensive analysis.

**Business Context**:
- 47 React components in the library
- 6 teams consuming the design system
- Recent reports of "visual inconsistencies" between environments
- Performance concerns on production
- Need data-driven prioritization for Q1 improvements

**Technical Context**:
- Design tokens defined in `tokens/` directory
- Components in `components/` directory
- Storybook for visual testing
- Multiple deployment environments
- Browser logs captured via auto-sync system

---

## Agent Roles & Responsibilities

### Agent 1: ARCHITECT (Planning & Coordination)

**Mode**: N/A (coordination only)

**Responsibilities**:
1. Design the audit strategy
2. Define success criteria
3. Create audit checklist
4. Coordinate other agents via task-queue
5. Monitor overall progress

**Tools Available**:
- `/dss-status` - Check mode capabilities
- Task queue for agent coordination
- Planning skills

**Deliverables**:
- Audit strategy document
- Success criteria checklist
- Agent task assignments
- Coordination plan

---

### Agent 2: LOCAL SCOUT (Local Development Audit)

**Mode**: LOCAL

**Environment**: Developer's local machine

**Responsibilities**:
1. Audit local component library
2. Extract design tokens from source
3. Capture Storybook screenshots
4. Check for hardcoded values
5. Analyze component adoption patterns

**Tools Available**:
- `/dss-config set local` - Switch to LOCAL mode
- `/dss-audit components/` - Component audit
- `/dss-extract src/ --tokens` - Token extraction
- `/dss-screenshot Button --story all` - Capture screenshots
- `/dss-analyze components/` - Pattern analysis
- Direct filesystem access via LOCAL strategy

**Deliverables**:
- Component audit report (hardcoded values, token usage)
- Extracted token inventory
- Screenshot collection from Storybook
- Local console logs
- Filesystem scan results

**Session Data to Share**:
- Local token inventory (JSON)
- Screenshot URLs/paths
- Component analysis results
- Session ID for correlation

---

### Agent 3: STAGING INSPECTOR (Staging Environment Audit)

**Mode**: REMOTE

**Environment**: staging.dss.overbits.luz.uy

**Responsibilities**:
1. Audit staging deployment
2. Collect browser logs from staging users
3. Capture runtime screenshots
4. Check for console errors
5. Analyze network performance
6. Compare staging vs local tokens

**Tools Available**:
- `/dss-config set remote` - Switch to REMOTE mode
- `/dss-logs --session latest` - Get browser logs
- `/dss-screenshot .dashboard` - Server-side screenshots
- `/dss-errors --last 24h` - Error analysis
- `/dss-diagnostic --performance` - Performance metrics
- Remote API strategy

**Deliverables**:
- Browser log analysis (errors, warnings)
- Staging screenshot collection
- Network performance report
- Error rate trends
- DOM snapshot analysis (Shadow State)

**Session Data to Share**:
- Session ID(s) from staging users
- Error logs categorized by severity
- Performance metrics
- Screenshot comparison baseline

---

### Agent 4: PRODUCTION MONITOR (Production Environment Audit)

**Mode**: REMOTE

**Environment**: dss.overbits.luz.uy (production)

**Responsibilities**:
1. Monitor production health
2. Analyze real user browser logs
3. Identify critical errors
4. Compare production vs staging
5. Visual regression detection
6. Performance benchmarking

**Tools Available**:
- `/dss-config set remote` - Switch to REMOTE mode
- `/dss-logs --production --last 7d` - Week of logs
- `/dss-errors --severity critical` - Critical errors only
- `/dss-diagnostic --full` - Complete system diagnostic
- `/dss-screenshot --compare staging` - Visual regression
- Remote monitoring tools

**Deliverables**:
- Production health dashboard
- Critical error report
- Visual regression findings
- Performance comparison (staging vs prod)
- User impact analysis

**Session Data to Share**:
- Production session IDs
- Critical error details
- Performance deltas
- Visual regression screenshots

---

### Agent 5: SYNTHESIZER (Cross-Environment Analysis)

**Mode**: Context-aware (analyzes data from all modes)

**Responsibilities**:
1. Aggregate findings from all agents
2. Identify cross-environment discrepancies
3. Correlate errors to root causes
4. Prioritize issues by impact
5. Generate comprehensive report
6. Create Q1 improvement roadmap

**Tools Available**:
- Task queue to fetch agent outputs
- Data correlation algorithms
- Report generation
- Prioritization framework

**Deliverables**:
- Executive Summary (1 page)
- Detailed Findings Report (by category)
- Cross-Environment Discrepancy Matrix
- Prioritized Issue Backlog
- Q1 Improvement Roadmap
- Visual Regression Gallery

---

## Challenge Phases

### Phase 1: Setup & Planning (30 minutes)

**Architect Agent Tasks**:
1. Review challenge requirements
2. Create audit checklist
3. Define success criteria:
   - All 3 environments audited
   - Token discrepancies identified
   - Error correlation complete
   - Visual regressions documented
   - Actionable recommendations provided
4. Create task queue entries for each agent
5. Share coordination plan

**All Agents**:
- Verify DSS PowerTools plugin installed
- Check mode capabilities: `/dss-status`
- Confirm network access to remote environments
- Review session ID strategy

---

### Phase 2: Parallel Environment Audits (2-3 hours)

**Local Scout** (LOCAL mode):
```bash
# Switch to LOCAL mode
/dss-config set local

# Verify capabilities
/dss-status

# Audit components
/dss-audit components/ --deep

# Extract tokens
/dss-extract src/tokens/ --output local-tokens.json

# Capture Storybook screenshots
/dss-screenshot Button --story primary,secondary,disabled
/dss-screenshot Card --story default,elevated
/dss-screenshot Input --story default,error,disabled

# Analyze patterns
/dss-analyze components/ --patterns tokens,hardcoded,inconsistencies

# Check local console
/dss-logs --local --last 1h

# Export results
/dss-export-session local-audit-{timestamp}
```

**Staging Inspector** (REMOTE mode):
```bash
# Switch to REMOTE mode
/dss-config set remote

# Configure remote URL
export DSS_REMOTE_URL=https://staging.dss.overbits.luz.uy

# Verify connection
/dss-status

# Get browser logs from staging
/dss-logs --session latest --limit 500

# Find errors
/dss-errors --last 24h --severity high,critical

# Capture runtime screenshots
/dss-screenshot .dashboard --fullpage
/dss-screenshot .component-library

# Performance check
/dss-diagnostic --performance --compare local

# Get DOM snapshots (Shadow State)
/dss-snapshot --latest

# Export results
/dss-export-session staging-audit-{timestamp}
```

**Production Monitor** (REMOTE mode):
```bash
# Switch to REMOTE mode
/dss-config set remote

# Configure production URL
export DSS_REMOTE_URL=https://dss.overbits.luz.uy

# Health check
/dss-health --detailed

# Get production logs (7 days)
/dss-logs --last 7d --limit 1000

# Critical errors only
/dss-errors --severity critical --grouped

# Performance benchmarks
/dss-diagnostic --performance --baseline staging

# Visual regression vs staging
/dss-screenshot .header --compare staging
/dss-screenshot .footer --compare staging

# Monitor metrics
/dss-metrics --uptime --error-rate --performance

# Export results
/dss-export-session prod-audit-{timestamp}
```

---

### Phase 3: Data Correlation (1 hour)

**Synthesizer Agent Tasks**:

1. **Fetch Agent Data**:
```python
# Via task queue
local_data = get_task_result("local-audit-{timestamp}")
staging_data = get_task_result("staging-audit-{timestamp}")
prod_data = get_task_result("prod-audit-{timestamp}")
```

2. **Token Consistency Analysis**:
```python
# Compare token inventories
local_tokens = local_data['tokens']
staging_tokens = extract_tokens_from_logs(staging_data['logs'])
prod_tokens = extract_tokens_from_logs(prod_data['logs'])

discrepancies = find_discrepancies(local_tokens, staging_tokens, prod_tokens)
```

3. **Error Correlation**:
```python
# Group errors by root cause
staging_errors = staging_data['errors']
prod_errors = prod_data['errors']

correlated = correlate_errors(staging_errors, prod_errors)
root_causes = identify_root_causes(correlated)
```

4. **Visual Regression Analysis**:
```python
# Compare screenshots
local_screenshots = local_data['screenshots']
staging_screenshots = staging_data['screenshots']
prod_screenshots = prod_data['screenshots']

regressions = detect_visual_regressions(
    local_screenshots,
    staging_screenshots,
    prod_screenshots
)
```

5. **Performance Comparison**:
```python
# Benchmark analysis
perf_matrix = {
    'local': local_data['performance'],
    'staging': staging_data['performance'],
    'production': prod_data['performance']
}

deltas = calculate_performance_deltas(perf_matrix)
bottlenecks = identify_bottlenecks(deltas)
```

---

### Phase 4: Report Generation (1 hour)

**Synthesizer Deliverables**:

1. **Executive Summary** (1 page):
```markdown
# Design System Audit: Cross-Environment Analysis

## Key Findings
- 12 token discrepancies between local and production
- 5 critical errors affecting 23% of production users
- 3 visual regressions in staging not present locally
- 18% performance degradation in production vs local

## Top 3 Priorities
1. Fix critical ButtonGroup error (23% user impact)
2. Align spacing tokens across environments (12 discrepancies)
3. Optimize Card component rendering (18% perf impact)

## Next Steps
- See detailed findings in full report
- Q1 roadmap attached
```

2. **Detailed Findings Report**:
- Token Discrepancies (by category: color, spacing, typography)
- Error Analysis (by severity, frequency, impact)
- Visual Regressions (with screenshot comparisons)
- Performance Benchmarks (by component, environment)
- Component Adoption Metrics

3. **Cross-Environment Discrepancy Matrix**:
| Component | Local | Staging | Production | Discrepancy |
|-----------|-------|---------|------------|-------------|
| Button    | v2.1  | v2.1    | v2.0       | Version mismatch |
| Card      | Tokens| Tokens  | Hardcoded  | Token usage |
| Input     | OK    | OK      | OK         | Consistent |

4. **Prioritized Issue Backlog**:
```markdown
## P0 - Critical (Fix this sprint)
- [ ] ButtonGroup onClick error (23% users affected)
- [ ] Card spacing inconsistency (visual regression)

## P1 - High (Fix next sprint)
- [ ] Align spacing tokens (12 discrepancies)
- [ ] Optimize Card rendering (18% perf)
- [ ] Fix Input placeholder color mismatch

## P2 - Medium (Q1 roadmap)
- [ ] Standardize component props across environments
- [ ] Document token migration guide
- [ ] Add performance monitoring
```

5. **Q1 Improvement Roadmap**:
- Month 1: Fix critical errors and visual regressions
- Month 2: Token alignment and performance optimization
- Month 3: Documentation and monitoring improvements

---

## Success Criteria

### Technical Success
- [x] All 3 environments successfully audited
- [x] Mode switching worked seamlessly (LOCAL ↔ REMOTE)
- [x] Session IDs enabled data correlation
- [x] Browser logs captured from remote environments
- [x] Screenshots compared across environments
- [x] Performance metrics benchmarked

### Collaboration Success
- [x] 5 agents coordinated via task queue
- [x] Data shared between agents successfully
- [x] No duplicate work across agents
- [x] Findings synthesized coherently

### Business Success
- [x] Actionable recommendations generated
- [x] Issues prioritized by user impact
- [x] Q1 roadmap created
- [x] Executive summary delivered
- [x] Visual evidence provided (screenshots)

---

## Bonus Challenges

1. **Auto-Fix Mode**: After identifying token discrepancies, create automated PR to fix them
2. **Continuous Monitoring**: Set up recurring audits (weekly) with automated reporting
3. **Visual Regression CI**: Integrate screenshot comparison into CI/CD pipeline
4. **Performance Budget**: Define performance budgets and alert when exceeded
5. **Multi-Project Audit**: Extend audit to multiple design system consumers

---

## Evaluation Rubric

| Category | Points | Criteria |
|----------|--------|----------|
| **Environment Coverage** | 20 | All 3 environments audited thoroughly |
| **Mode Mastery** | 20 | Correct LOCAL/REMOTE mode usage |
| **Data Quality** | 15 | Complete, accurate data collection |
| **Correlation Accuracy** | 15 | Correct cross-environment correlation |
| **Agent Collaboration** | 10 | Effective task queue coordination |
| **Report Quality** | 10 | Clear, actionable recommendations |
| **Prioritization** | 5 | Issues prioritized by impact |
| **Visual Evidence** | 5 | Screenshots and comparisons included |

**Total**: 100 points

**Passing Score**: 70 points

---

## Example Agent Coordination (Task Queue)

```python
# Architect creates coordination tasks
create_task({
    "title": "Local Environment Audit",
    "assigned_to": "local-scout",
    "description": "Audit local components and extract tokens",
    "priority": 1
})

create_task({
    "title": "Staging Environment Audit",
    "assigned_to": "staging-inspector",
    "description": "Audit staging deployment and browser logs",
    "priority": 1,
    "dependencies": []  # Can run in parallel
})

# Agents claim and complete tasks
claim_task("local-environment-audit")
start_task("local-environment-audit")

# ... perform audit ...

complete_task("local-environment-audit", result={
    "tokens": local_tokens,
    "screenshots": screenshot_paths,
    "session_id": "local-audit-xyz789"
})

# Synthesizer fetches results
local_results = get_task_result("local-environment-audit")
```

---

## Tips for Success

1. **Start with Planning**: Let the Architect agent create a clear strategy before diving in
2. **Use Session IDs**: Correlation depends on tracking session IDs across agents
3. **Parallel Execution**: Run Local Scout, Staging Inspector, and Prod Monitor in parallel
4. **Share Data Early**: Don't wait until the end to share findings
5. **Visual Evidence**: Screenshots are crucial for demonstrating issues
6. **Prioritize by Impact**: Not all issues are equal - focus on user impact
7. **Test Mode Switching**: Verify LOCAL/REMOTE modes work before starting
8. **Document Assumptions**: Make assumptions explicit in reports

---

## Stretch Goals

1. **Real-Time Dashboard**: Create live dashboard showing audit progress
2. **Automated Fixes**: Generate PRs to fix identified issues
3. **Slack Integration**: Post findings to team Slack channel
4. **Trend Analysis**: Compare current audit to previous audits (3 months ago)
5. **Cost Analysis**: Estimate development cost to fix each issue

---

## Challenge Completion

When all agents have completed their work and the Synthesizer has generated the final report, submit the following:

1. Complete audit findings report
2. Cross-environment discrepancy matrix
3. Prioritized issue backlog
4. Q1 improvement roadmap
5. Screenshot gallery (regressions)
6. Agent coordination logs (task queue)

**Good luck, agents! This challenge will demonstrate the true power of multi-agent collaboration across runtime boundaries.**

---

**Challenge Created**: 2025-12-06
**Difficulty**: EXPERT
**Estimated Duration**: 4-6 hours
**Success Rate**: TBD (first run)