# ZEN CHALLENGE: Cross-Reality Design System Audit **A Multi-Agent Collaboration Challenge for DSS PowerTools** --- ## Challenge Overview **Difficulty**: EXPERT **Agents Required**: 5 (Architect, Local Scout, Staging Inspector, Production Monitor, Synthesizer) **Estimated Duration**: 4-6 hours **Objective**: Perform a comprehensive design system audit across three runtime environments (local, staging, production) using adaptive LOCAL/REMOTE modes, then synthesize findings into an actionable improvement plan. **Prerequisites**: - DSS PowerTools plugin installed - Access to local development environment - Access to staging.dss.overbits.luz.uy (REMOTE) - Access to dss.overbits.luz.uy (REMOTE) - Task queue MCP server running - Browser-logger.js active on both remote environments --- ## Challenge Scenario Your design system team needs to understand the current state of design token adoption, component consistency, and runtime health across all environments. A previous audit was done 3 months ago, but the system has evolved significantly. You need a fresh, comprehensive analysis. **Business Context**: - 47 React components in the library - 6 teams consuming the design system - Recent reports of "visual inconsistencies" between environments - Performance concerns on production - Need data-driven prioritization for Q1 improvements **Technical Context**: - Design tokens defined in `tokens/` directory - Components in `components/` directory - Storybook for visual testing - Multiple deployment environments - Browser logs captured via auto-sync system --- ## Agent Roles & Responsibilities ### Agent 1: ARCHITECT (Planning & Coordination) **Mode**: N/A (coordination only) **Responsibilities**: 1. Design the audit strategy 2. Define success criteria 3. Create audit checklist 4. Coordinate other agents via task-queue 5. Monitor overall progress **Tools Available**: - `/dss-status` - Check mode capabilities - Task queue for agent coordination - Planning skills **Deliverables**: - Audit strategy document - Success criteria checklist - Agent task assignments - Coordination plan --- ### Agent 2: LOCAL SCOUT (Local Development Audit) **Mode**: LOCAL **Environment**: Developer's local machine **Responsibilities**: 1. Audit local component library 2. Extract design tokens from source 3. Capture Storybook screenshots 4. Check for hardcoded values 5. Analyze component adoption patterns **Tools Available**: - `/dss-config set local` - Switch to LOCAL mode - `/dss-audit components/` - Component audit - `/dss-extract src/ --tokens` - Token extraction - `/dss-screenshot Button --story all` - Capture screenshots - `/dss-analyze components/` - Pattern analysis - Direct filesystem access via LOCAL strategy **Deliverables**: - Component audit report (hardcoded values, token usage) - Extracted token inventory - Screenshot collection from Storybook - Local console logs - Filesystem scan results **Session Data to Share**: - Local token inventory (JSON) - Screenshot URLs/paths - Component analysis results - Session ID for correlation --- ### Agent 3: STAGING INSPECTOR (Staging Environment Audit) **Mode**: REMOTE **Environment**: staging.dss.overbits.luz.uy **Responsibilities**: 1. Audit staging deployment 2. Collect browser logs from staging users 3. Capture runtime screenshots 4. Check for console errors 5. Analyze network performance 6. Compare staging vs local tokens **Tools Available**: - `/dss-config set remote` - Switch to REMOTE mode - `/dss-logs --session latest` - Get browser logs - `/dss-screenshot .dashboard` - Server-side screenshots - `/dss-errors --last 24h` - Error analysis - `/dss-diagnostic --performance` - Performance metrics - Remote API strategy **Deliverables**: - Browser log analysis (errors, warnings) - Staging screenshot collection - Network performance report - Error rate trends - DOM snapshot analysis (Shadow State) **Session Data to Share**: - Session ID(s) from staging users - Error logs categorized by severity - Performance metrics - Screenshot comparison baseline --- ### Agent 4: PRODUCTION MONITOR (Production Environment Audit) **Mode**: REMOTE **Environment**: dss.overbits.luz.uy (production) **Responsibilities**: 1. Monitor production health 2. Analyze real user browser logs 3. Identify critical errors 4. Compare production vs staging 5. Visual regression detection 6. Performance benchmarking **Tools Available**: - `/dss-config set remote` - Switch to REMOTE mode - `/dss-logs --production --last 7d` - Week of logs - `/dss-errors --severity critical` - Critical errors only - `/dss-diagnostic --full` - Complete system diagnostic - `/dss-screenshot --compare staging` - Visual regression - Remote monitoring tools **Deliverables**: - Production health dashboard - Critical error report - Visual regression findings - Performance comparison (staging vs prod) - User impact analysis **Session Data to Share**: - Production session IDs - Critical error details - Performance deltas - Visual regression screenshots --- ### Agent 5: SYNTHESIZER (Cross-Environment Analysis) **Mode**: Context-aware (analyzes data from all modes) **Responsibilities**: 1. Aggregate findings from all agents 2. Identify cross-environment discrepancies 3. Correlate errors to root causes 4. Prioritize issues by impact 5. Generate comprehensive report 6. Create Q1 improvement roadmap **Tools Available**: - Task queue to fetch agent outputs - Data correlation algorithms - Report generation - Prioritization framework **Deliverables**: - Executive Summary (1 page) - Detailed Findings Report (by category) - Cross-Environment Discrepancy Matrix - Prioritized Issue Backlog - Q1 Improvement Roadmap - Visual Regression Gallery --- ## Challenge Phases ### Phase 1: Setup & Planning (30 minutes) **Architect Agent Tasks**: 1. Review challenge requirements 2. Create audit checklist 3. Define success criteria: - All 3 environments audited - Token discrepancies identified - Error correlation complete - Visual regressions documented - Actionable recommendations provided 4. Create task queue entries for each agent 5. Share coordination plan **All Agents**: - Verify DSS PowerTools plugin installed - Check mode capabilities: `/dss-status` - Confirm network access to remote environments - Review session ID strategy --- ### Phase 2: Parallel Environment Audits (2-3 hours) **Local Scout** (LOCAL mode): ```bash # Switch to LOCAL mode /dss-config set local # Verify capabilities /dss-status # Audit components /dss-audit components/ --deep # Extract tokens /dss-extract src/tokens/ --output local-tokens.json # Capture Storybook screenshots /dss-screenshot Button --story primary,secondary,disabled /dss-screenshot Card --story default,elevated /dss-screenshot Input --story default,error,disabled # Analyze patterns /dss-analyze components/ --patterns tokens,hardcoded,inconsistencies # Check local console /dss-logs --local --last 1h # Export results /dss-export-session local-audit-{timestamp} ``` **Staging Inspector** (REMOTE mode): ```bash # Switch to REMOTE mode /dss-config set remote # Configure remote URL export DSS_REMOTE_URL=https://staging.dss.overbits.luz.uy # Verify connection /dss-status # Get browser logs from staging /dss-logs --session latest --limit 500 # Find errors /dss-errors --last 24h --severity high,critical # Capture runtime screenshots /dss-screenshot .dashboard --fullpage /dss-screenshot .component-library # Performance check /dss-diagnostic --performance --compare local # Get DOM snapshots (Shadow State) /dss-snapshot --latest # Export results /dss-export-session staging-audit-{timestamp} ``` **Production Monitor** (REMOTE mode): ```bash # Switch to REMOTE mode /dss-config set remote # Configure production URL export DSS_REMOTE_URL=https://dss.overbits.luz.uy # Health check /dss-health --detailed # Get production logs (7 days) /dss-logs --last 7d --limit 1000 # Critical errors only /dss-errors --severity critical --grouped # Performance benchmarks /dss-diagnostic --performance --baseline staging # Visual regression vs staging /dss-screenshot .header --compare staging /dss-screenshot .footer --compare staging # Monitor metrics /dss-metrics --uptime --error-rate --performance # Export results /dss-export-session prod-audit-{timestamp} ``` --- ### Phase 3: Data Correlation (1 hour) **Synthesizer Agent Tasks**: 1. **Fetch Agent Data**: ```python # Via task queue local_data = get_task_result("local-audit-{timestamp}") staging_data = get_task_result("staging-audit-{timestamp}") prod_data = get_task_result("prod-audit-{timestamp}") ``` 2. **Token Consistency Analysis**: ```python # Compare token inventories local_tokens = local_data['tokens'] staging_tokens = extract_tokens_from_logs(staging_data['logs']) prod_tokens = extract_tokens_from_logs(prod_data['logs']) discrepancies = find_discrepancies(local_tokens, staging_tokens, prod_tokens) ``` 3. **Error Correlation**: ```python # Group errors by root cause staging_errors = staging_data['errors'] prod_errors = prod_data['errors'] correlated = correlate_errors(staging_errors, prod_errors) root_causes = identify_root_causes(correlated) ``` 4. **Visual Regression Analysis**: ```python # Compare screenshots local_screenshots = local_data['screenshots'] staging_screenshots = staging_data['screenshots'] prod_screenshots = prod_data['screenshots'] regressions = detect_visual_regressions( local_screenshots, staging_screenshots, prod_screenshots ) ``` 5. **Performance Comparison**: ```python # Benchmark analysis perf_matrix = { 'local': local_data['performance'], 'staging': staging_data['performance'], 'production': prod_data['performance'] } deltas = calculate_performance_deltas(perf_matrix) bottlenecks = identify_bottlenecks(deltas) ``` --- ### Phase 4: Report Generation (1 hour) **Synthesizer Deliverables**: 1. **Executive Summary** (1 page): ```markdown # Design System Audit: Cross-Environment Analysis ## Key Findings - 12 token discrepancies between local and production - 5 critical errors affecting 23% of production users - 3 visual regressions in staging not present locally - 18% performance degradation in production vs local ## Top 3 Priorities 1. Fix critical ButtonGroup error (23% user impact) 2. Align spacing tokens across environments (12 discrepancies) 3. Optimize Card component rendering (18% perf impact) ## Next Steps - See detailed findings in full report - Q1 roadmap attached ``` 2. **Detailed Findings Report**: - Token Discrepancies (by category: color, spacing, typography) - Error Analysis (by severity, frequency, impact) - Visual Regressions (with screenshot comparisons) - Performance Benchmarks (by component, environment) - Component Adoption Metrics 3. **Cross-Environment Discrepancy Matrix**: | Component | Local | Staging | Production | Discrepancy | |-----------|-------|---------|------------|-------------| | Button | v2.1 | v2.1 | v2.0 | Version mismatch | | Card | Tokens| Tokens | Hardcoded | Token usage | | Input | OK | OK | OK | Consistent | 4. **Prioritized Issue Backlog**: ```markdown ## P0 - Critical (Fix this sprint) - [ ] ButtonGroup onClick error (23% users affected) - [ ] Card spacing inconsistency (visual regression) ## P1 - High (Fix next sprint) - [ ] Align spacing tokens (12 discrepancies) - [ ] Optimize Card rendering (18% perf) - [ ] Fix Input placeholder color mismatch ## P2 - Medium (Q1 roadmap) - [ ] Standardize component props across environments - [ ] Document token migration guide - [ ] Add performance monitoring ``` 5. **Q1 Improvement Roadmap**: - Month 1: Fix critical errors and visual regressions - Month 2: Token alignment and performance optimization - Month 3: Documentation and monitoring improvements --- ## Success Criteria ### Technical Success - [x] All 3 environments successfully audited - [x] Mode switching worked seamlessly (LOCAL ↔ REMOTE) - [x] Session IDs enabled data correlation - [x] Browser logs captured from remote environments - [x] Screenshots compared across environments - [x] Performance metrics benchmarked ### Collaboration Success - [x] 5 agents coordinated via task queue - [x] Data shared between agents successfully - [x] No duplicate work across agents - [x] Findings synthesized coherently ### Business Success - [x] Actionable recommendations generated - [x] Issues prioritized by user impact - [x] Q1 roadmap created - [x] Executive summary delivered - [x] Visual evidence provided (screenshots) --- ## Bonus Challenges 1. **Auto-Fix Mode**: After identifying token discrepancies, create automated PR to fix them 2. **Continuous Monitoring**: Set up recurring audits (weekly) with automated reporting 3. **Visual Regression CI**: Integrate screenshot comparison into CI/CD pipeline 4. **Performance Budget**: Define performance budgets and alert when exceeded 5. **Multi-Project Audit**: Extend audit to multiple design system consumers --- ## Evaluation Rubric | Category | Points | Criteria | |----------|--------|----------| | **Environment Coverage** | 20 | All 3 environments audited thoroughly | | **Mode Mastery** | 20 | Correct LOCAL/REMOTE mode usage | | **Data Quality** | 15 | Complete, accurate data collection | | **Correlation Accuracy** | 15 | Correct cross-environment correlation | | **Agent Collaboration** | 10 | Effective task queue coordination | | **Report Quality** | 10 | Clear, actionable recommendations | | **Prioritization** | 5 | Issues prioritized by impact | | **Visual Evidence** | 5 | Screenshots and comparisons included | **Total**: 100 points **Passing Score**: 70 points --- ## Example Agent Coordination (Task Queue) ```python # Architect creates coordination tasks create_task({ "title": "Local Environment Audit", "assigned_to": "local-scout", "description": "Audit local components and extract tokens", "priority": 1 }) create_task({ "title": "Staging Environment Audit", "assigned_to": "staging-inspector", "description": "Audit staging deployment and browser logs", "priority": 1, "dependencies": [] # Can run in parallel }) # Agents claim and complete tasks claim_task("local-environment-audit") start_task("local-environment-audit") # ... perform audit ... complete_task("local-environment-audit", result={ "tokens": local_tokens, "screenshots": screenshot_paths, "session_id": "local-audit-xyz789" }) # Synthesizer fetches results local_results = get_task_result("local-environment-audit") ``` --- ## Tips for Success 1. **Start with Planning**: Let the Architect agent create a clear strategy before diving in 2. **Use Session IDs**: Correlation depends on tracking session IDs across agents 3. **Parallel Execution**: Run Local Scout, Staging Inspector, and Prod Monitor in parallel 4. **Share Data Early**: Don't wait until the end to share findings 5. **Visual Evidence**: Screenshots are crucial for demonstrating issues 6. **Prioritize by Impact**: Not all issues are equal - focus on user impact 7. **Test Mode Switching**: Verify LOCAL/REMOTE modes work before starting 8. **Document Assumptions**: Make assumptions explicit in reports --- ## Stretch Goals 1. **Real-Time Dashboard**: Create live dashboard showing audit progress 2. **Automated Fixes**: Generate PRs to fix identified issues 3. **Slack Integration**: Post findings to team Slack channel 4. **Trend Analysis**: Compare current audit to previous audits (3 months ago) 5. **Cost Analysis**: Estimate development cost to fix each issue --- ## Challenge Completion When all agents have completed their work and the Synthesizer has generated the final report, submit the following: 1. Complete audit findings report 2. Cross-environment discrepancy matrix 3. Prioritized issue backlog 4. Q1 improvement roadmap 5. Screenshot gallery (regressions) 6. Agent coordination logs (task queue) **Good luck, agents! This challenge will demonstrate the true power of multi-agent collaboration across runtime boundaries.** --- **Challenge Created**: 2025-12-06 **Difficulty**: EXPERT **Estimated Duration**: 4-6 hours **Success Rate**: TBD (first run)