Migrated from design-system-swarm with fresh git history.
Old project history preserved in /home/overbits/apps/design-system-swarm
Core components:
- MCP Server (Python FastAPI with mcp 1.23.1)
- Claude Plugin (agents, commands, skills, strategies, hooks, core)
- DSS Backend (dss-mvp1 - token translation, Figma sync)
- Admin UI (Node.js/React)
- Server (Node.js/Express)
- Storybook integration (dss-mvp1/.storybook)
Self-contained configuration:
- All paths relative or use DSS_BASE_PATH=/home/overbits/dss
- PYTHONPATH configured for dss-mvp1 and dss-claude-plugin
- .env file with all configuration
- Claude plugin uses ${CLAUDE_PLUGIN_ROOT} for portability
Migration completed: $(date)
🤖 Clean migration with full functionality preserved
16 KiB
ZEN CHALLENGE: Cross-Reality Design System Audit
A Multi-Agent Collaboration Challenge for DSS PowerTools
Challenge Overview
Difficulty: EXPERT
Agents Required: 5 (Architect, Local Scout, Staging Inspector, Production Monitor, Synthesizer)
Estimated Duration: 4-6 hours
Objective: Perform a comprehensive design system audit across three runtime environments (local, staging, production) using adaptive LOCAL/REMOTE modes, then synthesize findings into an actionable improvement plan.
Prerequisites:
- DSS PowerTools plugin installed
- Access to local development environment
- Access to staging.dss.overbits.luz.uy (REMOTE)
- Access to dss.overbits.luz.uy (REMOTE)
- Task queue MCP server running
- Browser-logger.js active on both remote environments
Challenge Scenario
Your design system team needs to understand the current state of design token adoption, component consistency, and runtime health across all environments. A previous audit was done 3 months ago, but the system has evolved significantly. You need a fresh, comprehensive analysis.
Business Context:
- 47 React components in the library
- 6 teams consuming the design system
- Recent reports of "visual inconsistencies" between environments
- Performance concerns on production
- Need data-driven prioritization for Q1 improvements
Technical Context:
- Design tokens defined in
tokens/directory - Components in
components/directory - Storybook for visual testing
- Multiple deployment environments
- Browser logs captured via auto-sync system
Agent Roles & Responsibilities
Agent 1: ARCHITECT (Planning & Coordination)
Mode: N/A (coordination only)
Responsibilities:
- Design the audit strategy
- Define success criteria
- Create audit checklist
- Coordinate other agents via task-queue
- Monitor overall progress
Tools Available:
/dss-status- Check mode capabilities- Task queue for agent coordination
- Planning skills
Deliverables:
- Audit strategy document
- Success criteria checklist
- Agent task assignments
- Coordination plan
Agent 2: LOCAL SCOUT (Local Development Audit)
Mode: LOCAL
Environment: Developer's local machine
Responsibilities:
- Audit local component library
- Extract design tokens from source
- Capture Storybook screenshots
- Check for hardcoded values
- Analyze component adoption patterns
Tools Available:
/dss-config set local- Switch to LOCAL mode/dss-audit components/- Component audit/dss-extract src/ --tokens- Token extraction/dss-screenshot Button --story all- Capture screenshots/dss-analyze components/- Pattern analysis- Direct filesystem access via LOCAL strategy
Deliverables:
- Component audit report (hardcoded values, token usage)
- Extracted token inventory
- Screenshot collection from Storybook
- Local console logs
- Filesystem scan results
Session Data to Share:
- Local token inventory (JSON)
- Screenshot URLs/paths
- Component analysis results
- Session ID for correlation
Agent 3: STAGING INSPECTOR (Staging Environment Audit)
Mode: REMOTE
Environment: staging.dss.overbits.luz.uy
Responsibilities:
- Audit staging deployment
- Collect browser logs from staging users
- Capture runtime screenshots
- Check for console errors
- Analyze network performance
- Compare staging vs local tokens
Tools Available:
/dss-config set remote- Switch to REMOTE mode/dss-logs --session latest- Get browser logs/dss-screenshot .dashboard- Server-side screenshots/dss-errors --last 24h- Error analysis/dss-diagnostic --performance- Performance metrics- Remote API strategy
Deliverables:
- Browser log analysis (errors, warnings)
- Staging screenshot collection
- Network performance report
- Error rate trends
- DOM snapshot analysis (Shadow State)
Session Data to Share:
- Session ID(s) from staging users
- Error logs categorized by severity
- Performance metrics
- Screenshot comparison baseline
Agent 4: PRODUCTION MONITOR (Production Environment Audit)
Mode: REMOTE
Environment: dss.overbits.luz.uy (production)
Responsibilities:
- Monitor production health
- Analyze real user browser logs
- Identify critical errors
- Compare production vs staging
- Visual regression detection
- Performance benchmarking
Tools Available:
/dss-config set remote- Switch to REMOTE mode/dss-logs --production --last 7d- Week of logs/dss-errors --severity critical- Critical errors only/dss-diagnostic --full- Complete system diagnostic/dss-screenshot --compare staging- Visual regression- Remote monitoring tools
Deliverables:
- Production health dashboard
- Critical error report
- Visual regression findings
- Performance comparison (staging vs prod)
- User impact analysis
Session Data to Share:
- Production session IDs
- Critical error details
- Performance deltas
- Visual regression screenshots
Agent 5: SYNTHESIZER (Cross-Environment Analysis)
Mode: Context-aware (analyzes data from all modes)
Responsibilities:
- Aggregate findings from all agents
- Identify cross-environment discrepancies
- Correlate errors to root causes
- Prioritize issues by impact
- Generate comprehensive report
- Create Q1 improvement roadmap
Tools Available:
- Task queue to fetch agent outputs
- Data correlation algorithms
- Report generation
- Prioritization framework
Deliverables:
- Executive Summary (1 page)
- Detailed Findings Report (by category)
- Cross-Environment Discrepancy Matrix
- Prioritized Issue Backlog
- Q1 Improvement Roadmap
- Visual Regression Gallery
Challenge Phases
Phase 1: Setup & Planning (30 minutes)
Architect Agent Tasks:
- Review challenge requirements
- Create audit checklist
- Define success criteria:
- All 3 environments audited
- Token discrepancies identified
- Error correlation complete
- Visual regressions documented
- Actionable recommendations provided
- Create task queue entries for each agent
- Share coordination plan
All Agents:
- Verify DSS PowerTools plugin installed
- Check mode capabilities:
/dss-status - Confirm network access to remote environments
- Review session ID strategy
Phase 2: Parallel Environment Audits (2-3 hours)
Local Scout (LOCAL mode):
# Switch to LOCAL mode
/dss-config set local
# Verify capabilities
/dss-status
# Audit components
/dss-audit components/ --deep
# Extract tokens
/dss-extract src/tokens/ --output local-tokens.json
# Capture Storybook screenshots
/dss-screenshot Button --story primary,secondary,disabled
/dss-screenshot Card --story default,elevated
/dss-screenshot Input --story default,error,disabled
# Analyze patterns
/dss-analyze components/ --patterns tokens,hardcoded,inconsistencies
# Check local console
/dss-logs --local --last 1h
# Export results
/dss-export-session local-audit-{timestamp}
Staging Inspector (REMOTE mode):
# Switch to REMOTE mode
/dss-config set remote
# Configure remote URL
export DSS_REMOTE_URL=https://staging.dss.overbits.luz.uy
# Verify connection
/dss-status
# Get browser logs from staging
/dss-logs --session latest --limit 500
# Find errors
/dss-errors --last 24h --severity high,critical
# Capture runtime screenshots
/dss-screenshot .dashboard --fullpage
/dss-screenshot .component-library
# Performance check
/dss-diagnostic --performance --compare local
# Get DOM snapshots (Shadow State)
/dss-snapshot --latest
# Export results
/dss-export-session staging-audit-{timestamp}
Production Monitor (REMOTE mode):
# Switch to REMOTE mode
/dss-config set remote
# Configure production URL
export DSS_REMOTE_URL=https://dss.overbits.luz.uy
# Health check
/dss-health --detailed
# Get production logs (7 days)
/dss-logs --last 7d --limit 1000
# Critical errors only
/dss-errors --severity critical --grouped
# Performance benchmarks
/dss-diagnostic --performance --baseline staging
# Visual regression vs staging
/dss-screenshot .header --compare staging
/dss-screenshot .footer --compare staging
# Monitor metrics
/dss-metrics --uptime --error-rate --performance
# Export results
/dss-export-session prod-audit-{timestamp}
Phase 3: Data Correlation (1 hour)
Synthesizer Agent Tasks:
- Fetch Agent Data:
# Via task queue
local_data = get_task_result("local-audit-{timestamp}")
staging_data = get_task_result("staging-audit-{timestamp}")
prod_data = get_task_result("prod-audit-{timestamp}")
- Token Consistency Analysis:
# Compare token inventories
local_tokens = local_data['tokens']
staging_tokens = extract_tokens_from_logs(staging_data['logs'])
prod_tokens = extract_tokens_from_logs(prod_data['logs'])
discrepancies = find_discrepancies(local_tokens, staging_tokens, prod_tokens)
- Error Correlation:
# Group errors by root cause
staging_errors = staging_data['errors']
prod_errors = prod_data['errors']
correlated = correlate_errors(staging_errors, prod_errors)
root_causes = identify_root_causes(correlated)
- Visual Regression Analysis:
# Compare screenshots
local_screenshots = local_data['screenshots']
staging_screenshots = staging_data['screenshots']
prod_screenshots = prod_data['screenshots']
regressions = detect_visual_regressions(
local_screenshots,
staging_screenshots,
prod_screenshots
)
- Performance Comparison:
# Benchmark analysis
perf_matrix = {
'local': local_data['performance'],
'staging': staging_data['performance'],
'production': prod_data['performance']
}
deltas = calculate_performance_deltas(perf_matrix)
bottlenecks = identify_bottlenecks(deltas)
Phase 4: Report Generation (1 hour)
Synthesizer Deliverables:
- Executive Summary (1 page):
# Design System Audit: Cross-Environment Analysis
## Key Findings
- 12 token discrepancies between local and production
- 5 critical errors affecting 23% of production users
- 3 visual regressions in staging not present locally
- 18% performance degradation in production vs local
## Top 3 Priorities
1. Fix critical ButtonGroup error (23% user impact)
2. Align spacing tokens across environments (12 discrepancies)
3. Optimize Card component rendering (18% perf impact)
## Next Steps
- See detailed findings in full report
- Q1 roadmap attached
- Detailed Findings Report:
- Token Discrepancies (by category: color, spacing, typography)
- Error Analysis (by severity, frequency, impact)
- Visual Regressions (with screenshot comparisons)
- Performance Benchmarks (by component, environment)
- Component Adoption Metrics
-
Cross-Environment Discrepancy Matrix:
Component Local Staging Production Discrepancy Button v2.1 v2.1 v2.0 Version mismatch Card Tokens Tokens Hardcoded Token usage Input OK OK OK Consistent -
Prioritized Issue Backlog:
## P0 - Critical (Fix this sprint)
- [ ] ButtonGroup onClick error (23% users affected)
- [ ] Card spacing inconsistency (visual regression)
## P1 - High (Fix next sprint)
- [ ] Align spacing tokens (12 discrepancies)
- [ ] Optimize Card rendering (18% perf)
- [ ] Fix Input placeholder color mismatch
## P2 - Medium (Q1 roadmap)
- [ ] Standardize component props across environments
- [ ] Document token migration guide
- [ ] Add performance monitoring
- Q1 Improvement Roadmap:
- Month 1: Fix critical errors and visual regressions
- Month 2: Token alignment and performance optimization
- Month 3: Documentation and monitoring improvements
Success Criteria
Technical Success
- All 3 environments successfully audited
- Mode switching worked seamlessly (LOCAL ↔ REMOTE)
- Session IDs enabled data correlation
- Browser logs captured from remote environments
- Screenshots compared across environments
- Performance metrics benchmarked
Collaboration Success
- 5 agents coordinated via task queue
- Data shared between agents successfully
- No duplicate work across agents
- Findings synthesized coherently
Business Success
- Actionable recommendations generated
- Issues prioritized by user impact
- Q1 roadmap created
- Executive summary delivered
- Visual evidence provided (screenshots)
Bonus Challenges
- Auto-Fix Mode: After identifying token discrepancies, create automated PR to fix them
- Continuous Monitoring: Set up recurring audits (weekly) with automated reporting
- Visual Regression CI: Integrate screenshot comparison into CI/CD pipeline
- Performance Budget: Define performance budgets and alert when exceeded
- Multi-Project Audit: Extend audit to multiple design system consumers
Evaluation Rubric
| Category | Points | Criteria |
|---|---|---|
| Environment Coverage | 20 | All 3 environments audited thoroughly |
| Mode Mastery | 20 | Correct LOCAL/REMOTE mode usage |
| Data Quality | 15 | Complete, accurate data collection |
| Correlation Accuracy | 15 | Correct cross-environment correlation |
| Agent Collaboration | 10 | Effective task queue coordination |
| Report Quality | 10 | Clear, actionable recommendations |
| Prioritization | 5 | Issues prioritized by impact |
| Visual Evidence | 5 | Screenshots and comparisons included |
Total: 100 points
Passing Score: 70 points
Example Agent Coordination (Task Queue)
# Architect creates coordination tasks
create_task({
"title": "Local Environment Audit",
"assigned_to": "local-scout",
"description": "Audit local components and extract tokens",
"priority": 1
})
create_task({
"title": "Staging Environment Audit",
"assigned_to": "staging-inspector",
"description": "Audit staging deployment and browser logs",
"priority": 1,
"dependencies": [] # Can run in parallel
})
# Agents claim and complete tasks
claim_task("local-environment-audit")
start_task("local-environment-audit")
# ... perform audit ...
complete_task("local-environment-audit", result={
"tokens": local_tokens,
"screenshots": screenshot_paths,
"session_id": "local-audit-xyz789"
})
# Synthesizer fetches results
local_results = get_task_result("local-environment-audit")
Tips for Success
- Start with Planning: Let the Architect agent create a clear strategy before diving in
- Use Session IDs: Correlation depends on tracking session IDs across agents
- Parallel Execution: Run Local Scout, Staging Inspector, and Prod Monitor in parallel
- Share Data Early: Don't wait until the end to share findings
- Visual Evidence: Screenshots are crucial for demonstrating issues
- Prioritize by Impact: Not all issues are equal - focus on user impact
- Test Mode Switching: Verify LOCAL/REMOTE modes work before starting
- Document Assumptions: Make assumptions explicit in reports
Stretch Goals
- Real-Time Dashboard: Create live dashboard showing audit progress
- Automated Fixes: Generate PRs to fix identified issues
- Slack Integration: Post findings to team Slack channel
- Trend Analysis: Compare current audit to previous audits (3 months ago)
- Cost Analysis: Estimate development cost to fix each issue
Challenge Completion
When all agents have completed their work and the Synthesizer has generated the final report, submit the following:
- Complete audit findings report
- Cross-environment discrepancy matrix
- Prioritized issue backlog
- Q1 improvement roadmap
- Screenshot gallery (regressions)
- Agent coordination logs (task queue)
Good luck, agents! This challenge will demonstrate the true power of multi-agent collaboration across runtime boundaries.
Challenge Created: 2025-12-06 Difficulty: EXPERT Estimated Duration: 4-6 hours Success Rate: TBD (first run)