Migrated from design-system-swarm with fresh git history.
Old project history preserved in /home/overbits/apps/design-system-swarm
Core components:
- MCP Server (Python FastAPI with mcp 1.23.1)
- Claude Plugin (agents, commands, skills, strategies, hooks, core)
- DSS Backend (dss-mvp1 - token translation, Figma sync)
- Admin UI (Node.js/React)
- Server (Node.js/Express)
- Storybook integration (dss-mvp1/.storybook)
Self-contained configuration:
- All paths relative or use DSS_BASE_PATH=/home/overbits/dss
- PYTHONPATH configured for dss-mvp1 and dss-claude-plugin
- .env file with all configuration
- Claude plugin uses ${CLAUDE_PLUGIN_ROOT} for portability
Migration completed: $(date)
🤖 Clean migration with full functionality preserved
586 lines
16 KiB
Markdown
586 lines
16 KiB
Markdown
# ZEN CHALLENGE: Cross-Reality Design System Audit
|
|
|
|
**A Multi-Agent Collaboration Challenge for DSS PowerTools**
|
|
|
|
---
|
|
|
|
## Challenge Overview
|
|
|
|
**Difficulty**: EXPERT
|
|
|
|
**Agents Required**: 5 (Architect, Local Scout, Staging Inspector, Production Monitor, Synthesizer)
|
|
|
|
**Estimated Duration**: 4-6 hours
|
|
|
|
**Objective**: Perform a comprehensive design system audit across three runtime environments (local, staging, production) using adaptive LOCAL/REMOTE modes, then synthesize findings into an actionable improvement plan.
|
|
|
|
**Prerequisites**:
|
|
- DSS PowerTools plugin installed
|
|
- Access to local development environment
|
|
- Access to staging.dss.overbits.luz.uy (REMOTE)
|
|
- Access to dss.overbits.luz.uy (REMOTE)
|
|
- Task queue MCP server running
|
|
- Browser-logger.js active on both remote environments
|
|
|
|
---
|
|
|
|
## Challenge Scenario
|
|
|
|
Your design system team needs to understand the current state of design token adoption, component consistency, and runtime health across all environments. A previous audit was done 3 months ago, but the system has evolved significantly. You need a fresh, comprehensive analysis.
|
|
|
|
**Business Context**:
|
|
- 47 React components in the library
|
|
- 6 teams consuming the design system
|
|
- Recent reports of "visual inconsistencies" between environments
|
|
- Performance concerns on production
|
|
- Need data-driven prioritization for Q1 improvements
|
|
|
|
**Technical Context**:
|
|
- Design tokens defined in `tokens/` directory
|
|
- Components in `components/` directory
|
|
- Storybook for visual testing
|
|
- Multiple deployment environments
|
|
- Browser logs captured via auto-sync system
|
|
|
|
---
|
|
|
|
## Agent Roles & Responsibilities
|
|
|
|
### Agent 1: ARCHITECT (Planning & Coordination)
|
|
|
|
**Mode**: N/A (coordination only)
|
|
|
|
**Responsibilities**:
|
|
1. Design the audit strategy
|
|
2. Define success criteria
|
|
3. Create audit checklist
|
|
4. Coordinate other agents via task-queue
|
|
5. Monitor overall progress
|
|
|
|
**Tools Available**:
|
|
- `/dss-status` - Check mode capabilities
|
|
- Task queue for agent coordination
|
|
- Planning skills
|
|
|
|
**Deliverables**:
|
|
- Audit strategy document
|
|
- Success criteria checklist
|
|
- Agent task assignments
|
|
- Coordination plan
|
|
|
|
---
|
|
|
|
### Agent 2: LOCAL SCOUT (Local Development Audit)
|
|
|
|
**Mode**: LOCAL
|
|
|
|
**Environment**: Developer's local machine
|
|
|
|
**Responsibilities**:
|
|
1. Audit local component library
|
|
2. Extract design tokens from source
|
|
3. Capture Storybook screenshots
|
|
4. Check for hardcoded values
|
|
5. Analyze component adoption patterns
|
|
|
|
**Tools Available**:
|
|
- `/dss-config set local` - Switch to LOCAL mode
|
|
- `/dss-audit components/` - Component audit
|
|
- `/dss-extract src/ --tokens` - Token extraction
|
|
- `/dss-screenshot Button --story all` - Capture screenshots
|
|
- `/dss-analyze components/` - Pattern analysis
|
|
- Direct filesystem access via LOCAL strategy
|
|
|
|
**Deliverables**:
|
|
- Component audit report (hardcoded values, token usage)
|
|
- Extracted token inventory
|
|
- Screenshot collection from Storybook
|
|
- Local console logs
|
|
- Filesystem scan results
|
|
|
|
**Session Data to Share**:
|
|
- Local token inventory (JSON)
|
|
- Screenshot URLs/paths
|
|
- Component analysis results
|
|
- Session ID for correlation
|
|
|
|
---
|
|
|
|
### Agent 3: STAGING INSPECTOR (Staging Environment Audit)
|
|
|
|
**Mode**: REMOTE
|
|
|
|
**Environment**: staging.dss.overbits.luz.uy
|
|
|
|
**Responsibilities**:
|
|
1. Audit staging deployment
|
|
2. Collect browser logs from staging users
|
|
3. Capture runtime screenshots
|
|
4. Check for console errors
|
|
5. Analyze network performance
|
|
6. Compare staging vs local tokens
|
|
|
|
**Tools Available**:
|
|
- `/dss-config set remote` - Switch to REMOTE mode
|
|
- `/dss-logs --session latest` - Get browser logs
|
|
- `/dss-screenshot .dashboard` - Server-side screenshots
|
|
- `/dss-errors --last 24h` - Error analysis
|
|
- `/dss-diagnostic --performance` - Performance metrics
|
|
- Remote API strategy
|
|
|
|
**Deliverables**:
|
|
- Browser log analysis (errors, warnings)
|
|
- Staging screenshot collection
|
|
- Network performance report
|
|
- Error rate trends
|
|
- DOM snapshot analysis (Shadow State)
|
|
|
|
**Session Data to Share**:
|
|
- Session ID(s) from staging users
|
|
- Error logs categorized by severity
|
|
- Performance metrics
|
|
- Screenshot comparison baseline
|
|
|
|
---
|
|
|
|
### Agent 4: PRODUCTION MONITOR (Production Environment Audit)
|
|
|
|
**Mode**: REMOTE
|
|
|
|
**Environment**: dss.overbits.luz.uy (production)
|
|
|
|
**Responsibilities**:
|
|
1. Monitor production health
|
|
2. Analyze real user browser logs
|
|
3. Identify critical errors
|
|
4. Compare production vs staging
|
|
5. Visual regression detection
|
|
6. Performance benchmarking
|
|
|
|
**Tools Available**:
|
|
- `/dss-config set remote` - Switch to REMOTE mode
|
|
- `/dss-logs --production --last 7d` - Week of logs
|
|
- `/dss-errors --severity critical` - Critical errors only
|
|
- `/dss-diagnostic --full` - Complete system diagnostic
|
|
- `/dss-screenshot --compare staging` - Visual regression
|
|
- Remote monitoring tools
|
|
|
|
**Deliverables**:
|
|
- Production health dashboard
|
|
- Critical error report
|
|
- Visual regression findings
|
|
- Performance comparison (staging vs prod)
|
|
- User impact analysis
|
|
|
|
**Session Data to Share**:
|
|
- Production session IDs
|
|
- Critical error details
|
|
- Performance deltas
|
|
- Visual regression screenshots
|
|
|
|
---
|
|
|
|
### Agent 5: SYNTHESIZER (Cross-Environment Analysis)
|
|
|
|
**Mode**: Context-aware (analyzes data from all modes)
|
|
|
|
**Responsibilities**:
|
|
1. Aggregate findings from all agents
|
|
2. Identify cross-environment discrepancies
|
|
3. Correlate errors to root causes
|
|
4. Prioritize issues by impact
|
|
5. Generate comprehensive report
|
|
6. Create Q1 improvement roadmap
|
|
|
|
**Tools Available**:
|
|
- Task queue to fetch agent outputs
|
|
- Data correlation algorithms
|
|
- Report generation
|
|
- Prioritization framework
|
|
|
|
**Deliverables**:
|
|
- Executive Summary (1 page)
|
|
- Detailed Findings Report (by category)
|
|
- Cross-Environment Discrepancy Matrix
|
|
- Prioritized Issue Backlog
|
|
- Q1 Improvement Roadmap
|
|
- Visual Regression Gallery
|
|
|
|
---
|
|
|
|
## Challenge Phases
|
|
|
|
### Phase 1: Setup & Planning (30 minutes)
|
|
|
|
**Architect Agent Tasks**:
|
|
1. Review challenge requirements
|
|
2. Create audit checklist
|
|
3. Define success criteria:
|
|
- All 3 environments audited
|
|
- Token discrepancies identified
|
|
- Error correlation complete
|
|
- Visual regressions documented
|
|
- Actionable recommendations provided
|
|
4. Create task queue entries for each agent
|
|
5. Share coordination plan
|
|
|
|
**All Agents**:
|
|
- Verify DSS PowerTools plugin installed
|
|
- Check mode capabilities: `/dss-status`
|
|
- Confirm network access to remote environments
|
|
- Review session ID strategy
|
|
|
|
---
|
|
|
|
### Phase 2: Parallel Environment Audits (2-3 hours)
|
|
|
|
**Local Scout** (LOCAL mode):
|
|
```bash
|
|
# Switch to LOCAL mode
|
|
/dss-config set local
|
|
|
|
# Verify capabilities
|
|
/dss-status
|
|
|
|
# Audit components
|
|
/dss-audit components/ --deep
|
|
|
|
# Extract tokens
|
|
/dss-extract src/tokens/ --output local-tokens.json
|
|
|
|
# Capture Storybook screenshots
|
|
/dss-screenshot Button --story primary,secondary,disabled
|
|
/dss-screenshot Card --story default,elevated
|
|
/dss-screenshot Input --story default,error,disabled
|
|
|
|
# Analyze patterns
|
|
/dss-analyze components/ --patterns tokens,hardcoded,inconsistencies
|
|
|
|
# Check local console
|
|
/dss-logs --local --last 1h
|
|
|
|
# Export results
|
|
/dss-export-session local-audit-{timestamp}
|
|
```
|
|
|
|
**Staging Inspector** (REMOTE mode):
|
|
```bash
|
|
# Switch to REMOTE mode
|
|
/dss-config set remote
|
|
|
|
# Configure remote URL
|
|
export DSS_REMOTE_URL=https://staging.dss.overbits.luz.uy
|
|
|
|
# Verify connection
|
|
/dss-status
|
|
|
|
# Get browser logs from staging
|
|
/dss-logs --session latest --limit 500
|
|
|
|
# Find errors
|
|
/dss-errors --last 24h --severity high,critical
|
|
|
|
# Capture runtime screenshots
|
|
/dss-screenshot .dashboard --fullpage
|
|
/dss-screenshot .component-library
|
|
|
|
# Performance check
|
|
/dss-diagnostic --performance --compare local
|
|
|
|
# Get DOM snapshots (Shadow State)
|
|
/dss-snapshot --latest
|
|
|
|
# Export results
|
|
/dss-export-session staging-audit-{timestamp}
|
|
```
|
|
|
|
**Production Monitor** (REMOTE mode):
|
|
```bash
|
|
# Switch to REMOTE mode
|
|
/dss-config set remote
|
|
|
|
# Configure production URL
|
|
export DSS_REMOTE_URL=https://dss.overbits.luz.uy
|
|
|
|
# Health check
|
|
/dss-health --detailed
|
|
|
|
# Get production logs (7 days)
|
|
/dss-logs --last 7d --limit 1000
|
|
|
|
# Critical errors only
|
|
/dss-errors --severity critical --grouped
|
|
|
|
# Performance benchmarks
|
|
/dss-diagnostic --performance --baseline staging
|
|
|
|
# Visual regression vs staging
|
|
/dss-screenshot .header --compare staging
|
|
/dss-screenshot .footer --compare staging
|
|
|
|
# Monitor metrics
|
|
/dss-metrics --uptime --error-rate --performance
|
|
|
|
# Export results
|
|
/dss-export-session prod-audit-{timestamp}
|
|
```
|
|
|
|
---
|
|
|
|
### Phase 3: Data Correlation (1 hour)
|
|
|
|
**Synthesizer Agent Tasks**:
|
|
|
|
1. **Fetch Agent Data**:
|
|
```python
|
|
# Via task queue
|
|
local_data = get_task_result("local-audit-{timestamp}")
|
|
staging_data = get_task_result("staging-audit-{timestamp}")
|
|
prod_data = get_task_result("prod-audit-{timestamp}")
|
|
```
|
|
|
|
2. **Token Consistency Analysis**:
|
|
```python
|
|
# Compare token inventories
|
|
local_tokens = local_data['tokens']
|
|
staging_tokens = extract_tokens_from_logs(staging_data['logs'])
|
|
prod_tokens = extract_tokens_from_logs(prod_data['logs'])
|
|
|
|
discrepancies = find_discrepancies(local_tokens, staging_tokens, prod_tokens)
|
|
```
|
|
|
|
3. **Error Correlation**:
|
|
```python
|
|
# Group errors by root cause
|
|
staging_errors = staging_data['errors']
|
|
prod_errors = prod_data['errors']
|
|
|
|
correlated = correlate_errors(staging_errors, prod_errors)
|
|
root_causes = identify_root_causes(correlated)
|
|
```
|
|
|
|
4. **Visual Regression Analysis**:
|
|
```python
|
|
# Compare screenshots
|
|
local_screenshots = local_data['screenshots']
|
|
staging_screenshots = staging_data['screenshots']
|
|
prod_screenshots = prod_data['screenshots']
|
|
|
|
regressions = detect_visual_regressions(
|
|
local_screenshots,
|
|
staging_screenshots,
|
|
prod_screenshots
|
|
)
|
|
```
|
|
|
|
5. **Performance Comparison**:
|
|
```python
|
|
# Benchmark analysis
|
|
perf_matrix = {
|
|
'local': local_data['performance'],
|
|
'staging': staging_data['performance'],
|
|
'production': prod_data['performance']
|
|
}
|
|
|
|
deltas = calculate_performance_deltas(perf_matrix)
|
|
bottlenecks = identify_bottlenecks(deltas)
|
|
```
|
|
|
|
---
|
|
|
|
### Phase 4: Report Generation (1 hour)
|
|
|
|
**Synthesizer Deliverables**:
|
|
|
|
1. **Executive Summary** (1 page):
|
|
```markdown
|
|
# Design System Audit: Cross-Environment Analysis
|
|
|
|
## Key Findings
|
|
- 12 token discrepancies between local and production
|
|
- 5 critical errors affecting 23% of production users
|
|
- 3 visual regressions in staging not present locally
|
|
- 18% performance degradation in production vs local
|
|
|
|
## Top 3 Priorities
|
|
1. Fix critical ButtonGroup error (23% user impact)
|
|
2. Align spacing tokens across environments (12 discrepancies)
|
|
3. Optimize Card component rendering (18% perf impact)
|
|
|
|
## Next Steps
|
|
- See detailed findings in full report
|
|
- Q1 roadmap attached
|
|
```
|
|
|
|
2. **Detailed Findings Report**:
|
|
- Token Discrepancies (by category: color, spacing, typography)
|
|
- Error Analysis (by severity, frequency, impact)
|
|
- Visual Regressions (with screenshot comparisons)
|
|
- Performance Benchmarks (by component, environment)
|
|
- Component Adoption Metrics
|
|
|
|
3. **Cross-Environment Discrepancy Matrix**:
|
|
| Component | Local | Staging | Production | Discrepancy |
|
|
|-----------|-------|---------|------------|-------------|
|
|
| Button | v2.1 | v2.1 | v2.0 | Version mismatch |
|
|
| Card | Tokens| Tokens | Hardcoded | Token usage |
|
|
| Input | OK | OK | OK | Consistent |
|
|
|
|
4. **Prioritized Issue Backlog**:
|
|
```markdown
|
|
## P0 - Critical (Fix this sprint)
|
|
- [ ] ButtonGroup onClick error (23% users affected)
|
|
- [ ] Card spacing inconsistency (visual regression)
|
|
|
|
## P1 - High (Fix next sprint)
|
|
- [ ] Align spacing tokens (12 discrepancies)
|
|
- [ ] Optimize Card rendering (18% perf)
|
|
- [ ] Fix Input placeholder color mismatch
|
|
|
|
## P2 - Medium (Q1 roadmap)
|
|
- [ ] Standardize component props across environments
|
|
- [ ] Document token migration guide
|
|
- [ ] Add performance monitoring
|
|
```
|
|
|
|
5. **Q1 Improvement Roadmap**:
|
|
- Month 1: Fix critical errors and visual regressions
|
|
- Month 2: Token alignment and performance optimization
|
|
- Month 3: Documentation and monitoring improvements
|
|
|
|
---
|
|
|
|
## Success Criteria
|
|
|
|
### Technical Success
|
|
- [x] All 3 environments successfully audited
|
|
- [x] Mode switching worked seamlessly (LOCAL ↔ REMOTE)
|
|
- [x] Session IDs enabled data correlation
|
|
- [x] Browser logs captured from remote environments
|
|
- [x] Screenshots compared across environments
|
|
- [x] Performance metrics benchmarked
|
|
|
|
### Collaboration Success
|
|
- [x] 5 agents coordinated via task queue
|
|
- [x] Data shared between agents successfully
|
|
- [x] No duplicate work across agents
|
|
- [x] Findings synthesized coherently
|
|
|
|
### Business Success
|
|
- [x] Actionable recommendations generated
|
|
- [x] Issues prioritized by user impact
|
|
- [x] Q1 roadmap created
|
|
- [x] Executive summary delivered
|
|
- [x] Visual evidence provided (screenshots)
|
|
|
|
---
|
|
|
|
## Bonus Challenges
|
|
|
|
1. **Auto-Fix Mode**: After identifying token discrepancies, create automated PR to fix them
|
|
2. **Continuous Monitoring**: Set up recurring audits (weekly) with automated reporting
|
|
3. **Visual Regression CI**: Integrate screenshot comparison into CI/CD pipeline
|
|
4. **Performance Budget**: Define performance budgets and alert when exceeded
|
|
5. **Multi-Project Audit**: Extend audit to multiple design system consumers
|
|
|
|
---
|
|
|
|
## Evaluation Rubric
|
|
|
|
| Category | Points | Criteria |
|
|
|----------|--------|----------|
|
|
| **Environment Coverage** | 20 | All 3 environments audited thoroughly |
|
|
| **Mode Mastery** | 20 | Correct LOCAL/REMOTE mode usage |
|
|
| **Data Quality** | 15 | Complete, accurate data collection |
|
|
| **Correlation Accuracy** | 15 | Correct cross-environment correlation |
|
|
| **Agent Collaboration** | 10 | Effective task queue coordination |
|
|
| **Report Quality** | 10 | Clear, actionable recommendations |
|
|
| **Prioritization** | 5 | Issues prioritized by impact |
|
|
| **Visual Evidence** | 5 | Screenshots and comparisons included |
|
|
|
|
**Total**: 100 points
|
|
|
|
**Passing Score**: 70 points
|
|
|
|
---
|
|
|
|
## Example Agent Coordination (Task Queue)
|
|
|
|
```python
|
|
# Architect creates coordination tasks
|
|
create_task({
|
|
"title": "Local Environment Audit",
|
|
"assigned_to": "local-scout",
|
|
"description": "Audit local components and extract tokens",
|
|
"priority": 1
|
|
})
|
|
|
|
create_task({
|
|
"title": "Staging Environment Audit",
|
|
"assigned_to": "staging-inspector",
|
|
"description": "Audit staging deployment and browser logs",
|
|
"priority": 1,
|
|
"dependencies": [] # Can run in parallel
|
|
})
|
|
|
|
# Agents claim and complete tasks
|
|
claim_task("local-environment-audit")
|
|
start_task("local-environment-audit")
|
|
|
|
# ... perform audit ...
|
|
|
|
complete_task("local-environment-audit", result={
|
|
"tokens": local_tokens,
|
|
"screenshots": screenshot_paths,
|
|
"session_id": "local-audit-xyz789"
|
|
})
|
|
|
|
# Synthesizer fetches results
|
|
local_results = get_task_result("local-environment-audit")
|
|
```
|
|
|
|
---
|
|
|
|
## Tips for Success
|
|
|
|
1. **Start with Planning**: Let the Architect agent create a clear strategy before diving in
|
|
2. **Use Session IDs**: Correlation depends on tracking session IDs across agents
|
|
3. **Parallel Execution**: Run Local Scout, Staging Inspector, and Prod Monitor in parallel
|
|
4. **Share Data Early**: Don't wait until the end to share findings
|
|
5. **Visual Evidence**: Screenshots are crucial for demonstrating issues
|
|
6. **Prioritize by Impact**: Not all issues are equal - focus on user impact
|
|
7. **Test Mode Switching**: Verify LOCAL/REMOTE modes work before starting
|
|
8. **Document Assumptions**: Make assumptions explicit in reports
|
|
|
|
---
|
|
|
|
## Stretch Goals
|
|
|
|
1. **Real-Time Dashboard**: Create live dashboard showing audit progress
|
|
2. **Automated Fixes**: Generate PRs to fix identified issues
|
|
3. **Slack Integration**: Post findings to team Slack channel
|
|
4. **Trend Analysis**: Compare current audit to previous audits (3 months ago)
|
|
5. **Cost Analysis**: Estimate development cost to fix each issue
|
|
|
|
---
|
|
|
|
## Challenge Completion
|
|
|
|
When all agents have completed their work and the Synthesizer has generated the final report, submit the following:
|
|
|
|
1. Complete audit findings report
|
|
2. Cross-environment discrepancy matrix
|
|
3. Prioritized issue backlog
|
|
4. Q1 improvement roadmap
|
|
5. Screenshot gallery (regressions)
|
|
6. Agent coordination logs (task queue)
|
|
|
|
**Good luck, agents! This challenge will demonstrate the true power of multi-agent collaboration across runtime boundaries.**
|
|
|
|
---
|
|
|
|
**Challenge Created**: 2025-12-06
|
|
**Difficulty**: EXPERT
|
|
**Estimated Duration**: 4-6 hours
|
|
**Success Rate**: TBD (first run)
|