--- applyTo: '**' --- # Grafana Alloy Configuration Project ## Project Overview This project contains Grafana Alloy configurations for monitoring infrastructure across multiple environments. The architecture supports dynamic service discovery with environment-specific labeling and centralized metrics collection. ## Directory Structure ### Environment Organization ``` / ├── README.md ├── .github/ │ └── instructions/ ├── {Environment}/ │ └── {environment}.alloy ``` ### Environment Naming - Use descriptive directory names (e.g., `OpenWRT/`, `Production/`, `Staging/`) - One `.alloy` file per environment/deployment target - File names should match environment purpose (e.g., `openwrt.alloy`, `production.alloy`) ## Monitoring Domains ### Current Implementations - **Storage Monitoring**: Ceph cluster monitoring with dynamic discovery - **Network Infrastructure**: OpenWRT-based network monitoring ### Adding New Domains When expanding to new monitoring domains: 1. Create environment-specific directory if needed 2. Follow the established discovery → relabel → scrape pipeline 3. Ensure proper integration with existing remote write configuration 4. Add appropriate documentation ## Environment Configuration Strategy ### Multi-Environment Support - Each environment has isolated configuration files - Environment-specific service discovery endpoints - Consistent labeling strategy across environments - Centralized metrics collection in Grafana Cloud ### Service Discovery Integration - HTTP-based service discovery for dynamic target discovery - Standardized SD endpoint patterns across environments - Port 8765 as standard for HTTP SD services ## Development Workflow ### Making Changes 1. **Identify Target Environment**: Determine which environment(s) need updates 2. **Follow Patterns**: Use existing configurations as templates 3. **Test Locally**: Validate Alloy syntax before deployment 4. **Document Changes**: Update README or comments as needed ### Adding New Services 1. **Service Discovery Setup**: Ensure HTTP SD endpoint exists 2. **Configuration Creation**: Follow 3-stage pipeline pattern 3. **Environment Labeling**: Add appropriate cluster/environment labels 4. **Integration Testing**: Verify metrics flow to Grafana Cloud ### Code Review Guidelines - Verify naming conventions are followed - Check for hardcoded secrets (should use environment variables) - Ensure proper service discovery patterns - Validate remote write configuration ## Security Considerations ### Secrets Management - Never commit API keys or passwords to repository - Use environment variables for all sensitive data - Follow principle of least privilege for API access ### Network Security - Service discovery endpoints should be on trusted networks - Consider firewall rules for Alloy agents - Use HTTPS where possible for external endpoints ## Integration Points ### Grafana Cloud - **Metrics Storage**: Prometheus-compatible remote write - **Authentication**: Instance ID + API key - **Endpoint**: Fixed Grafana Cloud Prometheus URL ### Service Discovery - **Protocol**: HTTP-based service discovery - **Format**: Prometheus SD compatible JSON - **Endpoints**: Environment-specific discovery services ### Monitoring Targets - **Exporters**: Various Prometheus exporters (Ceph, Node, etc.) - **Discovery**: Dynamic target discovery via HTTP SD - **Labeling**: Environment and cluster-specific labels ## Documentation Standards ### File Documentation - Each `.alloy` file should have header comments explaining purpose - Complex configurations need inline comments - Environment-specific notes in README sections ### Change Documentation - Update README when adding new environments - Document new service integrations - Note any breaking changes or migration requirements ## Troubleshooting ### Common Issues - **Service Discovery Failures**: Check HTTP SD endpoint availability - **Authentication Errors**: Verify environment variables are set - **Missing Metrics**: Confirm scrape job configuration and forwarding ### Debug Strategies - Use Alloy's built-in debugging and logging - Verify service discovery target resolution - Check Grafana Cloud metrics ingestion - Validate network connectivity to all endpoints