Add Grafana Alloy configuration files and update examples

- Introduced detailed configuration guidelines in alloy.instructions.md - Added general instructions for project structure in general.instructions.md - Created config.yaml for NAMU-PC with target hostnames - Implemented example.alloy and openwrt.alloy for service discovery and scraping - Added alloy_seed.json for initial configuration state - Developed demo.alloy for comprehensive monitoring setup - Established std.alloy for repository path formatting and host configuration loading - Updated test.alloy to utilize new host configuration loading
2025-08-01 16:25:31 +03:00
parent ad77d5808a
commit fa57353a6e
9 changed files with 621 additions and 0 deletions
--- a/.github/instructions/alloy.instructions.md
+++ b/.github/instructions/alloy.instructions.md
@@ -0,0 +1,114 @@
+---
+applyTo: '**/*.alloy'
+---
+
+# Grafana Alloy Configuration Guidelines
+
+## Component Naming Conventions
+
+### Discovery Components
+- HTTP discovery: `{service}_exporter_dynamic`
+- Relabel discovery: `{service}_exporter_with_cluster`
+
+### Scrape Jobs
+- Component name: `metrics_integrations_integrations_{service}`
+- Job name: `integrations/{service}`
+
+### Remote Write
+- Component name: `metrics_service`
+- Always forward to Grafana Cloud endpoint
+
+## Configuration Pipeline Pattern
+
+Every monitoring target must follow this 3-stage pipeline:
+
+```alloy
+// 1. Discovery Stage
+discovery.http "{service}_exporter_dynamic" {
+    url = "http://{service}-{env}.{domain}:8765/sd/prometheus/sd-config?service={service}-exporter"
+}
+
+// 2. Labeling Stage  
+discovery.relabel "{service}_exporter_with_cluster" {
+    targets = discovery.http.{service}_exporter_dynamic.targets
+    
+    rule {
+        target_label = "{service}_cluster"  // or appropriate cluster label
+        replacement  = "{environment_id}"   // hardcoded environment identifier
+    }
+}
+
+// 3. Scraping Stage
+prometheus.scrape "metrics_integrations_integrations_{service}" {
+    targets    = discovery.relabel.{service}_exporter_with_cluster.output
+    forward_to = [prometheus.remote_write.metrics_service.receiver]
+    job_name   = "integrations/{service}"
+}
+```
+
+## Environment Configuration
+
+### Service Discovery URLs
+- Pattern: `http://{service}-{env}.{domain}:8765/sd/prometheus/sd-config?service={service}-exporter`
+- Port 8765 is standard for HTTP SD endpoints
+- Always use query parameter `service={service}-exporter`
+
+### Authentication & Secrets
+- Use `sys.env("VARIABLE_NAME")` for sensitive data
+- Standard variables:
+  - `GCLOUD_RW_API_KEY` for Grafana Cloud API key
+- Never hardcode passwords or API keys
+
+### Cluster Labeling
+- Always add cluster/environment labels via `discovery.relabel`
+- Use descriptive cluster names (e.g., "gr7", "prod", "staging")
+- Cluster labels help with multi-environment visibility
+
+## Remote Write Configuration
+
+Standard remote write configuration:
+
+```alloy
+prometheus.remote_write "metrics_service" {
+    endpoint {
+        url = "https://prometheus-prod-24-prod-eu-west-2.grafana.net/api/prom/push"
+        
+        basic_auth {
+            username = "1257735"  // Grafana instance ID
+            password = sys.env("GCLOUD_RW_API_KEY")
+        }
+    }
+}
+```
+
+## Code Style Guidelines
+
+### Comments
+- Add descriptive comments above each component
+- Explain the purpose, not the syntax
+- Use format: `// {Action description}`
+
+### Formatting
+- Use tabs for indentation
+- One empty line between components
+- Align parameters vertically when reasonable
+
+### Component References
+- Always reference by full component path: `discovery.http.component_name.targets`
+- Use descriptive variable names in targets/forward_to chains
+
+## Error Prevention
+
+### Common Mistakes to Avoid
+- Don't hardcode service discovery URLs without environment variables
+- Don't skip the relabel stage - always add cluster labels
+- Don't use generic job names - follow `integrations/{service}` pattern
+- Don't forget to forward metrics to remote write endpoint
+
+### Validation Checklist
+- [ ] Service discovery URL uses correct pattern
+- [ ] Relabel adds appropriate cluster/environment labels  
+- [ ] Scrape job follows naming convention
+- [ ] Metrics are forwarded to remote write
+- [ ] No hardcoded secrets
+- [ ] Comments explain component purpose
--- a/.github/instructions/general.instructions.md
+++ b/.github/instructions/general.instructions.md
@@ -0,0 +1,126 @@
+---
+applyTo: '**'
+---
+
+# Grafana Alloy Configuration Project
+
+## Project Overview
+
+This project contains Grafana Alloy configurations for monitoring infrastructure across multiple environments. The architecture supports dynamic service discovery with environment-specific labeling and centralized metrics collection.
+
+## Directory Structure
+
+### Environment Organization
+```
+/
+├── README.md
+├── .github/
+│   └── instructions/
+├── {Environment}/
+│   └── {environment}.alloy
+```
+
+### Environment Naming
+- Use descriptive directory names (e.g., `OpenWRT/`, `Production/`, `Staging/`)
+- One `.alloy` file per environment/deployment target
+- File names should match environment purpose (e.g., `openwrt.alloy`, `production.alloy`)
+
+## Monitoring Domains
+
+### Current Implementations
+- **Storage Monitoring**: Ceph cluster monitoring with dynamic discovery
+- **Network Infrastructure**: OpenWRT-based network monitoring
+
+### Adding New Domains
+When expanding to new monitoring domains:
+1. Create environment-specific directory if needed
+2. Follow the established discovery → relabel → scrape pipeline
+3. Ensure proper integration with existing remote write configuration
+4. Add appropriate documentation
+
+## Environment Configuration Strategy
+
+### Multi-Environment Support
+- Each environment has isolated configuration files
+- Environment-specific service discovery endpoints
+- Consistent labeling strategy across environments
+- Centralized metrics collection in Grafana Cloud
+
+### Service Discovery Integration
+- HTTP-based service discovery for dynamic target discovery
+- Standardized SD endpoint patterns across environments
+- Port 8765 as standard for HTTP SD services
+
+## Development Workflow
+
+### Making Changes
+1. **Identify Target Environment**: Determine which environment(s) need updates
+2. **Follow Patterns**: Use existing configurations as templates
+3. **Test Locally**: Validate Alloy syntax before deployment
+4. **Document Changes**: Update README or comments as needed
+
+### Adding New Services
+1. **Service Discovery Setup**: Ensure HTTP SD endpoint exists
+2. **Configuration Creation**: Follow 3-stage pipeline pattern
+3. **Environment Labeling**: Add appropriate cluster/environment labels
+4. **Integration Testing**: Verify metrics flow to Grafana Cloud
+
+### Code Review Guidelines
+- Verify naming conventions are followed
+- Check for hardcoded secrets (should use environment variables)
+- Ensure proper service discovery patterns
+- Validate remote write configuration
+
+## Security Considerations
+
+### Secrets Management
+- Never commit API keys or passwords to repository
+- Use environment variables for all sensitive data
+- Follow principle of least privilege for API access
+
+### Network Security
+- Service discovery endpoints should be on trusted networks
+- Consider firewall rules for Alloy agents
+- Use HTTPS where possible for external endpoints
+
+## Integration Points
+
+### Grafana Cloud
+- **Metrics Storage**: Prometheus-compatible remote write
+- **Authentication**: Instance ID + API key
+- **Endpoint**: Fixed Grafana Cloud Prometheus URL
+
+### Service Discovery
+- **Protocol**: HTTP-based service discovery
+- **Format**: Prometheus SD compatible JSON
+- **Endpoints**: Environment-specific discovery services
+
+### Monitoring Targets
+- **Exporters**: Various Prometheus exporters (Ceph, Node, etc.)
+- **Discovery**: Dynamic target discovery via HTTP SD
+- **Labeling**: Environment and cluster-specific labels
+
+## Documentation Standards
+
+### File Documentation
+- Each `.alloy` file should have header comments explaining purpose
+- Complex configurations need inline comments
+- Environment-specific notes in README sections
+
+### Change Documentation
+- Update README when adding new environments
+- Document new service integrations
+- Note any breaking changes or migration requirements
+
+## Troubleshooting
+
+### Common Issues
+- **Service Discovery Failures**: Check HTTP SD endpoint availability
+- **Authentication Errors**: Verify environment variables are set
+- **Missing Metrics**: Confirm scrape job configuration and forwarding
+
+### Debug Strategies
+- Use Alloy's built-in debugging and logging
+- Verify service discovery target resolution
+- Check Grafana Cloud metrics ingestion
+- Validate network connectivity to all endpoints