Add Grafana Alloy configuration files and update examples
- Introduced detailed configuration guidelines in alloy.instructions.md - Added general instructions for project structure in general.instructions.md - Created config.yaml for NAMU-PC with target hostnames - Implemented example.alloy and openwrt.alloy for service discovery and scraping - Added alloy_seed.json for initial configuration state - Developed demo.alloy for comprehensive monitoring setup - Established std.alloy for repository path formatting and host configuration loading - Updated test.alloy to utilize new host configuration loading
This commit is contained in:
114
.github/instructions/alloy.instructions.md
vendored
Normal file
114
.github/instructions/alloy.instructions.md
vendored
Normal file
@@ -0,0 +1,114 @@
|
||||
---
|
||||
applyTo: '**/*.alloy'
|
||||
---
|
||||
|
||||
# Grafana Alloy Configuration Guidelines
|
||||
|
||||
## Component Naming Conventions
|
||||
|
||||
### Discovery Components
|
||||
- HTTP discovery: `{service}_exporter_dynamic`
|
||||
- Relabel discovery: `{service}_exporter_with_cluster`
|
||||
|
||||
### Scrape Jobs
|
||||
- Component name: `metrics_integrations_integrations_{service}`
|
||||
- Job name: `integrations/{service}`
|
||||
|
||||
### Remote Write
|
||||
- Component name: `metrics_service`
|
||||
- Always forward to Grafana Cloud endpoint
|
||||
|
||||
## Configuration Pipeline Pattern
|
||||
|
||||
Every monitoring target must follow this 3-stage pipeline:
|
||||
|
||||
```alloy
|
||||
// 1. Discovery Stage
|
||||
discovery.http "{service}_exporter_dynamic" {
|
||||
url = "http://{service}-{env}.{domain}:8765/sd/prometheus/sd-config?service={service}-exporter"
|
||||
}
|
||||
|
||||
// 2. Labeling Stage
|
||||
discovery.relabel "{service}_exporter_with_cluster" {
|
||||
targets = discovery.http.{service}_exporter_dynamic.targets
|
||||
|
||||
rule {
|
||||
target_label = "{service}_cluster" // or appropriate cluster label
|
||||
replacement = "{environment_id}" // hardcoded environment identifier
|
||||
}
|
||||
}
|
||||
|
||||
// 3. Scraping Stage
|
||||
prometheus.scrape "metrics_integrations_integrations_{service}" {
|
||||
targets = discovery.relabel.{service}_exporter_with_cluster.output
|
||||
forward_to = [prometheus.remote_write.metrics_service.receiver]
|
||||
job_name = "integrations/{service}"
|
||||
}
|
||||
```
|
||||
|
||||
## Environment Configuration
|
||||
|
||||
### Service Discovery URLs
|
||||
- Pattern: `http://{service}-{env}.{domain}:8765/sd/prometheus/sd-config?service={service}-exporter`
|
||||
- Port 8765 is standard for HTTP SD endpoints
|
||||
- Always use query parameter `service={service}-exporter`
|
||||
|
||||
### Authentication & Secrets
|
||||
- Use `sys.env("VARIABLE_NAME")` for sensitive data
|
||||
- Standard variables:
|
||||
- `GCLOUD_RW_API_KEY` for Grafana Cloud API key
|
||||
- Never hardcode passwords or API keys
|
||||
|
||||
### Cluster Labeling
|
||||
- Always add cluster/environment labels via `discovery.relabel`
|
||||
- Use descriptive cluster names (e.g., "gr7", "prod", "staging")
|
||||
- Cluster labels help with multi-environment visibility
|
||||
|
||||
## Remote Write Configuration
|
||||
|
||||
Standard remote write configuration:
|
||||
|
||||
```alloy
|
||||
prometheus.remote_write "metrics_service" {
|
||||
endpoint {
|
||||
url = "https://prometheus-prod-24-prod-eu-west-2.grafana.net/api/prom/push"
|
||||
|
||||
basic_auth {
|
||||
username = "1257735" // Grafana instance ID
|
||||
password = sys.env("GCLOUD_RW_API_KEY")
|
||||
}
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
## Code Style Guidelines
|
||||
|
||||
### Comments
|
||||
- Add descriptive comments above each component
|
||||
- Explain the purpose, not the syntax
|
||||
- Use format: `// {Action description}`
|
||||
|
||||
### Formatting
|
||||
- Use tabs for indentation
|
||||
- One empty line between components
|
||||
- Align parameters vertically when reasonable
|
||||
|
||||
### Component References
|
||||
- Always reference by full component path: `discovery.http.component_name.targets`
|
||||
- Use descriptive variable names in targets/forward_to chains
|
||||
|
||||
## Error Prevention
|
||||
|
||||
### Common Mistakes to Avoid
|
||||
- Don't hardcode service discovery URLs without environment variables
|
||||
- Don't skip the relabel stage - always add cluster labels
|
||||
- Don't use generic job names - follow `integrations/{service}` pattern
|
||||
- Don't forget to forward metrics to remote write endpoint
|
||||
|
||||
### Validation Checklist
|
||||
- [ ] Service discovery URL uses correct pattern
|
||||
- [ ] Relabel adds appropriate cluster/environment labels
|
||||
- [ ] Scrape job follows naming convention
|
||||
- [ ] Metrics are forwarded to remote write
|
||||
- [ ] No hardcoded secrets
|
||||
- [ ] Comments explain component purpose
|
126
.github/instructions/general.instructions.md
vendored
Normal file
126
.github/instructions/general.instructions.md
vendored
Normal file
@@ -0,0 +1,126 @@
|
||||
---
|
||||
applyTo: '**'
|
||||
---
|
||||
|
||||
# Grafana Alloy Configuration Project
|
||||
|
||||
## Project Overview
|
||||
|
||||
This project contains Grafana Alloy configurations for monitoring infrastructure across multiple environments. The architecture supports dynamic service discovery with environment-specific labeling and centralized metrics collection.
|
||||
|
||||
## Directory Structure
|
||||
|
||||
### Environment Organization
|
||||
```
|
||||
/
|
||||
├── README.md
|
||||
├── .github/
|
||||
│ └── instructions/
|
||||
├── {Environment}/
|
||||
│ └── {environment}.alloy
|
||||
```
|
||||
|
||||
### Environment Naming
|
||||
- Use descriptive directory names (e.g., `OpenWRT/`, `Production/`, `Staging/`)
|
||||
- One `.alloy` file per environment/deployment target
|
||||
- File names should match environment purpose (e.g., `openwrt.alloy`, `production.alloy`)
|
||||
|
||||
## Monitoring Domains
|
||||
|
||||
### Current Implementations
|
||||
- **Storage Monitoring**: Ceph cluster monitoring with dynamic discovery
|
||||
- **Network Infrastructure**: OpenWRT-based network monitoring
|
||||
|
||||
### Adding New Domains
|
||||
When expanding to new monitoring domains:
|
||||
1. Create environment-specific directory if needed
|
||||
2. Follow the established discovery → relabel → scrape pipeline
|
||||
3. Ensure proper integration with existing remote write configuration
|
||||
4. Add appropriate documentation
|
||||
|
||||
## Environment Configuration Strategy
|
||||
|
||||
### Multi-Environment Support
|
||||
- Each environment has isolated configuration files
|
||||
- Environment-specific service discovery endpoints
|
||||
- Consistent labeling strategy across environments
|
||||
- Centralized metrics collection in Grafana Cloud
|
||||
|
||||
### Service Discovery Integration
|
||||
- HTTP-based service discovery for dynamic target discovery
|
||||
- Standardized SD endpoint patterns across environments
|
||||
- Port 8765 as standard for HTTP SD services
|
||||
|
||||
## Development Workflow
|
||||
|
||||
### Making Changes
|
||||
1. **Identify Target Environment**: Determine which environment(s) need updates
|
||||
2. **Follow Patterns**: Use existing configurations as templates
|
||||
3. **Test Locally**: Validate Alloy syntax before deployment
|
||||
4. **Document Changes**: Update README or comments as needed
|
||||
|
||||
### Adding New Services
|
||||
1. **Service Discovery Setup**: Ensure HTTP SD endpoint exists
|
||||
2. **Configuration Creation**: Follow 3-stage pipeline pattern
|
||||
3. **Environment Labeling**: Add appropriate cluster/environment labels
|
||||
4. **Integration Testing**: Verify metrics flow to Grafana Cloud
|
||||
|
||||
### Code Review Guidelines
|
||||
- Verify naming conventions are followed
|
||||
- Check for hardcoded secrets (should use environment variables)
|
||||
- Ensure proper service discovery patterns
|
||||
- Validate remote write configuration
|
||||
|
||||
## Security Considerations
|
||||
|
||||
### Secrets Management
|
||||
- Never commit API keys or passwords to repository
|
||||
- Use environment variables for all sensitive data
|
||||
- Follow principle of least privilege for API access
|
||||
|
||||
### Network Security
|
||||
- Service discovery endpoints should be on trusted networks
|
||||
- Consider firewall rules for Alloy agents
|
||||
- Use HTTPS where possible for external endpoints
|
||||
|
||||
## Integration Points
|
||||
|
||||
### Grafana Cloud
|
||||
- **Metrics Storage**: Prometheus-compatible remote write
|
||||
- **Authentication**: Instance ID + API key
|
||||
- **Endpoint**: Fixed Grafana Cloud Prometheus URL
|
||||
|
||||
### Service Discovery
|
||||
- **Protocol**: HTTP-based service discovery
|
||||
- **Format**: Prometheus SD compatible JSON
|
||||
- **Endpoints**: Environment-specific discovery services
|
||||
|
||||
### Monitoring Targets
|
||||
- **Exporters**: Various Prometheus exporters (Ceph, Node, etc.)
|
||||
- **Discovery**: Dynamic target discovery via HTTP SD
|
||||
- **Labeling**: Environment and cluster-specific labels
|
||||
|
||||
## Documentation Standards
|
||||
|
||||
### File Documentation
|
||||
- Each `.alloy` file should have header comments explaining purpose
|
||||
- Complex configurations need inline comments
|
||||
- Environment-specific notes in README sections
|
||||
|
||||
### Change Documentation
|
||||
- Update README when adding new environments
|
||||
- Document new service integrations
|
||||
- Note any breaking changes or migration requirements
|
||||
|
||||
## Troubleshooting
|
||||
|
||||
### Common Issues
|
||||
- **Service Discovery Failures**: Check HTTP SD endpoint availability
|
||||
- **Authentication Errors**: Verify environment variables are set
|
||||
- **Missing Metrics**: Confirm scrape job configuration and forwarding
|
||||
|
||||
### Debug Strategies
|
||||
- Use Alloy's built-in debugging and logging
|
||||
- Verify service discovery target resolution
|
||||
- Check Grafana Cloud metrics ingestion
|
||||
- Validate network connectivity to all endpoints
|
Reference in New Issue
Block a user