Add Grafana Alloy configuration files and update examples

- Introduced detailed configuration guidelines in alloy.instructions.md
- Added general instructions for project structure in general.instructions.md
- Created config.yaml for NAMU-PC with target hostnames
- Implemented example.alloy and openwrt.alloy for service discovery and scraping
- Added alloy_seed.json for initial configuration state
- Developed demo.alloy for comprehensive monitoring setup
- Established std.alloy for repository path formatting and host configuration loading
- Updated test.alloy to utilize new host configuration loading
This commit is contained in:
2025-08-01 16:25:31 +03:00
parent ad77d5808a
commit fa57353a6e
9 changed files with 621 additions and 0 deletions

View File

@@ -0,0 +1,114 @@
---
applyTo: '**/*.alloy'
---
# Grafana Alloy Configuration Guidelines
## Component Naming Conventions
### Discovery Components
- HTTP discovery: `{service}_exporter_dynamic`
- Relabel discovery: `{service}_exporter_with_cluster`
### Scrape Jobs
- Component name: `metrics_integrations_integrations_{service}`
- Job name: `integrations/{service}`
### Remote Write
- Component name: `metrics_service`
- Always forward to Grafana Cloud endpoint
## Configuration Pipeline Pattern
Every monitoring target must follow this 3-stage pipeline:
```alloy
// 1. Discovery Stage
discovery.http "{service}_exporter_dynamic" {
url = "http://{service}-{env}.{domain}:8765/sd/prometheus/sd-config?service={service}-exporter"
}
// 2. Labeling Stage
discovery.relabel "{service}_exporter_with_cluster" {
targets = discovery.http.{service}_exporter_dynamic.targets
rule {
target_label = "{service}_cluster" // or appropriate cluster label
replacement = "{environment_id}" // hardcoded environment identifier
}
}
// 3. Scraping Stage
prometheus.scrape "metrics_integrations_integrations_{service}" {
targets = discovery.relabel.{service}_exporter_with_cluster.output
forward_to = [prometheus.remote_write.metrics_service.receiver]
job_name = "integrations/{service}"
}
```
## Environment Configuration
### Service Discovery URLs
- Pattern: `http://{service}-{env}.{domain}:8765/sd/prometheus/sd-config?service={service}-exporter`
- Port 8765 is standard for HTTP SD endpoints
- Always use query parameter `service={service}-exporter`
### Authentication & Secrets
- Use `sys.env("VARIABLE_NAME")` for sensitive data
- Standard variables:
- `GCLOUD_RW_API_KEY` for Grafana Cloud API key
- Never hardcode passwords or API keys
### Cluster Labeling
- Always add cluster/environment labels via `discovery.relabel`
- Use descriptive cluster names (e.g., "gr7", "prod", "staging")
- Cluster labels help with multi-environment visibility
## Remote Write Configuration
Standard remote write configuration:
```alloy
prometheus.remote_write "metrics_service" {
endpoint {
url = "https://prometheus-prod-24-prod-eu-west-2.grafana.net/api/prom/push"
basic_auth {
username = "1257735" // Grafana instance ID
password = sys.env("GCLOUD_RW_API_KEY")
}
}
}
```
## Code Style Guidelines
### Comments
- Add descriptive comments above each component
- Explain the purpose, not the syntax
- Use format: `// {Action description}`
### Formatting
- Use tabs for indentation
- One empty line between components
- Align parameters vertically when reasonable
### Component References
- Always reference by full component path: `discovery.http.component_name.targets`
- Use descriptive variable names in targets/forward_to chains
## Error Prevention
### Common Mistakes to Avoid
- Don't hardcode service discovery URLs without environment variables
- Don't skip the relabel stage - always add cluster labels
- Don't use generic job names - follow `integrations/{service}` pattern
- Don't forget to forward metrics to remote write endpoint
### Validation Checklist
- [ ] Service discovery URL uses correct pattern
- [ ] Relabel adds appropriate cluster/environment labels
- [ ] Scrape job follows naming convention
- [ ] Metrics are forwarded to remote write
- [ ] No hardcoded secrets
- [ ] Comments explain component purpose

View File

@@ -0,0 +1,126 @@
---
applyTo: '**'
---
# Grafana Alloy Configuration Project
## Project Overview
This project contains Grafana Alloy configurations for monitoring infrastructure across multiple environments. The architecture supports dynamic service discovery with environment-specific labeling and centralized metrics collection.
## Directory Structure
### Environment Organization
```
/
├── README.md
├── .github/
│ └── instructions/
├── {Environment}/
│ └── {environment}.alloy
```
### Environment Naming
- Use descriptive directory names (e.g., `OpenWRT/`, `Production/`, `Staging/`)
- One `.alloy` file per environment/deployment target
- File names should match environment purpose (e.g., `openwrt.alloy`, `production.alloy`)
## Monitoring Domains
### Current Implementations
- **Storage Monitoring**: Ceph cluster monitoring with dynamic discovery
- **Network Infrastructure**: OpenWRT-based network monitoring
### Adding New Domains
When expanding to new monitoring domains:
1. Create environment-specific directory if needed
2. Follow the established discovery → relabel → scrape pipeline
3. Ensure proper integration with existing remote write configuration
4. Add appropriate documentation
## Environment Configuration Strategy
### Multi-Environment Support
- Each environment has isolated configuration files
- Environment-specific service discovery endpoints
- Consistent labeling strategy across environments
- Centralized metrics collection in Grafana Cloud
### Service Discovery Integration
- HTTP-based service discovery for dynamic target discovery
- Standardized SD endpoint patterns across environments
- Port 8765 as standard for HTTP SD services
## Development Workflow
### Making Changes
1. **Identify Target Environment**: Determine which environment(s) need updates
2. **Follow Patterns**: Use existing configurations as templates
3. **Test Locally**: Validate Alloy syntax before deployment
4. **Document Changes**: Update README or comments as needed
### Adding New Services
1. **Service Discovery Setup**: Ensure HTTP SD endpoint exists
2. **Configuration Creation**: Follow 3-stage pipeline pattern
3. **Environment Labeling**: Add appropriate cluster/environment labels
4. **Integration Testing**: Verify metrics flow to Grafana Cloud
### Code Review Guidelines
- Verify naming conventions are followed
- Check for hardcoded secrets (should use environment variables)
- Ensure proper service discovery patterns
- Validate remote write configuration
## Security Considerations
### Secrets Management
- Never commit API keys or passwords to repository
- Use environment variables for all sensitive data
- Follow principle of least privilege for API access
### Network Security
- Service discovery endpoints should be on trusted networks
- Consider firewall rules for Alloy agents
- Use HTTPS where possible for external endpoints
## Integration Points
### Grafana Cloud
- **Metrics Storage**: Prometheus-compatible remote write
- **Authentication**: Instance ID + API key
- **Endpoint**: Fixed Grafana Cloud Prometheus URL
### Service Discovery
- **Protocol**: HTTP-based service discovery
- **Format**: Prometheus SD compatible JSON
- **Endpoints**: Environment-specific discovery services
### Monitoring Targets
- **Exporters**: Various Prometheus exporters (Ceph, Node, etc.)
- **Discovery**: Dynamic target discovery via HTTP SD
- **Labeling**: Environment and cluster-specific labels
## Documentation Standards
### File Documentation
- Each `.alloy` file should have header comments explaining purpose
- Complex configurations need inline comments
- Environment-specific notes in README sections
### Change Documentation
- Update README when adding new environments
- Document new service integrations
- Note any breaking changes or migration requirements
## Troubleshooting
### Common Issues
- **Service Discovery Failures**: Check HTTP SD endpoint availability
- **Authentication Errors**: Verify environment variables are set
- **Missing Metrics**: Confirm scrape job configuration and forwarding
### Debug Strategies
- Use Alloy's built-in debugging and logging
- Verify service discovery target resolution
- Check Grafana Cloud metrics ingestion
- Validate network connectivity to all endpoints