- Introduced detailed configuration guidelines in alloy.instructions.md - Added general instructions for project structure in general.instructions.md - Created config.yaml for NAMU-PC with target hostnames - Implemented example.alloy and openwrt.alloy for service discovery and scraping - Added alloy_seed.json for initial configuration state - Developed demo.alloy for comprehensive monitoring setup - Established std.alloy for repository path formatting and host configuration loading - Updated test.alloy to utilize new host configuration loading
127 lines
4.2 KiB
Markdown
127 lines
4.2 KiB
Markdown
---
|
|
applyTo: '**'
|
|
---
|
|
|
|
# Grafana Alloy Configuration Project
|
|
|
|
## Project Overview
|
|
|
|
This project contains Grafana Alloy configurations for monitoring infrastructure across multiple environments. The architecture supports dynamic service discovery with environment-specific labeling and centralized metrics collection.
|
|
|
|
## Directory Structure
|
|
|
|
### Environment Organization
|
|
```
|
|
/
|
|
├── README.md
|
|
├── .github/
|
|
│ └── instructions/
|
|
├── {Environment}/
|
|
│ └── {environment}.alloy
|
|
```
|
|
|
|
### Environment Naming
|
|
- Use descriptive directory names (e.g., `OpenWRT/`, `Production/`, `Staging/`)
|
|
- One `.alloy` file per environment/deployment target
|
|
- File names should match environment purpose (e.g., `openwrt.alloy`, `production.alloy`)
|
|
|
|
## Monitoring Domains
|
|
|
|
### Current Implementations
|
|
- **Storage Monitoring**: Ceph cluster monitoring with dynamic discovery
|
|
- **Network Infrastructure**: OpenWRT-based network monitoring
|
|
|
|
### Adding New Domains
|
|
When expanding to new monitoring domains:
|
|
1. Create environment-specific directory if needed
|
|
2. Follow the established discovery → relabel → scrape pipeline
|
|
3. Ensure proper integration with existing remote write configuration
|
|
4. Add appropriate documentation
|
|
|
|
## Environment Configuration Strategy
|
|
|
|
### Multi-Environment Support
|
|
- Each environment has isolated configuration files
|
|
- Environment-specific service discovery endpoints
|
|
- Consistent labeling strategy across environments
|
|
- Centralized metrics collection in Grafana Cloud
|
|
|
|
### Service Discovery Integration
|
|
- HTTP-based service discovery for dynamic target discovery
|
|
- Standardized SD endpoint patterns across environments
|
|
- Port 8765 as standard for HTTP SD services
|
|
|
|
## Development Workflow
|
|
|
|
### Making Changes
|
|
1. **Identify Target Environment**: Determine which environment(s) need updates
|
|
2. **Follow Patterns**: Use existing configurations as templates
|
|
3. **Test Locally**: Validate Alloy syntax before deployment
|
|
4. **Document Changes**: Update README or comments as needed
|
|
|
|
### Adding New Services
|
|
1. **Service Discovery Setup**: Ensure HTTP SD endpoint exists
|
|
2. **Configuration Creation**: Follow 3-stage pipeline pattern
|
|
3. **Environment Labeling**: Add appropriate cluster/environment labels
|
|
4. **Integration Testing**: Verify metrics flow to Grafana Cloud
|
|
|
|
### Code Review Guidelines
|
|
- Verify naming conventions are followed
|
|
- Check for hardcoded secrets (should use environment variables)
|
|
- Ensure proper service discovery patterns
|
|
- Validate remote write configuration
|
|
|
|
## Security Considerations
|
|
|
|
### Secrets Management
|
|
- Never commit API keys or passwords to repository
|
|
- Use environment variables for all sensitive data
|
|
- Follow principle of least privilege for API access
|
|
|
|
### Network Security
|
|
- Service discovery endpoints should be on trusted networks
|
|
- Consider firewall rules for Alloy agents
|
|
- Use HTTPS where possible for external endpoints
|
|
|
|
## Integration Points
|
|
|
|
### Grafana Cloud
|
|
- **Metrics Storage**: Prometheus-compatible remote write
|
|
- **Authentication**: Instance ID + API key
|
|
- **Endpoint**: Fixed Grafana Cloud Prometheus URL
|
|
|
|
### Service Discovery
|
|
- **Protocol**: HTTP-based service discovery
|
|
- **Format**: Prometheus SD compatible JSON
|
|
- **Endpoints**: Environment-specific discovery services
|
|
|
|
### Monitoring Targets
|
|
- **Exporters**: Various Prometheus exporters (Ceph, Node, etc.)
|
|
- **Discovery**: Dynamic target discovery via HTTP SD
|
|
- **Labeling**: Environment and cluster-specific labels
|
|
|
|
## Documentation Standards
|
|
|
|
### File Documentation
|
|
- Each `.alloy` file should have header comments explaining purpose
|
|
- Complex configurations need inline comments
|
|
- Environment-specific notes in README sections
|
|
|
|
### Change Documentation
|
|
- Update README when adding new environments
|
|
- Document new service integrations
|
|
- Note any breaking changes or migration requirements
|
|
|
|
## Troubleshooting
|
|
|
|
### Common Issues
|
|
- **Service Discovery Failures**: Check HTTP SD endpoint availability
|
|
- **Authentication Errors**: Verify environment variables are set
|
|
- **Missing Metrics**: Confirm scrape job configuration and forwarding
|
|
|
|
### Debug Strategies
|
|
- Use Alloy's built-in debugging and logging
|
|
- Verify service discovery target resolution
|
|
- Check Grafana Cloud metrics ingestion
|
|
- Validate network connectivity to all endpoints
|