//// Purpose ------- Information about how deployed systems can be observed. Examples -------- * Logging * Monitoring //// [id="observability_{context}"] = Observability == Infrastreucture Splunk Enterprise central to {cust}’s infrastructure observability. An external monitoring team oversees these platforms, while alerts are configured via Prometheus and Alertmanager to automatically notify people through Splunk on-call and send updates via email and Teams chat. == User Workloads User workload observability leverages centralized logging, with pod logs forwarded to Splunk Enterprise for easy access. Currently, the logging setup uses Fluentd, though a migration to Vector is underway. Additionally, Splunk Open Telemetry is available for tracing. Helvetia also uses an internal application called "Lifecycle" to consolidate data from various sources, providing insights into image versions and their status across environments. == Audit Logs Audit logging is managed with a dual approach: logs are available on request through Red Hat support under the standard ROSA process, and an internal audit tool actively intercepts API calls in the production environment. == Cost management and chargeback Helvetia has developed a sophisticated in-house tool for cost management and chargeback that allocates costs based primarily on relative customer usage divided by actual costs. This tool uses a dedicated data pipeline that pulls and preprocesses data from a PostgreSQL database, providing ready-to-use, accurate cost allocation. High-level costs, such as those related to reserved instances and AWS enterprise savings plans, are managed by a company-wide AWS cloud solutions team. {cust}’s custom-built chargeback tool stands out for its advanced data architecture, offering tailored insights into cost distribution and optimization opportunities across the organization.