//// Purpose ------- Information about how deployed systems can be observed. Examples -------- * Logging * Monitoring //// [id="observability_{context}"] = Observability == Infrastructure Splunk Enterprise and Splunk Open Telemetry are central to {cust}’s infrastructure observability. An external monitoring team oversees these platforms, while alerts are configured via Prometheus and Alertmanager to automatically notify people through Splunk on-call and send updates via email and Teams chat. == User Workloads User workload observability leverages centralized logging, with pod logs forwarded to Splunk Enterprise for easy access. Currently, the logging setup uses Fluentd, though a migration to Vector is underway. {cust} also uses an internal application called "Lifecycle" to consolidate data from various sources, providing insights into image versions and their status across environments. Additionally, purpose-built dashboards track resource allocation patterns such as Pods configured with low requests but high limits, as well as security policy compliance. == Audit Logs Audit logging is managed with a dual approach: logs are available on request through Red Hat support under the standard ROSA process, and an internal audit tool actively intercepts API calls in the production environment. == Cost management and chargeback Helvetia has developed an in-house tool for cost management and chargeback that allocates costs based primarily on relative customer usage divided by actual costs. This tool uses a dedicated data pipeline that pulls and preprocesses data from a PostgreSQL database, providing ready-to-use, accurate cost allocation. High-level costs, such as those related to reserved instances and AWS enterprise savings plans, are managed by a company-wide AWS cloud solutions team. == Conclusion Overall, the customer's observability stack is in line with industry best practices. {cust}’s custom-built chargeback tool stands out for its advanced data architecture, offering tailored insights into cost distribution and optimization opportunities across the organization.