Version: 2.0

Auditing And Reporting

For two decades, I’ve built and led engineering teams. We talk a lot about velocity, technical debt, and delivering value. But lately, I’ve been thinking about a surprisingly relevant parallel: auditing. Not in the sense of bug hunts (though those are important!), but in the rigorous, systematic examination of how we build things. The collapses of Enron and WorldCom – stemming from failures in financial reporting and internal controls ¹ – aren't just business school case studies; they’re stark reminders of what happens when systems of reporting and control break down. And as engineering leaders, we are, fundamentally, building and running complex systems.

This isn’t about nitpicking code quality. It’s about recognizing that transparency, accountability, and verifiable processes are crucial for long-term success – and that it’s our responsibility to foster them.

The Fragility of “Just Make It Work”

For many engineering teams, especially in the early stages of a startup, the mantra is “just make it work.” Speed is prioritized, and processes are often… flexible, let’s say. While this can be effective in the short term, it creates technical and organizational debt that can quickly become crippling. Think about it:

Hidden Dependencies: Who really understands how system A impacts system B? Without clear documentation and ownership, changes become risky, and outages become inevitable.
Shadow IT: What unsanctioned tools or services are your teams using? While they might increase productivity, they also introduce security vulnerabilities and data silos. A common scenario is a team spinning up a free analytics tool without involving security or legal, creating potential data privacy issues. Addressing this requires a centralized tool catalog and a streamlined approval process for new integrations.
Lack of Ownership: Who is accountable for the performance, security, and scalability of a critical service? If the answer is “everyone,” the answer is actually “no one.”

These are all symptoms of a lack of effective "auditing" – a systematic way to understand, verify, and report on the state of our systems.

Building an Engineering “Truth Serum”: From Visibility to Verification

Traditional financial auditing focuses on verifying the accuracy of financial statements. As engineering leaders, we need to adapt this concept to verify the truth about our technical systems. This isn’t about finding blame; it’s about identifying risks and opportunities for improvement. Instead of viewing this as a purely reactive exercise, we aim to establish proactive systems that surface issues before they become critical.

Here's how to shift from simply seeing what's happening to verifying the health of our systems:

Define Key Metrics: What does “healthy” look like for your services? Define metrics around performance, security, cost, and reliability. Think beyond just uptime; consider latency, error rates, and resource utilization. For example, track the 95th percentile latency of critical API calls, not just average response time.
Implement Comprehensive Logging & Monitoring: This is the foundation of any good auditing system. Comprehensive logging provides a detailed record of events, while monitoring tools provide real-time visibility into system behavior. Don’t just monitor for failures; monitor for anomalies that might indicate underlying problems.
Automate Audits Through CI/CD Integration: Instead of treating compliance as a separate task, embed audit checks directly into your CI/CD pipeline. Automated tests can verify code quality, security vulnerabilities, and compliance with coding standards. This “shift-left” approach makes auditing a continuous process, rather than a periodic event.
Regularly Review Audit Logs: Don't just collect logs; review them. Set aside time each week to analyze audit data and identify potential issues. Look for patterns, anomalies, and trends that might indicate underlying problems.
Establish Clear Ownership & Permissions: Who has access to what data? Who is responsible for maintaining which systems? Clear ownership and permissions are essential for accountability.

The Cost of Neglect: Lessons from the Past – and a Path Forward

The failures of Enron and WorldCom serve as cautionary tales. These companies prioritized short-term gains over long-term sustainability, and they created cultures where transparency and accountability were lacking. The consequences were devastating, including bankruptcies, job losses, and a loss of public trust.

In our own organizations, neglecting auditing and reporting can lead to:

Increased Technical Debt: Unaddressed vulnerabilities and performance issues accumulate over time, making it increasingly difficult to maintain and evolve our systems.
Security Breaches: Without proper monitoring and logging, we’re vulnerable to attacks and data breaches.
Reduced Innovation: When we’re constantly fighting fires, we have less time to focus on innovation.
Erosion of Trust: When things go wrong, it’s difficult to regain the trust of our users and stakeholders.

Beyond Compliance: Building a Culture of Transparency and Continuous Improvement

Auditing and reporting shouldn’t be seen as a chore or a compliance exercise. It’s an opportunity to build a culture of transparency, accountability, and continuous improvement.

As engineering leaders, we need to:

Lead by Example: Be open and honest about our own mistakes.
Encourage Feedback: Create a safe space for team members to raise concerns.
Celebrate Transparency: Recognize and reward team members who prioritize accountability.

By embracing these principles, we can build engineering organizations that are not only successful but also ethical, responsible, and sustainable.

Ultimately, thinking like an auditor isn’t about finding fault; it’s about ensuring the long-term health and resilience of the systems we build – and the organizations we lead.

To get started, identify the key metrics for your most critical services and implement basic logging and monitoring. This simple step can provide valuable insights into the health of your systems and help you proactively address potential issues.

See, for example, "The Fall of Enron" by Bethany McLean and Peter Elkind, and reports from the Securities and Exchange Commission regarding the accounting failures at both Enron and WorldCom. ↩

The Fragility of “Just Make It Work”​

Building an Engineering “Truth Serum”: From Visibility to Verification​

The Cost of Neglect: Lessons from the Past – and a Path Forward​

Beyond Compliance: Building a Culture of Transparency and Continuous Improvement​

Footnotes​

The Fragility of “Just Make It Work”

Building an Engineering “Truth Serum”: From Visibility to Verification

The Cost of Neglect: Lessons from the Past – and a Path Forward

Beyond Compliance: Building a Culture of Transparency and Continuous Improvement

Footnotes