This Week’s Focus: Lessons from the Newark Meltdown
In April, a single burnt-out copper wire caused a 90-second blackout at Newark Airport, leaving controllers unable to see or communicate with up to 20 planes. While no accidents occurred, the aftermath was severe: 150 flight cancellations, weeklong delays, and staff trauma leave. The Newark incident is a powerful reminder that operational excellence isn’t just about efficiency, it’s about anticipating risk and building resilience. This week we explore how this crisis could have been prevented through better staffing, technology upkeep, and change management—the process of guiding people through organizational change.
Nothing is more reassuring than getting an email from an airline’s CEO stating that safety is their top priority…
What triggered this email?
In late April 2025, a “fried” piece of copper wiring triggered a 90-second loss of all radar and communications with inbound planes at Newark airport. Controllers “were unable to see, hear, or talk” to 15–20 aircraft during this period—an unprecedented lapse!
Thankfully, no collisions occurred, but the cascading effects were dire: at least 150 flights canceled, average delays of 2+ hours at Newark for a week after, and five air traffic controllers on a 45-day trauma leave (to recover). What initially looked like a staffing “walkout” was in fact a system meltdown.
In today’s article we analyze how the crisis could have been prevented by addressing three root causes: staffing and workforce planning, technology and infrastructure risk, and governance and change management.
Let’s touch down at the bottom of this disruption.
Staffing and Workforce Planning Failures
Newark’s troubles were the predictable result of chronic under-staffing and mismanaged workforce planning in a high-stress, safety-critical environment.
Years of insufficient hiring and retention left the FAA with approximately 10,800 certified controllers—nearly 1,000 fewer than in 2012. In late 2024, United Airlines’ CEO had warned that the FAA was 3,000 controllers short nationwide, leading to widespread delays. At facilities like the Philadelphia TRACON (which handles Newark’s traffic) barely 22 fully certified controllers were on staff. Such thin staffing meant that mandatory overtime, fatigue, and burnout had become the norm. “They work a lot of hours, a lot of overtime. We need more of them,” noted the Aviation Safety Caucus chair in Congress, emphasizing that stress was being taken for granted.
Research in other high-stakes fields shows that chronic overwork and stress degrade performance and safety. For example, a 2023 study found that without proper recovery time after critical incidents, controller stress can “cause impaired performance and pose a risk to the safety of flight,” yet worker shortages often make immediate recovery breaks impossible. This vicious cycle of understaffing → stress → burnout → further understaffing was clearly at play in Newark.
Burnout and trauma leave further compounded the staffing crisis. The incident on April 28—where controllers suddenly lost all contact with incoming planes—was officially classified as a “traumatic” event that justified leave under federal policy. In its aftermath, roughly 20% of Newark’s ATC workforce took leave to cope with stress.
I’m not going to argue whether this leave was justified, but what is clear is that the loss of one-fifth of controllers virtually overnight plunged the facility into paralysis. Lacking surge capacity or cross-trained backups, the FAA had no quick way to fill the gap given how specialized the skill is: becoming a certified controller takes 2–3 years of training (with a ~40% washout rate).
The incident reflects a failure of workforce planning: in safety-critical industries, managers must anticipate that employees will sometimes need time off for stress or emergencies. High-reliability organizations ensure redundancy in personnel just as they do in equipment. By contrast, at Newark there was no margin. Controllers were so stretched that when a handful exercised their right to trauma leave, routine operations collapsed for days. As one congressional report later put it, the system was “built to set them up for failure” under such conditions.
How could better workforce planning have prevented the crisis? Proactive hiring and retention a few years earlier might have averted the staffing shortfall that led to the desperate measures in the first place. The FAA is now belatedly “supercharging” its hiring pipeline with bonuses and targeted incentives, but these steps could take years to bear fruit. Equally important is addressing burnout to retain existing talent. Studies of ATC and similar fields find that burnout correlates with factors like constant stress and lack of support. The FAA and NATCA (the controllers’ union) should implement robust wellness programs, counseling, and reasonable scheduling to avoid pushing controllers to the brink. It is telling that controllers felt compelled to conceal health issues for fear of losing their certification. A healthier organizational culture would encourage employees to flag fatigue or stress before it becomes a safety hazard.
In short, the crisis exposed a classic failure of human resource risk management: not planning for the “what if” scenario of multiple people being unavailable in a critical job. A well-staffed, resilient workforce is the first line of defense in preventing system meltdowns.
Poor Technology and Infrastructure Risk
The Newark incident also illustrates how aging technology and lack of infrastructure investment can turn a single glitch into a major crisis.
The immediate cause of the blackout was almost embarrassingly archaic: a single burnt-out copper wire, which knocked out radar feeds from a Long Island site and severed communications between pilots and controllers. To make matters worse, the redundant system failed to activate, leaving Newark ATC effectively blind and mute for 90 critical seconds.
This reveals a textbook failure of system redundancy. In safety-critical systems, one of the iron (no pun) laws is to avoid single points of failure through independent backups. Yet Newark’s communications were so fragile that one cable failure caused a total communications blackout. As an FAA official bluntly stated, the telecom system was antiquated. Controllers were literally relying on technology from the 1980s and 1990s—copper telephone lines, monochrome screens, even floppy disks—to manage 21st-century air traffic.
Such technology is decades past its life-cycle: the FAA itself admitted that 37% of its 138 ATC systems are currently unsustainable (with another 39% potentially unsustainable) due to age and obsolescence—many of these critical systems are over 30 years old. In a March 2025 GAO report, investigators warned that the FAA’s reliance on aging infrastructure “introduces risks to [its] ability to ensure the safe, orderly flow of air traffic.” That warning was a prophecy fulfilled at Newark.
Modern infrastructure risk management practices could have averted the outage or at least limited its scope.
Ongoing modernization and maintenance of ATC hardware would have long since replaced a brittle copper wire (and any existing failed backup circuit) with resilient, multi-path fiber links. Indeed, after the Newark fiasco, the FAA rushed to install “three new, high-bandwidth telecommunications connections” between New York and Philadelphia, and to replace old copper wires with fiber.
But these are reactive fixes to problems that were known well in advance. Industry insiders had long warned that the system was running on borrowed time—nearly 1,000 equipment failures were occurring each week across the U.S. ATC system due to outdated technology and infrastructure. The FAA’s own operational assessments identified numerous key systems that no longer meet mission needs, suffer frequent outages, and lack vendor support for want of spare parts. In other high-risk sectors like power grids or petrochemicals, operators track such warning signs and replace critical components before they fail catastrophically. Unfortunately, in the public sector, budget constraints and bureaucratic delays often lead to deferring modernization—until something breaks. Newark’s radar/comm failure was a foreseeable outcome of this “fix on failure” approach.
Another issue was the complex, failure-prone setup created by relocating Newark’s control facility. When Newark’s approach control moved to Philadelphia (100 miles away) in 2024, the FAA had to implement a unique system of radio and radar relays to transmit Newark’s data to Philly. This jury-rigged network, relying on legacy hardware, added layers of complexity. Safety experts note that tightly coupled, complex systems are especially vulnerable to “normal accidents” —small failures that cascade unpredictably.
In Newark’s case, the primary and backup comm lines were both routed through this new relay system. So when the relay link failed, controllers in Philadelphia instantly lost both radar and radio contact with Newark’s airspace. There was no local fallback, since Newark’s old tower had ceded control.
A highly redundant design—a backup control team or system in the New York area that could take over if the Philly link went down—might have stopped the domino effect. Studies by NASA and FAA have long advocated independent backups for critical ATC functions to prevent this exact kind of simultaneous loss. Yet the Newark sector move seemingly did not include an effective contingency plan for a relay failure. The result was that during those 90 seconds, a Newark controller had to tell a pilot “we don’t have a radar, so I don’t know where you are,” illustrating total situational awareness loss.
Put simply, the crisis could have been prevented with proper technology risk management. The FAA needed to treat telecom and radar upgrades not as optional investments, but as mission-critical safety imperatives. The cost of prevention (replacing an old cable, installing a robust backup system) is minuscule compared to the cost of failure (nationwide delays, emergency leaves, shaken public confidence). The agency’s new “comprehensive plan” announced in May 2025—which includes overhauling 4,600 site communications, replacing 618 aging radars, and installing modern hardware across facilities—is an important step. But this plan is essentially catching up on a backlog of upgrades that should have been done years ago.
For leaders, Newark is a cautionary tale: deferred maintenance and technical debt can silently accumulate to a breaking point. In any industry reliant on complex systems (from aviation to finance), rigorous lifecycle management, frequent stress-tests on backup systems, and continuous investment in modernization are the pillars of reliability. Newark’s meltdown shows what happens when those pillars rot: one small spark (or fried wire) can bring down the house.
The fact that a similar issue recently occurred at Heathrow, means that such issues are structural.
Governance and Change Management Breakdowns
The third dimension of the Newark crisis involves governance failures and mismanaged organizational change, particularly around the decision to relocate Newark’s air traffic control sector.
The transfer of Newark’s approach control from New York to Philadelphia—executed in July 2024—was intended to alleviate staffing shortages at the overburdened New York TRACON facility.
In theory, this inter-facility reorganization could have been beneficial: Philadelphia had an easier time recruiting controllers, and moving Newark’s traffic could lighten the load on New York’s understaffed team. However, the way this change was implemented (and opposed) reveals misaligned incentives and poor coordination among key stakeholders.
First, there was significant resistance from front-line experts and local authorities. The National Air Traffic Controllers Association (NATCA) strongly opposed the move, as did several New York-area political leaders. Their concerns likely ranged from safety issues to job impacts. Rather than achieving buy-in, the FAA pushed through the shift “with little fanfare,” suggesting limited transparency.
This created a classic change management problem: stakeholders felt steamrolled rather than included. In high-reliability domains, ignoring the concerns of those on the ground can be dangerous—some may remember NASA’s Challenger launch decision, where management dismissed engineers’ warnings resulting in tragedy. In this case, while the motive (reducing NY workload) was sound, the risk communication was inadequate.
Did FAA leadership thoroughly communicate how they would mitigate the new risks introduced by remote controlling Newark’s airspace? Did they coordinate protocols between the old and new centers? The aftermath suggests not.
An anonymous controller involved in Newark’s operations is quoted by MSNBC saying that the airport was “not safe for travelers right now,” indicating how little confidence the front-line staff had in the setup. When those responsible for executing a change are unconvinced of its safety, it’s a red flag that governance has failed to align goals and address legitimate worries. On the other hand, it’s true that people are always resistant to change, and this sometimes means that warnings may not always be justified. But this is why change management is so crucial.
Moreover, the inter-agency (or inter-unit) transition was poorly managed from an operational standpoint. Moving such a critical function between facilities is akin to a major organizational merger—it requires synchronized training, technology, procedures, and culture. In this case, the transfer spanned different FAA regions (New York to Philadelphia) and effectively grafted Newark’s high-density traffic into a new environment. However, coordination appears to have been lacking.
The unique relay system was installed as a technical fix, but what about process and people integration? There were no clear emergency protocols for a relay failure. Philadelphia controllers were suddenly responsible for Newark traffic without an established playbook for “what if Philly loses Newark comms?” Likewise, the New York TRACON, having ceded Newark, had no mandate (and perhaps no ability) to step back in when trouble hit. This gap reflects a failure in transition risk planning.
Effective change management in public safety organizations often involves running simulations, staging incremental cutovers, and ensuring overlap where the old and new systems can back each other up during the transition. If such steps were taken, they were insufficient. Newark’s sector was effectively cut over to Philadelphia cold-turkey. Indeed, a former controller noted that the move required “more complicated equipment” and that “intermittent” communication issues had occurred even before the big outage.
In other words, warning signs were present during the transition, but governance mechanisms did not act on them. A high-reliability mindset would treat every small outage as a chance to learn and adjust (the “preoccupation with failure” principle of HROs). Instead, the system limped along until a bigger failure struck.
Finally, misaligned incentives and bureaucratic inertia played a role. The FAA’s leadership was under pressure to show action on delays (hence the quick move to Philly), NATCA’s priority was controller workload and safety, while elected officials were keen to retain control in their region. These divergent incentives led to a compromise with which no one was fully satisfied: the move happened without the funding and overhaul truly needed to support it. One striking example was funding: despite a $1.2 trillion infrastructure bill in 2021, “less than 1%” was spent on upgrading the ATC system, according to Transportation Secretary Sean Duffy.
Upgrading Newark’s communications or building a redundant control hub didn’t offer the political glory that comes with cutting a ribbon at a new terminal.
This kind of misalignment is unfortunately common in the public sector, where short-term political wins can overshadow long-term risk mitigation. The result at Newark was an “organizational accident” where multiple latent failures lined up: under-staffing, under-investment, and a botched change.
The FAA could have phased the Newark transfer (e.g., handling some approach traffic from Philly as a test while retaining overlap with New York), addressed NATCA’s safety concerns openly, and secured dedicated funds to upgrade the needed infrastructure ahead of the move. Inter-organizational coordination between FAA technical teams, operations, the union, airports, and even airlines, should have been tighter.
In practice, it appears each assumed the other had things covered. As one aviation analyst observed, Newark’s controllers ended up performing “quiet heroic acts, in spite of a system that [was] built to set them up for failure.”
That is an indictment of governance. Good management would strive to never put employees in a position where only heroics avert disaster. The Newark outage was not a random act; it was the culmination of policy choices, inter-agency handoffs, and cultural attitudes that, if managed differently, could have produced a far better outcome.
Conclusion: Lessons for Leaders
Newark’s meltdown offers several strategic and operational lessons for leaders striving to manage complex organizations:
Invest in Resilience, Not Just Efficiency: The drive for cost-cutting or streamlined operations must be balanced with buffers and backups. Whether it’s maintaining extra staff on call, or upgrading aging servers, building slack into systems can prevent one failure from spiraling into a crisis. As evident from Newark’s case, running too lean can be perilous—resilience is a competitive advantage in the long run.
Proactive Risk Management Pays Off: Don’t wait for a high-profile failure to address known issues. The warning signs at Newark (staffing shortfalls, antiquated infrastructure, minor outages) were visible long before the breakdown. High-reliability organizations are proactive: they fix the roof before the storm. In practice, this means regular audits of critical systems, “red team” exercises to probe for weaknesses, and a culture that rewards raising concerns. Leaders should ask, “What’s our next Newark—and how do we prevent it?”
People and Technology Must Evolve Together: In any major change (a new IT system, a re-org, a merger), the human factor is as crucial as the tech. It’s not enough to install new equipment; you must train and prepare the workforce to use it under all conditions. Likewise, if you relocate or reassign people, ensure the technology and processes support them. Newark’s relocation lacked a holistic approach. The lesson is to synchronize the “soft” and “hard” aspects of change—communicate, train, and provide resources so that transitions are smooth rather than jarring.
Align Stakeholders and Communicate Risks: Misaligned incentives between frontline employees, management, and regulators can undermine even well-intentioned initiatives. Leaders should engage stakeholders early, transparently discuss risks, and align on a shared vision of success (and safety). In Newark’s case, better communication with NATCA and local officials might have surfaced and resolved critical concerns before implementation. In any sector, whether launching a new product or implementing a policy, collaboration and clear communication are key to avoiding unintended consequences.
At its core, the Newark crisis was a preventable failure. It resulted not from a single moment of bad luck, but from a series of management missteps. The silver lining is that each of these areas is within leaders’ control. By applying the fundamentals of sound workforce planning, robust infrastructure investment, and prudent change management, organizations can avoid “meltdowns” and ensure that the extraordinary situation witnessed at Newark remains a rare anomaly.
The overall lesson is clear: operational excellence isn’t just about optimizing what works—it’s about fortifying against what could go wrong.
But knowing the bureaucracy’s track record, I wouldn’t be surprised if I soon find myself reading another reassuring email from another CEO.
Hi Gad!
I'm listening to Abundance by Ezra Klein at present. And I see so many touch/tangents to this essay. As always your work is so insightful and informative! Great to see you on Notes.