‘Very Harmful’ Lack of Data Blunts U.S. Response to Outbreaks
ANCHORAGE, Alaska — After a middle-aged woman tested positive for COVID-19 in January at her workplace in Fairbanks, public health workers sought answers to questions vital to understanding how the virus was spreading in Alaska’s rugged interior.
The woman, they learned, had existing conditions and had not been vaccinated. She had been hospitalized but had recovered. Alaska and many other states have routinely collected that kind of information about people who test positive for the virus. Part of the goal is to paint a detailed picture of how one of the worst scourges in American history evolves and continues to kill hundreds of people daily, despite determined efforts to stop it.
But most of the information about the Fairbanks woman — and tens of millions more infected Americans — remains effectively lost to state and federal public health researchers. Decades of underinvestment in public health information systems has crippled efforts to understand the pandemic, stranding crucial data in incompatible data systems so outmoded that information often must be repeatedly typed in by hand. The data failure, a salient lesson of a pandemic that has killed more than 1 million Americans, will be expensive and time-consuming to fix.
The precise cost in needless illness and death cannot be quantified. The nation’s comparatively low vaccination rate is clearly a major factor in why the United States has recorded the highest COVID death rate among large, wealthy nations. But federal experts are certain that the lack of comprehensive, timely data has also exacted a heavy toll.
“It has been very harmful to our response,” said Dr. Ashish K. Jha, who leads the White House effort to control the pandemic. “It’s made it much harder to respond quickly.”
Details of the Fairbanks woman’s case were scattered among multiple state databases, none of which connect easily to the others, much less to the Centers for Disease Control and Prevention, the federal agency in charge of tracking the virus. Nine months after she fell ill, her information was largely useless to public health researchers because it was impossible to synthesize most of it with data on the roughly 300,000 other Alaskans and the 95 million-plus other Americans who have gotten COVID.
Those same antiquated data systems are now hampering the response to the monkeypox outbreak. Once again, state and federal officials are losing time trying to retrieve information from a digital pipeline riddled with huge holes and obstacles.
“We can’t be in a position where we have to do this for every disease and every outbreak,” Dr. Rochelle P. Walensky, the CDC director, said in an interview. “If we have to reinvent the wheel every time we have an outbreak, we will always be months behind.”
The federal government invested heavily over the past decade to modernize the data systems of private hospitals and health care providers, doling out more than $38 billion in incentives to shift to electronic health records. That has enabled doctors and health care systems to share information about patients much more efficiently.
But while the private sector was modernizing its data operations, state and local health departments were largely left with the same fax machines, spreadsheets, emails and phone calls to communicate.
States and localities need $7.84 billion for data modernization over the next five years, according to an estimate by the Council of State and Territorial Epidemiologists and other nonprofit groups. Another organization, the Healthcare Information and Management Systems Society, estimates those agencies need nearly $37 billion over the next decade.
The pandemic has laid bare the consequences of neglect. Countries with national health systems like Israel and, to a lesser extent, Britain were able to get solid, timely answers to questions such as who is being hospitalized with COVID and how well vaccines are working. American health officials, in contrast, have been forced to make do with extrapolations and educated guesses based on a mishmash of data.
Facing the wildfirelike spread of the highly contagious omicron variant last December, for example, federal officials urgently needed to know whether omicron was more deadly than the delta variant that had preceded it and whether hospitals would soon be flooded with patients. But they could not get the answer from testing, hospitalization or death data, Walensky said, because it failed to sufficiently distinguish cases by variant.
Instead, the CDC asked Kaiser Permanente of Southern California, a large private health system, to analyze its COVID patients. A preliminary study of nearly 70,000 infections from December showed patients hospitalized with omicron were less likely to be hospitalized, need intensive care or die than those infected with delta.
But that was only a snapshot, and the agency only got it by going hat in hand to a private system. “Why is that the path?” Walensky asked.
The drought of reliable data has also repeatedly left regulators high and dry in deciding whether, when and for whom additional shots of coronavirus vaccine should be authorized. Such decisions turn on how well the vaccines perform over time and against new versions of the virus. And that requires knowing how many vaccinated people are getting so-called breakthrough infections and when.
But almost two years after the first COVID shots were administered, the CDC still has no national data on breakthrough cases. A major reason is that many states and localities, citing privacy concerns, strip out names and other identifying information from much of the data they share with the CDC, making it impossible for the agency to figure out whether any given COVID patient was vaccinated.
“The CDC data is useless for actually finding out vaccine efficacy,” said Dr. Peter Marks, the top vaccine regulator at the Food and Drug Administration. Instead, regulators had to turn to reports from various regional hospital systems, knowing that picture might be skewed, and marry them with data from other countries like Israel.
The jumble of studies confused even vaccine experts and sowed public doubt about the government’s booster decisions. Some experts partly blame the disappointing uptake of booster doses on squishy data.
The FDA now spends tens of millions of dollars annually for access to detailed COVID-related health care data from private companies, Marks said. About 30 states now also report cases and deaths by vaccination status, showing that the unvaccinated are far more likely to die of COVID than those who got shots.
But those reports are incomplete, too: The state data, for instance, does not reflect prior infections, an important factor in trying to assess vaccine effectiveness.
And it took years to get this far. “We started working on this in April of 2020, before we even had a vaccine authorized,” Marks said.
Now, as the government rolls out reformulated booster shots before a possible winter virus surge, the need for up-to-date data is as pressing as ever. The new boosters target the version of a fast-evolving virus that is currently dominant. Pharmaceutical companies are expected to deliver evidence from human clinical trials showing how well they work later this year.
“But how will we know if that’s the reality on the ground?” Jha asked. Detailed clinical data that includes past infections, history of shots and brand of vaccine “is absolutely essential for policymaking,” he said. “It is going to be incredibly hard to get.”
New Outbreak, Same Data Problems
When the first U.S. monkeypox case was confirmed May 18, federal health officials prepared to confront another information vacuum. Federal authorities cannot generally demand public health data from states and localities, which have legal authority over that realm and zealously protect it. That has made it harder to organize a federal response to a new disease that has now spread to nearly 24,000 people nationwide.
Three months into the outbreak, more than half of the people reported to have been infected were not identified by race or ethnicity, clouding the disparate impact of the disease on Black and Hispanic men.
To find out how many people were being vaccinated against monkeypox, the CDC was forced to negotiate data-sharing agreements with individual jurisdictions, just as it had to do for COVID. That process took until early September, even though the information was important to assess whether the taxpayer-funded doses were going to the right places.
The government’s declaration in early August that the monkeypox outbreak constituted a national emergency helped ease some of the legal barriers to information-sharing, health officials said. But even now, the CDC’s vaccine data is based on only 38 states, plus New York City.
Some critics say the CDC could compensate for its lack of legal clout by exercising its financial muscle, since its grants help keep state and local health departments afloat. But others say such arm-twisting could end up harming public health if departments then decide to forgo funding and not cooperate with the agency.
Nor would that address the outmoded technologies and dearth of scientists and information analysts at state and local health departments, failings that many experts say are the biggest impediment to getting timely data.
Alaska is a prime example.
Early in the pandemic, many of the state’s COVID case reports arrived by fax on the fifth floor of the state health department’s office in Anchorage. National Guard members had to be called in to serve as data entry clerks.
The health department’s highly trained specialists “didn’t have the capacity to be the epidemiologists that we needed them to be because all they could do was enter data,” said Dr. Anne Zink, Alaska’s chief medical officer, who also heads the Association of State and Territorial Health Officials.
All too often, she said, the data that was painstakingly entered was too patchy to guide decisions.
A year ago, for instance, Zink asked her team whether racial and ethnic minorities were being tested less frequently than whites to assess whether testing sites were equitably located.
But public health researchers could not tell her because for 60% of those tested, the person’s race and ethnicity were not identified, said Megan Tompkins, a data scientist and public health researcher who until this month managed the state’s COVID data operation.
Long after mass testing sites were shuttered, Tompkins’ team was culling birth records to identify people’s race, hoping to manually update tens of thousands of old case reports in the state’s disease surveillance database. State officials still think that the racial breakdown will prove useful.
“We’ve started from really broken systems,” Tompkins said. “That meant we lost a lot of the data and the ability to analyze it, produce it or do something with it.”
Boom and Bust Funding
State and local public health agencies have been shriveling, losing an estimated 15% of their staffs between 2008 and 2019, according to a study by the de Beaumont Foundation, a public-health-focused philanthropy. In 2019, public health accounted for 3% of the $3.8 trillion spent on health care in the United States.
The pandemic has prompted Congress to loosen its purse strings. The CDC’s $50 million annual budget for data modernization was doubled for the current fiscal year, and key senators seem optimistic it will double again next year. Two pandemic relief bills provided an additional $1 billion, including funds for a new center to analyze outbreaks.
But public health funding has traced a long boom-and-bust pattern, rising during crises and shrinking once they end. Although COVID still kills about 400 Americans each day, Congress’ appetite for public health spending has waned.
While $1 billion-plus for data modernization sounds impressive, it is roughly the cost of shifting a single major hospital system to electronic health records, Walensky said.
For the first two years of the pandemic, the CDC’s disease surveillance database was supposed to track not just every confirmed COVID infection, but whether infected individuals were symptomatic, had recently traveled or attended a mass gathering, had existing medical conditions, had been hospitalized, required intensive care and had survived. State and local health departments reported data on 86 million cases.
But the vast majority of data fields are usually left blank, an analysis by The New York Times found. Even race and ethnicity, factors essential to understanding the pandemic’s unequal impact, are missing in about one-third of the cases. Only the patient’s sex, age group and geographic location are routinely recorded.
While the CDC said the basic demographic data remains broadly useful, swamped health departments were too overwhelmed or too ill-equipped to provide more. In February, the agency recommended that they stop trying and focus on high-risk groups and settings instead.
The CDC has patched together other disparate sources of data, each imperfect in its own way. A second database tracks how many COVID patients turn up in about 70% of the nation’s emergency departments and urgent care centers. It is an early warning signal of rising infections. But it is spotty: Many departments in California, Minnesota, Oklahoma and elsewhere do not participate.
Another database tracks how many hospital inpatients have COVID. It, too, is not comprehensive, and it is arguably inflated because totals include patients admitted for reasons other than COVID but who tested positive during their stay. The CDC nevertheless relies partly on those hospital numbers for its rolling, county-by-county assessment of the virus’s threat.
There are bright spots. Wastewater monitoring, a new tool that helps spot incipient coronavirus surges, is now conducted at 1,182 sites around the country. The government now tests enough viral specimens to detect whether a new version of the virus has begun to circulate.
In the long run, officials hope to leverage electronic health records to modernize the disease surveillance system that all but collapsed under the weight of the pandemic. Under the new system, if a doctor diagnoses a disease that is supposed to be flagged to public health authorities, the patient’s electronic health record would automatically generate a case report to local or state health departments.
Hospitals and clinicians are under pressure to deliver: The federal government is requiring them to show progress toward automated case reports by year’s end or face possible financial penalties. So far, though, only 15% of the nearly 5,300 hospitals certified by the Centers for Medicare and Medicaid Services are actually generating electronic case reports.
And many experts say automated case reports from the private sector are only half the solution. Unless public health departments also modernize their data operations, they will be unable to process the reports that hospitals and providers will be required to send them.
“People often say, ‘That’s great, you put the pitchers on steroids, but you didn’t give the catchers a mask or a good mitt,’” said Micky Tripathi, the national coordinator for health information technology at the Department of Health and Human Services.
One Case, Many Data Systems
The effort to document the Fairbanks woman’s COVID case shows just how far many health departments have yet to go.
After the woman was tested, her workplace transferred her nasal swab to the Fairbanks state laboratory. There, workers manually entered basic information into an electronic lab report, searching a state database for the woman’s address and telephone number.
The state lab then forwarded her case report to the state health department’s epidemiology section, where the same information had to be retyped into a database that feeds the CDC’s national disease surveillance database. A worker logged in and clicked through multiple screens in yet another state database to learn that the woman had not been vaccinated, then manually updated her file.
The epidemiology section then added the woman’s case to a spreadsheet with more than 1,500 others recorded that day. That was forwarded to a different team of contact tracers, who gathered other important details about the woman by telephone, then plugged those details into yet another database.
The result was a rich stew of information, but because the contact tracers’ database is incompatible with the public health researchers’ database, the information could not be easily shared at either the state or the federal level.
For example, when the contact tracers learned a few days later that the woman had been hospitalized with COVID, they had to inform the epidemiology section by email, and the public health researchers got the hospital’s confirmation by fax.
Tompkins said Alaska’s problem was not so much that it was short of information but that it was unable to meld the data it had into usable form. Alaska’s health officials reached the same conclusion as many of their state and federal counterparts: The disease surveillance system “did not work,” Tompkins said, “and we need to start rethinking it from the ground up.”
The CDC awarded Alaska a $3.3 million grant for data modernization last year. State officials considered that a start but anticipated much more when a second five-year public health grant for personnel and infrastructure was awarded this summer.
They hoped not only to improve their digital systems but also to beef up their tiny workforce, including by hiring a data modernization director.
Carrie Paykoc, the health department’s data coordinator, texted Zink at 8 p.m. June 22 after news of the grant arrived.
The award was $1.8 million a year, including just $213,000 for data modernization. “Pretty dire,” she wrote.
“We were hoping for moonshot funding,” Paykoc said. “We learned it was a nice camper van.”
This article originally appeared in The New York Times .