As AI is being used across the health sector to speed the search for new treatments, how do we keep it from being used to make bioweapons—viruses, toxins, bacteria, and the like?
That’s the focus of a paper recently published in Science, lead-authored byFordham Law professor Doni Bloomfield. It offers a blueprint for regulating new biological data that could be used for such nefarious ends, borrowing a page from current policies that protect the privacy of people whose genetic data are used in research.
“We need to be taking the biosecurity risk posed by AI seriously,” said Bloomfield, an expert in biosecurity and health law, noting that “we have our first examples of AI-designed viruses—though fortunately, these viruses don’t infect people or animals; rather, they infect bacteria that can make people sick.”
His co-authors include colleagues at Johns Hopkins University’s Center for Health Security and other universities. They propose the creation of a government panel to regulate a narrow slice of biological data generated in the future that could be used for making bioweapons. The idea, Bloomfield said, is to closely guard data that could create deadly tools while also permitting research that advances human health and scientific understanding. Fordham Now sat down with Bloomfield to learn more.
What are bioweapons?
The term refers to viruses, toxins, bacteria, or fungi used as weapons. The U.S. dismantled its biological weapons program in the late 1960s and led the effort to get other nations to sign on to the Biological Weapons Convention after deciding that biological weapons are indiscriminate, cannot be well targeted, and should never be used.
What are the biosecurity concerns around AI?
Making biological weapons is a difficult process, fortunately. Some states, and several well-funded terrorist groups, have failed to do it. The main worry is that AI could make it easier, enabling bad actors to design and deploy biological agents more dangerous than those found in nature. Without adequate safeguards, common tools like ChatGPT and Claude may eventually be able to help non-experts create infectious agents in a lab. And specialized AI models could take a more direct role in designing viruses and other threats, making connections that humans can’t, so that work that might have taken months or years can be done in days or weeks. Our proposed regulations would focus on these specialized models, making it harder for them to be used for concerning tasks like designing viruses that are more infectious.
Last year, researchers at Stanford used a biological AI model to design modified viruses that were more effective than those found in nature at killing E. coli. That capability could be used for good—for instance, to target antibiotic-resistant bacteria. But what if people try to optimize human-infecting viruses, like HIV or a coronavirus, to make them more durable and transmissible?
So how can regulation make a difference here?
Regulating biological AI models themselves would be difficult. The ability to create these is becoming relatively cheap and diffuse. But if we can regulate a slice of the specialized data that the models depend on, we can make it harder to train models that could be misused without affecting the vast majority of beneficial work.
Freely available biological data are great, and were essential to fighting COVID-19. But some data will be liable to being misused. We see an opportunity for well-targeted regulation, focusing on a small subset of data that doesn’t yet exist—data that would make for especially dangerous AI models—and ensuring that these data are only available to responsible scientists.
How would that work?
We can take a model from privacy law. Some people’s genetic data are held securely in trusted research environments, or TREs, where scientists can query them—on genetic propensities toward heart disease or obesity, say—without downloading the data directly. We can have a similar approach for securely housing biological data that could help create models with threatening capabilities.
We propose a five-tiered framework: the vast majority of data would be on the unregulated bottom tier, followed by light restrictions on data in the next two tiers, with restrictions tightest for top-tier data. An expert panel created by the government would regulate how certain data are deposited in the TREs, and the panel would work with scientists to design the tiers of data. The data could be hosted by government or by vetted private institutions.
These TREs could also be repositories in which scientists deposit biological data for AI training more generally. That way, the repositories could not only improve security but also accelerate the use of AI for good—for example, by helping design vaccines to guard against pandemics and other ills.
