What do you get when the world’s largest CRM breaks into the research industry and leverages AI to build their products? You get ProGen, a new AI system that can make artificial enzymes from scratch that can work just as well as real ones found in nature. ProGen was made by Salesforce Research (yes, that Salesforce) and uses language processing to learn about biology. In short, ProGen takes amino acid sequences and turns them into proteins.
In 1999, biologist Günter Blobel won the Nobel Prize for his work in protein synthesis, but this new AI powered tech may already be outpacing it. ProGen speeds up the creation of new proteins, which can be used for many things like medicines or breaking down plastic in landfills, presumably aiding us in avoiding the looming Great Garbage Avalanche of 2505.
“The artificial designs are better than ones made by the normal process,” said James Fraser, a scientist involved in the project. “We can now make specific types of enzymes, like ones that work well in hot temperatures or acid.”
To make ProGen, the scientists at Salesforce fed the system amino acid sequences from 280 million different proteins. The AI system quickly made a staggering one million protein sequences, of which 100 were picked to test. Out of these, five were made into actual proteins and tested in cells. That’s just 0.0005% of the generated results! It seems like the next frontier is to develop an AI to test all the possibilities. Two of the artificial enzymes were just as good at breaking down bacteria as the natural enzymes found in egg whites. Even still, the two were only 18% alike.
ProGen was made in 2020 using an LLM originally made for writing text, similar to ChatGPT. The AI system learned the rules and structure of proteins by looking at a lot of data. With proteins, there are a tremendous number of possibilities, but ProGen can still make working enzymes, even when there is a wide variation among the results.
“This is a new tool for protein engineers and we’re excited to see what it can be used for,” said Ali Madani, a scientist involved in the project. This project seems incredibly valuable, and must have cost Salesforce a fortune to get going, so we’re surprised to see that the code for ProGen is available on Github for anyone who wants to try it (or add to it).