Google's artificial intelligence predicts the structure of all known proteins and opens a new universe for science

Google’s artificial intelligence predicts the structure of all known proteins and opens a new universe for science

Share it

AlphaFold prediction of the structure of vitellogenin, an essential protein for all egg-laying animals.deep mind

An artificial intelligence owned by Google has predicted the structure of almost all known proteins; some 200 million molecules essential to understanding the biology of all living beings on the planet and the mechanisms of some of the most prevalent diseases, from malaria to Alzheimer’s and cancer.

“This work ushers in a new era of digital biology,” celebrated Demis Hassabis, the 45-year-old programming and neuroscience expert who is the main creator of AlphaFold, the neural network system that has been able to almost completely solve one of the biggest problems in biology.

The British Hassabis was a young chess and video game talent who founded Deepmind in 2010, a company focused on creating an artificial intelligence capable of learning like humans. In 2013, this system proved to be better than anyone playing video games from the Atari company. The following year, Google bought the company for around 500 million euros. In 2017, AlphaGo swept the top champions of Go, the highly complex chess-like Asian board game. Since then, Hassabis has focused his efforts on a much bigger challenge: predicting the three-dimensional shape that a protein will have by reading only its genetic sequence, written in two dimensions with DNA letters.

Knowing the three-dimensional structure of these molecules from their genetic sequence is essential to understand their function, but it is a problem of immense difficulty. It’s like finishing a puzzle with tens of thousands of pieces without knowing what image it represents.

Until the appearance of this system, elucidating the shape of a single protein made up of 100 basic units —called amino acids— could take 13.7 billion years, the age of the universe. At best, it took scientists years using X-ray microscopy or huge particle accelerators like the European synchrotron in Grenoble, France. Instead, Google’s algorithm predicts the structure of any protein in a few seconds.

“This universe of proteins” is “a gift to humanity”, Hassabis highlighted during the presentation of the new database, during a press conference held on Tuesday, together with scientists from the European Molecular Biology Laboratory (EMBL), a public institution that has collaborated in the development of AlphaFold.

Until the arrival of this technology, the structure of some 200,000 proteins had been determined, a task that took 60 years and the participation of millions of scientists. That database has been the learning material for Google’s artificial intelligence, which has searched for valid patterns that predict the shape of proteins whose two-dimensional sequence is only known. In 2021, the system has already solved the structure of a million proteins, including all human ones. This year’s new shipment extends the record to 200 million: practically all the known proteins of all living beings on the planet.

Access to this new database is open and free and the computer code of its artificial intelligence is open and downloadable. This Google of life shows the two-dimensional sequence of any protein and a three-dimensional model that indicates the level of reliability of the prediction, which has a similar or even lower margin of error than conventional methods.

It is important to note that AlphaFold does not determine reality, but rather predicts it. Read the genetic sequence and estimate the most likely way the amino acids will be configured. The prediction has high reliability, saving scientists a lot of time and money to do theoretical work without using expensive equipment to determine the actual structure of a protein until absolutely necessary.

The applications of this new tool are almost endless, since microscopic proteins are involved in any imaginable biological process, from the mass death of bees to the resistance of crops to heat, passing through an infinity of diseases.

The team of Matt Higgins, from the University of Oxford (United Kingdom), has used AlphaFold as part of their project to develop an antibody – a type of protein – capable of neutralizing one of the essential proteins so that the malaria pathogen can breed. Within years, this type of research could achieve the first highly protective vaccine against this disease, since it would prevent the transmission of the parasite from one person to another through mosquito bites.

Achieved achievements

Another of the milestones already achieved is the most detailed structure to date of the nuclear pores, a donut-shaped complex of proteins that is the entrance and exit gate of the nucleus of human cells and that is related to endless diseases, including cancer and cardiovascular diseases. This new tool allows unprecedented access to understand “how the recipe of life [escrita en el genoma] it comes into operation when it is translated into proteins”, Jan Kosinski, a researcher at the EMBL who co-authored this finding, explained to this newspaper.

Hassabis and the rest of those responsible for Deepmind and the EMBL have assured that analyzes have been carried out of the possible risks involved in publishing this database and making it accessible to everyone. “The benefits are clearly greater than the threats”, highlighted the creator of the system, who added that in the future, as this technology develops, it will be the international community that must decide whether its use should be limited.

One of the most tangible applications is the design of tailor-made molecules that can block harmful proteins or, better yet, modulate their activity, a much more desirable effect in the design of new drugs, explains Carlos Fernández, CSIC scientist and group leader of structural biology of the Spanish Society of Molecular Biology. His team has used AlphaFold to elucidate part of the structure of a complex made up of several proteins essential for the propagation of the trypanosome that causes sleeping sickness that exists in sub-Saharan African countries.

Now years of work lie ahead to confirm whether the predictions are correct, explains biologist José Márquez, an expert in protein structure at the Grenoble synchrotron. “The next frontier will be that AlphaFold can contribute to the design of protein-activating or blocking drugs, a problem they are already addressing,” he explains. Another stumbling block: the system does not say why a protein obtains its final shape, something that can be essential in the investigation of diseases such as Alzheimer’s or Parkinson’s, related to incorrect protein folding.

Alfonso Valencia, director of life sciences at the Barcelona Supercomputing Center, talks about the shortcomings of the system. “Not everything is solved, because AlphaFold can only predict things that are in the domain of known things. For example, it cannot predict the structure of a type of protein that protects against freezing well because they are rare and there are not many examples in the databases. Nor can it predict the consequence of mutations, which is a very negative point in medicine”, he highlights.

It also acknowledges one of its strengths: that the code for the entire system is open, meaning that other scientists can improve or modify it as they please, even if Google decides to take the system offline. “It is evident that the people of Deepmind are seeking to win the Nobel Prize by acting in this transparent way,” says Valencia. “On the one hand, they get a big image and an advantage over their competitors, like Facebook. On the other hand, they have already suggested that they reserve the private use of specific data on health and for the design of drugs, ”she adds.

You can follow MATTER in Facebook, Twitter and Instagramor sign up here to receive our weekly newsletter.


#Googles #artificial #intelligence #predicts #structure #proteins #opens #universe #science


Share it