Big data and Google BigQuery improve cancer drug development by detecting bacteria
Biotech firm BioCorteX uses what may be the world’s largest biological knowledge graph to understand how bacteria impact the effectiveness of cancer drugs
Developing new drugs is risky and expensive. Creating a new drug and bringing it to clinical trials can cost billions of pounds, with no guarantees of success. And sometimes a drug can fail to meet expectations during a clinical trial in one part of the world, even though it was effective in another.
One reason is the bacteria in the human body. Each person has a slightly different mix of bacteria in their bodies, and these bacteria are now known to play an important role in how well a medicine works – or even if it works at all.
Understanding that relationship is all the more important in cancer treatment, where bacteria in the tumour risk blocking potentially life-saving treatments.
The relationship between drugs and treatments is extremely complex and hard to predict. But with a new drug, or “asset”, costing as much as $2.6bn to develop, being able to model that relationship is hugely important to both pharmaceutical researchers and clinicians.
BioCortex is a specialist research company, set up to use advanced data science techniques to analyse the relationship between bacteria and drug candidates, with an initial focus on oncology and, in particular, antibody-drug conjugates.
By better understanding how bacteria interfere with medicines, BioCorteX, and the drugs researchers it works with, aim to increase the success rate of drugs going through clinical trials. This should lead to shorter drug development cycles and more effective treatments for patients.
“One of the reasons we founded the company was the frustration that, when you look after people as a clinician, you spot that people respond very, very differently to treatments, and often it’s difficult to understand why,” says BioCorteX co-founder Nik Sharma. “We saw an opportunity for a stepwise change in how we think about drugs and pharmaceuticals. Bacteria, which are really integral to human health, actually underact with pharmaceuticals. We think that is one of the core reasons why drugs may work in an individual, but at a larger scale may fail.”
Clinical trials
In clinical trials, a drug might be successful in one geography or population group but fail in another due to different bacteria in the human body.
Better understanding that relationship, with the vast number of variables involved – both the human body’s bacteria and the number of drugs being tested – is, though, a vast mathematical challenge. “The amount of bacteria that we have is phenomenal,” says Sharma. “The amount of pharmaceutical [treatments] is obviously large. The complexity is huge.
Sharma and Mo Alomari, a Rolls-Royce engineer at the time, worked together on finding a solution. Alomari was working on ways to model systems with extremely large numbers of variables.
This offered a way to look at bacteria and drug interaction, and Sharma and Alomari went on to cofound BioCorteX. The idea was to test these interactions “in silico”, or on computer hardware.
To do this, the firm built one of the largest knowledge graphs in biology. Modelling the interaction between bacteria and a drug candidate involves 15 to 16 billion connections.
This was beyond the reach of any commercial off-the-shelf database or analytics tool. So, BioCorteX builds its own, using Google’s BigQuery. “There is no software out there that can handle the knowledge graph of that size that we’ve been able to find, and we’ve looked at all commercial offerings out there,” says Sharma.
“What we’ve done is build our knowledge graph using BigQuery, and that’s what really allowed us to scale and, importantly, work economically, and issue new versions of our knowledge graph and merge those data two or three times a day.”
Knowledge graph
The knowledge graph has some three billion notes and 16 billion edges, all stored on BigQuery.
“On the graph databases, there are a number of them out there,” says Alomari. “Not one of them is able to handle the billions of nodes. So, basically, we came up with a bespoke solution built on top of BigQuery, where we added the layer on top that basically treats BigQuery as a graph database.”
If a scientist wants to run new data through the system, BioCorteX can do this several times a day at minimal cost. “It takes approximately 20 minutes,” says Sharma.
BioCorteX takes data from pharmaceutical companies and runs it through the knowledge graph to identify possible bacterial interference in a drug, and how it might impact its effectiveness in a larger number of patients.
“The bacteria makes some drugs for some individuals incompatible, whereas they’re compatible for others,” says Sharma. “We can determine that interaction. We can determine what is compatible at scale. So, the product is really the ability to be able to look at those assets fast forward.”
The process is quicker, and of course cheaper, than a clinical trial. Nor is the BioCorteX’s analysis limited to new drugs.
“What we’re also able to do is look at assets that haven’t been successful,” says Sharma. “You would have seen that a number of studies fail. Those drugs are often what’s called ‘out licensed’, so another company will take them on and see whether they can develop them. We’re able to look at those assets and see whether they failed because of the hidden interaction between the bacteria, the tumour and the drug.”
Read more about Google Cloud
- We survey the key cloud storage offers available from GCP, which include Cloud Storage, Persistent Disk and Filestore, with a range of service levels and app-specific options.
- Partnership with Europe’s biggest retailer will offer client-side encryption and ensure data doesn’t leave Germany.
This type of modelling is becoming more important as the development of medicines becomes more international. By using vast amounts of data, BioCorteX can run scenarios to model, say, a phase one study in Australia and then the differences between a phase two study in the US and Europe.
Nor is BioCorteX’s technology limited to cancer treatment. Although the current focus is on oncology, and bacteria, the approach is already being used to study viruses and fungi. “The engines are applicable across different verticals; we’ve done some work in consumer health,” says Sharma.
“What we’re able to do is provide further insights,” he says. “This isn’t a choice. A pharmaceutical company or a drug company doesn’t have a choice of whether this interaction is occurring – it’s happening.
“So, they can either choose to understand it, or they can do what they’re doing now with a 96% failure rate. In the future, we hope we can offer the right drug, first time, for all.”
Originally published at ECT News