Trial and error – with the vast majority of trials ending in error – has long been the default method of drug discovery. Of the USD2.6 billion spent each year on drug discovery by the world’s largest pharmaceutical companies, at least nine-tenths will go toward funding drug research that will not lead to a successfully marketed product.

There are many pitfalls. Finding active compounds that appear to show promise is hard enough in the first instance, but winnowing these down to the few with no serious side effects using traditional techniques is both costly and time consuming. Both tasks – seeking likely compounds in the first place, and then undertaking basic qualification of the compounds for more serious investigation – are dependent upon data analysis and repetitive checks. Since these are activities at which the current state-of-the-art machine learning systems excel, it is little wonder that AI-driven drug discovery is all the rage among big pharmaceutical companies.

Syntax error

While pharmaceutical companies are the experts when it comes to traditional drug development pipelines, they must generally turn to the talents of others where AI is concerned. These range from relatively commoditized offerings (and while IBM’s Watson technology might be among the more broadly available solutions, it is still scarcely “commercial, off-the-shelf” software) to entirely bespoke drug discovery/development machine learning systems developed by startups. Whether it is a traditional software licensing arrangement, some form of joint research and development agreement, a collaboration agreement, or an outright corporate buy-out, there will usually be some kind of legal arrangement between the pharmaceutical company and the technology company.

This article sets the basis of the relationship and determines questions of ownership and risk in relation to the outputs of the use of the AI system.

Data error

Articles addressing AI in life sciences will often consider the use of AI in a diagnostic context, and therefore (rightly) lean heavily on data protection and privacy issues arising from the use of patient data. While some drug development AIs might have access to patient records or similar personal data (and depending on the extent they do, they will need to be designed to comply with data protection requirements) in our experience, the majority rely on data sources which are unlikely to contain much, if any, personal data.

Instead, systems are identifying promising compounds based upon surveys of relevant literature, published (anonymized) studies, and a broad array of similar unstructured sources. Using “unsupervised” machine leaning techniques and natural language processing, the systems can surface compounds that may provide promising avenues for research revealed in correlations that only become obvious once vast “data lakes” have been analyzed. Similarly, initial qualification of apparently promising compounds follows a similar approach; little (if any) use of personal data, but large quantities of bio-chemical data regarding the likely interactions of different compounds with biological systems.

Ownership of the incredibly valuable intellectual property rights in the discovered compounds, and the correlations derived from the data sets, will be uppermost in the minds of both the pharmaceutical company and the technology company. A close second will be questions associated with liability should anything go wrong. Therefore, when it comes to the most important issues to address in drug discovery agreements, data protection/GDPR issues take something of a back seat to these intellectual property and liability issues.

Licensing error

From an IP perspective, there are likely to be five separate ownership and usage rights questions to be addressed, relating to: (i) the (untrained) AI or machine learning software system; (ii) the unstructured data sets used to train the system; (iii) the rights in the newly trained system; (iv) the newly structured data or correlations derived from the unstructured data sets; and (v) the rights in relation to the compounds discovered and/or qualified by the system.

The technology company will obviously want to retain ownership of the underlying software system, granting a license to its pharmaceutical sector partner. Similarly, it is only really the pharmaceutical company that will be in a position to progress the discovered drugs through the various regulatory hurdles necessary to achieve marketing authorization for the final product. It therefore makes sense for ownership of the discoveries to be theirs, although the technology company may wish to secure a percentage of the profit achieved from sale of the final drug.

Unstructured data sets will often be provided by the pharmaceutical company, although it will need to check carefully that use of any periodicals, published data etc. within the system does not breach the terms of subscription arrangements with the relevant journal’s publishers or other relevant licensors.

The rights in the trained AI, and the newly structured data sets derived from the unstructured input data, will be perhaps the most hotly contended in terms of ownership. From the technology company’s perspective, these become potentially licensable assets to other potential customers or partners in the life sciences sector. From the pharmaceutical company’s perspective they represent valuable research intelligence that provides a significant competitive advantage, and it will therefore wish to ensure that there is no probability of their being disclosed or licensed to competitors. This is ultimately a commercial question for the parties to factor in to their negotiations.

Errors and omissions insurance

While liability debates between contract parties are never truly straightforward, the liability question for drug development AI systems is far less vexed than in diagnostic contexts.

The major personal injury/false positive or false negative risks that must be addressed in a diagnostic context are far more remote in discovery-focused systems. Since any discovered or qualified drug must then undergo much additional in-vitro testing, culminating in human trails, before the drug is granted a marketing authorization, there is little real prospect of any issues that may manifest in use being definitively tied back to avoidable failures at the discovery or early qualification stages.

Therefore, the debate becomes much more akin to the usual vendor/customer liability conversation for any software system, although with the obvious caveat that any AI or machine learning system is a statistical engine. Even after thorough training, the system has a non-zero probability of error and this must be factored in to service standards and limitation of liability clauses.

Human error

For either pharma companies or tech companies that are considering an AI-driven drug discovery collaboration, it is imperative to have these and related questions in mind. To err might be human, but doing so in these contracts can have long-lasting and very expensive consequences.

If your business is embarking upon an AI or machine learning life sciences project, get in touch with your usual DLA Piper contact to find out more about how we can help. We would also be delighted if you were to join us on 15 October at our DLA Piper European Technology Summit 2019 in London where one of the panel sessions is set to explore the regulatory framework likely to govern future innovation in AI-related technologies, robotics, automation and machine learning. More details here.