Skip to content Skip to sidebar Skip to footer

Part 4: Large Language Models have Roadblocks to Discovery

dynamic adaptability to changing light

Biomimetic Data Selection for Drug Discovery

Computational approaches to drug discovery focus on algorithms, but the question of which data are being used to achieve a goal comes logically first (and whether these data allow us, even in principle, to answer the question at hand), according to the article – Artificial intelligence in drug discovery: what is realistic, what are illusions? (https://pubmed.ncbi.nlm.nih.gov/33346134/)

The article identifies biological complexity as the core challenge:

  • Biological systems suffer from drift (e.g., cell line drift), plasticity and high heterogeneity, which is difficult to define. This even leads to different responses of a cell line to the same drug in a high number of cases.
  • Biological readouts are highly dependent on the experimental system/assay used and are thus often not reproducible. Correction of batch effects is needed (e.g., for the integration of single-cell RNAseq data or histopathology images).

Target-based LLM processes cannot incorporate critical context elements (red rectangle).

The section on Modeling Abstract Concepts explained that building conceptual models that represent the real world requires a biomimetic digital twins ecosystem approach that begins with Identifying the real-world components that are critical to the model purpose.” Identifying the required components drives model conceptualization and the selection of relevant data. The criticality of this ecosystem modeling approach was demonstrated in the Princeton study on the stealth mode capability of the Hawaiian bobtail squid, which led to the discovery of the microbiome and the metagenome.

The squid hunts at night near the water surface, uses a sensor to measure the overhead moonlight and starlight, and dynamically adjusts the cover of the underbelly light organ to match the overhead light, making it invisible from below. It hatches with an underdeveloped light organ and no bioluminescent function, then the Vibrio Fischeri bacteria enter and colonize the light organ. The bacterial genome stimulates the squid genome to complete the development of the light organ and provides the necessary bioluminescence.

dynamic adaptability to changing light

The fascinating insights about the squid led to critical advancements in understanding human biology.

Identifying the real-world components that are critical to the model purpose helps scope the required data for discovery. The next step is twinning each component independently to the level of detail required by the purpose.

About the author: Joe Glick, Co-Founder, Chief Innovation Officer, RYLTI

Leave a comment