News

Jul 10 '24

Mining for Molecular Function from a Universe of Knowledge: Building Terpene DB

#AIFS

#Education

Kristin Singhasemanon

Justin Siegel

Did you know that many antioxidant bioactives, anticancer therapeutics, essential oils, and food flavors all come from the same class of molecules?

In the world of natural molecules, terpenes are one of the largest, most studied, and widely commercially-utilized classes. Among other qualities, terpenes are known for their strong and often pleasant odors, which we encounter in essential oils and food flavorings.

Biomolecules such as terpenes provide an enormous opportunity to better our lives — from life-saving therapeutics, to foods, to energy. Each molecule has a unique property, and is produced through a unique set of proteins.

The key to unlocking this potential is identifying which protein possesses the functionality we need to generate the terpene of interest. The massive challenge today is to determine which protein sequences we want. The possibilities are beyond astronomical — literally, at 10 to the 400 power.

To get to the function, we need to connect a protein sequence to its structure, which then dictates function.

In the early 21st century, we learned how to readily obtain billions of protein sequences, but understanding how the sequence relates to structure and function remained a bottleneck. In 2021, Google's DeepMind achieved a major breakthrough by applying cutting-edge AI tools to accurately obtain structure in a rapid, cost-effective, and commoditized way.

With a 'Holy Grail' of biosciences in sight, a race to the finish line is on. In the last six months alone, we have seen about $2 billion of startup investment towards crossing the last line connecting protein sequence to function, unlocking biomolecular solutions to the world's biggest problems.

In academia, we don't have the money to compete with these large companies, so what is our role?

To provide training to students who can work at these companies or start their own
To provide public knowledge about technology gaps
To generate public datasets to address knowledge gaps

Dr. Siegel is working together with grad student, Ian Anderson, to illuminate approaches using natural language processing and machine learning for the isolation and utilization of high-value terpenes, expanding their footprint in the therapeutics, pesticides, flavors, and materials markets.

In his talk, Dr. Siegel explains the steps they used to create Terpene DB and the surprises they encountered along the way. This model can be replicated with other classes of molecules to expand the library of publicly-available datasets in the area of biomolecular function.

FEATURED SPEAKER: Justin B. Siegel, PhD, Professor of Chemistry, Biochemistry & Molecular Medicine and Faculty Director of the Innovation Institute for Food and Health (IIFH) at the University of California, Davis.

>_

Related News & Events

Cranberry bog during harvest at Lee Brothers, Inc. in New Jersey.

News

Dec 22 '25

#AIFS

From Cranberry Bogs to Data: How AI Is Changing Our Food System

THE FUTURE OF FOOD: How Artificial Intelligence is Transforming Food Manufacturing, followed by AI-generated image of a lab and a robotic arm holding a green apple

News

Nov 13 '25

#AIFS

White Paper: AI for Food Product Development

Discarded produce that is beginning to rot in the background, yellow text box at the top right that says, NEW RESEARCH, followed by the headline, Byproduct Database: New Tool to Track What's Left After Processing Foods

News

Oct 14 '25

#AIFS

#Research

Turning Food Scraps into Opportunities

Group of high school and community college educators in a green field looking up at the drone taking their picture

News

Sep 09 '25

#AIFS

#Education

Teachers to Learn AgTech Skills for High School Classrooms

White background with graphics of light blue glass-like balls connected with thin strands, headline reads, AI BENCHMARKING, STUDENT COMPETITION, PREDICTIVE SAFETY MODELS, light green box below with words, Develop ML & DL models for real-world food safety challenges, gloved hand holding a tomato to the right

Event

Dec 30 '25

IAFP AI Benchmarking Student Competition on Predictive Food Safety Models

Bright blue and black background with Manmit's photo to the right, white band along the top with words, Dec. 9 | 1:30-2:30 PM, words in white over blue on left side, AI Platform Demo for Food Formulation, Food Science Technical Demo, small in white over black, Manmit Shrimali, Co-Founder, CEO, Turing Labs

Event

Dec 08 '25

AI Platform Demo for Food Formulation

Graphic image of a farm with high tech components in the background with dark blue overlay and borders, white lettering near the top right, November 21-22, 2025, large yellow lettering in the middle, AgTech Workshop, followed by white lettering, Practical Tools for Smart Farming

Event

Nov 20 '25

News

Mining for Molecular Function from a Universe of Knowledge: Building Terpene DB

>_

Related News & Events

News

From Cranberry Bogs to Data: How AI Is Changing Our Food System

News

White Paper: AI for Food Product Development

News

Turning Food Scraps into Opportunities

News

Teachers to Learn AgTech Skills for High School Classrooms

Event

IAFP AI Benchmarking Student Competition on Predictive Food Safety Models

Event

AI Platform Demo for Food Formulation

Event

AgTech Workshop for Industry