The Deterministic Biological Knowledge Graph
How Haeckel cross-references your biomarkers, supplements, and wearables against your genome through a curated graph that flags drug-gene conflicts before you take anything.
The Optimize surface is the place where the platform integrates everything you do to improve, from biomarkers and wearables through supplements, supplement stacks, sleep, cognitive assessments, hormones, and body composition. Underneath that surface sits a Deterministic Biological Knowledge Graph, or DBKG, which encodes the relationships between every node the platform tracks: your variants, your biomarkers, the supplements and drugs you might take, the metabolic pathways those substances pass through, and the published evidence linking each pair.
Why "deterministic" matters
The graph is not a generative model. It does not invent connections, and it does not hallucinate. Every edge has a source, a strength, and a directionality, and every conclusion the platform draws is traceable back through the graph to the underlying citations. When the platform tells you that a particular CYP2D6 genotype reduces tamoxifen activation, that conclusion comes from a chain of edges you can inspect, not from a probabilistic guess.
PGx safety gating
Whenever you add a new supplement or upload a new prescription, the platform walks the DBKG to find every gene-substance edge involving that compound. Each edge is checked against your actual genotype, and any conflict above a clinical-evidence threshold is surfaced as a safety gate before the supplement is added to your stack. The gates are based on CPIC guidelines for prescription drugs and on a curated literature base for over-the-counter supplements.
What the graph tracks today
- Variants: every clinically significant variant in the 89-gene panel plus the 12 pharmacogenes.
- Biomarkers: roughly 200 lab values across lipids, metabolic panels, hormones, vitamins, and inflammation.
- Substances: every drug in the FDA pharmacogenomic biomarker table plus 700 commonly used supplements.
- Pathways: ~40 curated biological pathways covering metabolism, neurotransmission, hormone signalling, and immune function.
- Evidence: every edge carries a CPIC level (A through D for drugs) or a literature confidence score (curated for supplements).
Graph schema
The DBKG is a property graph implemented in PostgreSQL. Nodes are typed and edges are typed, with both carrying structured properties.
- Node types: Variant (rsID, gene, position, classification), Biomarker (LOINC code, units, reference range), Substance (drug name, ATC code or supplement equivalent, mechanism class), Pathway (KEGG ID, name, organism), Outcome (UMLS concept, condition name, severity), Source (publication, guideline body, evidence tier).
- Edge types: modulates (positive or negative effect on a biomarker or pathway), metabolizes (one substance is broken down by a gene product), inhibits and induces (substance affects gene expression or enzyme activity), treats (substance is indicated for an outcome), contraindicates (substance should not be combined with a genotype or another substance), evidence-supports (a Source connects to a fact-edge with a confidence weight).
Source taxonomy and how confidence is assigned
Every edge in the DBKG carries at least one Source pointer with an evidence tier. The tier system is layered:
- CPIC (Clinical Pharmacogenetics Implementation Consortium) levels A through D for prescription drug-gene pairs. Level A is the strongest: drug-gene interactions with consensus prescribing guidance and demonstrated patient outcome impact. Level D is preliminary or contested.
- DPWG (Dutch Pharmacogenetics Working Group) guidelines as a secondary source for pairs that CPIC has not graded.
- FDA drug labelling pharmacogenomic biomarker table for any pair the FDA has formally listed.
- Peer-reviewed pharmacology literature for substance-substance interactions where regulatory bodies have not weighed in.
- NIH Office of Dietary Supplements evidence tiers for supplement claims (effective, possibly effective, insufficient evidence, possibly ineffective, unproven).
- Mechanistic literature for known molecular interactions where outcome data is sparse.
When two Sources disagree, the higher-tier one wins, and the disagreement is logged for human review. The platform never silently averages contradictory evidence into a single number; either it surfaces both with their respective tiers, or it withholds a conclusion and flags "insufficient evidence".
A concrete traversal
Suppose you add tamoxifen to your prescription list. The platform walks the DBKG as follows:
- Look up the Substance node for tamoxifen.
- Find every metabolizes edge pointing to tamoxifen. CYP2D6 is the primary metaboliser; CYP3A4 contributes secondarily.
- For each gene found, query your genotype. Suppose you carry CYP2D6*4/*4 (poor metaboliser).
- Walk to the contraindicates edge between (CYP2D6 poor metaboliser) and (tamoxifen). The edge exists, sourced to CPIC Level A.
- Render a safety gate in the UI: "Tamoxifen requires CYP2D6 conversion to its active metabolite endoxifen. Your CYP2D6 poor-metaboliser status produces sub-therapeutic endoxifen levels. CPIC recommends an alternative such as anastrozole or exemestane. Discuss with your prescriber."
The "insufficient evidence" branch
When the traversal cannot find an edge with adequate evidence, the platform reports "insufficient evidence" rather than greenlighting silently. The user is told that no conflict was found in the curated literature, that this does not guarantee safety, and that they should discuss with a prescriber if any concern arises. The honest "we do not know" is more useful than a confident "looks fine" that the data does not support.
User overrides with consent log
If the platform flags a conflict but the user (or their prescriber) has decided to proceed anyway, an override mechanism exists. The override creates an explicit consent record in the audit log: the conflict shown, the override action, the timestamp, and a free-text reason if provided. The override applies to that specific substance for that specific user, does not generalise, and remains visible in the safety panel as a yellow flag rather than disappearing entirely.
Update cadence
New CPIC guidelines are published roughly monthly; new FDA labelling changes a few times per year; the supplement evidence base updates continuously. The platform reviews every CPIC publication within two weeks of release and propagates new edges or revised confidence into the live DBKG. A change log is exposed at /help so users can see when their results were last refreshed against current evidence.
- Caudle KE et al. (2014). Incorporation of pharmacogenomics into routine clinical practice: the Clinical Pharmacogenetics Implementation Consortium (CPIC) guideline development process. Current Drug Metabolism.
- Swen JJ et al. (2023). A 12-gene pharmacogenetic panel to prevent adverse drug reactions: an open-label, multicentre, controlled, cluster-randomised crossover implementation study. Lancet.
- NIH Office of Dietary Supplements: ods.od.nih.gov.
- PharmGKB: pharmgkb.org.
Walk me through what the DBKG knows about my supplement stack and my genome.