2024–2025 Research Assistant · Drake University Published · UCNC 2025

Algebraic GPAC to CRN Compiler

GPAC
Input Format
General Purpose Analog Computer
CRN
Output Format
Chemical Reaction Networks
0 %
Correctness
Verified transformation
Optimized
Reactions
Minimal reaction count
01

Project Overview

A General Purpose Analog Computer specification is a continuous mathematical system. A Chemical Reaction Network is a set of discrete molecular interactions. This compiler translates the first into the second while keeping the math identical.

This work was conducted as a Research Assistant at Drake University (Sep 2024 – May 2025), supported by Department of Energy Office of Science Award DE-SC0024278. The resulting paper, "A Selective Dual-Railing Technique for General-Purpose Analog Computers," was published at UCNC 2025 (Unconventional Computation and Natural Computation), Springer LNCS.

The project lives at the intersection of compiler theory and molecular computing, contributing to Drake University's research in population protocols — distributed computation where agents with limited memory interact pairwise to collectively compute functions. The compiler makes CRN design tractable by automating a transformation that researchers previously did by hand.

The hard constraint is semantic preservation: the generated CRN must compute exactly the same function as the input GPAC specification, provably. The optimization challenge is minimizing the number of species and reactions produced, since every unnecessary species is one more thing a molecular implementation has to synthesize.

02

My Role

Compiler Design

Architected the multi-pass compiler: lexical analysis, parsing, semantic analysis, and a code generation phase that emits CRN reaction sets from the internal AST.

Transformation Algorithms

Developed the core algorithms that map GPAC differential equations to mass-action kinetics while preserving the computational semantics of the original specification.

Optimization Passes

Implemented algebraic simplification passes that identify equivalent species and merge them, and constraint-satisfaction search for minimal reaction sets that achieve the same output.

Formal Verification

Built bisimulation checking to verify behavioral correspondence between input GPAC and output CRN. Used Z3 and Mathematica for symbolic equivalence proofs on generated reactions.

03

Technical Stack

Languages

Python Core implementation
C++ Performance-critical
LaTeX Documentation
Mathematica Symbolic verification

Compiler Tools

PLY (Lex/Yacc) Parser generation
NetworkX Graph algorithms
SymPy Symbolic math
Z3 Solver Constraint solving

Research Tools

Git Version control
Jupyter Experiments
pytest Testing framework
Graphviz Visualization
04

Challenges & Solutions

Semantic Preservation Across Paradigms

GPAC dynamics are continuous — differential equations over real-valued signals. CRN reactions are discrete — integer-valued molecular counts governed by mass-action kinetics. Bridging those two without losing the mathematical meaning of the original specification is the core difficulty of the entire problem.

Fix. A formal framework using category theory to establish semantic equivalence, with bisimulation checking as the runtime verification method. Every generated CRN is tested for behavioral correspondence against the input GPAC before it leaves the compiler.

State Space Explosion

Naively translated CRNs grow exponentially. A moderately complex GPAC specification produces a CRN with so many species and reactions that no molecular implementation could handle it.

Fix. Algebraic simplification passes that identify equivalent species and merge them during compilation, reducing the reaction set to a minimum. Constraint satisfaction search finds the smallest set of reactions that produces the correct computational behavior.

Rate Constant Inference

CRN reactions don't execute instantly — their speed is governed by rate constants that determine how quickly species concentrations change. The correct constants aren't obvious from the GPAC equations, and wrong constants produce a CRN that computes the wrong answer at the right topology.

Fix. A rate constant inference algorithm grounded in mass-action kinetics, followed by numerical simulation to validate that generated CRNs converge to the correct output values before the compilation is considered complete.

05

Key Learnings

Compilers make assumptions about what programs mean. When the source and target languages operate on fundamentally different physical principles, those assumptions need formal proofs, not just tests.

01

Multi-pass compiler optimization isn't just about speed. The simplification passes here existed to make the output physically realizable, not to make it run faster.

02

Bisimulation is a more useful correctness criterion than testing for research compilers. Tests verify specific inputs; bisimulation proves the transformation holds for all of them.

03

Population protocols theory changes how you think about computation. Algorithms that work when agents have no memory and interact randomly are genuinely surprising to reason about.

04

Writing for academic publication is the opposite of writing documentation. Every claim needs a citation or a proof. Handwaving that would pass code review will fail peer review.