Language strategy¶

Why this repository uses both Python and Julia, and what each owns.

1. The question¶

The work splits into two very different kinds of task:

Text & structure work — tokenize and parse an authoring grammar, build an AST, normalize it into an IR, validate it, serialize it, and generate code for other languages.
Symbolic & numeric work — turn the IR into a ModelingToolkit System, simulate, linearize, and analyze it.

These have different "natural" languages. Forcing both into one runtime would either make the parser carry a heavy symbolic dependency, or make the symbolic work fight a language poor at parsing and CLI ergonomics.

2. The split¶

Concern	Language	Module
CLI / orchestration	Python	`src/model_parser/cli.py`
Tokenizer + expression parser	Python	`frontends/expr_parser.py`
Authoring frontend (INI → IR)	Python	`frontends/exprtk_ini.py`
IR data model + JSON Schema	Python (Pydantic)	`ir/`, `schema.py`
Validators (semantic + profile)	Python	`validation/`
Codegen IR → MTK `.jl`	Python	`backends/julia_mtk.py`
IR → `System` (in memory)	Julia	`julia/ModelParserJL`
Simulation / analysis / conformance reference	Julia	`julia/ModelParserJL` (+ SciML)

The boundary between the two languages is the IR JSON file — a plain, language-neutral artifact. Neither side imports the other; they communicate only through the IR.

3. Why Python owns the AST/IR and the codegen¶

Parsing & CLI strength. Python has excellent tooling for tokenizers, data modeling (Pydantic → free JSON Schema + validation), and CLIs (Typer).
No symbolic runtime needed to emit code. Generating a .jl file is pure string assembly from the IR tree; it needs no Julia process, so parse and emit are fast, dependency-light, and CI-friendly.
Ecosystem alignment. The org repository map already earmarks a Python orchestration CLI (apc-tooling); this repository is its concrete, focused first incarnation.

4. Why Julia owns the `System` build¶

Natural fit. ModelingToolkit is a Julia library; building a System is idiomatic there and benefits directly from MTK v11's faster TTFX.
Two complementary paths, same contract.
Codegen (emit julia, Python) → a durable, diffable .jl artifact. This is the default and the version-controlled form (see ADR 0002).
In-memory loader (ModelParserJL.build_system, Julia) → builds the System from IR JSON at run time, for dynamic/interactive use and as the conformance reference.

Because both consume the same IR and the same expression vocabulary, they cannot drift apart silently — conformance tests pin them to the same fixtures.

5. Avoiding the three-semantics trap¶

The known failure mode (org risk note) is having three independent implementations of expression semantics: Julia string rewrites, C++ ExprTk, and Python. This repository prevents that by:

representing expressions as an explicit tree in the IR (one semantics),
making every backend a renderer of that tree (no re-parsing),
testing rendering at the IR level (the same fixture drives Python codegen and, in time, the Julia loader and a C++ backend).

A future C++/realtime-cpp backend is therefore another renderer of the same tree, not a fourth parser.

6. Operational notes¶

Python deps via uv (uv add / uv sync); Julia deps via Pkg.
The Julia package keeps Manifest.toml out of version control (see .gitignore); the Python uv.lock is committed.
The Julia path is optional: a user who only wants IR → .jl never needs a Julia runtime; a user who wants in-memory systems never needs the generated file. Both are first-class.