Language strategy¶
Why this repository uses both Python and Julia, and what each owns.
1. The question¶
The work splits into two very different kinds of task:
- Text & structure work — tokenize and parse an authoring grammar, build an AST, normalize it into an IR, validate it, serialize it, and generate code for other languages.
- Symbolic & numeric work — turn the IR into a ModelingToolkit
System, simulate, linearize, and analyze it.
These have different "natural" languages. Forcing both into one runtime would either make the parser carry a heavy symbolic dependency, or make the symbolic work fight a language poor at parsing and CLI ergonomics.
2. The split¶
| Concern | Language | Module |
|---|---|---|
| CLI / orchestration | Python | src/model_parser/cli.py |
| Tokenizer + expression parser | Python | frontends/expr_parser.py |
| Authoring frontend (INI → IR) | Python | frontends/exprtk_ini.py |
| IR data model + JSON Schema | Python (Pydantic) | ir/, schema.py |
| Validators (semantic + profile) | Python | validation/ |
Codegen IR → MTK .jl |
Python | backends/julia_mtk.py |
IR → System (in memory) |
Julia | julia/ModelParserJL |
| Simulation / analysis / conformance reference | Julia | julia/ModelParserJL (+ SciML) |
The boundary between the two languages is the IR JSON file — a plain, language-neutral artifact. Neither side imports the other; they communicate only through the IR.
3. Why Python owns the AST/IR and the codegen¶
- Parsing & CLI strength. Python has excellent tooling for tokenizers, data modeling (Pydantic → free JSON Schema + validation), and CLIs (Typer).
- No symbolic runtime needed to emit code. Generating a
.jlfile is pure string assembly from the IR tree; it needs no Julia process, soparseandemitare fast, dependency-light, and CI-friendly. - Ecosystem alignment. The org repository map already earmarks a Python
orchestration CLI (
apc-tooling); this repository is its concrete, focused first incarnation.
4. Why Julia owns the System build¶
- Natural fit. ModelingToolkit is a Julia library; building a
Systemis idiomatic there and benefits directly from MTK v11's faster TTFX. - Two complementary paths, same contract.
- Codegen (
emit julia, Python) → a durable, diffable.jlartifact. This is the default and the version-controlled form (see ADR 0002). - In-memory loader (
ModelParserJL.build_system, Julia) → builds theSystemfrom IR JSON at run time, for dynamic/interactive use and as the conformance reference.
Because both consume the same IR and the same expression vocabulary, they cannot drift apart silently — conformance tests pin them to the same fixtures.
5. Avoiding the three-semantics trap¶
The known failure mode (org risk note) is having three independent implementations of expression semantics: Julia string rewrites, C++ ExprTk, and Python. This repository prevents that by:
- representing expressions as an explicit tree in the IR (one semantics),
- making every backend a renderer of that tree (no re-parsing),
- testing rendering at the IR level (the same fixture drives Python codegen and, in time, the Julia loader and a C++ backend).
A future C++/realtime-cpp backend is therefore another renderer of the same
tree, not a fourth parser.
6. Operational notes¶
- Python deps via uv (
uv add/uv sync); Julia deps via Pkg. - The Julia package keeps
Manifest.tomlout of version control (see.gitignore); the Pythonuv.lockis committed. - The Julia path is optional: a user who only wants IR →
.jlnever needs a Julia runtime; a user who wants in-memory systems never needs the generated file. Both are first-class.