FLARE¶
- class milp_flare.flare.FLARE(harness)[source]¶
Agentic MILP reformulation verifier.
FLARE (Formulation-Level Automated Reformulation Evaluation) uses an LLM-based agent and the Lean proof assistant to verify MILP reformulations according to the FormulationBench definition of reformulation. See the FLARE Paper for more details.
- Parameters:
- harness
Harness The agent harness to use for auto-formalization and proof synthesis. See Agent Harnesses for the available harnesses.
- harness
- Attributes:
- harness
Harness The configured agent harness.
- harness
Examples
Use FLARE to verify if formulation
bof problemp1from FormulationBench is a reformulation of formulationa(also see Running FLARE on FormulationBench):from pathlib import Path from formulation_bench import Dataset from milp_flare import FLARE, FormulationInput from milp_flare.harness import ClaudeCodeHarness ds = Dataset.load() a = ds.problems[1].formulations["a"] b = ds.problems[1].formulations["b"] harness = ClaudeCodeHarness(model="claude-opus-4-7") flare = FLARE(harness=harness) result = flare.verify( FormulationInput(a.render_markdown(), a.gen_solve_py()), FormulationInput(b.render_markdown(), b.gen_solve_py()), output_path=Path("runs/p1_a_b"), )
- verify(a, b, output_path)[source]¶
Run FLARE on a pair of MILP formulations.
Run FLARE to determine if formulation
bis a reformulation of formulationaaccording to the FormulationBench definition of reformulation. This method creates the agent working directory (see below), triggers the agent, and evaluates the agent output. Finally, it populatesoutput_pathwith:The agent working directory (
wd/)The FLARE configuration (
config.json)The result dictionary (
result.json)
The agent working directory contains descriptions of each formulation, the agent prompt
prompt.txt(see Prompts), and the necessary Lake environment files.wd/ ├── A/ │ ├── formulation.md │ ├── solve.py │ └── Formulation.lean # written by agent ├── B/ │ ├── formulation.md │ ├── solve.py │ └── Formulation.lean # written by agent ├── Reformulation.lean # written by agent ├── prompt.txt ├── Common.lean ├── lake-manifest.json ├── lakefile.toml └── lean-toolchain
- Parameters:
- a
FormulationInput Inputs for formulation A.
- b
FormulationInput Inputs for formulation B (the candidate reformulation of A).
- output_path
pathlib.Path Directory to populate with run artifacts.
- a
- Returns:
- result
FLAREResult The verdict, duration, cost, and per-check metadata.
- result
Examples
Run FLARE on formulations
aandband inspect the result:flare = FLARE(harness=harness) result = flare.verify(a, b, output_path=Path("runs/a_b")) result.is_reformulation True result.metadata["form_a_written"] True result.cost_usd 1.49 result.duration_s 322
See Running FLARE on FormulationBench for more example usage.
- class milp_flare.flare.FormulationInput(formulation_md, solve_py)[source]¶
Formulation input for the FLARE agent.
- Attributes:
- formulation_md
str Markdown description of the formulation. Typically produced by
Formulation.render_markdown()from FormulationBench.- solve_py
str Python solver script with Gurobi implementation of the formulation.
FLAREdoes not execute this script; it is only used as an additional reference for the agent. Typically produced byFormulation.gen_solve_py()from FormulationBench.
- formulation_md
Examples
Construct a
FormulationInputfrom formulationp1.afrom FormulationBench:>>> from formulation_bench import Dataset >>> from milp_flare import FormulationInput >>> ds = Dataset.load() >>> a = ds.problems[1].formulations["a"] >>> inp = FormulationInput(a.render_markdown(), a.gen_solve_py()) >>> print(inp.formulation_md) # Amusement Park Ticket Machines ## Problem Description An amusement park is installing cash-based machines and card-only machines... >>> print(inp.solve_py) import json from gurobipy import Model, GRB import argparse def main(params_path: str, solution_path: str) -> None: ...
- class milp_flare.flare.FLAREResult(is_reformulation, duration_s=None, cost_usd=None, metadata=<factory>)[source]¶
Result from FLARE.
- Attributes:
- is_reformulationbool
Final verdict. True if all sub-checks pass: both
Formulation.leanfiles compile,Reformulation.leancompiles and contains adef : MILPReformulation, and the proof issorry-free using only theSTANDARD_AXIOMS.- duration_s
float, optional Wall-clock duration of the agent run, in seconds.
- cost_usd
float, optional Estimated USD cost of the agent run.
Nonewhen the harness does not report cost.- metadata
dict[str,Any] Per-check breakdown and harness run metadata:
form_a_written,form_a_compiled,form_b_written,form_b_compiled,proof_compiled,milp_reform_found,sorry_free,no_new_axioms,axioms,agent_decision, and token counts.
- milp_flare.flare.STANDARD_AXIOMS = frozenset({'Classical.choice', 'Quot.sound', 'propext'})¶
Axioms permitted in a verified reformulation proof. All of Mathlib is built on these three; a proof depending on anything else is not trusted by FLARE.