sidekick/TODO.md
Simon Cruanes 01a4bfc50c
todo
2021-09-05 10:56:06 -04:00

182 lines
7.5 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters

This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

## TODO
- dyn-trans for CC: add hooks in CC explanation of conflict
so the theory can intercept that
- emit proofs for SMT;
* remove dproof, emit them directly (maybe have `emit_input` vs `emit_input_sat` so we can avoid duplicating `i` lines)
- data oriented congruence closure (all based on VecI32.t)
* then add Egg
- new proofs:
* simplify : use subproof
* preprocess: no subproof
* rule for proving `t=u` by CC modulo all previous unit equalities (needed
for simp/preprocess)
* to prove `if a b b = b`, do `a => if a b b = b`,
`¬a => if a b b = b`, and use resolution on these to remove conditions.
- dynamic ackermannization and transitivity (and maybe datatype acyclicity)
* clause pool for each type of dynamic inference (dynamic ack, etc.), with
its own GC limit.
* Probably need removable atoms too, once all clauses containing it are gone,
for atoms not added by the user explicitely.
- enable conflict minimization (from branch) in SAT solver
- sidekick-check:
* implement deletion
* seems like drat-trim is much faster in backward checking mode, see why
* implement checking of CC/preprocess
* checking of LRA
* checking of DT
- refactor:
* revamp msat proofs
(store proofs in allocator, also encoded as 0-terminated series of ints
indicating clause numbers?
this way we can GC them just like we GC clauses)
⇒ keep arrays, simpler
+ means the type of proofs is independent of variables
+ actually: try to use RUP (side buffer to store a representation of
a list of clauses, each "line" is `c_i := l1…ln 0` like dimacs.
Add delete lines to get DRUP.)
proof reconstruction to get, _if wanted_, proper resolution.
+ or just store clause premises in a set, use local RUP to reconstruct
(assert ¬C and propagate)
* fix proof production for minimized conflicts (or come back after proof are better)
- implement Egg paper
* use it for existing theories
- stronger preprocessing:
* first, split conjunction eagerly
* add every toplevel `x=t` where x atomic to a congruence closure;
then replace toplevel `assert p` with `assert [p]` where `[p]` is
the nested representative of `p` (replace each subterm by its representative).
proof is just `x1=t1 … xn=tn |- p=[p]`
* using Egg, should be possible to integrate theory preprocessing this way
* then:
- one-point rule for LRA and UF (`x = t /\ F[x]` becomes `x=t /\ F[t]` and
we forget `x` in practice)
- eliminate LRA equations true at level 0 (why is alt-ergo faster on uart7?)
- preprocess LRA for:
`./sidekick tests/unsat/pb_real_50_150_30_47.smt2`
`./sidekick tests/unsat/clocksynchro_9clocks.main_invar.base.smt2`
- proof production:
* [x] produce Quip
* [x] have it proofchecked
* [x] do not produce stuff like `(cc-imply (refl tseitin0) (= tseitin0 <the term>))`
* [x] do not produce stuff like `(hres (init (assert t)) (p1 (refl t)))`
(just `assert t` is enough)
* [ ] theory lemmas
- [x] functor for mapping equiv classes to values of some monoid, code to be
used in 99% of the theories
* [ ] also allow to register to "joins" between distinct classes
(useful for injectivity or array reduction or selector reduction)
→ in a way this must become a (backtracking) query API for making theories
easy to write
* [x] use it in th-data for cstors
* [x] use it in th-data for select
* [x] use it in th-data for is-a
* [ ] use it in th-array
* [ ] use it in th-define-subsort
**NOTE** this is actually subsumed by Egg
- provide a notion of smtlib theory, with `smtlib.ast --> Term.t option`
and typing rule (extensible parsing+typing)
* use if for th-bool
* use if for th-distinct
- cleanup of CC/optims
* store lits directly in (boolean) classes
- add terms of input goals pre-emptively to the CC/theories
- add `CC.add_rw_step t u` work, where `t-->u`
(keep `t` in sig table, but flag is so it's inert and will not
be pushed into `cc.pending` ever, should have almost 0 overhead after that)
* use it for local simplifications
* use it for function evaluation (i.e. rewrite rules)
- design evaluation system (guards + `eval:(term -> value) option` in custom TC)
- compilation of rec functions to defined constants
- theory of defined constants relying on congruence closure + evaluation
of guards of each case
- abstract domain propagation in CC
- domain propagation (intervals) for arith
## Main goals
- Allow to plug one's code into boolean propagation
* react upon propagation (possibly by propagating more, or side-effect)
* more advanced/specific propagation (2-clauses)?
* implement 'constraints' (see https://www.lri.fr/~conchon/TER/2013/3/minisat.pdf )
## Long term goals
- max-sat/max-smt
- coq proofs ?
## Done
- write theory of datatypes (without acyclicity) with local case splits + injectivity + selectors
- datatype acyclicity check
- make CC work on QF_UF
* [x] public `add` function should push_pending then call main CC loop
(internal one is called within the loop so no need)
* [x] internalize terms on the fly (backtrackable) OR: at the beginning, backtrack to 0 to add a new term and then add trail again
* ~~[ ] basic notion of activity (of subterms) for `ite` (and recfuns, later)?~~
- use `stmlib-utils`
- parser for cstors (`declare-cstor`, same as `declare-fun`)
- refactor:
1. have sidekick.env with only the signature that is instance-dependent + notion of theory (as module sig);
2. solver takes this env and SAT solver with compatible lits
3. move concrete terms, etc. into smtlib lib
4. update theories
- fix: have normal theories `on_new_term` work properly (callback in the CC?)
- proper statistics
- make the core library entirely functorized
* current term/cst/lit/… would become an implementation library to
be used in the binary
- theory for `ite`
- theory for constructors
- do a `th-distinct` package
* in CC, use payload to store set of distinct tags (if non empty)
* if `¬distinct(t1…tn)` asserted, make big disjunction of cases (split on demand)
- in CC, store literal associated with term, for bool propagation
- in CC, propagation of boolean terms
- in CC, merge classes eagerly
- when CC merging two classes, if one is a value then pick it as representative
(i.e. values are always representatives)
- large refactoring of CC
- simplify CC by separating cleanly public/internal API, remove excess batching
- notion of `value` (V_true,V_false,V_custom, + maybe undefined some day)
- checking models (for each clause, at least one true lit also evals to true in model)
* have a `function` section in the model with `(id*value list) -> value`,
extracted from congruence closure
- simplify terms a lot, remove solving/semantic terms
- remove `detach_clause`. never detach a clause until it's really dead;
instead, do (and re-do) `internalize_clause` for its atoms.
- remove push/pop
- merge Solver_types.ml with Internal.ml
- theory of boolean constructs (on the fly Tseitin using local clauses)
* test `Sat_solver.add_clause` and `new_atom` behavior upon backtracking
→ when we backtrack an atom, its Tseitin encoding should be removed
* NOTE: clauses should never be added twice
(or else: what if a clause is added, removed (but not filtered from watched lits) then re-added? duplicate?)
- backtrackable addition of clauses/atoms in Sat solver
- remove tseitin lib
- typing and translation Ast -> Term (from mc2?)
- main executable for SMT solver
- base types (Term, Lit, …)
- theory combination
- basic design of theories