diff --git a/.gitignore b/.gitignore index cfb63410..ee79f402 100644 --- a/.gitignore +++ b/.gitignore @@ -7,4 +7,6 @@ _build/ .session *.docdir src/util/log.ml +doc/index.html + msat diff --git a/VERSION b/VERSION new file mode 100644 index 00000000..4b9fcbec --- /dev/null +++ b/VERSION @@ -0,0 +1 @@ +0.5.1 diff --git a/docs/articles/mcsat-vmcai2013.pdf b/articles/mcsat-vmcai2013.pdf similarity index 100% rename from docs/articles/mcsat-vmcai2013.pdf rename to articles/mcsat-vmcai2013.pdf diff --git a/docs/articles/mcsat_design.pdf b/articles/mcsat_design.pdf similarity index 100% rename from docs/articles/mcsat_design.pdf rename to articles/mcsat_design.pdf diff --git a/docs/articles/minisat.pdf b/articles/minisat.pdf similarity index 100% rename from docs/articles/minisat.pdf rename to articles/minisat.pdf diff --git a/doc/deploy b/doc/deploy new file mode 100755 index 00000000..7fa4a604 --- /dev/null +++ b/doc/deploy @@ -0,0 +1,28 @@ +#!/bin/bash -e + +# Enssure we are on master branch +git checkout master + +# Generate documentation +make doc +(cd doc && asciidoc index.txt) + +# Checkout gh-pages +git checkout gh-pages +git pull + +# Copy doc to the right locations +cp doc/index.html ./ +cp msat.docdir/* ./dev/ + +# Add potentially new pages +git add ./dev/* +git add ./index.html + +# Commit it all & push +git commit -m 'Doc update' +git push + +# Get back to master branch +git checkout master + diff --git a/doc/index.txt b/doc/index.txt new file mode 100644 index 00000000..1fce0d1a --- /dev/null +++ b/doc/index.txt @@ -0,0 +1,8 @@ + +Dolmen +====== +Guillaume Bury + +* link:dev[] +* link:0.5[] + diff --git a/doc/release b/doc/release new file mode 100755 index 00000000..465f672a --- /dev/null +++ b/doc/release @@ -0,0 +1,34 @@ +#!/bin/bash -e + +# Get current verison number +version=`cat VERSION` + +# Enssure we are on master branch +git checkout master + +# Generate documentation +(cd src && make doc) +(cd doc && asciidoc index.txt) + +# Checkout gh-pages +git checkout gh-pages +git pull + +# Create doc folder if it does not exists +mkdir -p ./$version + +# Copy doc to the right locations +cp doc/index.html ./ +cp src/dolmen.docdir/* ./$version/ + +# Add potentially new pages +git add ./$version/* +git add ./index.html + +# Commit it all & push +git commit -m "Doc release $version" +git push + +# Get back to master branch +git checkout master + diff --git a/docs/biblio.bib b/docs/biblio.bib deleted file mode 100644 index 51f613a2..00000000 --- a/docs/biblio.bib +++ /dev/null @@ -1,13 +0,0 @@ -# $Id: fc4f8c6e8645238b51d6c5bc228b9c166291aa46 $ - -@inproceedings{FMCAD13, - author = "Jovanovic, Devan and Barrett, Clark and de Moura, Leonardo", - title = "The Design and Implementation of the Model Constructing Satisfiability Calculus", - year = 2013 -} - -@inproceedings{VMCAI13, - author = "Jovanovic, Devan and de Moura, Leonardo", - title = "A Model-Constructing Satisfiability Calculus", - year = 2013 -} diff --git a/docs/macros.tex b/docs/macros.tex deleted file mode 100644 index a21ad2fb..00000000 --- a/docs/macros.tex +++ /dev/null @@ -1,17 +0,0 @@ - -\def\ocaml{\textsf{OCaml}} - - -\def\sat{\textsf{Sat}} -\def\smt{\textsf{SMT}} -\def\mcsat{\textsf{McSat}} - -\def\dpll{\textsf{DPLL}} -\def\cdcl{\textsf{CDCL}} - -\def\msat{\textsf{mSAT}} - -\newcommand{\irule}[1]{{\small\textsc{#1}}} - -\EnableBpAbbreviations{} -\newcommand{\LLc}[1]{\LL{\irule{#1}}} diff --git a/docs/msat.pdf b/docs/msat.pdf deleted file mode 100644 index 99075567..00000000 Binary files a/docs/msat.pdf and /dev/null differ diff --git a/docs/msat.tex b/docs/msat.tex deleted file mode 100644 index 0043967c..00000000 --- a/docs/msat.tex +++ /dev/null @@ -1,610 +0,0 @@ - -\documentclass{article} - -\usepackage[T1]{fontenc} -\usepackage{makeidx} -\usepackage{bussproofs} -\usepackage{amsmath,amssymb} -\usepackage{tabularx} -\usepackage{pgf, tikz} -% \usepackage[margin=4cm]{geometry} - -\usepackage{float} -\floatstyle{boxed} -\restylefloat{figure} - -\usepackage{listings} -\lstset{language=[Objective]caml} - -\input{macros} - -\usetikzlibrary{arrows, automata} - -\begin{document} - -\title{\msat{}: a \sat{}/\smt{}/\mcsat{} library} -\author{Guillaume~Bury} - -\maketitle - -\section{Introduction} - -The goal of the \msat{} library is to provide a way to easily -create automated theorem provers based on a \sat{} solver. More precisely, -the library, written in \ocaml{}, provides functors which, once instantiated, -provide a \sat{}, \smt{} or \mcsat{} solver. - -Given the current state of the art of \smt{} solvers, where most \sat{} solvers -are written in C and heavily optimised\footnote{Some solvers now have instructions -to manage a processor's cache}, the \msat{} library does not aim to provide solvers -competitive with the existing implementations, but rather an easy way to create -reasonably efficient solvers. - -\msat{} currently uses the following techniques: -\begin{itemize} - \item 2-watch literals scheme - \item Activity based decisions - \item Restarts -\end{itemize} - -Additionally, \msat{} has the following features: -\begin{itemize} - \item Local assumptions - \item Proof/Model output - \item Adding clauses during proof search -\end{itemize} - -\clearpage - -\tableofcontents{} - -\clearpage - -\section{\sat{} Solvers: principles and formalization}\label{sec:sat} - -\subsection{Idea} - -\subsection{Inference rules}\label{sec:trail} - -The SAT algorithm can be formalized as follows. During the search, the solver keeps -a set of clauses, containing the problem hypotheses and the learnt clauses, and -a trail, which is the current ordered list of assumptions and/or decisions made by -the solver. - -Each element in the trail (decision or propagation) has a level, which is the number of decision -appearing in the trail up to (and including) it. So for instance, propagations made before any -decisions have level $0$, and the first decision has level $1$. Propagations are written -$a \leadsto_C \top$, with $C$ the clause that caused the propagation, and decisions -$a \mapsto_n \top$, with $n$ the level of the decision. Trails are read -chronologically from left to right. - -In the following, given a trail $t$ and an atomic formula $a$, we will use the following notation: -$a \in t$ if $a \mapsto_n \top$ or $a \leadsto_C \top$ is in $t$, i.e.~$a \in t$ is $a$ is true -in the trail $t$. In this context, the negation $\neg$ is supposed to be involutive -(i.e.~$\neg \neg a = a$), so that, if $a \in t$ then $\neg \neg a = a \in t$. - -There exists two main \sat{} algorithms: \dpll{} and \cdcl{}. In both, there exists -two states: first, the starting state $\text{Solve}$, where propagations -and decisions are made, until a conflict is detected, at which point the algorithm -enters in the $\text{Analyse}$ state, where it analyzes the conflict, backtracks, -and re-enter the $\text{Solve}$ state. The difference between \dpll{} and \cdcl{} -is the treatment of conflicts during the $\text{Analyze}$ phase: \dpll{} will use -the conflict only to known where to backtrack/backjump, while in \cdcl{} the result -of the conflict analysis will be added to the set of hypotheses, so that the same -conflict does not occur again. -The $\text{Solve}$ state take as argument the set of hypotheses and the trail, while -the $\text{Analyze}$ state also take as argument the current conflict clause. - -We can now formalize the \cdcl{} algorithm using the inference rules -in Figure~\ref{fig:smt_rules}. In order to completely recover the \sat{} algorithm, -one must apply the rules with the following precedence and termination conditions, -depending on the current state: -\begin{itemize} - \item If the empty clause is in $\mathbb{S}$, then the problem is unsat. - If there is no more rule to apply, the problem is sat. - \item If we are in $\text{Solve}$ mode: - \begin{enumerate} - \item First is the rule \irule{Conflict}; - \item Then the try and use \irule{Propagate}; - \item Finally, is there is nothing left to be propagated, the \irule{Decide} - rule is used. - \end{enumerate} - \item If we are in $\text{Analyze}$ mode, we have a choice concerning - the order of application. First we can observe that the rules - \irule{Analyze-Propagate}, \irule{Analyze-Decision} and \irule{Analyze-Resolution} - can not apply simultaneously, and we will thus group them in a super-rule - \irule{Analyze}. We now have the choice of when to apply the \irule{Backjump} rule - compared to the \irule{Analyze} rule: - using \irule{Backjump} eagerly will result in the first UIP strategy, while delaying it - until later will yield other strategies, both of which are valid. -\end{itemize} - -\begin{figure} - \center{\underline{\sat{}}} - \begin{center} - \begin{tabular}{c@{\hspace{1cm}}l} - % Propagation (boolean) - \AXC{$\text{Solve}(\mathbb{S}, t)$} - \LLc{Propagate} - \UIC{$\text{Sove}(\mathbb{S}, t :: a \leadsto_C \top)$} - \DP{} & - \begin{tabular}{l} - $a \in C, C \in \mathbb{S}, \neg a \notin t$ \\ - $\forall b \in C. b \neq a \rightarrow \neg b \in t$ \\ - \end{tabular} - \\ \\ - % Decide (boolean) - \AXC{$\text{Solve}(\mathbb{S}, t)$} - \LLc{Decide} - \UIC{$\text{Solve}(\mathbb{S}, t :: a \mapsto_n \top)$} - \DP{} & - \begin{tabular}{l} - $a \notin t, \neg a \notin t, a \in \mathbb{S}$ \\ - $n = \text{max\_level}(t) + 1$ \\ - \end{tabular} - \\ \\ - % Conflict (boolean) - \AXC{$\text{Solve}(\mathbb{S}, t)$} - \LLc{Conflict} - \UIC{$\text{Analyze}(\mathbb{S}, t, C)$} - \DP{} & - $C \in \mathbb{S}, \forall a \in C. \neg a \in t$ - \\ \\ - \end{tabular} - \end{center} - \begin{center} - \begin{tabular}{c@{\hspace{.5cm}}l} - % Analyze (propagation) - \AXC{$\text{Analyze}(\mathbb{S}, t :: a \leadsto_C \top, D)$} - \LLc{Analyze-propagation} - \UIC{$\text{Analyze}(\mathbb{S}, t, D)$} - \DP{} & - $\neg a \notin D$ - \\ \\ - % Analyze (decision) - \AXC{$\text{Analyze}(\mathbb{S}, t :: a \mapsto_n \top, D)$} - \LLc{Analyze-decision} - \UIC{$\text{Analyze}(\mathbb{S}, t, D)$} - \DP{} & - $\neg a \notin D$ - \\ \\ - % Resolution - \AXC{$\text{Analyze}(\mathbb{S}, t :: a \leadsto_C \top, D)$} - \LLc{Analyze-Resolution} - \UIC{$\text{Analyze}(\mathbb{S}, t, (C - \{a\}) \cup (D - \{ \neg a\}))$} - \DP{} & - $\neg a \in D$ - \\ \\ - \end{tabular} - \end{center} - \begin{center} - \begin{tabular}{c@{\hspace{1cm}}l} - % BackJump - \AXC{$\text{Analyze}(\mathbb{S}, t :: a \mapsto_d \top :: t', C)$} - \LLc{Backjump} - \UIC{$\text{Solve}(\mathbb{S} \cup \{ C \}, t)$} - \DP{} & - \begin{tabular}{l} - $\text{is\_uip}(C)$ \\ - $d \leq \text{uip\_level}(C)$ \\ - \end{tabular} - \\ \\ - \end{tabular} - \end{center} - - \center{\underline{\smt{}}} - \begin{center} - \begin{tabular}{c@{\hspace{1cm}}l} - % Conflict (theory) - \AXC{$\text{Solve}(\mathbb{S}, t)$} - \LLc{Conflict-Smt} - \UIC{$\text{Analyze}(\mathbb{S}, t, C)$} - \DP{} & - \begin{tabular}{l} - $\mathcal{T} \vdash C$ \\ - $\forall a \in C. \neg a \in t$ \\ - \end{tabular} - \\ \\ - \end{tabular} - \end{center} - \caption{Inference rules for \sat{} and \smt{}}\label{fig:smt_rules} -\end{figure} - -\begin{figure} - \center{\underline{\mcsat{}}} - \begin{center} - \begin{tabular}{c@{\hspace{.2cm}}l} - % Decide (assignment) - \AXC{$\text{Solve}(\mathbb{S}, t)$} - \LLc{Theory-Decide} - \UIC{$\text{Solve}(\mathbb{S}, t :: a \mapsto_n v)$} - \DP{} & - \begin{tabular}{l} - $a \mapsto_k \_ \notin t$ \\ - $n = \text{max\_level}(t) + 1$ \\ - $\sigma_{t :: a \mapsto_n v} \text{ compatible with } t$ - \end{tabular} - \\ \\ - % Propagation (semantic) - \AXC{$\text{Solve}(\mathbb{S}, t)$} - \LLc{Propagate-Theory} - \UIC{$\text{Solve}(\mathbb{S}, t :: a \leadsto_n \top)$} - \DP{} & - $\sigma_t(a) = \top$ - \\ \\ - % Conflict (semantic) - \AXC{$\text{Solve}(\mathbb{S}, t)$} - \LLc{Conflict-Mcsat} - \UIC{$\text{Analyze}(\mathbb{S}, t, C)$} - \DP{} & - \begin{tabular}{l} - $\mathcal{T} \vdash C$ \\ - $\forall a \in C, \neg a \in t \lor \sigma_t(a) = \bot$ - \end{tabular} - \\ \\ - \end{tabular} - \end{center} - \begin{center} - \begin{tabular}{c@{\hspace{.2cm}}l} - % Analyze (assign) - \AXC{$\text{Analyze}(\mathbb{S}, t :: a \mapsto_n v, C)$} - \LLc{Analyze-Assign} - \UIC{$\text{Analyze}(\mathbb{S}, t, C)$} - \DP{} & - \\ \\ - % Analyze (semantic) - \AXC{$\text{Analyze}(\mathbb{S}, t :: a \leadsto_n \top, C)$} - \LLc{Analyze-Semantic} - \UIC{$\text{Analyze}(\mathbb{S}, t, C)$} - \DP{} & - \\ \\ - \end{tabular} - \end{center} - \begin{center} - \begin{tabular}{c@{\hspace{.2cm}}l} - % Backjump (semantic) - \AXC{$\text{Analyze}(\mathbb{S}, t :: a \mapsto_n \_ :: \_, C)$} - \LLc{Backjump-Semantic} - \UIC{$\text{Solve}(\mathbb{S}, t)$} - \DP{} & - \begin{tabular}{l} - $\text{is\_semantic}(C)$ \\ - $n = \text{slevel}(C)$ \\ - \end{tabular} - \\ \\ - \end{tabular} - \end{center} - \caption{\mcsat{} specific inference rules}\label{fig:mcsat_rules} -\end{figure} - -\subsection{Invariants, correctness and completeness} - -The following invariants are maintained by the inference rules in Figure~\ref{fig:smt_rules}: -\begin{description} - \item[Trail Soundness] In $\text{Solve}(\mathbb{S}, t)$, if $a \in t$ then $\neg a \notin t$ - \item[Conflict Analysis] In $\text{Analyze}(\mathbb{S}, t, C)$, $C$ is a clause implied by the - clauses in $\mathbb{S}$, and $\forall a \in C. \neg a \in t$ (i.e.~$C$ is entailed by the - hypotheses, yet false in the partial model formed by the trail $t$). - \item[Equivalence] In any rule \AXC{$s_1$}\UIC{$s_2$}\DP{}, the set of hypotheses - (usually written $\mathbb{S}$) in $s_1$ is equivalent to that of $s_2$. -\end{description} - -These invariants are relatively easy to prove, and provide an easy proof of correctness for -the \cdcl{} algorithm. Termination can be proved by observing that the same trail appears -at most twice during proof search (once during propagation, and eventually a second time -right after backjumping\footnote{This could be avoided by making the \irule{Backjump} rule -directly propagate the relevant literal of the conflict clause, but it needlessly -complicates the rule.}). Correctness and termination implies completeness of the \sat{} -algorithm. - - -\section{\smt{} solver architecture}\label{sec:smt} - -\subsection{Idea}\label{sec:smt_flow} - -In a \smt{} solver, after each propagation and decision, the solver sends the newly -assigned literals to the theory. The theory then has the possibility to declare the -current set of literals incoherent, and give the solver a tautology in which all -literals are currently assigned to $\bot$\footnote{or rather for each literal, its negation -is assigned to $\top$}, thus prompting the solver to backtrack. -We can represent a simplified version of the information flow (not taking into -account backtracking) of usual \smt{} Solvers, using the graph in fig~\ref{fig:smt_flow}. - -Contrary to what the Figure~\ref{fig:smt_flow} could suggest, it is not impossible -for the theory to propagate information back to the \sat{} solver. Indeed, -some \smt{} solvers already allow the theory to propagate entailed literals back to the -\sat{} solver. However, these propagations are in practice limited by the complexity -of deciding entailment. Moreover, procedures in a \smt{} solver should be incremental -in order to get decent performances, and deciding entailment in an incremental manner -is not easy (TODO~: ref nécessaire). Doing efficient, incremental entailment is exactly -what \mcsat{} allows (see Section~\ref{sec:mcsat}). - -\subsection{Formalization and Theory requirements} - -An \smt{} solver is the combination of a \sat{} solver, and a theory $\mathcal{T}$. -The role of the theory $\mathcal{T}$ is to stop the proof search as soon as the trail -of the \sat{} solver is inconsistent. A trail is inconsistent iff there exists a clause -$C$, which is a tautology of $\mathcal{T}$ (thus $\mathcal{T} \vdash C$), but is not -satisfied in the current trail (each of its literals has either been decided or -propagated to false). Thus, we can add the \irule{Conflict-Smt} rule (see -Figure~\ref{fig:smt_rules}) to the \cdcl{} inference rules in order to get a \smt{} solver. -We give the \irule{Conflict-Smt} rule a slightly lower precedence than the -\irule{Conflict} rule for performance reason (detecting boolean conflict is -faster than theory specific conflicts). - -So, what is the minimum that a theory must implement in practice to be used in a -\smt{} solver~? The theory has to ensure that the current trail is consistent -(when seen as a conjunction of literals), that is to say, given a trail $t$, -it must ensure that there exists a model $\mathcal{M}$ of $\mathcal{T} $so that -$\forall a \in t. \mathcal{M} \vDash a$, or if it is impossible (i.e.~the trail -is inconsistent) produce a conflict. - -\begin{figure} - \begin{center} - \begin{tikzpicture}[ - ->, % Arrow style - > = stealth, % arrow head style - shorten > = 1pt, % don't touch arrow head to node - node distance = 2cm, % distance between nodes - semithick, % line style - auto - ] - - \tikzstyle{state}=[rectangle,draw=black!75] - - \node (sat) {SAT Core}; - \node (th) [right of=sat, node distance=6cm] {Theory}; - \node[state] (d) [below of=sat, node distance=1cm] {Decision (boolean)}; - \node[state] (bp) [below of=d, node distance=2cm] {Boolean propagation}; - \node[state] (tp) [right of=bp, node distance=6cm] {Theory propagation}; - - \draw (d) edge [bend left=30] (bp); - \draw (bp) edge [bend left=30] (d); - \draw (bp) edge (tp); - - \draw[black!50] (-2,1) rectangle (2,-4); - \draw[black!50] (4,1) rectangle (8,-4); - - \end{tikzpicture} - \end{center} - \caption{Simplified \smt{} Solver architecture}\label{fig:smt_flow} -\end{figure} - -\section{\mcsat{}: An extension of \smt{} Solvers}\label{sec:mcsat} - -\subsection{Motivation and idea} - -\mcsat{} is an extension of usual \smt{} solvers, introduced in~\cite{VMCAI13} and~\cite{FMCAD13}. -In usual \smt{} Solvers, interaction between the core SAT Solver and the Theory is pretty limited~: -the SAT Solver make boolean decisions and propagations, and sends them to the theory, -whose role is in return to stop the SAT Solver as soon as the current set of assumptions -is incoherent. This means that the information that theories can give the SAT Solver is -pretty limited, and furthermore it heavily restricts the ability of theories to guide -the proof search (see Section~\ref{sec:smt_flow}). - -While it appears to leave a reasonably simple job to the theory, since it completely -hides the propositional structure of the problem, this simple interaction between the -SAT Solver and the theory makes it harder to combine multiple theories into one. Usual -techniques for combining theories in \smt{} solvers typically require to keep track of -equivalence classes (with respect to the congruence closure) and for instance -in the Nelson-Oppen method for combining theories (TODO~: ref nécessaire) require of -theories to propagate any equality implied by the current assertions. - -\mcsat{} extends the SAT paradigm by allowing more exchange of information between the theory -and the SAT Solver. This is achieved by allowing the solver to not only decide on the truth value -of atomic propositions, but also to decide assignments for terms that appear in the problem. -For instance, if the SAT Solver assigns a variable $x$ to $0$, -an arithmetic theory could propagate to the SAT Solver that the formula $x < 1$ must also hold, -instead of waiting for the SAT Solver to guess the truth value of $x < 1$ and then -inform the SAT Solver that the conjunction~: $x = 0 \land \neg x < 1$ is incoherent. - -This exchange of information between the SAT Solver and the theories results in -the construction of a model throughout the proof search (which explains the name -Model Constructing SAT). - -The main addition of \mcsat{} is that when the solver makes a decision, instead of -being restricted to making boolean assignment of formulas, the solver now can -decide to assign a value to a term belonging to one of the literals. In order to do so, -the solver first chooses a term that has not yet been assigned, and then asks -the theory for a possible assignment. Like in usual SMT Solvers, a \mcsat{} solver -only exchange information with one theory, but, as we will see, combination -of theories into one becomes easier in this framework, because assignments are -actually a very good way to exchange information. - -Using the assignments on terms, the theory can then very easily do efficient -propagation of formulas implied by the current assignments: it is enough to -evaluate formulas using the current partial assignment. -The information flow then looks like fig~\ref{fig:mcsat_flow}. - -\begin{figure} - \begin{center} - \begin{tikzpicture}[ - ->, % Arrow style - > = stealth, % arrow head style - shorten > = 1pt, % don't touch arrow head to node - node distance = 2cm, % distance between nodes - semithick, % line style - auto - ] - - \tikzstyle{state}=[rectangle,draw=black!75] - - \node (sat) {SAT Core}; - \node (th) [right of=sat, node distance=6cm] {Theory}; - \node[state] (d) [below of=sat, node distance=1cm] {Decision}; - \node[state] (ass) [right of=d, node distance=6cm] {Assignment}; - \node[state] (bp) [below of=d, node distance=2cm] {Boolean propagation}; - \node[state] (tp) [right of=bp, node distance=6cm] {Theory propagation}; - - \draw (bp)[right] edge [bend left=5] (tp); - \draw (tp) edge [bend left=5] (bp); - \draw (bp) edge [bend left=30] (d); - \draw (ass) edge [bend left=5] (d); - \draw (d) edge [bend left=5] (ass); - \draw (d) edge [bend left=30] (bp); - \draw[<->] (ass) edge (tp); - - \draw[black!50] (-2,1) rectangle (2,-4); - \draw[black!50] (4,1) rectangle (8,-4); - - \end{tikzpicture} - \end{center} - \caption{Simplified \mcsat{} Solver architecture}\label{fig:mcsat_flow} -\end{figure} - -\subsection{Decisions and propagations} - -In \mcsat{}, semantic propagations are a bit different from the propagations -used in traditional \smt{} Solvers. In the case of \mcsat{} (or at least the version presented here), -semantic propagations strictly correspond to the evaluation of formulas in the -current assignment. Moreover, in order to be able to correctly handle these semantic -propagations during backtrack, they are assigned a level: each decision is given a level -(using the same process as in a \sat{} solvers: a decision level is the number of decision -previous to it, plus one), and a formula is propagated at the maximum level of the decisions -used to evaluate it. - -We can thus extend the notations introduced in Section~\ref{sec:trail}: $t \mapsto_n v$ is -a semantic decision which assign $t$ to a concrete value $v$, $a \leadsto_n \top$ is a -semantic propagation at level $n$. - -For instance, if the current trail is $\{x \mapsto_1 0, x + y + z = 0 \mapsto_2 \top, y\mapsto_3 0\}$, -then $x + y = 0$ can be propagated at level $3$, but $z = 0$ can not be propagated, because -$z$ is not assigned yet, even if there is only one remaining valid value for $z$. -The problem with assignments as propagations is that it is not clear what to do with -them during the $\text{Analyze}$ phase of the solver, see later. - -\subsection{First order terms \& Models} - -A model traditionally is a triplet which comprises a domain, a signature and an interpretation -function. Since most problems define their signature using type definitions at the beginning, and -built-in theories such as arithmetic usually have canonic domain and signature, -we will consider in the following that the domain $\mathbb{D}$ and signature are given (and constant). -Additionally, we consider a fixed interpretation of the theory-defined symbols (which -usually does along with the domain). - -In the following, we use the following notations: -\begin{itemize} - \item $\mathbb{V}$ is an infinite set of variables; - \item $\mathbb{C}$ is the possibly infinite set of constants defined - by the input problem's theories\footnote{For instance, the theory of arithmetic - defines the usual operators $+, -, *, /$ as well as $0, -5, \frac{1}{2}, -2.3, \ldots$}; - \item $\mathbb{S}$ is the finite set of constants defined by the input problem's type - definitions; - \item $\mathbb{T}$ is the (infinite) set of first-order terms over $\mathbb{V}$, $\mathbb{C}$ - and $\mathbb{S}$ (for instance $a, f(0), x + y, \ldots$); - \item $\mathbb{F}$ is the (infinite) set of first order quantified formulas - over the terms in $\mathbb{T}$. -\end{itemize} - -We are therefore interested in finding an interpretation of a problem, and more -specifically an interpretation of the symbols in $\mathbb{S}$ not defined by -the theory, i.e.~non-interpreted functions. An interpretation -$\mathcal{I}$ can easily be extended to a function from ground terms and formulas -to model value by recursively applying it: -\[ - \mathcal{I}( f ( e_1, \ldots, e_n ) ) = - \mathcal{I}_f ( \mathcal{I}(e_1), \ldots, \mathcal{I}(e_n) ) -\] - -\paragraph{Partial Interpretation} -The goal of \mcsat{} is to construct a first-order model of the input formulas. To do so, -it has to build what we will call partial interpretations: intuitively, a partial -interpretation is a partial function from the constants symbols to the model values. -It is, however, not so simple: during proof search, the \mcsat{} algorithm maintains -a partial mapping from expressions to model values (and not from constant symbols -to model values). The intention of this mapping is to represent a set of constraints -on the partial interpretation that the algorithm is building. -For instance, given a function symbol $f$ of type $(\text{int} \rightarrow \text{int})$ and an -integer constant $a$, one such constraint that we would like to be able to express in -our mapping is the following: $f(a) \mapsto 0$, regardless of the values that $f$ takes on -other argument, and also regardless of the value mapped to $a$. To that end we introduce -the notion of abstract partial interpretation. - -An abstract partial interpretation $\sigma$ is a mapping from ground expressions to model values. -To each abstract partial interpretation correspond a set of complete models that realize it. -More precisely, any mapping $\sigma$ can be completed in various ways, leading to a set of -potential interpretations: -\[ - \text{Complete}(\sigma) = - \left\{ - \mathcal{I} - \; | \; - \forall t \mapsto v \in \sigma , \mathcal{I} ( t ) = v - \right\} -\] - -\paragraph{Coherence} -An abstract partial interpretation $\sigma$ is said to be coherent iff -there exists at least one model that completes it, -i.e.~$\text{Complete}(\sigma) \neq \emptyset$. One example -of incoherent abstract partial interpretation is the following -mapping: -\[ - \sigma = \left\{ - \begin{matrix} - a \mapsto 0 \\ - b \mapsto 0 \\ - f(a) \mapsto 0 \\ - f(b) \mapsto 1 \\ - \end{matrix} - \right. -\] - -\paragraph{Compatibility} -In order to do semantic propagations, we want to have some kind of notion -of evaluation for abstract partial interpretations, and we thus define the -partial interpretation function in the following way: -\[ - \forall t \in \mathbb{T} \cup \mathbb{F}. \forall v \in \mathbb{D}. - \left( - \forall \mathcal{I} \in \text{Complete}(\sigma). - \mathcal{I}(t) = v - \right) \rightarrow - \sigma(t) = v -\] - -The partial interpretation function is the intersection of the interpretation -functions of all the completions of $\sigma$, i.e.~it is the interpretation -where all completions agree. We can now say that a mapping $\sigma$ is compatible -with a trail $t$ iff $\sigma$ is coherent, and -$\forall a \in t. \not (\sigma(a) = \bot)$, or in other words, for every literal $a$ -true in the trail, there exists at least one model that completes $\sigma$ and where -$a$ is satisfied. - -\paragraph{Completeness} -We need one last property on abstract partial interpretations, which is to -specify the relation between the terms in a mapping, and the terms it can evaluate, -according to its partial interpretation function defined above. Indeed, at the end -of proof search, we want all terms (and sub-terms) of the initial problem to be -evaluated using the final mapping returned by the algorithm when it finds that the -problem is satisfiable: that is what we will call completeness of a mapping. -To that end we will here enumerate some sufficient conditions on the mapping domain -$\text{Dom}(\sigma)$ compared to the finite set of terms (and sub-terms) $T$ that -appear in the problem. - -A first way, is to have $\text{Dom}(\sigma) = T$, i.e.~all terms (and sub-terms) of the -initial problem are present in the mapping. While this is the simplest way to ensure that -the mapping is complete, it might be a bit heavy: for instance, we might have to assign -both $x$ and $2x$, which is redundant. The problem in that case is that we try and assign -a term for which we could actually compute the value from its arguments: indeed, -since the multiplication is interpreted, we do not need to interpret it in our mapping. -This leads to another way to have a complete mapping: if $\text{Dom}(\sigma)$ is the -set of terms of $T$ whose head symbol is uninterpreted (i.e.~not defined by the theory), -it is enough to ensure that the mapping is complete, because the theory will automatically -constrain the value of terms whose head symbol is interpreted. - -\subsection{Inference rules} - -In \mcsat{}, a trail $t$ contains boolean decision and propagations as well as semantic -decisions (assignments) and propagations. We can thus define $\sigma_t$ as the mapping -formed by all assignments in $t$. This lets us define the last rules in Figure~\ref{fig:mcsat_rules}, -that, together with the other rules, define the \mcsat{} algorithm. - - -\bibliographystyle{plain} -\bibliography{biblio} - -\clearpage -\appendix - -\end{document} diff --git a/myocamlbuild.ml b/myocamlbuild.ml new file mode 100644 index 00000000..818b0017 --- /dev/null +++ b/myocamlbuild.ml @@ -0,0 +1,16 @@ + +(* This file is free software, part of mSAT. See file "LICENSE" for more information. *) + +open Ocamlbuild_plugin;; + +let doc_intro = "src/doc.txt";; + +dispatch begin function + | After_rules -> + (* Documentation index *) + dep ["ocaml"; "doc"; "extension:html"] & [doc_intro] ; + flag ["ocaml"; "doc"; "extension:html"] + & S [ A "-t"; A "mSAT doc"; A "-intro"; P doc_intro ]; + | _ -> () +end + diff --git a/src/doc.txt b/src/doc.txt new file mode 100644 index 00000000..aaea57e3 --- /dev/null +++ b/src/doc.txt @@ -0,0 +1,83 @@ +{1 Dolmen} + +{2 License} + +This code is free, under the Apache 2.0 license. + +{2 Contents} + +mSAT is an ocaml library provinding SAT/SMT/McSat solvers. More precisely, +what mSAT provides are functors to easily create such solvers. Indeed, the core +of a sat solver does not need much information about neither the exact representation +of terms nor the innner workings of a theory. + +{4 Solver creation} + +The following modules allow to easily create a SAT or SMT solver (remark: a SAT solver is +simply an SMT solver with an empty theory). + +{!modules: +Solver +Formula_intf +Theory_intf +} + +The following modules allow the creation of a McSat solver (Model Constructing solver): + +{!ṃodules: +Mcsolver +Expr_intf +Plugin_intf +} + +Lastly, mSAT also provides an implementation of Tseitin's CNF conversion: + +{!modules +Tseitin +Tseitin_intf +} + +{4 Proof Management} + +mSAT solvers are able to provide detailed proofs when an unsat state is reached. To do +so, it require the provided theory to give proofs of the tautologies it gives the solver. +These proofs will be called lemmas. The type of lemmas is defined by the theory and can +very well be [unit]. + +In this context a proof is a resolution tree, whose conclusion (i.e. root) is the +empty clause, effectively allowing to deduce [false] from the hypotheses. +A resolution tree is a binary tree whose nodes are clauses. Inner nodes' clauses are +obtained by performing resolution between the two clauses of the children nodes, while +leafs of the tree are either hypotheses, or tautologies (i.e. conflicts returned by +the theory). + +{!modules: +Res +Res_intf +} + +Backends for exporting proofs to different formats: + +{!modules: +Dot +Dedukti +Backend_intf +} + +{4 Internal modules} + +WARNING: for advanced users only ! These modules expose a lot of unsafe functions +that must be used with care to not break the required invariants. Additionally, these +interfaces are not part of the main API and so are subject to a lot more breaking changes +than the safe modules above. + +{!modules: +Internal +External +Solver_types +Solver_types_intf +} + +{2 Index} + +{!indexlist}