moonpool/dev/ocaml/Stdlib/Genlex/index.html

9 lines
No EOL
5.6 KiB
HTML
Raw Blame History

This file contains ambiguous Unicode characters

This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

<!DOCTYPE html>
<html xmlns="http://www.w3.org/1999/xhtml"><head><title>Genlex (ocaml.Stdlib.Genlex)</title><link rel="stylesheet" href="../../../_odoc-theme/odoc.css"/><meta charset="utf-8"/><meta name="generator" content="odoc 2.2.1"/><meta name="viewport" content="width=device-width,initial-scale=1.0"/><script src="../../../highlight.pack.js"></script><script>hljs.initHighlightingOnLoad();</script></head><body class="odoc"><nav class="odoc-nav"><a href="../index.html">Up</a> <a href="../../index.html">ocaml</a> &#x00BB; <a href="../index.html">Stdlib</a> &#x00BB; Genlex</nav><header class="odoc-preamble"><h1>Module <code><span>Stdlib.Genlex</span></code></h1><ul class="at-tags"><li class="deprecated"><span class="at-tag">deprecated</span> Use the camlp-streams library instead.</li></ul><p>A generic lexical analyzer.</p><p>This module implements a simple 'standard' lexical analyzer, presented as a function from character streams to token streams. It implements roughly the lexical conventions of OCaml, but is parameterized by the set of keywords of your language.</p><p>Example: a lexer suitable for a desk calculator is obtained by</p><pre class="language-ocaml"><code>let lexer = make_lexer [&quot;+&quot;; &quot;-&quot;; &quot;*&quot;; &quot;/&quot;; &quot;let&quot;; &quot;=&quot;; &quot;(&quot;; &quot;)&quot;]</code></pre><p>The associated parser would be a function from <code>token stream</code> to, for instance, <code>int</code>, and would have rules such as:</p><pre class="language-ocaml"><code>let rec parse_expr = parser
| [&lt; n1 = parse_atom; n2 = parse_remainder n1 &gt;] -&gt; n2
and parse_atom = parser
| [&lt; 'Int n &gt;] -&gt; n
| [&lt; 'Kwd &quot;(&quot;; n = parse_expr; 'Kwd &quot;)&quot; &gt;] -&gt; n
and parse_remainder n1 = parser
| [&lt; 'Kwd &quot;+&quot;; n2 = parse_expr &gt;] -&gt; n1 + n2
| [&lt; &gt;] -&gt; n1</code></pre><p>One should notice that the use of the <code>parser</code> keyword and associated notation for streams are only available through camlp4 extensions. This means that one has to preprocess its sources <i>e. g.</i> by using the <code>&quot;-pp&quot;</code> command-line switch of the compilers.</p></header><div class="odoc-content"><div class="odoc-spec"><div class="spec type anchored" id="type-token"><a href="#type-token" class="anchor"></a><code><span><span class="keyword">type</span> token</span><span> = </span></code><ol><li id="type-token.Kwd" class="def variant constructor anchored"><a href="#type-token.Kwd" class="anchor"></a><code><span>| </span><span><span class="constructor">Kwd</span> <span class="keyword">of</span> string</span></code></li><li id="type-token.Ident" class="def variant constructor anchored"><a href="#type-token.Ident" class="anchor"></a><code><span>| </span><span><span class="constructor">Ident</span> <span class="keyword">of</span> string</span></code></li><li id="type-token.Int" class="def variant constructor anchored"><a href="#type-token.Int" class="anchor"></a><code><span>| </span><span><span class="constructor">Int</span> <span class="keyword">of</span> int</span></code></li><li id="type-token.Float" class="def variant constructor anchored"><a href="#type-token.Float" class="anchor"></a><code><span>| </span><span><span class="constructor">Float</span> <span class="keyword">of</span> float</span></code></li><li id="type-token.String" class="def variant constructor anchored"><a href="#type-token.String" class="anchor"></a><code><span>| </span><span><span class="constructor">String</span> <span class="keyword">of</span> string</span></code></li><li id="type-token.Char" class="def variant constructor anchored"><a href="#type-token.Char" class="anchor"></a><code><span>| </span><span><span class="constructor">Char</span> <span class="keyword">of</span> char</span></code></li></ol></div><div class="spec-doc"><p>The type of tokens. The lexical classes are: <code>Int</code> and <code>Float</code> for integer and floating-point numbers; <code>String</code> for string literals, enclosed in double quotes; <code>Char</code> for character literals, enclosed in single quotes; <code>Ident</code> for identifiers (either sequences of letters, digits, underscores and quotes, or sequences of 'operator characters' such as <code>+</code>, <code>*</code>, etc); and <code>Kwd</code> for keywords (either identifiers or single 'special characters' such as <code>(</code>, <code>}</code>, etc).</p></div></div><div class="odoc-spec"><div class="spec value anchored" id="val-make_lexer"><a href="#val-make_lexer" class="anchor"></a><code><span><span class="keyword">val</span> make_lexer : <span><span>string list</span> <span class="arrow">&#45;&gt;</span></span> <span><span>char <a href="../Stream/index.html#type-t">Stream.t</a></span> <span class="arrow">&#45;&gt;</span></span> <span><a href="#type-token">token</a> <a href="../Stream/index.html#type-t">Stream.t</a></span></span></code></div><div class="spec-doc"><p>Construct the lexer function. The first argument is the list of keywords. An identifier <code>s</code> is returned as <code>Kwd s</code> if <code>s</code> belongs to this list, and as <code>Ident s</code> otherwise. A special character <code>s</code> is returned as <code>Kwd s</code> if <code>s</code> belongs to this list, and cause a lexical error (exception <a href="../Stream/index.html#exception-Error"><code>Stream.Error</code></a> with the offending lexeme as its parameter) otherwise. Blanks and newlines are skipped. Comments delimited by <code>(*</code> and <code>*)</code> are skipped as well, and can be nested. A <a href="../Stream/index.html#exception-Failure"><code>Stream.Failure</code></a> exception is raised if end of stream is unexpectedly reached.</p></div></div></div></body></html>