moonpool/dev/ocaml/Stdlib/Lexing/index.html

2 lines
No EOL
17 KiB
HTML
Raw Blame History

This file contains ambiguous Unicode characters

This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

<!DOCTYPE html>
<html xmlns="http://www.w3.org/1999/xhtml"><head><title>Lexing (ocaml.Stdlib.Lexing)</title><link rel="stylesheet" href="../../../_odoc-theme/odoc.css"/><meta charset="utf-8"/><meta name="generator" content="odoc 2.2.1"/><meta name="viewport" content="width=device-width,initial-scale=1.0"/><script src="../../../highlight.pack.js"></script><script>hljs.initHighlightingOnLoad();</script></head><body class="odoc"><nav class="odoc-nav"><a href="../index.html">Up</a> <a href="../../index.html">ocaml</a> &#x00BB; <a href="../index.html">Stdlib</a> &#x00BB; Lexing</nav><header class="odoc-preamble"><h1>Module <code><span>Stdlib.Lexing</span></code></h1><p>The run-time library for lexers generated by <code>ocamllex</code>.</p></header><nav class="odoc-toc"><ul><li><a href="#positions">Positions</a></li><li><a href="#lexer-buffers">Lexer buffers</a></li><li><a href="#functions-for-lexer-semantic-actions">Functions for lexer semantic actions</a></li><li><a href="#miscellaneous-functions">Miscellaneous functions</a></li></ul></nav><div class="odoc-content"><h2 id="positions"><a href="#positions" class="anchor"></a>Positions</h2><div class="odoc-spec"><div class="spec type anchored" id="type-position"><a href="#type-position" class="anchor"></a><code><span><span class="keyword">type</span> position</span><span> = </span><span>{</span></code><ol><li id="type-position.pos_fname" class="def record field anchored"><a href="#type-position.pos_fname" class="anchor"></a><code><span>pos_fname : string;</span></code></li><li id="type-position.pos_lnum" class="def record field anchored"><a href="#type-position.pos_lnum" class="anchor"></a><code><span>pos_lnum : int;</span></code></li><li id="type-position.pos_bol" class="def record field anchored"><a href="#type-position.pos_bol" class="anchor"></a><code><span>pos_bol : int;</span></code></li><li id="type-position.pos_cnum" class="def record field anchored"><a href="#type-position.pos_cnum" class="anchor"></a><code><span>pos_cnum : int;</span></code></li></ol><code><span>}</span></code></div><div class="spec-doc"><p>A value of type <code>position</code> describes a point in a source file. <code>pos_fname</code> is the file name; <code>pos_lnum</code> is the line number; <code>pos_bol</code> is the offset of the beginning of the line (number of characters between the beginning of the lexbuf and the beginning of the line); <code>pos_cnum</code> is the offset of the position (number of characters between the beginning of the lexbuf and the position). The difference between <code>pos_cnum</code> and <code>pos_bol</code> is the character offset within the line (i.e. the column number, assuming each character is one column wide).</p><p>See the documentation of type <code>lexbuf</code> for information about how the lexing engine will manage positions.</p></div></div><div class="odoc-spec"><div class="spec value anchored" id="val-dummy_pos"><a href="#val-dummy_pos" class="anchor"></a><code><span><span class="keyword">val</span> dummy_pos : <a href="#type-position">position</a></span></code></div><div class="spec-doc"><p>A value of type <code>position</code>, guaranteed to be different from any valid position.</p></div></div><h2 id="lexer-buffers"><a href="#lexer-buffers" class="anchor"></a>Lexer buffers</h2><div class="odoc-spec"><div class="spec type anchored" id="type-lexbuf"><a href="#type-lexbuf" class="anchor"></a><code><span><span class="keyword">type</span> lexbuf</span><span> = </span><span>{</span></code><ol><li id="type-lexbuf.refill_buff" class="def record field anchored"><a href="#type-lexbuf.refill_buff" class="anchor"></a><code><span>refill_buff : <span><a href="#type-lexbuf">lexbuf</a> <span class="arrow">&#45;&gt;</span></span> unit;</span></code></li><li id="type-lexbuf.lex_buffer" class="def record field anchored"><a href="#type-lexbuf.lex_buffer" class="anchor"></a><code><span><span class="keyword">mutable</span> lex_buffer : bytes;</span></code></li><li id="type-lexbuf.lex_buffer_len" class="def record field anchored"><a href="#type-lexbuf.lex_buffer_len" class="anchor"></a><code><span><span class="keyword">mutable</span> lex_buffer_len : int;</span></code></li><li id="type-lexbuf.lex_abs_pos" class="def record field anchored"><a href="#type-lexbuf.lex_abs_pos" class="anchor"></a><code><span><span class="keyword">mutable</span> lex_abs_pos : int;</span></code></li><li id="type-lexbuf.lex_start_pos" class="def record field anchored"><a href="#type-lexbuf.lex_start_pos" class="anchor"></a><code><span><span class="keyword">mutable</span> lex_start_pos : int;</span></code></li><li id="type-lexbuf.lex_curr_pos" class="def record field anchored"><a href="#type-lexbuf.lex_curr_pos" class="anchor"></a><code><span><span class="keyword">mutable</span> lex_curr_pos : int;</span></code></li><li id="type-lexbuf.lex_last_pos" class="def record field anchored"><a href="#type-lexbuf.lex_last_pos" class="anchor"></a><code><span><span class="keyword">mutable</span> lex_last_pos : int;</span></code></li><li id="type-lexbuf.lex_last_action" class="def record field anchored"><a href="#type-lexbuf.lex_last_action" class="anchor"></a><code><span><span class="keyword">mutable</span> lex_last_action : int;</span></code></li><li id="type-lexbuf.lex_eof_reached" class="def record field anchored"><a href="#type-lexbuf.lex_eof_reached" class="anchor"></a><code><span><span class="keyword">mutable</span> lex_eof_reached : bool;</span></code></li><li id="type-lexbuf.lex_mem" class="def record field anchored"><a href="#type-lexbuf.lex_mem" class="anchor"></a><code><span><span class="keyword">mutable</span> lex_mem : <span>int array</span>;</span></code></li><li id="type-lexbuf.lex_start_p" class="def record field anchored"><a href="#type-lexbuf.lex_start_p" class="anchor"></a><code><span><span class="keyword">mutable</span> lex_start_p : <a href="#type-position">position</a>;</span></code></li><li id="type-lexbuf.lex_curr_p" class="def record field anchored"><a href="#type-lexbuf.lex_curr_p" class="anchor"></a><code><span><span class="keyword">mutable</span> lex_curr_p : <a href="#type-position">position</a>;</span></code></li></ol><code><span>}</span></code></div><div class="spec-doc"><p>The type of lexer buffers. A lexer buffer is the argument passed to the scanning functions defined by the generated scanners. The lexer buffer holds the current state of the scanner, plus a function to refill the buffer from the input.</p><p>Lexers can optionally maintain the <code>lex_curr_p</code> and <code>lex_start_p</code> position fields. This &quot;position tracking&quot; mode is the default, and it corresponds to passing <code>~with_position:true</code> to functions that create lexer buffers. In this mode, the lexing engine and lexer actions are co-responsible for properly updating the position fields, as described in the next paragraph. When the mode is explicitly disabled (with <code>~with_position:false</code>), the lexing engine will not touch the position fields and the lexer actions should be careful not to do it either; the <code>lex_curr_p</code> and <code>lex_start_p</code> field will then always hold the <code>dummy_pos</code> invalid position. Not tracking positions avoids allocations and memory writes and can significantly improve the performance of the lexer in contexts where <code>lex_start_p</code> and <code>lex_curr_p</code> are not needed.</p><p>Position tracking mode works as follows. At each token, the lexing engine will copy <code>lex_curr_p</code> to <code>lex_start_p</code>, then change the <code>pos_cnum</code> field of <code>lex_curr_p</code> by updating it with the number of characters read since the start of the <code>lexbuf</code>. The other fields are left unchanged by the lexing engine. In order to keep them accurate, they must be initialised before the first use of the lexbuf, and updated by the relevant lexer actions (i.e. at each end of line -- see also <code>new_line</code>).</p></div></div><div class="odoc-spec"><div class="spec value anchored" id="val-from_channel"><a href="#val-from_channel" class="anchor"></a><code><span><span class="keyword">val</span> from_channel : <span>?with_positions:bool <span class="arrow">&#45;&gt;</span></span> <span><a href="../index.html#type-in_channel">in_channel</a> <span class="arrow">&#45;&gt;</span></span> <a href="#type-lexbuf">lexbuf</a></span></code></div><div class="spec-doc"><p>Create a lexer buffer on the given input channel. <code>Lexing.from_channel inchan</code> returns a lexer buffer which reads from the input channel <code>inchan</code>, at the current reading position.</p></div></div><div class="odoc-spec"><div class="spec value anchored" id="val-from_string"><a href="#val-from_string" class="anchor"></a><code><span><span class="keyword">val</span> from_string : <span>?with_positions:bool <span class="arrow">&#45;&gt;</span></span> <span>string <span class="arrow">&#45;&gt;</span></span> <a href="#type-lexbuf">lexbuf</a></span></code></div><div class="spec-doc"><p>Create a lexer buffer which reads from the given string. Reading starts from the first character in the string. An end-of-input condition is generated when the end of the string is reached.</p></div></div><div class="odoc-spec"><div class="spec value anchored" id="val-from_function"><a href="#val-from_function" class="anchor"></a><code><span><span class="keyword">val</span> from_function : <span>?with_positions:bool <span class="arrow">&#45;&gt;</span></span> <span><span>(<span>bytes <span class="arrow">&#45;&gt;</span></span> <span>int <span class="arrow">&#45;&gt;</span></span> int)</span> <span class="arrow">&#45;&gt;</span></span> <a href="#type-lexbuf">lexbuf</a></span></code></div><div class="spec-doc"><p>Create a lexer buffer with the given function as its reading method. When the scanner needs more characters, it will call the given function, giving it a byte sequence <code>s</code> and a byte count <code>n</code>. The function should put <code>n</code> bytes or fewer in <code>s</code>, starting at index 0, and return the number of bytes provided. A return value of 0 means end of input.</p></div></div><div class="odoc-spec"><div class="spec value anchored" id="val-set_position"><a href="#val-set_position" class="anchor"></a><code><span><span class="keyword">val</span> set_position : <span><a href="#type-lexbuf">lexbuf</a> <span class="arrow">&#45;&gt;</span></span> <span><a href="#type-position">position</a> <span class="arrow">&#45;&gt;</span></span> unit</span></code></div><div class="spec-doc"><p>Set the initial tracked input position for <code>lexbuf</code> to a custom value. Ignores <code>pos_fname</code>. See <a href="#val-set_filename"><code>set_filename</code></a> for changing this field.</p><ul class="at-tags"><li class="since"><span class="at-tag">since</span> 4.11</li></ul></div></div><div class="odoc-spec"><div class="spec value anchored" id="val-set_filename"><a href="#val-set_filename" class="anchor"></a><code><span><span class="keyword">val</span> set_filename : <span><a href="#type-lexbuf">lexbuf</a> <span class="arrow">&#45;&gt;</span></span> <span>string <span class="arrow">&#45;&gt;</span></span> unit</span></code></div><div class="spec-doc"><p>Set filename in the initial tracked position to <code>file</code> in <code>lexbuf</code>.</p><ul class="at-tags"><li class="since"><span class="at-tag">since</span> 4.11</li></ul></div></div><div class="odoc-spec"><div class="spec value anchored" id="val-with_positions"><a href="#val-with_positions" class="anchor"></a><code><span><span class="keyword">val</span> with_positions : <span><a href="#type-lexbuf">lexbuf</a> <span class="arrow">&#45;&gt;</span></span> bool</span></code></div><div class="spec-doc"><p>Tell whether the lexer buffer keeps track of position fields <code>lex_curr_p</code> / <code>lex_start_p</code>, as determined by the corresponding optional argument for functions that create lexer buffers (whose default value is <code>true</code>).</p><p>When <code>with_positions</code> is <code>false</code>, lexer actions should not modify position fields. Doing it nevertheless could re-enable the <code>with_position</code> mode and degrade performances.</p></div></div><h2 id="functions-for-lexer-semantic-actions"><a href="#functions-for-lexer-semantic-actions" class="anchor"></a>Functions for lexer semantic actions</h2><p>The following functions can be called from the semantic actions of lexer definitions (the ML code enclosed in braces that computes the value returned by lexing functions). They give access to the character string matched by the regular expression associated with the semantic action. These functions must be applied to the argument <code>lexbuf</code>, which, in the code generated by <code>ocamllex</code>, is bound to the lexer buffer passed to the parsing function.</p><div class="odoc-spec"><div class="spec value anchored" id="val-lexeme"><a href="#val-lexeme" class="anchor"></a><code><span><span class="keyword">val</span> lexeme : <span><a href="#type-lexbuf">lexbuf</a> <span class="arrow">&#45;&gt;</span></span> string</span></code></div><div class="spec-doc"><p><code>Lexing.lexeme lexbuf</code> returns the string matched by the regular expression.</p></div></div><div class="odoc-spec"><div class="spec value anchored" id="val-lexeme_char"><a href="#val-lexeme_char" class="anchor"></a><code><span><span class="keyword">val</span> lexeme_char : <span><a href="#type-lexbuf">lexbuf</a> <span class="arrow">&#45;&gt;</span></span> <span>int <span class="arrow">&#45;&gt;</span></span> char</span></code></div><div class="spec-doc"><p><code>Lexing.lexeme_char lexbuf i</code> returns character number <code>i</code> in the matched string.</p></div></div><div class="odoc-spec"><div class="spec value anchored" id="val-lexeme_start"><a href="#val-lexeme_start" class="anchor"></a><code><span><span class="keyword">val</span> lexeme_start : <span><a href="#type-lexbuf">lexbuf</a> <span class="arrow">&#45;&gt;</span></span> int</span></code></div><div class="spec-doc"><p><code>Lexing.lexeme_start lexbuf</code> returns the offset in the input stream of the first character of the matched string. The first character of the stream has offset 0.</p></div></div><div class="odoc-spec"><div class="spec value anchored" id="val-lexeme_end"><a href="#val-lexeme_end" class="anchor"></a><code><span><span class="keyword">val</span> lexeme_end : <span><a href="#type-lexbuf">lexbuf</a> <span class="arrow">&#45;&gt;</span></span> int</span></code></div><div class="spec-doc"><p><code>Lexing.lexeme_end lexbuf</code> returns the offset in the input stream of the character following the last character of the matched string. The first character of the stream has offset 0.</p></div></div><div class="odoc-spec"><div class="spec value anchored" id="val-lexeme_start_p"><a href="#val-lexeme_start_p" class="anchor"></a><code><span><span class="keyword">val</span> lexeme_start_p : <span><a href="#type-lexbuf">lexbuf</a> <span class="arrow">&#45;&gt;</span></span> <a href="#type-position">position</a></span></code></div><div class="spec-doc"><p>Like <code>lexeme_start</code>, but return a complete <code>position</code> instead of an offset. When position tracking is disabled, the function returns <code>dummy_pos</code>.</p></div></div><div class="odoc-spec"><div class="spec value anchored" id="val-lexeme_end_p"><a href="#val-lexeme_end_p" class="anchor"></a><code><span><span class="keyword">val</span> lexeme_end_p : <span><a href="#type-lexbuf">lexbuf</a> <span class="arrow">&#45;&gt;</span></span> <a href="#type-position">position</a></span></code></div><div class="spec-doc"><p>Like <code>lexeme_end</code>, but return a complete <code>position</code> instead of an offset. When position tracking is disabled, the function returns <code>dummy_pos</code>.</p></div></div><div class="odoc-spec"><div class="spec value anchored" id="val-new_line"><a href="#val-new_line" class="anchor"></a><code><span><span class="keyword">val</span> new_line : <span><a href="#type-lexbuf">lexbuf</a> <span class="arrow">&#45;&gt;</span></span> unit</span></code></div><div class="spec-doc"><p>Update the <code>lex_curr_p</code> field of the lexbuf to reflect the start of a new line. You can call this function in the semantic action of the rule that matches the end-of-line character. The function does nothing when position tracking is disabled.</p><ul class="at-tags"><li class="since"><span class="at-tag">since</span> 3.11.0</li></ul></div></div><h2 id="miscellaneous-functions"><a href="#miscellaneous-functions" class="anchor"></a>Miscellaneous functions</h2><div class="odoc-spec"><div class="spec value anchored" id="val-flush_input"><a href="#val-flush_input" class="anchor"></a><code><span><span class="keyword">val</span> flush_input : <span><a href="#type-lexbuf">lexbuf</a> <span class="arrow">&#45;&gt;</span></span> unit</span></code></div><div class="spec-doc"><p>Discard the contents of the buffer and reset the current position to 0. The next use of the lexbuf will trigger a refill.</p></div></div></div></body></html>