mirror of
https://github.com/c-cube/moonpool.git
synced 2025-12-16 23:56:49 -05:00
4 lines
No EOL
27 KiB
HTML
4 lines
No EOL
27 KiB
HTML
<!DOCTYPE html>
|
||
<html xmlns="http://www.w3.org/1999/xhtml"><head><title>Str (ocaml.Str)</title><link rel="stylesheet" href="../../_odoc-theme/odoc.css"/><meta charset="utf-8"/><meta name="generator" content="odoc 2.2.1"/><meta name="viewport" content="width=device-width,initial-scale=1.0"/><script src="../../highlight.pack.js"></script><script>hljs.initHighlightingOnLoad();</script></head><body class="odoc"><nav class="odoc-nav"><a href="../index.html">Up</a> – <a href="../index.html">ocaml</a> » Str</nav><header class="odoc-preamble"><h1>Module <code><span>Str</span></code></h1><p>Regular expressions and high-level string processing</p></header><nav class="odoc-toc"><ul><li><a href="#regular-expressions">Regular expressions</a></li><li><a href="#string-matching-and-searching">String matching and searching</a></li><li><a href="#replacement">Replacement</a></li><li><a href="#splitting">Splitting</a></li><li><a href="#extracting-substrings">Extracting substrings</a></li></ul></nav><div class="odoc-content"><h2 id="regular-expressions"><a href="#regular-expressions" class="anchor"></a>Regular expressions</h2><div class="odoc-spec"><div class="spec type anchored" id="type-regexp"><a href="#type-regexp" class="anchor"></a><code><span><span class="keyword">type</span> regexp</span></code></div><div class="spec-doc"><p>The type of compiled regular expressions.</p></div></div><div class="odoc-spec"><div class="spec value anchored" id="val-regexp"><a href="#val-regexp" class="anchor"></a><code><span><span class="keyword">val</span> regexp : <span>string <span class="arrow">-></span></span> <a href="#type-regexp">regexp</a></span></code></div><div class="spec-doc"><p>Compile a regular expression. The following constructs are recognized:</p><ul><li><code>. </code> Matches any character except newline.</li><li><code>* </code> (postfix) Matches the preceding expression zero, one or several times</li><li><code>+ </code> (postfix) Matches the preceding expression one or several times</li><li><code>? </code> (postfix) Matches the preceding expression once or not at all</li><li><code>[..] </code> Character set. Ranges are denoted with <code>-</code>, as in <code>[a-z]</code>. An initial <code>^</code>, as in <code>[^0-9]</code>, complements the set. To include a <code>]</code> character in a set, make it the first character of the set. To include a <code>-</code> character in a set, make it the first or the last character of the set.</li><li><code>^ </code> Matches at beginning of line: either at the beginning of the matched string, or just after a '\n' character.</li><li><code>$ </code> Matches at end of line: either at the end of the matched string, or just before a '\n' character.</li><li><code>\| </code> (infix) Alternative between two expressions.</li><li><code>\(..\)</code> Grouping and naming of the enclosed expression.</li><li><code>\1 </code> The text matched by the first <code>\(...\)</code> expression (<code>\2</code> for the second expression, and so on up to <code>\9</code>).</li><li><code>\b </code> Matches word boundaries.</li><li><code>\ </code> Quotes special characters. The special characters are <code>$^\.*+?[]</code>.</li></ul><p>In regular expressions you will often use backslash characters; it's easier to use a quoted string literal <code>{|...|}</code> to avoid having to escape backslashes.</p><p>For example, the following expression:</p><pre class="language-ocaml"><code>let r = Str.regexp {|hello \([A-Za-z]+\)|} in
|
||
Str.replace_first r {|\1|} "hello world" </code></pre><p>returns the string <code>"world"</code>.</p><p>If you want a regular expression that matches a literal backslash character, you need to double it: <code>Str.regexp {|\\|}</code>.</p><p>You can use regular string literals <code>"..."</code> too, however you will have to escape backslashes. The example above can be rewritten with a regular string literal as:</p><pre class="language-ocaml"><code>let r = Str.regexp "hello \\([A-Za-z]+\\)" in
|
||
Str.replace_first r "\\1" "hello world" </code></pre><p>And the regular expression for matching a backslash becomes a quadruple backslash: <code>Str.regexp "\\\\"</code>.</p></div></div><div class="odoc-spec"><div class="spec value anchored" id="val-regexp_case_fold"><a href="#val-regexp_case_fold" class="anchor"></a><code><span><span class="keyword">val</span> regexp_case_fold : <span>string <span class="arrow">-></span></span> <a href="#type-regexp">regexp</a></span></code></div><div class="spec-doc"><p>Same as <code>regexp</code>, but the compiled expression will match text in a case-insensitive way: uppercase and lowercase letters will be considered equivalent.</p></div></div><div class="odoc-spec"><div class="spec value anchored" id="val-quote"><a href="#val-quote" class="anchor"></a><code><span><span class="keyword">val</span> quote : <span>string <span class="arrow">-></span></span> string</span></code></div><div class="spec-doc"><p><code>Str.quote s</code> returns a regexp string that matches exactly <code>s</code> and nothing else.</p></div></div><div class="odoc-spec"><div class="spec value anchored" id="val-regexp_string"><a href="#val-regexp_string" class="anchor"></a><code><span><span class="keyword">val</span> regexp_string : <span>string <span class="arrow">-></span></span> <a href="#type-regexp">regexp</a></span></code></div><div class="spec-doc"><p><code>Str.regexp_string s</code> returns a regular expression that matches exactly <code>s</code> and nothing else.</p></div></div><div class="odoc-spec"><div class="spec value anchored" id="val-regexp_string_case_fold"><a href="#val-regexp_string_case_fold" class="anchor"></a><code><span><span class="keyword">val</span> regexp_string_case_fold : <span>string <span class="arrow">-></span></span> <a href="#type-regexp">regexp</a></span></code></div><div class="spec-doc"><p><code>Str.regexp_string_case_fold</code> is similar to <a href="#val-regexp_string"><code>Str.regexp_string</code></a>, but the regexp matches in a case-insensitive way.</p></div></div><h2 id="string-matching-and-searching"><a href="#string-matching-and-searching" class="anchor"></a>String matching and searching</h2><div class="odoc-spec"><div class="spec value anchored" id="val-string_match"><a href="#val-string_match" class="anchor"></a><code><span><span class="keyword">val</span> string_match : <span><a href="#type-regexp">regexp</a> <span class="arrow">-></span></span> <span>string <span class="arrow">-></span></span> <span>int <span class="arrow">-></span></span> bool</span></code></div><div class="spec-doc"><p><code>string_match r s start</code> tests whether a substring of <code>s</code> that starts at position <code>start</code> matches the regular expression <code>r</code>. The first character of a string has position <code>0</code>, as usual.</p></div></div><div class="odoc-spec"><div class="spec value anchored" id="val-search_forward"><a href="#val-search_forward" class="anchor"></a><code><span><span class="keyword">val</span> search_forward : <span><a href="#type-regexp">regexp</a> <span class="arrow">-></span></span> <span>string <span class="arrow">-></span></span> <span>int <span class="arrow">-></span></span> int</span></code></div><div class="spec-doc"><p><code>search_forward r s start</code> searches the string <code>s</code> for a substring matching the regular expression <code>r</code>. The search starts at position <code>start</code> and proceeds towards the end of the string. Return the position of the first character of the matched substring.</p><ul class="at-tags"><li class="raises"><span class="at-tag">raises</span> <span class="value">Not_found</span> <p>if no substring matches.</p></li></ul></div></div><div class="odoc-spec"><div class="spec value anchored" id="val-search_backward"><a href="#val-search_backward" class="anchor"></a><code><span><span class="keyword">val</span> search_backward : <span><a href="#type-regexp">regexp</a> <span class="arrow">-></span></span> <span>string <span class="arrow">-></span></span> <span>int <span class="arrow">-></span></span> int</span></code></div><div class="spec-doc"><p><code>search_backward r s last</code> searches the string <code>s</code> for a substring matching the regular expression <code>r</code>. The search first considers substrings that start at position <code>last</code> and proceeds towards the beginning of string. Return the position of the first character of the matched substring.</p><ul class="at-tags"><li class="raises"><span class="at-tag">raises</span> <span class="value">Not_found</span> <p>if no substring matches.</p></li></ul></div></div><div class="odoc-spec"><div class="spec value anchored" id="val-string_partial_match"><a href="#val-string_partial_match" class="anchor"></a><code><span><span class="keyword">val</span> string_partial_match : <span><a href="#type-regexp">regexp</a> <span class="arrow">-></span></span> <span>string <span class="arrow">-></span></span> <span>int <span class="arrow">-></span></span> bool</span></code></div><div class="spec-doc"><p>Similar to <a href="#val-string_match"><code>Str.string_match</code></a>, but also returns true if the argument string is a prefix of a string that matches. This includes the case of a true complete match.</p></div></div><div class="odoc-spec"><div class="spec value anchored" id="val-matched_string"><a href="#val-matched_string" class="anchor"></a><code><span><span class="keyword">val</span> matched_string : <span>string <span class="arrow">-></span></span> string</span></code></div><div class="spec-doc"><p><code>matched_string s</code> returns the substring of <code>s</code> that was matched by the last call to one of the following matching or searching functions:</p><ul><li><a href="#val-string_match"><code>Str.string_match</code></a></li><li><a href="#val-search_forward"><code>Str.search_forward</code></a></li><li><a href="#val-search_backward"><code>Str.search_backward</code></a></li><li><a href="#val-string_partial_match"><code>Str.string_partial_match</code></a></li><li><a href="#val-global_substitute"><code>Str.global_substitute</code></a></li><li><a href="#val-substitute_first"><code>Str.substitute_first</code></a></li></ul><p>provided that none of the following functions was called in between:</p><ul><li><a href="#val-global_replace"><code>Str.global_replace</code></a></li><li><a href="#val-replace_first"><code>Str.replace_first</code></a></li><li><a href="#val-split"><code>Str.split</code></a></li><li><a href="#val-bounded_split"><code>Str.bounded_split</code></a></li><li><a href="#val-split_delim"><code>Str.split_delim</code></a></li><li><a href="#val-bounded_split_delim"><code>Str.bounded_split_delim</code></a></li><li><a href="#val-full_split"><code>Str.full_split</code></a></li><li><a href="#val-bounded_full_split"><code>Str.bounded_full_split</code></a></li></ul><p>Note: in the case of <code>global_substitute</code> and <code>substitute_first</code>, a call to <code>matched_string</code> is only valid within the <code>subst</code> argument, not after <code>global_substitute</code> or <code>substitute_first</code> returns.</p><p>The user must make sure that the parameter <code>s</code> is the same string that was passed to the matching or searching function.</p></div></div><div class="odoc-spec"><div class="spec value anchored" id="val-match_beginning"><a href="#val-match_beginning" class="anchor"></a><code><span><span class="keyword">val</span> match_beginning : <span>unit <span class="arrow">-></span></span> int</span></code></div><div class="spec-doc"><p><code>match_beginning()</code> returns the position of the first character of the substring that was matched by the last call to a matching or searching function (see <a href="#val-matched_string"><code>Str.matched_string</code></a> for details).</p></div></div><div class="odoc-spec"><div class="spec value anchored" id="val-match_end"><a href="#val-match_end" class="anchor"></a><code><span><span class="keyword">val</span> match_end : <span>unit <span class="arrow">-></span></span> int</span></code></div><div class="spec-doc"><p><code>match_end()</code> returns the position of the character following the last character of the substring that was matched by the last call to a matching or searching function (see <a href="#val-matched_string"><code>Str.matched_string</code></a> for details).</p></div></div><div class="odoc-spec"><div class="spec value anchored" id="val-matched_group"><a href="#val-matched_group" class="anchor"></a><code><span><span class="keyword">val</span> matched_group : <span>int <span class="arrow">-></span></span> <span>string <span class="arrow">-></span></span> string</span></code></div><div class="spec-doc"><p><code>matched_group n s</code> returns the substring of <code>s</code> that was matched by the <code>n</code>th group <code>\(...\)</code> of the regular expression that was matched by the last call to a matching or searching function (see <a href="#val-matched_string"><code>Str.matched_string</code></a> for details). When <code>n</code> is <code>0</code>, it returns the substring matched by the whole regular expression. The user must make sure that the parameter <code>s</code> is the same string that was passed to the matching or searching function.</p><ul class="at-tags"><li class="raises"><span class="at-tag">raises</span> <span class="value">Not_found</span> <p>if the <code>n</code>th group of the regular expression was not matched. This can happen with groups inside alternatives <code>\|</code>, options <code>?</code> or repetitions <code>*</code>. For instance, the empty string will match <code>\(a\)*</code>, but <code>matched_group 1 ""</code> will raise <code>Not_found</code> because the first group itself was not matched.</p></li></ul></div></div><div class="odoc-spec"><div class="spec value anchored" id="val-group_beginning"><a href="#val-group_beginning" class="anchor"></a><code><span><span class="keyword">val</span> group_beginning : <span>int <span class="arrow">-></span></span> int</span></code></div><div class="spec-doc"><p><code>group_beginning n</code> returns the position of the first character of the substring that was matched by the <code>n</code>th group of the regular expression that was matched by the last call to a matching or searching function (see <a href="#val-matched_string"><code>Str.matched_string</code></a> for details).</p><ul class="at-tags"><li class="raises"><span class="at-tag">raises</span> <span class="value">Not_found</span> <p>if the <code>n</code>th group of the regular expression was not matched.</p></li></ul><ul class="at-tags"><li class="raises"><span class="at-tag">raises</span> <span class="value">Invalid_argument</span> <p>if there are fewer than <code>n</code> groups in the regular expression.</p></li></ul></div></div><div class="odoc-spec"><div class="spec value anchored" id="val-group_end"><a href="#val-group_end" class="anchor"></a><code><span><span class="keyword">val</span> group_end : <span>int <span class="arrow">-></span></span> int</span></code></div><div class="spec-doc"><p><code>group_end n</code> returns the position of the character following the last character of substring that was matched by the <code>n</code>th group of the regular expression that was matched by the last call to a matching or searching function (see <a href="#val-matched_string"><code>Str.matched_string</code></a> for details).</p><ul class="at-tags"><li class="raises"><span class="at-tag">raises</span> <span class="value">Not_found</span> <p>if the <code>n</code>th group of the regular expression was not matched.</p></li></ul><ul class="at-tags"><li class="raises"><span class="at-tag">raises</span> <span class="value">Invalid_argument</span> <p>if there are fewer than <code>n</code> groups in the regular expression.</p></li></ul></div></div><h2 id="replacement"><a href="#replacement" class="anchor"></a>Replacement</h2><div class="odoc-spec"><div class="spec value anchored" id="val-global_replace"><a href="#val-global_replace" class="anchor"></a><code><span><span class="keyword">val</span> global_replace : <span><a href="#type-regexp">regexp</a> <span class="arrow">-></span></span> <span>string <span class="arrow">-></span></span> <span>string <span class="arrow">-></span></span> string</span></code></div><div class="spec-doc"><p><code>global_replace regexp templ s</code> returns a string identical to <code>s</code>, except that all substrings of <code>s</code> that match <code>regexp</code> have been replaced by <code>templ</code>. The replacement template <code>templ</code> can contain <code>\1</code>, <code>\2</code>, etc; these sequences will be replaced by the text matched by the corresponding group in the regular expression. <code>\0</code> stands for the text matched by the whole regular expression.</p></div></div><div class="odoc-spec"><div class="spec value anchored" id="val-replace_first"><a href="#val-replace_first" class="anchor"></a><code><span><span class="keyword">val</span> replace_first : <span><a href="#type-regexp">regexp</a> <span class="arrow">-></span></span> <span>string <span class="arrow">-></span></span> <span>string <span class="arrow">-></span></span> string</span></code></div><div class="spec-doc"><p>Same as <a href="#val-global_replace"><code>Str.global_replace</code></a>, except that only the first substring matching the regular expression is replaced.</p></div></div><div class="odoc-spec"><div class="spec value anchored" id="val-global_substitute"><a href="#val-global_substitute" class="anchor"></a><code><span><span class="keyword">val</span> global_substitute : <span><a href="#type-regexp">regexp</a> <span class="arrow">-></span></span> <span><span>(<span>string <span class="arrow">-></span></span> string)</span> <span class="arrow">-></span></span> <span>string <span class="arrow">-></span></span> string</span></code></div><div class="spec-doc"><p><code>global_substitute regexp subst s</code> returns a string identical to <code>s</code>, except that all substrings of <code>s</code> that match <code>regexp</code> have been replaced by the result of function <code>subst</code>. The function <code>subst</code> is called once for each matching substring, and receives <code>s</code> (the whole text) as argument.</p></div></div><div class="odoc-spec"><div class="spec value anchored" id="val-substitute_first"><a href="#val-substitute_first" class="anchor"></a><code><span><span class="keyword">val</span> substitute_first : <span><a href="#type-regexp">regexp</a> <span class="arrow">-></span></span> <span><span>(<span>string <span class="arrow">-></span></span> string)</span> <span class="arrow">-></span></span> <span>string <span class="arrow">-></span></span> string</span></code></div><div class="spec-doc"><p>Same as <a href="#val-global_substitute"><code>Str.global_substitute</code></a>, except that only the first substring matching the regular expression is replaced.</p></div></div><div class="odoc-spec"><div class="spec value anchored" id="val-replace_matched"><a href="#val-replace_matched" class="anchor"></a><code><span><span class="keyword">val</span> replace_matched : <span>string <span class="arrow">-></span></span> <span>string <span class="arrow">-></span></span> string</span></code></div><div class="spec-doc"><p><code>replace_matched repl s</code> returns the replacement text <code>repl</code> in which <code>\1</code>, <code>\2</code>, etc. have been replaced by the text matched by the corresponding groups in the regular expression that was matched by the last call to a matching or searching function (see <a href="#val-matched_string"><code>Str.matched_string</code></a> for details). <code>s</code> must be the same string that was passed to the matching or searching function.</p></div></div><h2 id="splitting"><a href="#splitting" class="anchor"></a>Splitting</h2><div class="odoc-spec"><div class="spec value anchored" id="val-split"><a href="#val-split" class="anchor"></a><code><span><span class="keyword">val</span> split : <span><a href="#type-regexp">regexp</a> <span class="arrow">-></span></span> <span>string <span class="arrow">-></span></span> <span>string list</span></span></code></div><div class="spec-doc"><p><code>split r s</code> splits <code>s</code> into substrings, taking as delimiters the substrings that match <code>r</code>, and returns the list of substrings. For instance, <code>split (regexp "[ \t]+") s</code> splits <code>s</code> into blank-separated words. An occurrence of the delimiter at the beginning or at the end of the string is ignored.</p></div></div><div class="odoc-spec"><div class="spec value anchored" id="val-bounded_split"><a href="#val-bounded_split" class="anchor"></a><code><span><span class="keyword">val</span> bounded_split : <span><a href="#type-regexp">regexp</a> <span class="arrow">-></span></span> <span>string <span class="arrow">-></span></span> <span>int <span class="arrow">-></span></span> <span>string list</span></span></code></div><div class="spec-doc"><p>Same as <a href="#val-split"><code>Str.split</code></a>, but splits into at most <code>n</code> substrings, where <code>n</code> is the extra integer parameter.</p></div></div><div class="odoc-spec"><div class="spec value anchored" id="val-split_delim"><a href="#val-split_delim" class="anchor"></a><code><span><span class="keyword">val</span> split_delim : <span><a href="#type-regexp">regexp</a> <span class="arrow">-></span></span> <span>string <span class="arrow">-></span></span> <span>string list</span></span></code></div><div class="spec-doc"><p>Same as <a href="#val-split"><code>Str.split</code></a> but occurrences of the delimiter at the beginning and at the end of the string are recognized and returned as empty strings in the result. For instance, <code>split_delim (regexp " ") " abc "</code> returns <code>[""; "abc"; ""]</code>, while <code>split</code> with the same arguments returns <code>["abc"]</code>.</p></div></div><div class="odoc-spec"><div class="spec value anchored" id="val-bounded_split_delim"><a href="#val-bounded_split_delim" class="anchor"></a><code><span><span class="keyword">val</span> bounded_split_delim : <span><a href="#type-regexp">regexp</a> <span class="arrow">-></span></span> <span>string <span class="arrow">-></span></span> <span>int <span class="arrow">-></span></span> <span>string list</span></span></code></div><div class="spec-doc"><p>Same as <a href="#val-bounded_split"><code>Str.bounded_split</code></a>, but occurrences of the delimiter at the beginning and at the end of the string are recognized and returned as empty strings in the result.</p></div></div><div class="odoc-spec"><div class="spec type anchored" id="type-split_result"><a href="#type-split_result" class="anchor"></a><code><span><span class="keyword">type</span> split_result</span><span> = </span></code><ol><li id="type-split_result.Text" class="def variant constructor anchored"><a href="#type-split_result.Text" class="anchor"></a><code><span>| </span><span><span class="constructor">Text</span> <span class="keyword">of</span> string</span></code></li><li id="type-split_result.Delim" class="def variant constructor anchored"><a href="#type-split_result.Delim" class="anchor"></a><code><span>| </span><span><span class="constructor">Delim</span> <span class="keyword">of</span> string</span></code></li></ol></div></div><div class="odoc-spec"><div class="spec value anchored" id="val-full_split"><a href="#val-full_split" class="anchor"></a><code><span><span class="keyword">val</span> full_split : <span><a href="#type-regexp">regexp</a> <span class="arrow">-></span></span> <span>string <span class="arrow">-></span></span> <span><a href="#type-split_result">split_result</a> list</span></span></code></div><div class="spec-doc"><p>Same as <a href="#val-split_delim"><code>Str.split_delim</code></a>, but returns the delimiters as well as the substrings contained between delimiters. The former are tagged <code>Delim</code> in the result list; the latter are tagged <code>Text</code>. For instance, <code>full_split (regexp "[{}]") "{ab}"</code> returns <code>[Delim "{"; Text "ab"; Delim "}"]</code>.</p></div></div><div class="odoc-spec"><div class="spec value anchored" id="val-bounded_full_split"><a href="#val-bounded_full_split" class="anchor"></a><code><span><span class="keyword">val</span> bounded_full_split : <span><a href="#type-regexp">regexp</a> <span class="arrow">-></span></span> <span>string <span class="arrow">-></span></span> <span>int <span class="arrow">-></span></span> <span><a href="#type-split_result">split_result</a> list</span></span></code></div><div class="spec-doc"><p>Same as <a href="#val-bounded_split_delim"><code>Str.bounded_split_delim</code></a>, but returns the delimiters as well as the substrings contained between delimiters. The former are tagged <code>Delim</code> in the result list; the latter are tagged <code>Text</code>.</p></div></div><h2 id="extracting-substrings"><a href="#extracting-substrings" class="anchor"></a>Extracting substrings</h2><div class="odoc-spec"><div class="spec value anchored" id="val-string_before"><a href="#val-string_before" class="anchor"></a><code><span><span class="keyword">val</span> string_before : <span>string <span class="arrow">-></span></span> <span>int <span class="arrow">-></span></span> string</span></code></div><div class="spec-doc"><p><code>string_before s n</code> returns the substring of all characters of <code>s</code> that precede position <code>n</code> (excluding the character at position <code>n</code>).</p></div></div><div class="odoc-spec"><div class="spec value anchored" id="val-string_after"><a href="#val-string_after" class="anchor"></a><code><span><span class="keyword">val</span> string_after : <span>string <span class="arrow">-></span></span> <span>int <span class="arrow">-></span></span> string</span></code></div><div class="spec-doc"><p><code>string_after s n</code> returns the substring of all characters of <code>s</code> that follow position <code>n</code> (including the character at position <code>n</code>).</p></div></div><div class="odoc-spec"><div class="spec value anchored" id="val-first_chars"><a href="#val-first_chars" class="anchor"></a><code><span><span class="keyword">val</span> first_chars : <span>string <span class="arrow">-></span></span> <span>int <span class="arrow">-></span></span> string</span></code></div><div class="spec-doc"><p><code>first_chars s n</code> returns the first <code>n</code> characters of <code>s</code>. This is the same function as <a href="#val-string_before"><code>Str.string_before</code></a>.</p></div></div><div class="odoc-spec"><div class="spec value anchored" id="val-last_chars"><a href="#val-last_chars" class="anchor"></a><code><span><span class="keyword">val</span> last_chars : <span>string <span class="arrow">-></span></span> <span>int <span class="arrow">-></span></span> string</span></code></div><div class="spec-doc"><p><code>last_chars s n</code> returns the last <code>n</code> characters of <code>s</code>.</p></div></div></div></body></html> |