linol/uutf/Uutf/String/index.html
2024-05-08 15:15:46 +00:00

9 lines
6.5 KiB
HTML
Raw Blame History

This file contains ambiguous Unicode characters

This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

<!DOCTYPE html>
<html xmlns="http://www.w3.org/1999/xhtml"><head><title>String (uutf.Uutf.String)</title><meta charset="utf-8"/><link rel="stylesheet" href="../../../_odoc-theme/odoc.css"/><meta name="generator" content="odoc 2.4.2"/><meta name="viewport" content="width=device-width,initial-scale=1.0"/><script src="../../../highlight.pack.js"></script><script>hljs.initHighlightingOnLoad();</script></head><body class="odoc"><nav class="odoc-nav"><a href="../index.html">Up</a> <a href="../../index.html">uutf</a> &#x00BB; <a href="../index.html">Uutf</a> &#x00BB; String</nav><header class="odoc-preamble"><h1>Module <code><span>Uutf.String</span></code></h1><p>Fold over the characters of UTF encoded OCaml <code>string</code> values.</p><p><b>Note.</b> Since OCaml 4.14, UTF decoders are available in <a href="../../../ocaml/Stdlib/String/index.html"><code>Stdlib.String</code></a>. You are encouraged to migrate to them.</p></header><nav class="odoc-toc"><ul><li><a href="#encoding-guess">Encoding guess</a></li><li><a href="#string-folders">String folders</a></li></ul></nav><div class="odoc-content"><h2 id="encoding-guess"><a href="#encoding-guess" class="anchor"></a>Encoding guess</h2><div class="odoc-spec"><div class="spec value anchored" id="val-encoding_guess"><a href="#val-encoding_guess" class="anchor"></a><code><span><span class="keyword">val</span> encoding_guess : <span>string <span class="arrow">&#45;&gt;</span></span> <span>[ `UTF_8 <span>| `UTF_16BE</span> <span>| `UTF_16LE</span> ]</span> * bool</span></code></div><div class="spec-doc"><p><code>encoding_guess s</code> is the encoding guessed for <code>s</code> coupled with <code>true</code> iff there's an initial <a href="http://unicode.org/glossary/#byte_order_mark">BOM</a>.</p></div></div><h2 id="string-folders"><a href="#string-folders" class="anchor"></a>String folders</h2><p><b>Note.</b> Initial <a href="http://unicode.org/glossary/#byte_order_mark">BOM</a>s are also folded over.</p><div class="odoc-spec"><div class="spec type anchored" id="type-folder"><a href="#type-folder" class="anchor"></a><code><span><span class="keyword">type</span> <span>'a folder</span></span><span> =
<span><span class="type-var">'a</span> <span class="arrow">&#45;&gt;</span></span>
<span>int <span class="arrow">&#45;&gt;</span></span>
<span><span>[ <span>`Uchar of <a href="../../../ocaml/Stdlib/Uchar/index.html#type-t">Stdlib.Uchar.t</a></span> <span><span>| `Malformed</span> of string</span> ]</span> <span class="arrow">&#45;&gt;</span></span>
<span class="type-var">'a</span></span></code></div><div class="spec-doc"><p>The type for character folders. The integer is the index in the string where the <code>`Uchar</code> or <code>`Malformed</code> starts.</p></div></div><div class="odoc-spec"><div class="spec value anchored" id="val-fold_utf_8"><a href="#val-fold_utf_8" class="anchor"></a><code><span><span class="keyword">val</span> fold_utf_8 : <span><span class="optlabel">?pos</span>:int <span class="arrow">&#45;&gt;</span></span> <span><span class="optlabel">?len</span>:int <span class="arrow">&#45;&gt;</span></span> <span><span><span class="type-var">'a</span> <a href="#type-folder">folder</a></span> <span class="arrow">&#45;&gt;</span></span> <span><span class="type-var">'a</span> <span class="arrow">&#45;&gt;</span></span> <span>string <span class="arrow">&#45;&gt;</span></span> <span class="type-var">'a</span></span></code></div><div class="spec-doc"><p><code>fold_utf_8 f a s ?pos ?len ()</code> is <code>f (</code> ... <code>(f (f a pos u</code><sub>0</sub><code>) j</code><sub>1</sub><code> u</code><sub>1</sub><code>)</code> ... <code>)</code> ... <code>)
j</code><sub>n</sub><code> u</code><sub>n</sub> where <code>u</code><sub>i</sub>, <code>j</code><sub>i</sub> are characters and their start position in the UTF-8 encoded substring <code>s</code> starting at <code>pos</code> and <code>len</code> long. The default value for <code>pos</code> is <code>0</code> and <code>len</code> is <code>String.length s - pos</code>.</p></div></div><div class="odoc-spec"><div class="spec value anchored" id="val-fold_utf_16be"><a href="#val-fold_utf_16be" class="anchor"></a><code><span><span class="keyword">val</span> fold_utf_16be : <span><span class="optlabel">?pos</span>:int <span class="arrow">&#45;&gt;</span></span> <span><span class="optlabel">?len</span>:int <span class="arrow">&#45;&gt;</span></span> <span><span><span class="type-var">'a</span> <a href="#type-folder">folder</a></span> <span class="arrow">&#45;&gt;</span></span> <span><span class="type-var">'a</span> <span class="arrow">&#45;&gt;</span></span> <span>string <span class="arrow">&#45;&gt;</span></span> <span class="type-var">'a</span></span></code></div><div class="spec-doc"><p><code>fold_utf_16be f a s ?pos ?len ()</code> is <code>f (</code> ... <code>(f (f a pos u</code><sub>0</sub><code>) j</code><sub>1</sub><code> u</code><sub>1</sub><code>)</code> ... <code>)</code> ... <code>)
j</code><sub>n</sub><code> u</code><sub>n</sub> where <code>u</code><sub>i</sub>, <code>j</code><sub>i</sub> are characters and their start position in the UTF-8 encoded substring <code>s</code> starting at <code>pos</code> and <code>len</code> long. The default value for <code>pos</code> is <code>0</code> and <code>len</code> is <code>String.length s - pos</code>.</p></div></div><div class="odoc-spec"><div class="spec value anchored" id="val-fold_utf_16le"><a href="#val-fold_utf_16le" class="anchor"></a><code><span><span class="keyword">val</span> fold_utf_16le : <span><span class="optlabel">?pos</span>:int <span class="arrow">&#45;&gt;</span></span> <span><span class="optlabel">?len</span>:int <span class="arrow">&#45;&gt;</span></span> <span><span><span class="type-var">'a</span> <a href="#type-folder">folder</a></span> <span class="arrow">&#45;&gt;</span></span> <span><span class="type-var">'a</span> <span class="arrow">&#45;&gt;</span></span> <span>string <span class="arrow">&#45;&gt;</span></span> <span class="type-var">'a</span></span></code></div><div class="spec-doc"><p><code>fold_utf_16le f a s ?pos ?len ()</code> is <code>f (</code> ... <code>(f (f a pos u</code><sub>0</sub><code>) j</code><sub>1</sub><code> u</code><sub>1</sub><code>)</code> ... <code>)</code> ... <code>)
j</code><sub>n</sub><code> u</code><sub>n</sub> where <code>u</code><sub>i</sub>, <code>j</code><sub>i</sub> are characters and their start position in the UTF-8 encoded substring <code>s</code> starting at <code>pos</code> and <code>len</code> long. The default value for <code>pos</code> is <code>0</code> and <code>len</code> is <code>String.length s - pos</code>.</p></div></div></div></body></html>