mirror of
https://github.com/c-cube/ocaml-containers.git
synced 2025-12-06 11:15:31 -05:00
2 lines
No EOL
17 KiB
HTML
2 lines
No EOL
17 KiB
HTML
<!DOCTYPE html>
|
||
<html xmlns="http://www.w3.org/1999/xhtml"><head><title>CCUtf8_string (containers.CCUtf8_string)</title><link rel="stylesheet" href="../../_odoc-theme/odoc.css"/><meta charset="utf-8"/><meta name="generator" content="odoc 2.2.0"/><meta name="viewport" content="width=device-width,initial-scale=1.0"/><script src="../../highlight.pack.js"></script><script>hljs.initHighlightingOnLoad();</script></head><body class="odoc"><nav class="odoc-nav"><a href="../index.html">Up</a> – <a href="../index.html">containers</a> » CCUtf8_string</nav><header class="odoc-preamble"><h1>Module <code><span>CCUtf8_string</span></code></h1><p>Unicode String, in UTF8</p></header><div class="odoc-content"><p>A unicode string represented by a utf8 bytestring. This representation is convenient for manipulating normal OCaml strings that are encoded in UTF8.</p><p>We perform only basic decoding and encoding between codepoints and bytestrings. For more elaborate operations, please use the excellent <a href="http://erratique.ch/software/uutf">Uutf</a>.</p><p><b>status: experimental</b></p><ul class="at-tags"><li class="since"><span class="at-tag">since</span> 2.1</li></ul><div class="odoc-spec"><div class="spec type anchored" id="type-uchar"><a href="#type-uchar" class="anchor"></a><code><span><span class="keyword">type</span> uchar</span><span> = <a href="../../ocaml/Stdlib/Uchar/index.html#type-t">Stdlib.Uchar.t</a></span></code></div></div><div class="odoc-spec"><div class="spec type anchored" id="type-gen"><a href="#type-gen" class="anchor"></a><code><span><span class="keyword">type</span> <span>'a gen</span></span><span> = <span>unit <span class="arrow">-></span></span> <span><span class="type-var">'a</span> option</span></span></code></div></div><div class="odoc-spec"><div class="spec type anchored" id="type-iter"><a href="#type-iter" class="anchor"></a><code><span><span class="keyword">type</span> <span>'a iter</span></span><span> = <span><span>(<span><span class="type-var">'a</span> <span class="arrow">-></span></span> unit)</span> <span class="arrow">-></span></span> unit</span></code></div><div class="spec-doc"><p>Fast internal iterator.</p><ul class="at-tags"><li class="since"><span class="at-tag">since</span> 2.8</li></ul></div></div><div class="odoc-spec"><div class="spec type anchored" id="type-t"><a href="#type-t" class="anchor"></a><code><span><span class="keyword">type</span> t</span><span> = <span class="keyword">private</span> string</span></code></div><div class="spec-doc"><p>A UTF8 string</p></div></div><div class="odoc-spec"><div class="spec value anchored" id="val-equal"><a href="#val-equal" class="anchor"></a><code><span><span class="keyword">val</span> equal : <span><a href="#type-t">t</a> <span class="arrow">-></span></span> <span><a href="#type-t">t</a> <span class="arrow">-></span></span> bool</span></code></div></div><div class="odoc-spec"><div class="spec value anchored" id="val-hash"><a href="#val-hash" class="anchor"></a><code><span><span class="keyword">val</span> hash : <span><a href="#type-t">t</a> <span class="arrow">-></span></span> int</span></code></div></div><div class="odoc-spec"><div class="spec value anchored" id="val-compare"><a href="#val-compare" class="anchor"></a><code><span><span class="keyword">val</span> compare : <span><a href="#type-t">t</a> <span class="arrow">-></span></span> <span><a href="#type-t">t</a> <span class="arrow">-></span></span> int</span></code></div></div><div class="odoc-spec"><div class="spec value anchored" id="val-pp"><a href="#val-pp" class="anchor"></a><code><span><span class="keyword">val</span> pp : <span><a href="../../ocaml/Stdlib/Format/index.html#type-formatter">Stdlib.Format.formatter</a> <span class="arrow">-></span></span> <span><a href="#type-t">t</a> <span class="arrow">-></span></span> unit</span></code></div></div><div class="odoc-spec"><div class="spec value anchored" id="val-to_string"><a href="#val-to_string" class="anchor"></a><code><span><span class="keyword">val</span> to_string : <span><a href="#type-t">t</a> <span class="arrow">-></span></span> string</span></code></div><div class="spec-doc"><p>Identity.</p></div></div><div class="odoc-spec"><div class="spec exception anchored" id="exception-Malformed"><a href="#exception-Malformed" class="anchor"></a><code><span><span class="keyword">exception</span> </span><span><span class="exception">Malformed</span> <span class="keyword">of</span> string * int</span></code></div><div class="spec-doc"><p>Malformed string at given offset</p></div></div><div class="odoc-spec"><div class="spec value anchored" id="val-to_gen"><a href="#val-to_gen" class="anchor"></a><code><span><span class="keyword">val</span> to_gen : <span>?idx:int <span class="arrow">-></span></span> <span><a href="#type-t">t</a> <span class="arrow">-></span></span> <span><a href="#type-uchar">uchar</a> <a href="#type-gen">gen</a></span></span></code></div><div class="spec-doc"><p>Generator of unicode codepoints.</p><ul class="at-tags"><li class="parameter"><span class="at-tag">parameter</span> <span class="value">idx</span> <p>offset where to start the decoding.</p></li></ul></div></div><div class="odoc-spec"><div class="spec value anchored" id="val-to_iter"><a href="#val-to_iter" class="anchor"></a><code><span><span class="keyword">val</span> to_iter : <span>?idx:int <span class="arrow">-></span></span> <span><a href="#type-t">t</a> <span class="arrow">-></span></span> <span><a href="#type-uchar">uchar</a> <a href="#type-iter">iter</a></span></span></code></div><div class="spec-doc"><p>Iterator of unicode codepoints.</p><ul class="at-tags"><li class="parameter"><span class="at-tag">parameter</span> <span class="value">idx</span> <p>offset where to start the decoding.</p></li></ul><ul class="at-tags"><li class="since"><span class="at-tag">since</span> 2.8</li></ul></div></div><div class="odoc-spec"><div class="spec value anchored" id="val-to_seq"><a href="#val-to_seq" class="anchor"></a><code><span><span class="keyword">val</span> to_seq : <span>?idx:int <span class="arrow">-></span></span> <span><a href="#type-t">t</a> <span class="arrow">-></span></span> <span><a href="#type-uchar">uchar</a> <a href="../../ocaml/Stdlib/Seq/index.html#type-t">Stdlib.Seq.t</a></span></span></code></div><div class="spec-doc"><p>Iter of unicode codepoints. Renamed from <code>to_std_seq</code> since 3.0.</p><ul class="at-tags"><li class="parameter"><span class="at-tag">parameter</span> <span class="value">idx</span> <p>offset where to start the decoding.</p></li></ul><ul class="at-tags"><li class="since"><span class="at-tag">since</span> 3.0</li></ul></div></div><div class="odoc-spec"><div class="spec value anchored" id="val-to_list"><a href="#val-to_list" class="anchor"></a><code><span><span class="keyword">val</span> to_list : <span>?idx:int <span class="arrow">-></span></span> <span><a href="#type-t">t</a> <span class="arrow">-></span></span> <span><a href="#type-uchar">uchar</a> list</span></span></code></div><div class="spec-doc"><p>List of unicode codepoints.</p><ul class="at-tags"><li class="parameter"><span class="at-tag">parameter</span> <span class="value">idx</span> <p>offset where to start the decoding.</p></li></ul></div></div><div class="odoc-spec"><div class="spec value anchored" id="val-fold"><a href="#val-fold" class="anchor"></a><code><span><span class="keyword">val</span> fold : <span>?idx:int <span class="arrow">-></span></span> <span><span>(<span><span class="type-var">'a</span> <span class="arrow">-></span></span> <span><a href="#type-uchar">uchar</a> <span class="arrow">-></span></span> <span class="type-var">'a</span>)</span> <span class="arrow">-></span></span> <span><span class="type-var">'a</span> <span class="arrow">-></span></span> <span><a href="#type-t">t</a> <span class="arrow">-></span></span> <span class="type-var">'a</span></span></code></div></div><div class="odoc-spec"><div class="spec value anchored" id="val-iter"><a href="#val-iter" class="anchor"></a><code><span><span class="keyword">val</span> iter : <span>?idx:int <span class="arrow">-></span></span> <span><span>(<span><a href="#type-uchar">uchar</a> <span class="arrow">-></span></span> unit)</span> <span class="arrow">-></span></span> <span><a href="#type-t">t</a> <span class="arrow">-></span></span> unit</span></code></div></div><div class="odoc-spec"><div class="spec value anchored" id="val-n_chars"><a href="#val-n_chars" class="anchor"></a><code><span><span class="keyword">val</span> n_chars : <span><a href="#type-t">t</a> <span class="arrow">-></span></span> int</span></code></div><div class="spec-doc"><p>Number of characters.</p></div></div><div class="odoc-spec"><div class="spec value anchored" id="val-n_bytes"><a href="#val-n_bytes" class="anchor"></a><code><span><span class="keyword">val</span> n_bytes : <span><a href="#type-t">t</a> <span class="arrow">-></span></span> int</span></code></div><div class="spec-doc"><p>Number of bytes.</p></div></div><div class="odoc-spec"><div class="spec value anchored" id="val-map"><a href="#val-map" class="anchor"></a><code><span><span class="keyword">val</span> map : <span><span>(<span><a href="#type-uchar">uchar</a> <span class="arrow">-></span></span> <a href="#type-uchar">uchar</a>)</span> <span class="arrow">-></span></span> <span><a href="#type-t">t</a> <span class="arrow">-></span></span> <a href="#type-t">t</a></span></code></div></div><div class="odoc-spec"><div class="spec value anchored" id="val-filter_map"><a href="#val-filter_map" class="anchor"></a><code><span><span class="keyword">val</span> filter_map : <span><span>(<span><a href="#type-uchar">uchar</a> <span class="arrow">-></span></span> <span><a href="#type-uchar">uchar</a> option</span>)</span> <span class="arrow">-></span></span> <span><a href="#type-t">t</a> <span class="arrow">-></span></span> <a href="#type-t">t</a></span></code></div></div><div class="odoc-spec"><div class="spec value anchored" id="val-flat_map"><a href="#val-flat_map" class="anchor"></a><code><span><span class="keyword">val</span> flat_map : <span><span>(<span><a href="#type-uchar">uchar</a> <span class="arrow">-></span></span> <a href="#type-t">t</a>)</span> <span class="arrow">-></span></span> <span><a href="#type-t">t</a> <span class="arrow">-></span></span> <a href="#type-t">t</a></span></code></div></div><div class="odoc-spec"><div class="spec value anchored" id="val-empty"><a href="#val-empty" class="anchor"></a><code><span><span class="keyword">val</span> empty : <a href="#type-t">t</a></span></code></div><div class="spec-doc"><p>Empty string.</p><ul class="at-tags"><li class="since"><span class="at-tag">since</span> 3.5</li></ul></div></div><div class="odoc-spec"><div class="spec value anchored" id="val-append"><a href="#val-append" class="anchor"></a><code><span><span class="keyword">val</span> append : <span><a href="#type-t">t</a> <span class="arrow">-></span></span> <span><a href="#type-t">t</a> <span class="arrow">-></span></span> <a href="#type-t">t</a></span></code></div><div class="spec-doc"><p>Append two string together.</p></div></div><div class="odoc-spec"><div class="spec value anchored" id="val-concat"><a href="#val-concat" class="anchor"></a><code><span><span class="keyword">val</span> concat : <span><a href="#type-t">t</a> <span class="arrow">-></span></span> <span><span><a href="#type-t">t</a> list</span> <span class="arrow">-></span></span> <a href="#type-t">t</a></span></code></div><div class="spec-doc"><p><code>concat sep l</code> concatenates each string in <code>l</code>, inserting <code>sep</code> in between each string. Similar to <code>String</code>.concat.</p></div></div><div class="odoc-spec"><div class="spec value anchored" id="val-of_uchar"><a href="#val-of_uchar" class="anchor"></a><code><span><span class="keyword">val</span> of_uchar : <span><a href="#type-uchar">uchar</a> <span class="arrow">-></span></span> <a href="#type-t">t</a></span></code></div><div class="spec-doc"><p><code>of_char c</code> is a string with only one unicode char in it.</p><ul class="at-tags"><li class="since"><span class="at-tag">since</span> 3.5</li></ul></div></div><div class="odoc-spec"><div class="spec value anchored" id="val-make"><a href="#val-make" class="anchor"></a><code><span><span class="keyword">val</span> make : <span>int <span class="arrow">-></span></span> <span><a href="#type-uchar">uchar</a> <span class="arrow">-></span></span> <a href="#type-t">t</a></span></code></div><div class="spec-doc"><p><code>make n c</code> makes a new string with <code>n</code> copies of <code>c</code> in it.</p><ul class="at-tags"><li class="since"><span class="at-tag">since</span> 3.5</li></ul></div></div><div class="odoc-spec"><div class="spec value anchored" id="val-of_seq"><a href="#val-of_seq" class="anchor"></a><code><span><span class="keyword">val</span> of_seq : <span><span><a href="#type-uchar">uchar</a> <a href="../../ocaml/Stdlib/Seq/index.html#type-t">Stdlib.Seq.t</a></span> <span class="arrow">-></span></span> <a href="#type-t">t</a></span></code></div><div class="spec-doc"><p>Build a string from unicode codepoints Renamed from <code>of_std_seq</code> since 3.0.</p><ul class="at-tags"><li class="since"><span class="at-tag">since</span> 3.0</li></ul></div></div><div class="odoc-spec"><div class="spec value anchored" id="val-of_iter"><a href="#val-of_iter" class="anchor"></a><code><span><span class="keyword">val</span> of_iter : <span><span><a href="#type-uchar">uchar</a> <a href="#type-iter">iter</a></span> <span class="arrow">-></span></span> <a href="#type-t">t</a></span></code></div><div class="spec-doc"><p>Build a string from unicode codepoints</p><ul class="at-tags"><li class="since"><span class="at-tag">since</span> 2.8</li></ul></div></div><div class="odoc-spec"><div class="spec value anchored" id="val-uchar_to_bytes"><a href="#val-uchar_to_bytes" class="anchor"></a><code><span><span class="keyword">val</span> uchar_to_bytes : <span><a href="#type-uchar">uchar</a> <span class="arrow">-></span></span> <span>char <a href="#type-iter">iter</a></span></span></code></div><div class="spec-doc"><p>Translate the unicode codepoint to a list of utf-8 bytes. This can be used, for example, in combination with <code>Buffer</code>.add_char on a pre-allocated buffer to add the bytes one by one (despite its name, <code>Buffer</code>.add_char takes individual bytes, not unicode codepoints).</p><ul class="at-tags"><li class="since"><span class="at-tag">since</span> 3.2</li></ul></div></div><div class="odoc-spec"><div class="spec value anchored" id="val-of_gen"><a href="#val-of_gen" class="anchor"></a><code><span><span class="keyword">val</span> of_gen : <span><span><a href="#type-uchar">uchar</a> <a href="#type-gen">gen</a></span> <span class="arrow">-></span></span> <a href="#type-t">t</a></span></code></div></div><div class="odoc-spec"><div class="spec value anchored" id="val-of_list"><a href="#val-of_list" class="anchor"></a><code><span><span class="keyword">val</span> of_list : <span><span><a href="#type-uchar">uchar</a> list</span> <span class="arrow">-></span></span> <a href="#type-t">t</a></span></code></div></div><div class="odoc-spec"><div class="spec value anchored" id="val-of_string_exn"><a href="#val-of_string_exn" class="anchor"></a><code><span><span class="keyword">val</span> of_string_exn : <span>string <span class="arrow">-></span></span> <a href="#type-t">t</a></span></code></div><div class="spec-doc"><p>Validate string by checking it is valid UTF8.</p><ul class="at-tags"><li class="raises"><span class="at-tag">raises</span> <span class="value">Invalid_argument</span> <p>if the string is not valid UTF8.</p></li></ul></div></div><div class="odoc-spec"><div class="spec value anchored" id="val-of_string"><a href="#val-of_string" class="anchor"></a><code><span><span class="keyword">val</span> of_string : <span>string <span class="arrow">-></span></span> <span><a href="#type-t">t</a> option</span></span></code></div><div class="spec-doc"><p>Safe version of <a href="#val-of_string_exn"><code>of_string_exn</code></a>.</p></div></div><div class="odoc-spec"><div class="spec value anchored" id="val-is_valid"><a href="#val-is_valid" class="anchor"></a><code><span><span class="keyword">val</span> is_valid : <span>string <span class="arrow">-></span></span> bool</span></code></div><div class="spec-doc"><p>Valid UTF8?</p></div></div><div class="odoc-spec"><div class="spec value anchored" id="val-unsafe_of_string"><a href="#val-unsafe_of_string" class="anchor"></a><code><span><span class="keyword">val</span> unsafe_of_string : <span>string <span class="arrow">-></span></span> <a href="#type-t">t</a></span></code></div><div class="spec-doc"><p>Conversion from a string without validating. <b>CAUTION</b> this is unsafe and can break all the other functions in this module. Use only if you're sure the string is valid UTF8. Upon iteration, if an invalid substring is met, Malformed will be raised.</p></div></div></div></body></html> |