diff --git a/dev/containers/CCParse/index.html b/dev/containers/CCParse/index.html index d082d7eb..89909972 100644 --- a/dev/containers/CCParse/index.html +++ b/dev/containers/CCParse/index.html @@ -43,6 +43,6 @@ assert (l=l');;
Same as chars_fold but with the following differences:
`Continue _. The string is built from characters returned by `Yield.`Yield (acc, c) adds c to the returned string and continues parsing with acc.take len parses exactly len characters from the input. Fails if the input doesn't contain at least len chars.
take_if f takes characters as long as they satisfy the predicate f.
take1_if f takes characters as long as they satisfy the predicate f. Fails if no character satisfies f.
val char_if : ?descr:string -> (char -> bool) -> char tchar_if f parses a character c if f c = true. Fails if the next char does not satisfy f.
val chars_if : (char -> bool) -> string tchars_if f parses a string of chars that satisfy f. Cannot fail.
val chars1_if : ?descr:string -> (char -> bool) -> string tLike chars_if, but accepts only non-empty strings. chars1_if p fails if the string accepted by chars_if p is empty. chars1_if p is equivalent to take1_if p >|= Slice.to_string.
val endline : char tParse '\n'.
val space : char tTab or space.
val white : char tTab or space or newline.
val skip_chars : (char -> bool) -> unit tSkip 0 or more chars satisfying the predicate.
val skip_space : unit tSkip ' ' and '\t'.
val skip_white : unit tSkip ' ' and '\t' and '\n'.
suspend f is the same as f (), but evaluates f () only when needed.
A practical use case is to implement recursive parsers manually, as described in fix. The parser is let rec p () = …, and suspend p can be used in the definition to use p.
val string : string -> string tstring s parses exactly the string s, and nothing else.
many p parses p repeatedly, until p fails, and collects the results into a list.
optional p tries to parse p, and return () whether it succeeded or failed. Cannot fail itself. It consumes input if p succeeded (as much as p consumed), but consumes not input if p failed.
try_ p is just like p (it used to play a role in backtracking semantics but no more).
try_opt p tries to parse using p, and return Some x if p succeeded with x (and consumes what p consumed). Otherwise it returns None and consumes nothing. This cannot fail.
many_until ~until p parses as many p as it can until the until parser successfully returns. If p fails before that then many_until ~until p fails as well. Typically until can be a closing ')' or another termination condition, and what is consumed by until is also consumed by many_until ~until p.
EXPERIMENTAL
try_or p1 ~f ~else_:p2 attempts to parse x using p1, and then becomes f x. If p1 fails, then it becomes p2. This can be useful if f is expensive but only ever works if p1 matches (e.g. after an opening parenthesis or some sort of prefix).
try_or_l ?else_ l tries each pair (test, p) in order. If the n-th test succeeds, then try_or_l l behaves like n-th p, whether p fails or not. If test consumes input, the state is restored before calling p. If they all fail, and else_ is defined, then it behaves like else_. If all fail, and else_ is None, then it fails as well.
This is a performance optimization compared to (<|>). We commit to a branch if the test succeeds, without backtracking at all. It can also provide better error messages, because failures in the parser will not be reported as failures in try_or_l.
See lookahead_ignore for a convenient way of writing the test conditions.
or_ p1 p2 tries to parse p1, and if it fails, tries p2 from the same position.
both a b parses a, then b, then returns the pair of their results.
many1 p is like many p excepts it fails if the list is empty (i.e. it needs p to succeed at least once).
skip p parses zero or more times p and ignores its result. It is eager, meaning it will continue as long as p succeeds. As soon as p fails, skip p stops consuming any input.
Same as sep but stop when until parses successfully.
lookahead p behaves like p, except it doesn't consume any input.
EXPERIMENTAL
lookahead_ignore p tries to parse input with p, and succeeds if p succeeds. However it doesn't consume any input and returns (), so in effect its only use-case is to detect whether p succeeds, e.g. in try_or_l.
EXPERIMENTAL
Fixpoint combinator. fix (fun self -> p) is the parser p, in which self refers to the parser p itself (which is useful to parse recursive structures.
An alternative, manual implementation to let p = fix (fun self -> q) is:
let rec p () =
+ ('acc * string) tSame as chars_fold but with the following differences:
`Continue _. The string is built from characters returned by `Yield.`Yield (acc, c) adds c to the returned string and continues parsing with acc.take_until_success p accumulates characters of the input into a slice, until p successfully parses a value x; then it returns slice, x.
NOTE performance wise, if p does a lot of work at each position, this can be costly (thing naive substring search if p is string "very long needle").
take len parses exactly len characters from the input. Fails if the input doesn't contain at least len chars.
take_if f takes characters as long as they satisfy the predicate f.
take1_if f takes characters as long as they satisfy the predicate f. Fails if no character satisfies f.
val char_if : ?descr:string -> (char -> bool) -> char tchar_if f parses a character c if f c = true. Fails if the next char does not satisfy f.
val chars_if : (char -> bool) -> string tchars_if f parses a string of chars that satisfy f. Cannot fail.
val chars1_if : ?descr:string -> (char -> bool) -> string tLike chars_if, but accepts only non-empty strings. chars1_if p fails if the string accepted by chars_if p is empty. chars1_if p is equivalent to take1_if p >|= Slice.to_string.
val endline : char tParse '\n'.
val space : char tTab or space.
val white : char tTab or space or newline.
val skip_chars : (char -> bool) -> unit tSkip 0 or more chars satisfying the predicate.
val skip_space : unit tSkip ' ' and '\t'.
val skip_white : unit tSkip ' ' and '\t' and '\n'.
suspend f is the same as f (), but evaluates f () only when needed.
A practical use case is to implement recursive parsers manually, as described in fix. The parser is let rec p () = …, and suspend p can be used in the definition to use p.
val string : string -> string tstring s parses exactly the string s, and nothing else.
many p parses p repeatedly, until p fails, and collects the results into a list.
optional p tries to parse p, and return () whether it succeeded or failed. Cannot fail itself. It consumes input if p succeeded (as much as p consumed), but consumes not input if p failed.
try_ p is just like p (it used to play a role in backtracking semantics but no more).
try_opt p tries to parse using p, and return Some x if p succeeded with x (and consumes what p consumed). Otherwise it returns None and consumes nothing. This cannot fail.
many_until ~until p parses as many p as it can until the until parser successfully returns. If p fails before that then many_until ~until p fails as well. Typically until can be a closing ')' or another termination condition, and what is consumed by until is also consumed by many_until ~until p.
EXPERIMENTAL
try_or p1 ~f ~else_:p2 attempts to parse x using p1, and then becomes f x. If p1 fails, then it becomes p2. This can be useful if f is expensive but only ever works if p1 matches (e.g. after an opening parenthesis or some sort of prefix).
try_or_l ?else_ l tries each pair (test, p) in order. If the n-th test succeeds, then try_or_l l behaves like n-th p, whether p fails or not. If test consumes input, the state is restored before calling p. If they all fail, and else_ is defined, then it behaves like else_. If all fail, and else_ is None, then it fails as well.
This is a performance optimization compared to (<|>). We commit to a branch if the test succeeds, without backtracking at all. It can also provide better error messages, because failures in the parser will not be reported as failures in try_or_l.
See lookahead_ignore for a convenient way of writing the test conditions.
or_ p1 p2 tries to parse p1, and if it fails, tries p2 from the same position.
both a b parses a, then b, then returns the pair of their results.
many1 p is like many p excepts it fails if the list is empty (i.e. it needs p to succeed at least once).
skip p parses zero or more times p and ignores its result. It is eager, meaning it will continue as long as p succeeds. As soon as p fails, skip p stops consuming any input.
Same as sep but stop when until parses successfully.
lookahead p behaves like p, except it doesn't consume any input.
EXPERIMENTAL
lookahead_ignore p tries to parse input with p, and succeeds if p succeeds. However it doesn't consume any input and returns (), so in effect its only use-case is to detect whether p succeeds, e.g. in try_or_l.
EXPERIMENTAL
Fixpoint combinator. fix (fun self -> p) is the parser p, in which self refers to the parser p itself (which is useful to parse recursive structures.
An alternative, manual implementation to let p = fix (fun self -> q) is:
let rec p () =
let self = suspend p in
qval line_str : string tline_str is line >|= Slice.to_string. It parses the next line and turns the slice into a string. The state points to the character immediately after the '\n' character.
split_1 ~on_char looks for on_char in the input, and returns a pair sl1, sl2, where:
sl1 is the slice of the input the precedes the first occurrence of on_char, or the whole input if on_char cannot be found. It does not contain on_char.sl2 is the slice that comes after on_char, or None if on_char couldn't be found. It doesn't contain the first occurrence of on_char (if any).The parser is now positioned at the end of the input.
EXPERIMENTAL
split_list ~on_char splits the input on all occurrences of on_char, returning a list of slices.
EXPERIMENTAL
split_list_at_most ~on_char n applies split_1 ~on_char at most n times, to get a list of n+1 elements. The last element might contain on_char. This is useful to limit the amount of work done by split_list.
EXPERIMENTAL
split_2 ~on_char splits the input into exactly 2 fields, and fails if the split yields less or more than 2 items. EXPERIMENTAL
split_list_map ~on_char p uses split_list ~on_char to split the input, then parses each chunk of the input thus obtained using p.
The difference with sep ~by:(char on_char) p is that sep calls p first, and only tries to find on_char after p returns. While it is more flexible, this technique also means p has to be careful not to consume on_char by error.
A useful specialization of this is each_line, which is basically each_split ~on_char:'\n' p.
EXPERIMENTAL
all returns all the unconsumed input as a slice, and consumes it. Use Slice.to_string to turn it into a string.
Note that lookahead all can be used to peek at the rest of the input without consuming anything.
val all_str : string tall_str accepts all the remaining chars and extracts them into a string. Similar to all but with a string.
EXPERIMENTAL
Memoize the parser. memo p will behave like p, but when called in a state (read: position in input) it has already processed, memo p returns a result directly. The implementation uses an underlying hashtable. This can be costly in memory, but improve the run time a lot if there is a lot of backtracking involving p.
Do not call memo inside other functions, especially with (>>=), map, etc. being so prevalent. Instead the correct way to use it is in a toplevel definition:
let my_expensive_parser = memo (foo *> bar >>= fun i -> …)This function is not thread-safe.
module Infix : sig ... endinclude module type of InfixAlias to map. p >|= f parses an item x using p, and returns f x.
Alias to bind. p >>= f results in a new parser which behaves as p then, in case of success, applies f to the result.
a <* b parses a into x, parses b and ignores its result, and returns x.
a *> b parses a, then parses b into x, and returns x. The result of a is ignored.
Alias to or_.
a <|> b tries to parse a, and if a fails without consuming any input, backtracks and tries to parse b, otherwise it fails as a.
a <?> msg behaves like a, but if a fails, a <?> msg fails with msg instead. Useful as the last choice in a series of <|>. For example: a <|> b <|> c <?> "expected one of a, b, c".
Alias to both. a ||| b parses a, then b, then returns the pair of their results.
Let operators on OCaml >= 4.08.0, nothing otherwise
Turn a Error.t-oriented result into a more basic string result.
Version of parse_string that returns a more detailed error.
val parse_string_exn : 'a t -> string -> 'aparse_file p filename parses file named filename with p by opening the file and reading it whole.
Version of parse_file that returns a more detailed error.
val parse_file_exn : 'a t -> string -> 'aSame as parse_file, but
module U : sig ... endmodule Debug_ : sig ... endDebugging utils. EXPERIMENTAL