This module defines the common Either interface that is provided for all OCaml versions. For documentation of these functions, refer to the standard library.
This module defines the common Either interface that is provided for all OCaml versions. For documentation of these functions, refer to the standard library.
compare_and_set r seen v sets the new value of r to v only if its current value is physically equal to seen -- the comparison and the set occur atomically. Returns true if the comparison succeeded (so the set happened) and false otherwise.
compare_and_set r seen v sets the new value of r to v only if its current value is physically equal to seen -- the comparison and the set occur atomically. Returns true if the comparison succeeded (so the set happened) and false otherwise.
This queue is quite basic and will not behave well under heavy contention. However, it can be sufficient for many practical use cases.
NOTE: this queue will typically block the caller thread in case the operation (push/pop) cannot proceed. Be wary of deadlocks when using the queue from a pool when you expect the other end to also be produced/consumed from the same pool.
See discussion on Fut.wait_block for more details on deadlocks and how to mitigate the risk of running into them.
More scalable queues can be found in Lockfree (https://github.com/ocaml-multicore/lockfree/)
type'a t
Unbounded blocking queue.
This queue is thread-safe and will block when calling pop on it when it's empty.
Number of items currently in the queue. Note that pop might still block if this returns a non-zero number, since another thread might have consumed the items in the mean time.
try_pop q immediately pops the first element of q, if any, or returns None without blocking.
parameterforce_lock
if true, use Mutex.lock (which can block under contention); if false, use Mutex.try_lock, which might return None even in presence of an element if there's contention
transfer bq q2 transfers all items presently in bq into q2 in one atomic section, and clears bq. It blocks if no element is in bq.
This is useful to consume elements from the queue in batch. Create a Queue.t locally:
let dowork (work_queue: job Bb_queue.t) =
+Blocking_queue (moonpool.Moonpool.Blocking_queue)
Module Moonpool.Blocking_queue
A simple blocking queue.
This queue is quite basic and will not behave well under heavy contention. However, it can be sufficient for many practical use cases.
NOTE: this queue will typically block the caller thread in case the operation (push/pop) cannot proceed. Be wary of deadlocks when using the queue from a pool when you expect the other end to also be produced/consumed from the same pool.
See discussion on Fut.wait_block for more details on deadlocks and how to mitigate the risk of running into them.
More scalable queues can be found in Lockfree (https://github.com/ocaml-multicore/lockfree/)
type'a t
Unbounded blocking queue.
This queue is thread-safe and will block when calling pop on it when it's empty.
Number of items currently in the queue. Note that pop might still block if this returns a non-zero number, since another thread might have consumed the items in the mean time.
try_pop q immediately pops the first element of q, if any, or returns None without blocking.
parameterforce_lock
if true, use Mutex.lock (which can block under contention); if false, use Mutex.try_lock, which might return None even in presence of an element if there's contention
transfer bq q2 transfers all items presently in bq into q2 in one atomic section, and clears bq. It blocks if no element is in bq.
This is useful to consume elements from the queue in batch. Create a Queue.t locally:
let dowork (work_queue: job Bb_queue.t) =
(* local queue, not thread safe *)
let local_q = Queue.create() in
try
diff --git a/dev/moonpool/Moonpool/Bounded_queue/index.html b/dev/moonpool/Moonpool/Bounded_queue/index.html
index 8fc91ff4..4b4b06b9 100644
--- a/dev/moonpool/Moonpool/Bounded_queue/index.html
+++ b/dev/moonpool/Moonpool/Bounded_queue/index.html
@@ -1,2 +1,2 @@
-Bounded_queue (moonpool.Moonpool.Bounded_queue)
Module Moonpool.Bounded_queue
A blocking queue of finite size.
This queue, while still using locks underneath (like the regular blocking queue) should be enough for usage under reasonable contention.
The bounded size is helpful whenever some form of backpressure is desirable: if the queue is used to communicate between producer(s) and consumer(s), the consumer(s) can limit the rate at which producer(s) send new work down their way. Whenever the queue is full, means that producer(s) will have to wait before pushing new work.
try_pop ~force_lock q tries to pop the first element, or returns None if no element is available or if it failed to acquire q.
parameterforce_lock
if true, use Mutex.lock (which can block under contention); if false, use Mutex.try_lock, which might return None even in presence of an element if there's contention.
to_gen q returns a (transient) sequence from the queue.
+Bounded_queue (moonpool.Moonpool.Bounded_queue)
Module Moonpool.Bounded_queue
A blocking queue of finite size.
This queue, while still using locks underneath (like the regular blocking queue) should be enough for usage under reasonable contention.
The bounded size is helpful whenever some form of backpressure is desirable: if the queue is used to communicate between producer(s) and consumer(s), the consumer(s) can limit the rate at which producer(s) send new work down their way. Whenever the queue is full, means that producer(s) will have to wait before pushing new work.
try_pop ~force_lock q tries to pop the first element, or returns None if no element is available or if it failed to acquire q.
parameterforce_lock
if true, use Mutex.lock (which can block under contention); if false, use Mutex.try_lock, which might return None even in presence of an element if there's contention.
Like pop, but blocks if an element is not available immediately. The precautions around blocking from inside a thread pool are the same as explained in Fut.wait_block.
Like pop, but blocks if an element is not available immediately. The precautions around blocking from inside a thread pool are the same as explained in Fut.wait_block.
FIFO: first-in, first-out. Basically tasks are put into a queue, and worker threads pull them out of the queue at the other end.
Since this uses a single blocking queue to manage tasks, it's very simple and reliable. The number of worker threads is fixed, but they are spread over several domains to enable parallelism.
This can be useful for latency-sensitive applications (e.g. as a pool of workers for network servers). Work-stealing pools might have higher throughput but they're very unfair to some tasks; by contrast, here, older tasks have priority over younger tasks.
If a runner is no longer needed, shutdown can be used to signal all worker threads in it to stop (after they finish their work), and wait for them to stop.
The threads are distributed across a fixed domain pool (whose size is determined by Domain.recommended_domain_count on OCaml 5, and simple the single runtime on OCaml 4).
run_wait_block pool f schedules f for later execution on the pool, like run_async. It then blocks the current thread until f() is done executing, and returns its result. If f() raises an exception, then run_wait_block pool f will raise it as well.
NOTE be careful with deadlocks (see notes in Fut.wait_block about the required discipline to avoid deadlocks).
called at the beginning of each new thread in the pool.
parametermin
minimum size of the pool. See Pool.create_args. The default is Domain.recommended_domain_count(), ie one worker per CPU core. On OCaml 4 the default is 4 (since there is only one domain).
parameteron_exit_thread
called at the end of each worker thread in the pool.
parameteraround_task
a pair of before, after functions ran around each task. See Pool.create_args.
with_ () f calls f pool, where pool is obtained via create. When f pool returns or fails, pool is shutdown and its resources are released. Most parameters are the same as in create.
NOTE These are only available on OCaml 5.0 and above.
since 0.3
val both : (unit ->'a)->(unit ->'b)->'a * 'b
both f g runs f() and g(), potentially in parallel, and returns their result when both are done. If any of f() and g() fails, then the whole computation fails.
This must be run from within the pool: for example, inside Pool.run or inside a Fut.spawn computation. This is because it relies on an effect handler to be installed.
since 0.3
NOTE this is only available on OCaml 5.
val both_ignore : (unit ->_)->(unit ->_)-> unit
Same as both f g |> ignore.
since 0.3
NOTE this is only available on OCaml 5.
val for_ : ?chunk_size:int ->int ->(int ->int -> unit)-> unit
for_ n f is the parallel version of for i=0 to n-1 do f i done.
f is called with parameters low and high and must use them like so:
for j = low to high do (* … actual work *) done
. If chunk_size=1 then low=high and the loop is not actually needed.
parameterchunk_size
controls the granularity of parallelism. The default chunk size is not specified. See all_array or all_list for more details.
Example:
let total_sum = Atomic.make 0
+Fork_join (moonpool.Moonpool.Fork_join)
Module Moonpool.Fork_join
Fork-join primitives.
NOTE These are only available on OCaml 5.0 and above.
since 0.3
val both : (unit ->'a)->(unit ->'b)->'a * 'b
both f g runs f() and g(), potentially in parallel, and returns their result when both are done. If any of f() and g() fails, then the whole computation fails.
This must be run from within the pool: for example, inside Pool.run or inside a Fut.spawn computation. This is because it relies on an effect handler to be installed.
since 0.3
NOTE this is only available on OCaml 5.
val both_ignore : (unit ->_)->(unit ->_)-> unit
Same as both f g |> ignore.
since 0.3
NOTE this is only available on OCaml 5.
val for_ : ?chunk_size:int ->int ->(int ->int -> unit)-> unit
for_ n f is the parallel version of for i=0 to n-1 do f i done.
f is called with parameters low and high and must use them like so:
for j = low to high do (* … actual work *) done
. If chunk_size=1 then low=high and the loop is not actually needed.
parameterchunk_size
controls the granularity of parallelism. The default chunk size is not specified. See all_array or all_list for more details.
A future of type 'a t represents the result of a computation that will yield a value of type 'a.
Typically, the computation is running on a thread pool Runner.t and will proceed on some worker. Once set, a future cannot change. It either succeeds (storing a Ok x with x: 'a), or fail (storing a Error (exn, bt) with an exception and the corresponding backtrace).
Combinators such as map and join_array can be used to produce futures from other futures (in a monadic way). Some combinators take a on argument to specify a runner on which the intermediate computation takes place; for example map ~on:pool ~f fut maps the value in fut using function f, applicatively; the call to f happens on the runner pool (once fut resolves successfully with a value).
bind ?on ~f fut returns a new future fut2 that resolves like the future f x if fut resolved with x; and fails with e if fut fails with e or f x raises e.
bind_reify_error ?on ~f fut returns a new future fut2 that resolves like the future f (Ok x) if fut resolved with x; and resolves like the future f (Error (exn, bt)) if fut fails with exn and backtrace bt.
choose a b succeeds Left x or Right y if a succeeds with x or b succeeds with y, or fails if both of them fails. If they both succeed, it is not specified which result is used.
choose_same a b succeeds with the value of one of a or b if they succeed, or fails if both fail. If they both succeed, it is not specified which result is used.
wait_list l waits for all futures in l to resolve. It discards the individual results of futures in l. It fails if any future fails.
val for_ : on:Runner.t->int ->(int -> unit)->unit t
for_ ~on n f runs f 0, f 1, …, f (n-1) on the runner, and returns a future that resolves when all the tasks have resolved, or fails as soon as one task has failed.
val for_array : on:Runner.t->'a array->(int ->'a-> unit)->unit t
for_array ~on arr f runs f 0 arr.(0), …, f (n-1) arr.(n-1) in the runner (where n = Array.length arr), and returns a future that resolves when all the tasks are done, or fails if any of them fails.
since 0.2
val for_list : on:Runner.t->'a list->('a-> unit)->unit t
for_list ~on l f is like for_array ~on (Array.of_list l) f.
wait_block fut blocks the current thread until fut is resolved, and returns its value.
NOTE: A word of warning: this will monopolize the calling thread until the future resolves. This can also easily cause deadlocks, if enough threads in a pool call wait_block on futures running on the same pool or a pool depending on it.
A good rule to avoid deadlocks is to run this from outside of any pool, or to have an acyclic order between pools where wait_block is only called from a pool on futures evaluated in a pool that comes lower in the hierarchy. If this rule is broken, it is possible for all threads in a pool to wait for futures that can only make progress on these same threads, hence the deadlock.
infix runner makes a new infix module with intermediate computations running on the given runner..
since 0.2
+Fut (moonpool.Moonpool.Fut)
Module Moonpool.Fut
Futures.
A future of type 'a t represents the result of a computation that will yield a value of type 'a.
Typically, the computation is running on a thread pool Runner.t and will proceed on some worker. Once set, a future cannot change. It either succeeds (storing a Ok x with x: 'a), or fail (storing a Error (exn, bt) with an exception and the corresponding backtrace).
Combinators such as map and join_array can be used to produce futures from other futures (in a monadic way). Some combinators take a on argument to specify a runner on which the intermediate computation takes place; for example map ~on:pool ~f fut maps the value in fut using function f, applicatively; the call to f happens on the runner pool (once fut resolves successfully with a value).
bind ?on ~f fut returns a new future fut2 that resolves like the future f x if fut resolved with x; and fails with e if fut fails with e or f x raises e.
bind_reify_error ?on ~f fut returns a new future fut2 that resolves like the future f (Ok x) if fut resolved with x; and resolves like the future f (Error (exn, bt)) if fut fails with exn and backtrace bt.
choose a b succeeds Left x or Right y if a succeeds with x or b succeeds with y, or fails if both of them fails. If they both succeed, it is not specified which result is used.
choose_same a b succeeds with the value of one of a or b if they succeed, or fails if both fail. If they both succeed, it is not specified which result is used.
wait_list l waits for all futures in l to resolve. It discards the individual results of futures in l. It fails if any future fails.
val for_ : on:Runner.t->int ->(int -> unit)->unit t
for_ ~on n f runs f 0, f 1, …, f (n-1) on the runner, and returns a future that resolves when all the tasks have resolved, or fails as soon as one task has failed.
val for_array : on:Runner.t->'a array->(int ->'a-> unit)->unit t
for_array ~on arr f runs f 0 arr.(0), …, f (n-1) arr.(n-1) in the runner (where n = Array.length arr), and returns a future that resolves when all the tasks are done, or fails if any of them fails.
since 0.2
val for_list : on:Runner.t->'a list->('a-> unit)->unit t
for_list ~on l f is like for_array ~on (Array.of_list l) f.
wait_block fut blocks the current thread until fut is resolved, and returns its value.
NOTE: A word of warning: this will monopolize the calling thread until the future resolves. This can also easily cause deadlocks, if enough threads in a pool call wait_block on futures running on the same pool or a pool depending on it.
A good rule to avoid deadlocks is to run this from outside of any pool, or to have an acyclic order between pools where wait_block is only called from a pool on futures evaluated in a pool that comes lower in the hierarchy. If this rule is broken, it is possible for all threads in a pool to wait for futures that can only make progress on these same threads, hence the deadlock.
Runner that runs tasks immediately in the caller thread.
Whenever a task is submitted to this runner via Runner.run_async r task, the task is run immediately in the caller thread as task(). There are no background threads, no resource, this is just a trivial implementation of the interface.
This can be useful when an implementation needs a runner, but there isn't enough work to justify starting an actual full thread pool.
Another situation is when threads cannot be used at all (e.g. because you plan to call Unix.fork later).
If a runner is no longer needed, shutdown can be used to signal all worker threads in it to stop (after they finish their work), and wait for them to stop.
The threads are distributed across a fixed domain pool (whose size is determined by Domain.recommended_domain_count on OCaml 5, and simple the single runtime on OCaml 4).
run_wait_block pool f schedules f for later execution on the pool, like run_async. It then blocks the current thread until f() is done executing, and returns its result. If f() raises an exception, then run_wait_block pool f will raise it as well.
NOTE be careful with deadlocks (see notes in Fut.wait_block about the required discipline to avoid deadlocks).
with_ l f runs f x where x is the value protected with the lock l, in a critical section. If f x fails, with_lock l f fails too but the lock is released.
with_ l f runs f x where x is the value protected with the lock l, in a critical section. If f x fails, with_lock l f fails too but the lock is released.
If a pool is no longer needed, shutdown can be used to signal all threads in it to stop (after they finish their work), and wait for them to stop.
The threads are distributed across a fixed domain pool (whose size is determined by Domain.recommended_domain_count on OCaml 5, and simply the single runtime on OCaml 4).
If a runner is no longer needed, shutdown can be used to signal all worker threads in it to stop (after they finish their work), and wait for them to stop.
The threads are distributed across a fixed domain pool (whose size is determined by Domain.recommended_domain_count on OCaml 5, and simple the single runtime on OCaml 4).
run_wait_block pool f schedules f for later execution on the pool, like run_async. It then blocks the current thread until f() is done executing, and returns its result. If f() raises an exception, then run_wait_block pool f will raise it as well.
NOTE be careful with deadlocks (see notes in Fut.wait_block).
type thread_loop_wrapper =
- thread:Thread.t->
- pool:t->
- (unit -> unit)->
- unit ->
- unit
A thread wrapper f takes the current thread, the current pool, and the worker function loop : unit -> unit which is the worker's main loop, and returns a new loop function. By default it just returns the same loop function but it can be used to install tracing, effect handlers, etc.
add_global_thread_loop_wrapper f installs f to be installed in every new pool worker thread, for all existing pools, and all new pools created with create. These wrappers accumulate: they all apply, but their order is not specified.
called at the beginning of each new thread in the pool.
parametermin
minimum size of the pool. It will be at least 1 internally, so 0 or negative values make no sense.
parameterper_domain
is the number of threads allocated per domain in the fixed domain pool. The default value is 0, but setting, say, ~per_domain:2 means that if there are 8 domains (which might be the case on an 8-core machine) then the minimum size of the pool is 16. If both min and per_domain are specified, the maximum of both min and per_domain * num_of_domains is used.
a pair of before, after, where before pool is called before a task is processed, on the worker thread about to run it, and returns x; and after pool x is called by the same thread after the task is over. (since 0.2)
If a runner is no longer needed, shutdown can be used to signal all worker threads in it to stop (after they finish their work), and wait for them to stop.
The threads are distributed across a fixed domain pool (whose size is determined by Domain.recommended_domain_count on OCaml 5, and simple the single runtime on OCaml 4).
run_wait_block pool f schedules f for later execution on the pool, like run_async. It then blocks the current thread until f() is done executing, and returns its result. If f() raises an exception, then run_wait_block pool f will raise it as well.
NOTE be careful with deadlocks (see notes in Fut.wait_block).
This provides an abstraction for running tasks in the background, which is implemented by various thread pools.
since 0.3
type task = unit -> unit
type t
A runner.
If a runner is no longer needed, shutdown can be used to signal all worker threads in it to stop (after they finish their work), and wait for them to stop.
The threads are distributed across a fixed domain pool (whose size is determined by Domain.recommended_domain_count on OCaml 5, and simple the single runtime on OCaml 4).
run_wait_block pool f schedules f for later execution on the pool, like run_async. It then blocks the current thread until f() is done executing, and returns its result. If f() raises an exception, then run_wait_block pool f will raise it as well.
NOTE be careful with deadlocks (see notes in Fut.wait_block about the required discipline to avoid deadlocks).
The handler that knows what to do with the suspended computation.
The handler is given two things:
the suspended computation (which can be resumed with a result eventually);
a run function that can be used to start tasks to perform some computation.
This means that a fork-join primitive, for example, can use a single call to suspend to:
suspend the caller until the fork-join is done
use run to start all the tasks. Typically run is called multiple times, which is where the "fork" part comes from. Each call to run potentially runs in parallel with the other calls. The calls must coordinate so that, once they are all done, the suspended caller is resumed with the aggregated result of the computation.
The effect used to suspend the current thread and pass it, suspended, to the handler. The handler will ensure that the suspension is resumed later once some computation has been done.
suspend h jumps back to the nearest with_suspend and calls h.handle with the current continuation k and a task runner function.
val with_suspend :
- run:(with_handler:bool ->task-> unit)->
- (unit -> unit)->
- unit
with_suspend ~run f runs f() in an environment where suspend will work. If f() suspends with suspension handler h, this calls h ~run k where k is the suspension.
A pool of threads with a worker-stealing scheduler. The pool contains a fixed number of threads that wait for work items to come, process these, and loop.
This is good for CPU-intensive tasks that feature a lot of small tasks. Note that tasks will not always be processed in the order they are scheduled, so this is not great for workloads where the latency of individual tasks matter (for that see Fifo_pool).
If a pool is no longer needed, shutdown can be used to signal all threads in it to stop (after they finish their work), and wait for them to stop.
The threads are distributed across a fixed domain pool (whose size is determined by Domain.recommended_domain_count on OCaml 5, and simply the single runtime on OCaml 4).
If a runner is no longer needed, shutdown can be used to signal all worker threads in it to stop (after they finish their work), and wait for them to stop.
The threads are distributed across a fixed domain pool (whose size is determined by Domain.recommended_domain_count on OCaml 5, and simple the single runtime on OCaml 4).
run_wait_block pool f schedules f for later execution on the pool, like run_async. It then blocks the current thread until f() is done executing, and returns its result. If f() raises an exception, then run_wait_block pool f will raise it as well.
NOTE be careful with deadlocks (see notes in Fut.wait_block about the required discipline to avoid deadlocks).
called at the beginning of each new thread in the pool.
parameternum_threads
size of the pool, ie. number of worker threads. It will be at least 1 internally, so 0 or negative values make no sense. The default is Domain.recommended_domain_count(), ie one worker thread per CPU core. On OCaml 4 the default is 4 (since there is only one domain).
parameteron_exit_thread
called at the end of each thread in the pool
parameteraround_task
a pair of before, after, where before pool is called before a task is processed, on the worker thread about to run it, and returns x; and after pool x is called by the same thread after the task is over. (since 0.2)
val start_thread_on_some_domain : ('a-> unit)->'a->Thread.t
Similar to Thread.create, but it picks a background domain at random to run the thread. This ensures that we don't always pick the same domain to run all the various threads needed in an application (timers, event loops, etc.)
A pool within a bigger pool (ie the ocean). Here, we're talking about pools of Thread.t that are dispatched over several Domain.t to enable parallelism.
We provide several implementations of pools with distinct scheduling strategies, alongside some concurrency primitives such as guarding locks (Lock.t) and futures (Fut.t).
Default pool. Please explicitly pick an implementation instead.
val start_thread_on_some_domain : ('a-> unit)->'a->Thread.t
Similar to Thread.create, but it picks a background domain at random to run the thread. This ensures that we don't always pick the same domain to run all the various threads needed in an application (timers, event loops, etc.)
run_async runner task schedules the task to run on the given runner. This means task() will be executed at some point in the future, possibly in another thread.
since NEXT_RELEASE
val recommended_thread_count : unit -> int
Number of threads recommended to saturate the CPU. For IO pools this makes little sense (you might want more threads than this because many of them will be blocked most of the time).
diff --git a/dev/moonpool/_doc-dir/README.md b/dev/moonpool/_doc-dir/README.md
index 60f478d3..5135d00b 100644
--- a/dev/moonpool/_doc-dir/README.md
+++ b/dev/moonpool/_doc-dir/README.md
@@ -24,22 +24,31 @@ In addition, some concurrency and parallelism primitives are provided:
## Usage
-The user can create several thread pools. These pools use regular posix threads,
-but the threads are spread across multiple domains (on OCaml 5), which enables
-parallelism.
+The user can create several thread pools (implementing the interface `Runner.t`).
+These pools use regular posix threads, but the threads are spread across
+multiple domains (on OCaml 5), which enables parallelism.
-The function `Pool.run_async pool task` runs `task()` on one of the workers
-of `pool`, as soon as one is available. No result is returned.
+Current we provide these pool implementations:
+- `Fifo_pool` is a thread pool that uses a blocking queue to schedule tasks,
+ which means they're picked in the same order they've been scheduled ("fifo").
+ This pool is simple and will behave fine for coarse-granularity concurrency,
+ but will slow down under heavy contention.
+- `Ws_pool` is a work-stealing pool, where each thread has its own local queue
+ in addition to a global queue of tasks. This is efficient for workloads
+ with many short tasks that spawn other tasks, but the order in which
+ tasks are run is less predictable. This is useful when throughput is
+ the important thing to optimize.
+
+The function `Runner.run_async pool task` schedules `task()` to run on one of
+the workers of `pool`, as soon as one is available. No result is returned by `run_async`.
```ocaml
# #require "threads";;
-# let pool = Moonpool.Pool.create ~min:4 ();;
-val pool : Moonpool.Runner.t =
- {Moonpool.Pool.run_async = ; shutdown = ; size = ;
- num_tasks = }
+# let pool = Moonpool.Fifo_pool.create ~num_threads:4 ();;
+val pool : Moonpool.Runner.t =
# begin
- Moonpool.Pool.run_async pool
+ Moonpool.Runner.run_async pool
(fun () ->
Thread.delay 0.1;
print_endline "running from the pool");
@@ -51,11 +60,13 @@ running from the pool
- : unit = ()
```
-To wait until the task is done, you can use `Pool.run_wait_block` instead:
+To wait until the task is done, you can use `Runner.run_wait_block`[^1] instead:
+
+[^1]: beware of deadlock! See documentation for more details.
```ocaml
# begin
- Moonpool.Pool.run_wait_block pool
+ Moonpool.Runner.run_wait_block pool
(fun () ->
Thread.delay 0.1;
print_endline "running from the pool");
@@ -157,7 +168,11 @@ val expected_sum : int = 5050
On OCaml 5, again using effect handlers, the module `Fork_join`
implements the [fork-join model](https://en.wikipedia.org/wiki/Fork%E2%80%93join_model).
-It must run on a pool (using [Pool.run] or inside a future via [Future.spawn]).
+It must run on a pool (using [Runner.run_async] or inside a future via [Fut.spawn]).
+
+It is generally better to use the work-stealing pool for workloads that rely on
+fork-join for better performance, because fork-join will tend to spawn lots of
+shorter tasks.
```ocaml
# let rec select_sort arr i len =
@@ -259,7 +274,7 @@ This works for OCaml >= 4.08.
the same pool, too — this is useful for threads blocking on IO).
A useful analogy is that each domain is a bit like a CPU core, and `Thread.t` is a logical thread running on a core.
- Multiple threads have to share a single core and do not run in parallel on it[^1].
+ Multiple threads have to share a single core and do not run in parallel on it[^2].
We can therefore build pools that spread their worker threads on multiple cores to enable parallelism within each pool.
TODO: actually use https://github.com/haesbaert/ocaml-processor to pin domains to cores,
@@ -275,4 +290,4 @@ MIT license.
$ opam install moonpool
```
-[^1]: let's not talk about hyperthreading.
+[^2]: let's not talk about hyperthreading.
diff --git a/dev/moonpool/index.html b/dev/moonpool/index.html
index ea675d63..2aa1f7dc 100644
--- a/dev/moonpool/index.html
+++ b/dev/moonpool/index.html
@@ -1,2 +1,2 @@
-index (moonpool.index)
add key data m returns a map containing the same bindings as m, plus a binding of key to data. If key was already bound in m to a value that is physically equal to data, m is returned unchanged (the result of the function is then physically equal to m). Otherwise, the previous binding of key in m disappears.
before4.03
Physical equality was not ensured.
val update : key->('a option->'a option)->'at->'at
update key f m returns a map containing the same bindings as m, except for the binding of key. Depending on the value of y where y is f (find_opt key m), the binding of key is added, removed or updated. If y is None, the binding is removed if it exists; otherwise, if y is Some z then key is associated to z in the resulting map. If key was already bound in m to a value that is physically equal to z, m is returned unchanged (the result of the function is then physically equal to m).
remove x m returns a map containing the same bindings as m, except for x which is unbound in the returned map. If x was not in m, m is returned unchanged (the result of the function is then physically equal to m).
before4.03
Physical equality was not ensured.
val merge :
+Map (ocaml.Arg_helper.Make.S.Key.Map)
add key data m returns a map containing the same bindings as m, plus a binding of key to data. If key was already bound in m to a value that is physically equal to data, m is returned unchanged (the result of the function is then physically equal to m). Otherwise, the previous binding of key in m disappears.
before4.03
Physical equality was not ensured.
val update : key->('a option->'a option)->'at->'at
update key f m returns a map containing the same bindings as m, except for the binding of key. Depending on the value of y where y is f (find_opt key m), the binding of key is added, removed or updated. If y is None, the binding is removed if it exists; otherwise, if y is Some z then key is associated to z in the resulting map. If key was already bound in m to a value that is physically equal to z, m is returned unchanged (the result of the function is then physically equal to m).
remove x m returns a map containing the same bindings as m, except for x which is unbound in the returned map. If x was not in m, m is returned unchanged (the result of the function is then physically equal to m).