there's a single blocking queue, and one WS_queue per worker. Scheduling
into the pool from a worker (e.g. via fork_join or explicitly) will push
into this WS queue; otherwise it goes into the main blocking queue.
Workers will always try to empty their local queue first, then try to
work steal, then block on the main queue.