template<typename Fn>
struct Foundation::Core::ParallelForJob< Fn >
Self-draining for-loop job used by ThreadPool::ParallelFor.
One instance is co-invoked across N workers (+ the calling thread) via ThreadPool::CoInvoke; each pulls indices from a shared atomic cursor until the range is exhausted, invoking fn per index. It lives on the caller's stack — no heap allocation, no future. fn is invoked concurrently as fn(i) / fn(i, workerId), so it must be thread-safe: shared captures read-only, writes disjoint per index, any scratch keyed by workerId (a worker never runs two invocations at once, so per-worker scratch is race-free). Per-index granularity is intentional — a job that wants coarser work batches itself by choosing count.