Skip to content

Commit 4fcc712

Browse files
Kent Overstreettorvalds
authored andcommitted
aio: fix io_destroy() regression by using call_rcu()
There was a regression introduced by 36f5588 ("aio: refcounting cleanup"), reported by Jens Axboe - the refcounting cleanup switched to using RCU in the shutdown path, but the synchronize_rcu() was done in the context of the io_destroy() syscall greatly increasing the time it could block. This patch switches it to call_rcu() and makes shutdown asynchronous (more asynchronous than it was originally; before the refcount changes io_destroy() would still wait on pending kiocbs). Note that there's a global quota on the max outstanding kiocbs, and that quota must be manipulated synchronously; otherwise io_setup() could return -EAGAIN when there isn't quota available, and userspace won't have any way of waiting until shutdown of the old kioctxs has finished (besides busy looping). So we release our quota before kioctx shutdown has finished, which should be fine since the quota never corresponded to anything real anyways. Signed-off-by: Kent Overstreet <koverstreet@google.com> Cc: Zach Brown <zab@redhat.com> Cc: Felipe Balbi <balbi@ti.com> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Mark Fasheh <mfasheh@suse.com> Cc: Joel Becker <jlbec@evilplan.org> Cc: Rusty Russell <rusty@rustcorp.com.au> Reported-by: Jens Axboe <axboe@kernel.dk> Tested-by: Jens Axboe <axboe@kernel.dk> Cc: Asai Thambi S P <asamymuthupa@micron.com> Cc: Selvan Mani <smani@micron.com> Cc: Sam Bradshaw <sbradshaw@micron.com> Cc: Jeff Moyer <jmoyer@redhat.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Benjamin LaHaise <bcrl@kvack.org> Tested-by: Benjamin LaHaise <bcrl@kvack.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
1 parent bba00e5 commit 4fcc712

File tree

1 file changed

+16
-20
lines changed

1 file changed

+16
-20
lines changed

fs/aio.c

Lines changed: 16 additions & 20 deletions
Original file line numberDiff line numberDiff line change
@@ -141,9 +141,6 @@ static void aio_free_ring(struct kioctx *ctx)
141141
for (i = 0; i < ctx->nr_pages; i++)
142142
put_page(ctx->ring_pages[i]);
143143

144-
if (ctx->mmap_size)
145-
vm_munmap(ctx->mmap_base, ctx->mmap_size);
146-
147144
if (ctx->ring_pages && ctx->ring_pages != ctx->internal_pages)
148145
kfree(ctx->ring_pages);
149146
}
@@ -322,11 +319,6 @@ static void free_ioctx(struct kioctx *ctx)
322319

323320
aio_free_ring(ctx);
324321

325-
spin_lock(&aio_nr_lock);
326-
BUG_ON(aio_nr - ctx->max_reqs > aio_nr);
327-
aio_nr -= ctx->max_reqs;
328-
spin_unlock(&aio_nr_lock);
329-
330322
pr_debug("freeing %p\n", ctx);
331323

332324
/*
@@ -435,17 +427,24 @@ static void kill_ioctx(struct kioctx *ctx)
435427
{
436428
if (!atomic_xchg(&ctx->dead, 1)) {
437429
hlist_del_rcu(&ctx->list);
438-
/* Between hlist_del_rcu() and dropping the initial ref */
439-
synchronize_rcu();
440430

441431
/*
442-
* We can't punt to workqueue here because put_ioctx() ->
443-
* free_ioctx() will unmap the ringbuffer, and that has to be
444-
* done in the original process's context. kill_ioctx_rcu/work()
445-
* exist for exit_aio(), as in that path free_ioctx() won't do
446-
* the unmap.
432+
* It'd be more correct to do this in free_ioctx(), after all
433+
* the outstanding kiocbs have finished - but by then io_destroy
434+
* has already returned, so io_setup() could potentially return
435+
* -EAGAIN with no ioctxs actually in use (as far as userspace
436+
* could tell).
447437
*/
448-
kill_ioctx_work(&ctx->rcu_work);
438+
spin_lock(&aio_nr_lock);
439+
BUG_ON(aio_nr - ctx->max_reqs > aio_nr);
440+
aio_nr -= ctx->max_reqs;
441+
spin_unlock(&aio_nr_lock);
442+
443+
if (ctx->mmap_size)
444+
vm_munmap(ctx->mmap_base, ctx->mmap_size);
445+
446+
/* Between hlist_del_rcu() and dropping the initial ref */
447+
call_rcu(&ctx->rcu_head, kill_ioctx_rcu);
449448
}
450449
}
451450

@@ -495,10 +494,7 @@ void exit_aio(struct mm_struct *mm)
495494
*/
496495
ctx->mmap_size = 0;
497496

498-
if (!atomic_xchg(&ctx->dead, 1)) {
499-
hlist_del_rcu(&ctx->list);
500-
call_rcu(&ctx->rcu_head, kill_ioctx_rcu);
501-
}
497+
kill_ioctx(ctx);
502498
}
503499
}
504500

0 commit comments

Comments
 (0)