rious decrement as opposed to some sort of leak. Further tracing of the dirty cluster count deltas and an LLM scan of the resulting output identified the cause as a double decrement in the error path between ext4_mb_mark_diskspace_used() and the caller ext4_mb_new_blocks(). First, note that generic/388 is a shutdown vs. fsstress test and so produces a random set of operations and shutdown injections. In the problematic case, the shutdown triggers an error return from the ext4_handle_dirty_metadata() call(s) made from ext4_mb_mark_context(). The changed value is non-zero at this point, so ext4_mb_mark_diskspace_used() does not exit after the error bubbles up from ext4_mb_mark_context(). Instead, the former decrements both cluster counters and returns the error up to ext4_mb_new_blocks(). The latter falls into the !ar->len out path which decrements the dirty clusters counter a second time, creating the inconsistency. To avoid this problem and simplify ownership of the cluster reservation in this codepath, lift the counter reduction to a single place in the caller. This makes it more clear that ext4_mb_new_blocks() is responsible for acquiring cluster reservation (via ext4_claim_free_clusters()) in the !delalloc case as well as releasing it, regardless of whether it ends up consumed or returned due to failure. Fixes: 0087d9fb3f29 ("ext4: Fix s_dirty_blocks_counter if block allocation failed with nodelalloc") Signed-off-by: Brian Foster Reviewed-by: Baokun Li Link: https://patch.msgid.link/20260113171905.118284-1-bfoster@redhat.com Signed-off-by: Theodore Ts'o Cc: stable@kernel.org [ Drop mballoc-test changes ] Signed-off-by: Sasha Levin --- fs/ext4/mballoc.c | 21 +++++---------------- 1 file changed, 5 insertions(+), 16 deletions(-) diff --git a/fs/ext4/mballoc.c b/fs/ext4/mballoc.c index 0fd921ada8973..e16f64daf1334 100644 --- a/fs/ext4/mballoc.c +++ b/fs/ext4/mballoc.c @@ -3836,8 +3836,7 @@ void ext4_exit_mballoc(void) * Returns 0 if success or error code */ static noinline_for_stack int -ext4_mb_mark_diskspace_used(struct ext4_allocation_context *ac, - handle_t *handle, unsigned int reserv_clstrs) +ext4_mb_mark_diskspace_used(struct ext4_allocation_context *ac, handle_t *handle) { struct buffer_head *bitmap_bh = NULL; struct ext4_group_desc *gdp; @@ -3925,13 +3924,6 @@ ext4_mb_mark_diskspace_used(struct ext4_allocation_context *ac, ext4_unlock_group(sb, ac->ac_b_ex.fe_group); percpu_counter_sub(&sbi->s_freeclusters_counter, ac->ac_b_ex.fe_len); - /* - * Now reduce the dirty block count also. Should not go negative - */ - if (!(ac->ac_flags & EXT4_MB_DELALLOC_RESERVED)) - /* release all the reserved blocks if non delalloc */ - percpu_counter_sub(&sbi->s_dirtyclusters_counter, - reserv_clstrs); if (sbi->s_log_groups_per_flex) { ext4_group_t flex_group = ext4_flex_group(sbi, @@ -5803,7 +5795,7 @@ ext4_fsblk_t ext4_mb_new_blocks(handle_t *handle, ext4_mb_pa_free(ac); } if (likely(ac->ac_status == AC_STATUS_FOUND)) { - *errp = ext4_mb_mark_diskspace_used(ac, handle, reserv_clstrs); + *errp = ext4_mb_mark_diskspace_used(ac, handle); if (*errp) { ext4_discard_allocated_blocks(ac); goto errout; @@ -5835,12 +5827,9 @@ ext4_fsblk_t ext4_mb_new_blocks(handle_t *handle, kmem_cache_free(ext4_ac_cachep, ac); if (inquota && ar->len < inquota) dquot_free_block(ar->inode, EXT4_C2B(sbi, inquota - ar->len)); - if (!ar->len) { - if ((ar->flags & EXT4_MB_DELALLOC_RESERVED) == 0) - /* release all the reserved blocks if non delalloc */ - percpu_counter_sub(&sbi->s_dirtyclusters_counter, - reserv_clstrs); - } + /* release any reserved blocks */ + if (reserv_clstrs) + percpu_counter_sub(&sbi->s_dirtyclusters_counter, reserv_clstrs); trace_ext4_allocate_blocks(ar, (unsigned long long)block); -- 2.51.0[PATCH 5.15.y] ext4: fix dirtyclusters double decrement on fs shutdownSasha Levin undefinedstable@vger.kernel.org undefined undefined undefined undefined undefined