VirtualDev::alloc_blk nblks 16bit overflow can possibly cause silent data corruption #560

yamingk · 2024-09-23T23:50:46Z

Problem Statement:

VirtualDev::alloc_blk takes 16bit nblks, however existing caller in homestore take 32bit data size and pass 32bit nblks to vdev layer which will cause overflow when the data size is larger than 256MB, it will cause corruption (worst case data mismatch) when data size being written is larger than 256MB.

Meta Svc layer has protection when the overflow happens, e.g. it can't match the input context_sz with the nblks matched sizes and will hit release assert, however this is still will cause crash.

Code pointer:
repl_req_ctx::alloc_local_blks
--> auto status = data_service().alloc_blks(sisl::round_up(uint32_cast(data_size), // <<< overflow
BlkDataService::alloc_blks
--> return m_vdev->alloc_blks(nblks, hints, out_blkids); // <<< overflow
MetaBlkService::alloc_meta_blks
--> const auto ret = m_sb_vdev->alloc_blks(nblks, blk_alloc_hints{}, bids); // <<< overflow

The underlying alloc_blks when overflow happens, if we go deep into the actual blk allocator (both the bitmap and append blk allocator), it will return SUCCESS with bids.size() equls to zero. There is no bug in the underlying blkallocator, the point is there is no assert in them also to detect the input nblks is actual 0, it will just return SUCCESS with nothing allocated, when this happens, silent data corruption will happen.

The fix needs to be done at vdev layer.

yamingk added this to the MileStone4.3 milestone Sep 23, 2024

yamingk added the bug Something isn't working label Sep 23, 2024

yamingk changed the title ~~VirtualDev::alloc_blk nblks 16bit overflow can cause return SUCCESS with zero blks allocated~~ VirtualDev::alloc_blk nblks 16bit overflow can cause silent data corruption Sep 23, 2024

yamingk changed the title ~~VirtualDev::alloc_blk nblks 16bit overflow can cause silent data corruption~~ VirtualDev::alloc_blk nblks 16bit overflow can possibly cause silent data corruption Sep 23, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

VirtualDev::alloc_blk nblks 16bit overflow can possibly cause silent data corruption #560

VirtualDev::alloc_blk nblks 16bit overflow can possibly cause silent data corruption #560

yamingk commented Sep 23, 2024

VirtualDev::alloc_blk nblks 16bit overflow can possibly cause silent data corruption #560

VirtualDev::alloc_blk nblks 16bit overflow can possibly cause silent data corruption #560

Comments

yamingk commented Sep 23, 2024