aboutsummaryrefslogtreecommitdiff
path: root/fs/bcachefs/movinggc.c (follow)
Commit message (Collapse)AuthorAgeFilesLines
* bcachefs: Fix failure to flush moves before sleeping in copygcKent Overstreet2024-08-241-1/+1
| | | | | | | This fixes an apparent deadlock - rebalance would get stuck trying to take nocow locks because they weren't being released by copygc. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
* bcachefs: Improve copygc_wait_to_text()Kent Overstreet2024-07-141-3/+8
| | | | | | printing the raw values can occasionally be very useful Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
* bcachefs: Enable automatic shrinking for rhashtablesKent Overstreet2024-06-101-3/+4
| | | | | | | Since the key cache shrinker walks the rhashtable, a mostly empty rhashtable leads to really nasty reclaim performance issues. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
* bcachefs: bch2_trans_unlock() must always be followed by relock() or begin()Kent Overstreet2024-05-081-0/+2
| | | | | | | | We're about to add new asserts for btree_trans locking consistency, and part of that requires that aren't using the btree_trans while it's unlocked. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
* bcachefs: iter/update/trigger/str_hash flag cleanupKent Overstreet2024-05-081-1/+1
| | | | | | | | | | | Combine iter/update/trigger/str_hash flags into a single enum, and x-macroize them for a to_text() function later. These flags are all for a specific iter/key/update context, so it makes sense to group them together - iter/update/trigger flags were already given distinct bits, this cleans up and unifies that handling. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
* bcachefs: Improve bch2_fatal_error()Kent Overstreet2024-03-181-2/+1
| | | | | | error messages should always include __func__ Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
* bcachefs: for_each_member_device() now declares loop iterKent Overstreet2024-01-011-5/+2
| | | | Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
* bcachefs: for_each_btree_key() now declares loop iterKent Overstreet2024-01-011-2/+0
| | | | Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
* bcachefs: for_each_btree_key_upto() -> for_each_btree_key_old_upto()Kent Overstreet2024-01-011-1/+1
| | | | | | And for_each_btree_key2_upto -> for_each_btree_key_upto Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
* bcachefs: darray_for_each() now declares loop iterKent Overstreet2024-01-011-1/+0
| | | | Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
* bcachefs: bch_err_(fn|msg) check if should printKent Overstreet2024-01-011-4/+3
| | | | Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
* bcachefs: copygc shouldn't try moving buckets on errorDaniel Hill2024-01-011-4/+12
| | | | | | Co-developed-by: Kent Overstreet <kent.overstreet@linux.dev> Signed-off-by: Daniel Hill <daniel@gluo.nz> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
* bcachefs: copygc should wakeup on shutdown if disabledDaniel Hill2024-01-011-1/+2
| | | | | Signed-off-by: Daniel Hill <daniel@gluo.nz> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
* bcachefs: remove dead bch2_evacuate_bucket()Daniel Hill2024-01-011-1/+1
| | | | | Signed-off-by: Daniel Hill <daniel@gluo.nz> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
* bcachefs: bch2_btree_write_buffer_flush() -> bch2_btree_write_buffer_tryflush()Kent Overstreet2024-01-011-2/+2
| | | | | | More accurate naming. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
* bcachefs: Clean up btree write buffer write ref handlingKent Overstreet2024-01-011-0/+3
| | | | | | | | | | | | | __bch2_btree_write_buffer_flush() now assumes a write ref is already held (as called by the transaction commit path); and the wrappers bch2_write_buffer_flush() and flush_sync() take an explicit write ref. This means internally the write buffer code can always use BTREE_INSERT_NOCHECK_RW, instead of in the previous code passing flags around and hoping the NOCHECK_RW flag was always carried around correctly. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
* bcachefs: New bucket sector count helpersKent Overstreet2024-01-011-1/+1
| | | | | | | This introduces bch2_bucket_sectors() and bch2_bucket_sectors_dirty(), prep work for separately accounting stripe sectors. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
* bcachefs: Extra kthread_should_stop() calls for copygcKent Overstreet2023-11-281-1/+1
| | | | | | | | | This fixes a bug where going read-only was taking longer than it should have due to copygc forgetting to check kthread_should_stop() Additionally: fix a missing is_kthread check in bch2_move_ratelimit(). Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
* bcachefs: fix odebug warn and lockdep splat due to on-stack rhashtableBrian Foster2023-11-041-10/+14
| | | | | | | | | | | | | | | | Guenter Roeck reports a lockdep splat and DEBUG_OBJECTS_WORK related warning when bch2_copygc_thread() initializes its rhashtable. The lockdep splat relates to a warning print caused by the fact that the rhashtable exists on the stack but is not annotated as so. This is something that could be addressed by INIT_WORK_ONSTACK(), but rhashtable doesn't expose that control and probably isnt worth the churn for just one user. Instead, dynamically allocate the buckets_in_flight structure and avoid the splat that way. Reported-by: Guenter Roeck <linux@roeck-us.net> Tested-by: Guenter Roeck <linux@roeck-us.net> Signed-off-by: Brian Foster <bfoster@redhat.com> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
* bcachefs: Data move path now uses bch2_trans_unlock_long()Kent Overstreet2023-11-041-2/+2
| | | | Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
* bcachefs: Ensure copygc does not spinKent Overstreet2023-11-041-2/+18
| | | | | | | If copygc does no work - finds no fragmented buckets - wait for a bit of IO to happen. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
* bcachefs: move: move_stats refactoringKent Overstreet2023-10-311-0/+1
| | | | | | | | | | | data_progress_list is gone - it was redundant with moving_context_list The upcoming rebalance rewrite is going to have it using two different move_stats objects with the same moving_context, depending on whether it's scanning or using the rebalance_work btree - this patch plumbs stats around a bit differently so that will work. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
* bcachefs: moving_context now owns a btree_transKent Overstreet2023-10-311-20/+16
| | | | | | | btree_trans and moving_context are used together, and having the moving_context owns the transaction object reduces some plumbing. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
* bcachefs: Heap allocate btree_transKent Overstreet2023-10-221-9/+9
| | | | | | | | | | We're using more stack than we'd like in a number of functions, and btree_trans is the biggest object that we stack allocate. But we have to do a heap allocatation to initialize it anyways, so there's no real downside to heap allocating the entire thing. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
* bcachefs: Fix W=12 build errorsKent Overstreet2023-10-221-13/+13
| | | | Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
* bcachefs: Fix -Wcompare-distinct-pointer-types in bch2_copygc_get_buckets()Nathan Chancellor2023-10-221-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When building bcachefs for 32-bit ARM, there is a warning when using max() to compare an expression involving 'size_t' with an 'unsigned long' literal: fs/bcachefs/movinggc.c:159:21: error: comparison of distinct pointer types ('typeof (16UL) *' (aka 'unsigned long *') and 'typeof (buckets_in_flight->nr / 4) *' (aka 'unsigned int *')) [-Werror,-Wcompare-distinct-pointer-types] 159 | size_t nr_to_get = max(16UL, buckets_in_flight->nr / 4); | ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ include/linux/minmax.h:76:19: note: expanded from macro 'max' 76 | #define max(x, y) __careful_cmp(x, y, >) | ^~~~~~~~~~~~~~~~~~~~~~ include/linux/minmax.h:38:24: note: expanded from macro '__careful_cmp' 38 | __builtin_choose_expr(__safe_cmp(x, y), \ | ^~~~~~~~~~~~~~~~ include/linux/minmax.h:28:4: note: expanded from macro '__safe_cmp' 28 | (__typecheck(x, y) && __no_side_effects(x, y)) | ^~~~~~~~~~~~~~~~~ include/linux/minmax.h:22:28: note: expanded from macro '__typecheck' 22 | (!!(sizeof((typeof(x) *)1 == (typeof(y) *)1))) | ~~~~~~~~~~~~~~ ^ ~~~~~~~~~~~~~~ 1 error generated. On 64-bit architectures, size_t is 'unsigned long', so there is no warning when comparing these two expressions. Use max_t(size_t, ...) for this situation, eliminating the warning. Fixes: dd49018737d4 ("bcachefs: Rhashtable based buckets_in_flight for copygc") Signed-off-by: Nathan Chancellor <nathan@kernel.org> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
* bcachefs: Break up io.cKent Overstreet2023-10-221-8/+0
| | | | | | | | | More reorganization, this splits up io.c into - io_read.c - io_misc.c - fallocate, fpunch, truncate - io_write.c Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
* bcachefs: Convert more code to bch_err_msg()Kent Overstreet2023-10-221-4/+3
| | | | Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
* bcachefs: Fix for bch2_copygc() spuriously returning -EEXISTKent Overstreet2023-10-221-1/+3
| | | | Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
* bcachefs: Kill BTREE_INSERT_USE_RESERVEKent Overstreet2023-10-221-1/+1
| | | | | | | Now that we have journal watermarks and alloc watermarks unified, BTREE_INSERT_USE_RESERVE is redundant and can be deleted. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
* bcachefs: Kill JOURNAL_WATERMARKKent Overstreet2023-10-221-1/+1
| | | | | | | This unifies JOURNAL_WATERMARK with BCH_WATERMARK; we're working towards specifying watermarks once in the transaction commit path. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
* bcachefs: Add a missing rhashtable_destroy() callKent Overstreet2023-10-221-0/+1
| | | | | | Fixes https://lore.kernel.org/linux-bcachefs/784c3e6a-75bd-e6ca-535a-43b3e1daf643@kernel.dk/T/#mbf7caf005f960018eba23b58795d06c06c947411 Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
* bcachefs: Rename enum alloc_reserve -> bch_watermarkKent Overstreet2023-10-221-1/+1
| | | | | | This is prep work for consolidating with JOURNAL_WATERMARK. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
* bcachefs: Convert -ENOENT to private error codesKent Overstreet2023-10-221-1/+1
| | | | | | | As with previous conversions, replace -ENOENT uses with more informative private error codes. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
* bcachefs: bch2_bkey_get_iter() helpersKent Overstreet2023-10-221-13/+3
| | | | | | | | | | | | | | | | Introduce new helpers for a common pattern: bch2_trans_iter_init(); bch2_btree_iter_peek_slot(); - bch2_bkey_get_iter_type() returns -ENOENT if it doesn't find a key of the correct type - bch2_bkey_get_val_typed() copies the val out of the btree to a (typically stack allocated) variable; it handles the case where the value in the btree is smaller than the current version of the type, zeroing out the remainder. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
* bcachefs: Mark bch2_copygc() noinlineKent Overstreet2023-10-221-0/+1
| | | | | | This works around a "stack from too large" error. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
* bcachefs: Kill bch2_verify_bucket_evacuated()Kent Overstreet2023-10-221-7/+0
| | | | | | | | | | | | With backpointers, it's now impossible for bch2_evacuate_bucket() to be completely reliable: it can race with an extent being partially overwritten or split, which needs a new write buffer flush for the backpointer to be seen. This shouldn't be a real issue in practice; the previous patch added a new tracepoint so we'll be able to see more easily if it is. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
* bcachefs: Rhashtable based buckets_in_flight for copygcKent Overstreet2023-10-221-85/+127
| | | | | | | | | | | | Previously, copygc used a fifo for tracking buckets in flight - this had the disadvantage of being fixed size, since we pass references to elements into the move code. This restructures it to be a hash table and linked list, since with erasure coding we need to be able to pipeline across an arbitrary number of buckets. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
* bcachefs: Improved copygc wait debuggingKent Overstreet2023-10-221-1/+9
| | | | | | | This just adds a line for how long copygc has been waiting to sysfs copygc_wait, helpful for debugging why copygc isn't running. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
* bcachefs: Fix an assert in copygc thread shutdown pathKent Overstreet2023-10-221-1/+1
| | | | | | | | We're not supposed to have nested (locked) btree_trans on the stack: this means copygc shutdown needs to exit our btree_trans before exiting the move_ctxt, which calls bch2_write(). Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
* bcachefs: bch2_bucket_is_movable() -> BTREE_ITER_CACHEDKent Overstreet2023-10-221-1/+1
| | | | | | | BTREE_ITER_CACHED should really be the default for cached btrees - this is an easy mistake to make. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
* bcachefs: Improved copygc pipeliningKent Overstreet2023-10-221-38/+153
| | | | | | | | | | | | | | | | | | | | This improves copygc pipelining across multiple buckets: we now track each in flight bucket we're evacuating, with separate moving_contexts. This means that whereas previously we had to wait for outstanding moves to complete to ensure we didn't try to evacuate the same bucket twice, we can now just check buckets we want to evacuate against the pending list. This also mean we can run the verify_bucket_evacuated() check without killing pipelining - meaning it can now always be enabled, not just on debug builds. This is going to be important for the upcoming erasure coding work, where moving IOs that are being erasure coded will now skip the initial replication step; instead the IOs will wait on the stripe to complete. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
* bcachefs: Mark stripe buckets with correct data typeKent Overstreet2023-10-221-3/+7
| | | | | | | | | | | | | | | | | | | | Currently, we don't use bucket data type for tracking whether buckets are part of a stripe; parity buckets are BCH_DATA_parity, but data buckets in a stripe are BCH_DATA_user. There's a separate counter, buckets_ec, outside the BCH_DATA_TYPES system for tracking number of buckets on a device that are part of a stripe. The trouble with this approach is that it's too coarse grained, and we need better information on fragmentation for debugging copygc. With this patch, data buckets in a stripe are now tracked as BCH_DATA_stripe buckets. This doesn't yet differentiate between erasure coded and non-erasure coded data in a stripe bucket, nor do we yet track empty data buckets in stripes. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
* bcachefs: bch2_copygc_wait_to_text()Kent Overstreet2023-10-221-0/+12
| | | | Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
* bcachefs: Fragmentation LRUKent Overstreet2023-10-221-101/+70
| | | | | | | | | | | | | | | | | | | | | | | Now that we have much more efficient updates to the LRU btree, this patch adds a new LRU that indexes buckets by fragmentation. This means copygc no longer has to scan every bucket to find buckets that need to be evacuated. Changes: - A new field in bch_alloc_v4, fragmentation_lru - this corresponds to the bucket's position in the fragmentation LRU. We add a new field for this instead of calculating it as needed because we may make the fragmentation LRU optional; this field indicates whether a bucket is on the fragmentation LRU. Also, zoned devices will introduce variable bucket sizes; explicitly recording the LRU position will be safer for them. - A new copygc path for using the fragmentation LRU instead of scanning every bucket and building up an in-memory heap. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
* bcachefs: Copygc now uses backpointersKent Overstreet2023-10-221-206/+30
| | | | | | | | | | | | | | Previously, copygc needed to walk the entire extents & reflink btrees to find extents that needed to be moved. Now that we have backpointers, this patch implements bch2_evacuate_bucket() in the move code, which copygc now uses for evacuating mostly empty buckets. Also, thanks to the new backpointers code, copygc can now move btree nodes. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
* bcachefs: Better inlining for bch2_alloc_to_v4_mutKent Overstreet2023-10-221-8/+9
| | | | | | | | This separates out the slowpath into a separate function, and inlines bch2_alloc_v4_mut into bch2_trans_start_alloc_update(), the main place it's called. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
* bcachefs: Convert EROFS errors to private error codesKent Overstreet2023-10-221-1/+1
| | | | | | More error code improvements - this gets us more useful error messages. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
* bcachefs: Suppress -EROFS messages when shutting downKent Overstreet2023-10-221-1/+1
| | | | | | | This isn't actually an error condition, this just indicates a normal shutdown - no reason for these to be in the log. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
* bcachefs: Fixes for building in userspaceKent Overstreet2023-10-221-1/+1
| | | | | | | | | | | | | | - Marking a non-static function as inline doesn't actually work and is now causing problems - drop that - Introduce BCACHEFS_LOG_PREFIX for when we want to prefix log messages with bcachefs (filesystem name) - Userspace doesn't have real percpu variables (maybe we can get this fixed someday), put an #ifdef around bch2_disk_reservation_add() fastpath Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>