aboutsummaryrefslogtreecommitdiff
path: root/fs/bcachefs/btree_update.h (follow)
Commit message (Collapse)AuthorAgeFilesLines
* bcachefs: bch2_btree_insert() - add btree iter flagsAriel Miculas2024-07-141-2/+3
| | | | | | | | | | | | The commit 65bd44239727 ("bcachefs: bch2_btree_insert_trans() no longer specifies BTREE_ITER_cached") removes BTREE_ITER_cached from bch2_btree_insert_trans, which causes the update_inode function from bcachefs-tools to take a long time (~20s). Add an iter_flags parameter to bch2_btree_insert, so the users can specify iter update trigger flags, such as BTREE_ITER_cached. Signed-off-by: Ariel Miculas <ariel.miculas@gmail.com> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
* bcachefs: Disk space accounting rewriteKent Overstreet2024-07-141-8/+1
| | | | | | | | | | | | | | | | | | | | | | | | | Main part of the disk accounting rewrite. This is a wholesale rewrite of the existing disk space accounting, which relies on percepu counters that are sharded by journal buffer, and rolled up and added to each journal write. With the new scheme, every set of counters is a distinct key in the accounting btree; this fixes scaling limitations of the old scheme, where counters took up space in each journal entry and required multiple percpu counters. Now, in memory accounting requires a single set of percpu counters - not multiple for each in flight journal buffer - and in the future we'll probably also have counters that don't use in memory percpu counters, they're not strictly required. An accounting update is now a normal btree update, using the btree write buffer path. At transaction commit time, we apply accounting updates to the in memory counters, which are percpu counters indexed in an eytzinger tree by the accounting key. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
* bcachefs: Accumulate accounting keys in journal replayKent Overstreet2024-07-141-1/+13
| | | | | | | | | | | | | | | | | | | | | | | | Until accounting keys hit the btree, they are deltas, not new versions of the existing key; this means we have to teach journal replay to accumulate them. Additionally, the journal doesn't track precisely which entries have been flushed to the btree; it only tracks a range of entries that may possibly still need to be flushed. That means we need to compare accounting keys against the version in the btree and only flush updates that are newer. There's another wrinkle with the write buffer: if the write buffer starts flushing accounting keys before journal replay has finished flushing accounting keys, journal replay will see the version number from the new updates and updates from the journal will be lost. To avoid this, journal replay has to flush accounting keys first, and we'll be adding a flag so that write buffer flush knows to hold accounting keys until then. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
* bcachefs: metadata version bucket_stripe_sectorsKent Overstreet2024-07-141-8/+0
| | | | | | | | | | New on disk format version for bch_alloc->stripe_sectors and BCH_DATA_unstriped - accounting for unstriped data in stripe buckets. Upgrade/downgrade requires regenerating alloc info - but only if erasure coding is in use. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
* bcachefs: bch2_trans_commit_flags_to_text()Kent Overstreet2024-05-081-0/+2
| | | | Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
* bcachefs: iter/update/trigger/str_hash flag cleanupKent Overstreet2024-05-081-6/+6
| | | | | | | | | | | Combine iter/update/trigger/str_hash flags into a single enum, and x-macroize them for a to_text() function later. These flags are all for a specific iter/key/update context, so it makes sense to group them together - iter/update/trigger flags were already given distinct bits, this cleans up and unifies that handling. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
* bcachefs: bch2_btree_bit_mod()Kent Overstreet2024-03-131-0/+1
| | | | | | | Provide a non-write buffer version of bch2_btree_bit_mod_buffered(), for the subvolume children btree. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
* bcachefs: bch2_btree_bit_mod -> bch2_btree_bit_mod_bufferedKent Overstreet2024-03-131-2/+2
| | | | Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
* bcachefs: Clean up btree_transKent Overstreet2024-01-011-1/+1
| | | | Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
* bcachefs: btree_insert_entry -> btree_path_idx_tKent Overstreet2024-01-011-1/+1
| | | | Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
* bcachefs; bch2_path_put() -> btree_path_idx_tKent Overstreet2024-01-011-1/+1
| | | | Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
* bcachefs: trans_for_each_update() now declares loop iterKent Overstreet2024-01-011-3/+1
| | | | Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
* bcachefs: kill btree_trans->wb_updatesKent Overstreet2024-01-011-9/+19
| | | | | | the btree write buffer path now creates a journal entry directly Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
* bcachefs: Improve trans->extra_journal_entriesKent Overstreet2024-01-011-3/+20
| | | | | | | | | | | | Instead of using a darray, we now allocate journal entries for the transaction commit path with our normal bump allocator - with an inlined fastpath, and using btree_transaction_stats to remember how much to initially allocate so as to avoid transaction restarts. This is prep work for converting write buffer updates to use this mechanism. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
* bcachefs: No need to allocate keys for write bufferKent Overstreet2024-01-011-1/+6
| | | | Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
* bcachefs: Rename BTREE_INSERT flagsKent Overstreet2024-01-011-17/+19
| | | | | | | BTREE_INSERT flags are actually transaction commit flags - rename them for clarity. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
* bcachefs: bch_str_hash_flags_tKent Overstreet2024-01-011-5/+0
| | | | | | | | Create a separate enum for str_hash flags - instead of abusing the btree_insert_flags enum - and create a __bitwise typedef for sparse typechecking. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
* bcachefs: Kill dead BTREE_INSERT flagsKent Overstreet2024-01-011-6/+0
| | | | | | | BTREE_INSERT_NOWAIT and BTREE_INSERT_GC_LOCK_HELD are no longer used, and can be deleted. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
* bcachefs: Remove duplicate includeJiapeng Chong2023-10-221-1/+0
| | | | | | | | | ./fs/bcachefs/btree_update.h: journal.h is included more than once. Reported-by: Abaci Robot <abaci@linux.alibaba.com> Closes: https://bugzilla.openanolis.cn/show_bug.cgi?id=6573 Signed-off-by: Jiapeng Chong <jiapeng.chong@linux.alibaba.com> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
* bcachefs: Heap allocate btree_transKent Overstreet2023-10-221-19/+6
| | | | | | | | | | We're using more stack than we'd like in a number of functions, and btree_trans is the biggest object that we stack allocate. But we have to do a heap allocatation to initialize it anyways, so there's no real downside to heap allocating the entire thing. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
* bcachefs: Fix W=12 build errorsKent Overstreet2023-10-221-3/+3
| | | | Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
* bcachefs: BTREE_ID_logged_opsKent Overstreet2023-10-221-0/+1
| | | | | | | | | | | Add a new btree for long running logged operations - i.e. for logging operations that we can't do within a single btree transaction, so that they can be resumed if we crash. Keys in the logged operations btree will represent operations in progress, with the state of the operation stored in the value. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
* bcachefs: __bch2_btree_insert() -> bch2_btree_insert_trans()Kent Overstreet2023-10-221-1/+1
| | | | Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
* bcachefs: Fix assorted checkpatch nitsKent Overstreet2023-10-221-2/+2
| | | | Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
* bcachefs: bch2_trans_update_extent_overwrite()Kent Overstreet2023-10-221-2/+3
| | | | | | | Factor out a new helper, to be used when fsck has to repair overlapping extents. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
* bcachefs: Move some declarations to the correct headerKent Overstreet2023-10-221-9/+0
| | | | Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
* bcachefs: bch2_btree_bit_mod()Kent Overstreet2023-10-221-0/+2
| | | | | | New helper for bitset btrees. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
* bcachefs: support btree updates of prejournaled keysBrian Foster2023-10-221-0/+2
| | | | | | | | | | | | | | | | | | | | Introduce support for prejournaled key updates. This allows a transaction to commit an update for a key that already exists (and is pinned) in the journal. This is required for btree write buffer updates as the current scheme of journaling both on write buffer insertion and write buffer (slow path) flush is unsafe in certain crash recovery scenarios. Create a small trans update wrapper to pass along the seq where the key resides into the btree_insert_entry. From there, trans commit passes the seq into the btree insert path where it is used to manage the journal pin for the associated btree leaf. Note that this patch only introduces the underlying mechanism and otherwise includes no functional changes. Signed-off-by: Brian Foster <bfoster@redhat.com> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
* bcachefs: Kill BTREE_INSERT_USE_RESERVEKent Overstreet2023-10-221-16/+13
| | | | | | | Now that we have journal watermarks and alloc watermarks unified, BTREE_INSERT_USE_RESERVE is redundant and can be deleted. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
* bcachefs: Kill JOURNAL_WATERMARKKent Overstreet2023-10-221-2/+2
| | | | | | | This unifies JOURNAL_WATERMARK with BCH_WATERMARK; we're working towards specifying watermarks once in the transaction commit path. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
* bcachefs: Improve bch2_bkey_make_mut()Kent Overstreet2023-10-221-3/+5
| | | | | | | | | | bch2_bkey_make_mut() now takes the bkey_s_c by reference and points it at the new, mutable key. This helps in some fsck paths that may have multiple repair operations on the same key. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
* bcachefs: Fix corruption with writeable snapshotsKent Overstreet2023-10-221-0/+23
| | | | | | | | | | | | | | | | When partially overwriting an extent in an older snapshot, the existing extent has to be split. If the existing extent was overwritten in a different (sibling) snapshot, we have to ensure that the split won't be visible in the sibling snapshot. data_update.c already has code for this, bch2_insert_snapshot_writeouts() - we just need to move it into btree_update_leaf.c and change bch2_trans_update_extent() to use it as well. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
* bcachefs: bch2_bkey_get_empty_slot()Kent Overstreet2023-10-221-0/+3
| | | | | | Add a new helper for allocating a new slot in a btree. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
* bcachefs: bch2_bkey_make_mut() now calls bch2_trans_update()Kent Overstreet2023-10-221-6/+32
| | | | | | | | | | | It's safe to call bch2_trans_update with a k/v pair where the value hasn't been filled out, as long as the key part has been and the value is filled out by transaction commit time. This patch folds the bch2_trans_update() call into bch2_bkey_make_mut(), eliminating a bit of boilerplate. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
* bcachefs: bch2_bkey_get_mut() now calls bch2_trans_update()Kent Overstreet2023-10-221-2/+31
| | | | | | | | | | | It's safe to call bch2_trans_update with a k/v pair where the value hasn't been filled out, as long as the key part has been and the value is filled out by transaction commit time. This patch folds the bch2_trans_update() call into bch2_bkey_get_mut(), eliminating a bit of boilerplate. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
* bcachefs: bch2_bkey_alloc() now calls bch2_trans_update()Kent Overstreet2023-10-221-9/+14
| | | | | | | | | | | It's safe to call bch2_trans_update with a k/v pair where the value hasn't been filled out, as long as the key part has been and the value is filled out by transaction commit time. This patch folds the bch2_trans_update() call into bch2_bkey_alloc(), eliminating a bit of boilerplate. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
* bcachefs: bch2_bkey_get_mut() improvementsKent Overstreet2023-10-221-29/+72
| | | | | | | | | | | | | - bch2_bkey_get_mut() now handles types increasing in size, allocating a buffer for the type's current size when necessary - bch2_bkey_make_mut_typed() - bch2_bkey_get_mut() now initializes the iterator, like bch2_bkey_get_iter() Also, refactor so that most of the code is in functions - now macros are only used for wrappers. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
* bcachefs: Move bch2_bkey_make_mut() to btree_update.hKent Overstreet2023-10-221-0/+43
| | | | | | | It's for doing updates - this is where it belongs, and next pathes will be changing these helpers to use items from btree_update.h. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
* bcachefs: Rip out code for storing backpointers in alloc keysKent Overstreet2023-10-221-0/+1
| | | | | | | | | | We don't store backpointers in alloc keys anymore, since we gained the btree write buffer. This patch drops support for backpointers in alloc keys, and revs the on disk format version so that we know a fsck is required. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
* bcachefs: use reservation for log messages during recoveryBrian Foster2023-10-221-0/+1
| | | | | | | | | | | | | | | If we block on journal reservation attempting to log journal messages during recovery, particularly for the first message(s) before we start doing actual work, chances are the filesystem ends up deadlocked. Allow logged messages to use reserved journal space to mitigate this problem. In the worst case where no space is available whatsoever, this at least allows the fs to recognize that the journal is stuck and fail the mount gracefully. Signed-off-by: Brian Foster <bfoster@redhat.com> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
* bcachefs: remove unused bch2_trans_log_msg()Brian Foster2023-10-221-1/+0
| | | | | Signed-off-by: Brian Foster <bfoster@redhat.com> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
* bcachefs: When shutting down, flush btree node writes lastKent Overstreet2023-10-221-0/+3
| | | | Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
* bcachefs: bch2_btree_insert_nonextent()Kent Overstreet2023-10-221-0/+3
| | | | | | | This adds a new helper to delete some redundant code in bch2_trans_update_extent(). Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
* bcachefs: let __bch2_btree_insert() pass in flagsDaniel Hill2023-10-221-1/+2
| | | | | | | This patch is prep work for the following patch. Signed-off-by: Daniel Hill <daniel@gluo.nz> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
* bcachefs: Btree write bufferKent Overstreet2023-10-221-0/+12
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This adds a new method of doing btree updates - a straight write buffer, implemented as a flat fixed size array. This is only useful when we don't need to read from the btree in order to do the update, and when reading is infrequent - perfect for the LRU btree. This will make LRU btree updates fast enough that we'll be able to use it for persistently indexing buckets by fragmentation, which will be a massive boost to copygc performance. Changes: - A new btree_insert_type enum, for btree_insert_entries. Specifies btree, btree key cache, or btree write buffer. - bch2_trans_update_buffered(): updates via the btree write buffer don't need a btree path, so we need a new update path. - Transaction commit path changes: The update to the btree write buffer both mutates global, and can fail if there isn't currently room. Therefore we do all write buffer updates in the transaction all at once, and also if it fails we have to revert filesystem usage counter changes. If there isn't room we flush the write buffer in the transaction commit error path and retry. - A new persistent option, for specifying the number of entries in the write buffer. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
* bcachefs: Kill trans->flagsKent Overstreet2023-10-221-3/+2
| | | | | | | | | | Recursive transaction commits are occasionally necessary - in particular, for the upcoming btree write buffer's flush path. This avoids bugs due to trans->flags being accidentally mutated mid-commit, which can cause c->writes refcount leaks. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
* bcachefs: Fix bch2_trans_reset_updates()Kent Overstreet2023-10-221-0/+8
| | | | | | This should have been resetting trans->fs_usage_deltas as well. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
* bcachefs: Log more messages in the journalKent Overstreet2023-10-221-1/+2
| | | | | | | | | | | | | | | | | | | This patch - Adds a mechanism for queuing up journal entries prior to the journal being started, which will be used for early journal log messages - Adds bch2_fs_log_msg() and improves bch2_trans_log_msg(), which now take format strings. bch2_fs_log_msg() can be used before or after the journal has been started, and will use the appropriate mechanism. - Deletes the now obsolete bch2_journal_log_msg() - And adds more log messages to the recovery path - messages for journal/filesystem started, journal entries being blacklisted, and journal replay starting/finishing. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
* bcachefs: bch2_btree_insert_node() no longer uses lock_write_nofailKent Overstreet2023-10-221-2/+2
| | | | | | | Now that we have an error path plumbed through, there's no need to be using bch2_btree_node_lock_write_nofail(). Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
* bcachefs: EINTR -> BCH_ERR_transaction_restartKent Overstreet2023-10-221-1/+0
| | | | | | | | | Now that we have error codes, with subtypes, we can switch to our own error code for transaction restarts - and even better, a distinct error code for each transaction restart reason: clearer code and better debugging. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>