aboutsummaryrefslogtreecommitdiff
path: root/fs/bcachefs/ec.c (follow)
Commit message (Collapse)AuthorAgeFilesLines
* bcachefs: Make bkey_fsck_err() a wrapper around fsck_err()Kent Overstreet2024-08-131-8/+7
| | | | | | | | | | | | | | | | | | bkey_fsck_err() was added as an interface that looks like fsck_err(), but previously all it did was ensure that the appropriate error counter was incremented in the superblock. This is a cleanup and bugfix patch that converts it to a wrapper around fsck_err(). This is needed to fix an issue with the upgrade path to disk_accounting_v3, where the "silent fix" error list now includes bkey_fsck errors; fsck_err() handles this in a unified way, and since we need to change printing of bkey fsck errors from the caller to the inner bkey_fsck_err() calls, this ends up being a pretty big change. Als,, rename .invalid() methods to .validate(), for clarity, while we're changing the function signature anyways (to drop the printbuf argument). Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
* bcachefs: ec should not allocate from ro devsKent Overstreet2024-08-071-0/+3
| | | | | | This fixes a device removal deadlock when using erasure coding. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
* bcachefs: Improved allocator debugging for ecKent Overstreet2024-08-071-11/+20
| | | | | | chasing down a device removal deadlock with erasure coding Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
* Merge tag 'mm-nonmm-stable-2024-07-21-15-07' of ↵Linus Torvalds2024-07-211-25/+51
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm Pull non-MM updates from Andrew Morton: - In the series "treewide: Refactor heap related implementation", Kuan-Wei Chiu has significantly reworked the min_heap library code and has taught bcachefs to use the new more generic implementation. - Yury Norov's series "Cleanup cpumask.h inclusion in core headers" reworks the cpumask and nodemask headers to make things generally more rational. - Kuan-Wei Chiu has sent along some maintenance work against our sorting library code in the series "lib/sort: Optimizations and cleanups". - More library maintainance work from Christophe Jaillet in the series "Remove usage of the deprecated ida_simple_xx() API". - Ryusuke Konishi continues with the nilfs2 fixes and clanups in the series "nilfs2: eliminate the call to inode_attach_wb()". - Kuan-Ying Lee has some fixes to the gdb scripts in the series "Fix GDB command error". - Plus the usual shower of singleton patches all over the place. Please see the relevant changelogs for details. * tag 'mm-nonmm-stable-2024-07-21-15-07' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: (98 commits) ia64: scrub ia64 from poison.h watchdog/perf: properly initialize the turbo mode timestamp and rearm counter tsacct: replace strncpy() with strscpy() lib/bch.c: use swap() to improve code test_bpf: convert comma to semicolon init/modpost: conditionally check section mismatch to __meminit* init: remove unused __MEMINIT* macros nilfs2: Constify struct kobj_type nilfs2: avoid undefined behavior in nilfs_cnt32_ge macro math: rational: add missing MODULE_DESCRIPTION() macro lib/zlib: add missing MODULE_DESCRIPTION() macro fs: ufs: add MODULE_DESCRIPTION() lib/rbtree.c: fix the example typo ocfs2: add bounds checking to ocfs2_check_dir_entry() fs: add kernel-doc comments to ocfs2_prepare_orphan_dir() coredump: simplify zap_process() selftests/fpu: add missing MODULE_DESCRIPTION() macro compiler.h: simplify data_race() macro build-id: require program headers to be right after ELF header resource: add missing MODULE_DESCRIPTION() ...
| * bcachefs: remove heap-related macros and switch to generic min_heapKuan-Wei Chiu2024-06-241-25/+51
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Drop the heap-related macros from bcachefs and replacing them with the generic min_heap implementation from include/linux. By doing so, code readability is improved by using functions instead of macros. Moreover, the min_heap implementation in include/linux adopts a bottom-up variation compared to the textbook version currently used in bcachefs. This bottom-up variation allows for approximately 50% reduction in the number of comparison operations during heap siftdown, without changing the number of swaps, thus making it more efficient. [visitorckw@gmail.com: fix missing assignment of minimum element] Link: https://lkml.kernel.org/r/20240602174828.1955320-1-visitorckw@gmail.com Link: https://lkml.kernel.org/ioyfizrzq7w7mjrqcadtzsfgpuntowtjdw5pgn4qhvsdp4mqqg@nrlek5vmisbu Link: https://lkml.kernel.org/r/20240524152958.919343-17-visitorckw@gmail.com Signed-off-by: Kuan-Wei Chiu <visitorckw@gmail.com> Reviewed-by: Ian Rogers <irogers@google.com> Acked-by: Kent Overstreet <kent.overstreet@linux.dev> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Arnaldo Carvalho de Melo <acme@kernel.org> Cc: Bagas Sanjaya <bagasdotme@gmail.com> Cc: Brian Foster <bfoster@redhat.com> Cc: Ching-Chun (Jim) Huang <jserv@ccns.ncku.edu.tw> Cc: Coly Li <colyli@suse.de> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Matthew Sakai <msakai@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Randy Dunlap <rdunlap@infradead.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
* | bcachefs: Fix missing BTREE_TRIGGER_bucket_invalidate flagKent Overstreet2024-07-141-1/+1
| | | | | | | | | | | | This fixes an accounting mismatch for cached data. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
* | bcachefs: Convert gc to new accountingKent Overstreet2024-07-141-55/+42
| | | | | | | | | | | | | | | | | | | | Rewrite fsck/gc for the new accounting scheme. This adds a second set of in-memory accounting counters for gc to use; like with other parts of gc we run all trigger in TRIGGER_GC mode, then compare what we calculated to existing in-memory accounting at the end. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
* | bcachefs: Disk space accounting rewriteKent Overstreet2024-07-141-11/+15
|/ | | | | | | | | | | | | | | | | | | | | | | | | Main part of the disk accounting rewrite. This is a wholesale rewrite of the existing disk space accounting, which relies on percepu counters that are sharded by journal buffer, and rolled up and added to each journal write. With the new scheme, every set of counters is a distinct key in the accounting btree; this fixes scaling limitations of the old scheme, where counters took up space in each journal entry and required multiple percpu counters. Now, in memory accounting requires a single set of percpu counters - not multiple for each in flight journal buffer - and in the future we'll probably also have counters that don't use in memory percpu counters, they're not strictly required. An accounting update is now a normal btree update, using the btree write buffer path. At transaction commit time, we apply accounting updates to the in memory counters, which are percpu counters indexed in an eytzinger tree by the accounting key. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
* bcachefs: Check for invalid bucket from bucket_gen(), gc_bucket()Kent Overstreet2024-06-101-6/+20
| | | | | | | Turn more asserts into proper recoverable error paths. Reported-by: syzbot+246b47da27f8e7e7d6fb@syzkaller.appspotmail.com Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
* bcachefs: btree_gc can now handle unknown btreesKent Overstreet2024-05-281-1/+1
| | | | | | | Compatibility fix - we no longer have a separate table for which order gc walks btrees in, and special case the stripes btree directly. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
* bcachefs: s/bkey_invalid_flags/bch_validate_flagsKent Overstreet2024-05-091-1/+1
| | | | | | | We're about to start using bch_validate_flags for superblock section validation - it's no longer bkey specific. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
* bcachefs: bch2_dev_get_ioref() checks for device not presentKent Overstreet2024-05-091-11/+10
| | | | Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
* bcachefs: ptr_stale() -> dev_ptr_stale()Kent Overstreet2024-05-081-2/+2
| | | | Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
* bcachefs: ec_validate_checksums() -> bch2_dev_tryget()Kent Overstreet2024-05-081-10/+12
| | | | Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
* bcachefs: ob_dev()Kent Overstreet2024-05-081-5/+2
| | | | | | | Wrapper around bch2_dev_have_ref() for open_buckets; we do guarantee that the device an open_bucket points to exists. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
* bcachefs: Kill bch2_dev_bkey_exists() in backpointer codeKent Overstreet2024-05-081-2/+3
| | | | Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
* bcachefs: PTR_BUCKET_POS() now takes bch_devKent Overstreet2024-05-081-22/+26
| | | | Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
* bcachefs: bch2_bucket_ref_update() now takes bch_devKent Overstreet2024-05-081-5/+14
| | | | Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
* bcachefs: bch2_bkey_drop_ptrs() declares loop iterKent Overstreet2024-05-081-1/+1
| | | | Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
* bcachefs: simplify bch2_trans_start_alloc_update()Kent Overstreet2024-05-081-13/+3
| | | | Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
* bcachefs: __mark_stripe_bucket() now takes bch_alloc_v4Kent Overstreet2024-05-081-62/+38
| | | | Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
* bcachefs: kill bch2_dev_usage_update_m()Kent Overstreet2024-05-081-3/+3
| | | | | | by using bucket_m_to_alloc() more, we can get some nice code cleanup. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
* bcachefs: alloc_data_type_set()Kent Overstreet2024-05-081-1/+1
| | | | Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
* bcachefs: Run bch2_check_fix_ptrs() via triggersKent Overstreet2024-05-081-1/+4
| | | | | | | | | | | | | | | | | Currently, the reflink_p gc trigger does repair as well - turning a reflink_p key into an error key if the reflink_v it points to doesn't exist. This won't work with online check/repair, because the repair path once online will be subject to transaction restarts, but BTREE_TRIGGER_gc is not idempotant - we can't run it multiple times if we get a transaction restart. So we need to split these paths; to do so this patch calls check_fix_ptrs() by a new general path - a new trigger type, BTREE_TRIGGER_check_repair. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
* bcachefs: bch2_bucket_ref_update()Kent Overstreet2024-05-081-7/+7
| | | | | | | | | | | If we hit an inconsistency when updating allocation information, we don't want to fail the update if it's for a deletion - only if it's for a new key. Rename check_bucket_ref() -> bucket_ref_update() so we can centralize the logic to do this. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
* bcachefs: Consolidate mark_stripe_bucket() and trans_mark_stripe_bucket()Kent Overstreet2024-05-081-116/+135
| | | | | | | | This eliminates some duplicated logic, and the gc path now handles stripe updates and deletions - we need this since soon we're bringing back runtime gc. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
* bcachefs: mark_stripe_bucket cleanupKent Overstreet2024-05-081-35/+65
| | | | | | | | Start to work on unifying mark_stripe_bucket() and trans_mark_stripe_bucket(); first, clean up all the unnecessary and gratuitious differences. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
* bcachefs: bucket_data_type_mismatch()Kent Overstreet2024-05-081-10/+9
| | | | | | | | | | We're working on potentially unifying bch2_check_bucket_ref() and bch2_check_fix_ptrs() - or at least eliminating gratuitious differences. Most immediately, there's a bunch of cleanups to be done regarding BCH_DATA_stripe. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
* bcachefs: member helper cleanupsKent Overstreet2024-05-081-5/+5
| | | | | | | | | | | | | | Some renaming for better consistency bch2_member_exists -> bch2_member_alive bch2_dev_exists -> bch2_member_exists bch2_dev_exsits2 -> bch2_dev_exists bch_dev_locked -> bch2_dev_locked bch_dev_bkey_exists -> bch2_dev_bkey_exists new helper - bch2_dev_safe Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
* bcachefs: iter/update/trigger/str_hash flag cleanupKent Overstreet2024-05-081-13/+13
| | | | | | | | | | | Combine iter/update/trigger/str_hash flags into a single enum, and x-macroize them for a to_text() function later. These flags are all for a specific iter/key/update context, so it makes sense to group them together - iter/update/trigger flags were already given distinct bits, this cleans up and unifies that handling. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
* bcachefs: Standardize helpers for printing enum strs with bounds checksKent Overstreet2024-04-131-8/+6
| | | | Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
* bcachefs: fix unsafety in bch2_stripe_to_text()Kent Overstreet2024-04-131-21/+25
| | | | | | .to_text() functions need to work on key values that didn't pass .valid Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
* bcachefs: Improve bch2_fatal_error()Kent Overstreet2024-03-181-3/+3
| | | | | | error messages should always include __func__ Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
* bcachefs: kill kvpmalloc()Kent Overstreet2024-03-131-2/+2
| | | | Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
* bcachefs: helpers for printing data typesKent Overstreet2024-01-211-2/+2
| | | | | | We need bounds checking since new versions may introduce new data types. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
* bcachefs: BTREE_TRIGGER_ATOMICKent Overstreet2024-01-211-1/+1
| | | | | | Add a new flag to be explicit about when we're running atomic triggers. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
* bcachefs: unify stripe triggerKent Overstreet2024-01-051-92/+72
| | | | Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
* bcachefs: move stripe triggers to ec.cKent Overstreet2024-01-051-0/+321
| | | | Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
* bcachefs: bkey_for_each_ptr() now declares loop iterKent Overstreet2024-01-011-1/+0
| | | | Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
* bcachefs: for_each_member_device_rcu() now declares loop iterKent Overstreet2024-01-011-9/+6
| | | | Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
* bcachefs: for_each_btree_key() now declares loop iterKent Overstreet2024-01-011-11/+4
| | | | Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
* bcachefs: bch_err_(fn|msg) check if should printKent Overstreet2024-01-011-16/+8
| | | | Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
* bcachefs: btree write buffer now slurps keys from journalKent Overstreet2024-01-011-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Previosuly, the transaction commit path would have to add keys to the btree write buffer as a separate operation, requiring additional global synchronization. This patch introduces a new journal entry type, which indicates that the keys need to be copied into the btree write buffer prior to being written out. We switch the journal entry type back to JSET_ENTRY_btree_keys prior to write, so this is not an on disk format change. Flushing the btree write buffer may require pulling keys out of journal entries yet to be written, and quiescing outstanding journal reservations; we previously added journal->buf_lock for synchronization with the journal write path. We also can't put strict bounds on the number of keys in the journal destined for the write buffer, which means we might overflow the size of the preallocated buffer and have to reallocate - this introduces a potentially fatal memory allocation failure. This is something we'll have to watch for, if it becomes an issue in practice we can do additional mitigation. The transaction commit path no longer has to explicitly check if the write buffer is full and wait on flushing; this is another performance optimization. Instead, when the btree write buffer is close to full we change the journal watermark, so that only reservations for journal reclaim are allowed. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
* bcachefs: Rename for_each_btree_key2() -> for_each_btree_key()Kent Overstreet2024-01-011-2/+2
| | | | Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
* bcachefs: Kill for_each_btree_key()Kent Overstreet2024-01-011-25/+21
| | | | | | | | | | | | | for_each_btree_key() handles transaction restarts, like for_each_btree_key2(), but only calls bch2_trans_begin() after a transaction restart - for_each_btree_key2() wraps every loop iteration in a transaction. The for_each_btree_key() behaviour is problematic when it leads to holding the SRCU lock that prevents key cache reclaim for an unbounded amount of time - there's no real need to keep it around. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
* bcachefs: Clean up btree write buffer write ref handlingKent Overstreet2024-01-011-1/+1
| | | | | | | | | | | | | __bch2_btree_write_buffer_flush() now assumes a write ref is already held (as called by the transaction commit path); and the wrappers bch2_write_buffer_flush() and flush_sync() take an explicit write ref. This means internally the write buffer code can always use BTREE_INSERT_NOCHECK_RW, instead of in the previous code passing flags around and hoping the NOCHECK_RW flag was always carried around correctly. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
* bcachefs: convert bch_fs_flags to x-macroKent Overstreet2024-01-011-1/+1
| | | | | | | Now we can print out filesystem flags in sysfs, useful for debugging various "what's my filesystem doing" issues. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
* bcachefs: Rename BTREE_INSERT flagsKent Overstreet2024-01-011-5/+5
| | | | | | | BTREE_INSERT flags are actually transaction commit flags - rename them for clarity. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
* bcachefs: Guard against insufficient devices to create stripesKent Overstreet2023-11-131-2/+14
| | | | | | | We can't create stripes if we don't have enough devices - this manifested as an integer underflow bug later. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
* bcachefs: Improve stripe checksum error messageKent Overstreet2023-11-051-8/+13
| | | | | | | We now include the name of the device in the error message - and also increment the number of checksum errors on that device. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>