This PR adds `DeleteRange` to `ethdb.KeyValueWriter`. While range
deletion using an iterator can be really slow, `DeleteRange` is natively
supported by pebble and apparently runs in O(1) time (typically 20-30ms
in my tests for removing hundreds of millions of keys and gigabytes of
data). For leveldb and memorydb an iterator based fallback is
implemented. Note that since the iterator method can be slow and a
database function should not unexpectedly block for a very long time,
the number of deleted keys is limited at 10000 which should ensure that
it does not block for more than a second. ErrTooManyKeys is returned if
the range has only been partially deleted. In this case the caller can
repeat the call until it finally succeeds.
* cmd/geth, ethdb/pebble: polish method naming and code comment
* implement db stat for pebble
* cmd, core, ethdb, internal, trie: remove db property selector
* cmd, core, ethdb: fix function description
---------
Co-authored-by: prpeh <prpeh@proton.me>
Co-authored-by: Gary Rong <garyrong0905@gmail.com>
* all: implement path-based state scheme
* all: edits from review
* core/rawdb, trie/triedb/pathdb: review changes
* core, light, trie, eth, tests: reimplement pbss history
* core, trie/triedb/pathdb: track block number in state history
* trie/triedb/pathdb: add history documentation
* core, trie/triedb/pathdb: address comments from Peter's review
Important changes to list:
- Cache trie nodes by path in clean cache
- Remove root->id mappings when history is truncated
* trie/triedb/pathdb: fallback to disk if unexpect node in clean cache
* core/rawdb: fix tests
* trie/triedb/pathdb: rename metrics, change clean cache key
* trie/triedb: manage the clean cache inside of disk layer
* trie/triedb/pathdb: move journal function
* trie/triedb/path: fix tests
* trie/triedb/pathdb: fix journal
* trie/triedb/pathdb: fix history
* trie/triedb/pathdb: try to fix tests on windows
* core, trie: address comments
* trie/triedb/pathdb: fix test issues
---------
Co-authored-by: Felix Lange <fjl@twurst.com>
Co-authored-by: Martin Holst Swende <martin@swende.se>
This change adds the ability to perform reads from freezer without size limitation. This can be useful in cases where callers are certain that out-of-memory will not happen (e.g. reading only a few elements).
The previous API was designed to behave both optimally and secure while servicing a request from a peer, whereas this change should _not_ be used when an untrusted peer can influence the query size.
Previously freezer has only been used for storing ancient chain data, while obviously it can be used more. This PR unties the chain data and freezer, keep the minimal freezer structure and move all other logic (like incrementally freezing block data) into a separate structure called ChainFreezer.
This PR also extends the database interface by adding a new ancient store function AncientDatadir which can return the root directory of ancient store. The ancient root directory can be used when we want to open some other ancient-stores (e.g. reverse diff freezer).
* cmd,core: add simple legacy receipt converter
core/rawdb: use forEach in migrate
core/rawdb: batch reads in forEach
core/rawdb: make forEach anonymous fn
cmd/geth: check for legacy receipts on node startup
fix err msg
Co-authored-by: rjl493456442 <garyrong0905@gmail.com>
fix log
Co-authored-by: rjl493456442 <garyrong0905@gmail.com>
fix some review comments
add warning to cmd
drop isLegacy fn from migrateTable params
add test for windows rename
test replacing in windows case
* minor fix
* sanity check for tail-deletion
* add log before moving files around
* speed-up hack for mainnet
* fix mainnet check, use networkid instead
* check mainnet genesis
* review fixes
* resume previous migration attempt
* core/rawdb: lint fix
Co-authored-by: Martin Holst Swende <martin@swende.se>
* core/rawdb, cmd, ethdb, eth: implement freezer tail deletion
* core/rawdb: address comments from martin and sina
* core/rawdb: fixes cornercase in tail deletion
* core/rawdb: separate metadata into a standalone file
* core/rawdb: remove unused code
* core/rawdb: add random test
* core/rawdb: polish code
* core/rawdb: fsync meta file before manipulating the index
* core/rawdb: fix typo
* core/rawdb: address comments
This PR adds a new accessor method to the freezer database. This new view offers a consistent interface, guaranteeing that all individual tables (headers, bodies etc) are all on the same number, and that this number is not changes (added/truncated) while the operation is performing.
This change is a rewrite of the freezer code.
When writing ancient chain data to the freezer, the previous version first encoded each
individual item to a temporary buffer, then wrote the buffer. For small item sizes (for
example, in the block hash freezer table), this strategy causes a lot of system calls for
writing tiny chunks of data. It also allocated a lot of temporary []byte buffers.
In the new version, we instead encode multiple items into a re-useable batch buffer, which
is then written to the file all at once. This avoids performing a system call for every
inserted item.
To make the internal batching work, the ancient database API had to be changed. While
integrating this new API in BlockChain.InsertReceiptChain, additional optimizations were
also added there.
Co-authored-by: Felix Lange <fjl@twurst.com>
* core/rawdb: implement sequential reads in freezer_table
* core/rawdb, ethdb: add sequential reader to db interface
* core/rawdb: lint nitpicks
* core/rawdb: fix some nitpicks
* core/rawdb: fix flaw with deferred reads not being performed
* core/rawdb: better documentation
* core, eth: some fixes for freezer
* vendor, core/rawdb, cmd/geth: add db inspector
* core, cmd/utils: check ancient store path forceily
* cmd/geth, common, core/rawdb: a few fixes
* cmd/geth: support windows file rename and fix rename error
* core: support ancient plugin
* core, cmd: streaming file copy
* cmd, consensus, core, tests: keep genesis in leveldb
* core: write txlookup during ancient init
* core: bump database version
* all: freezer style syncing
core, eth, les, light: clean up freezer relative APIs
core, eth, les, trie, ethdb, light: clean a bit
core, eth, les, light: add unit tests
core, light: rewrite setHead function
core, eth: fix downloader unit tests
core: add receipt chain insertion test
core: use constant instead of hardcoding table name
core: fix rollback
core: fix setHead
core/rawdb: remove canonical block first and then iterate side chain
core/rawdb, ethdb: add hasAncient interface
eth/downloader: calculate ancient limit via cht first
core, eth, ethdb: lots of fixes
* eth/downloader: print ancient disable log only for fast sync
This PR is a more advanced form of the dirty-to-clean cacher (#18995),
where we reuse previous database write batches as datasets to uncache,
saving a dirty-trie-iteration and a dirty-trie-rlp-reencoding per block.
* ethdb: add Putter interface and Has method
* ethdb: improve docs and add IdealBatchSize
* ethdb: remove memory batch lock
Batches are not safe for concurrent use.
* core: use ethdb.Putter for Write* functions
This covers the easy cases.
* core/state: simplify StateSync
* trie: optimize local node check
* ethdb: add ValueSize to Batch
* core: optimize HasHeader check
This avoids one random database read get the block number. For many uses
of HasHeader, the expectation is that it's actually there. Using Has
avoids a load + decode of the value.
* core: write fast sync block data in batches
Collect writes into batches up to the ideal size instead of issuing many
small, concurrent writes.
* eth/downloader: commit larger state batches
Collect nodes into a batch up to the ideal size instead of committing
whenever a node is received.
* core: optimize HasBlock check
This avoids a random database read to get the number.
* core: use numberCache in HasHeader
numberCache has higher capacity, increasing the odds of finding the
header without a database lookup.
* core: write imported block data using a batch
Restore batch writes of state and add blocks, tx entries, receipts to
the same batch. The change also simplifies the miner.
This commit also removes posting of logs when a forked block is imported.
* core: fix DB write error handling
* ethdb: use RLock for Has
* core: fix HasBlock comment
* accounts, cmd, eth, ethdb: port logs over to new system
* ethdb: drop concept of cache distribution between dbs
* eth: fix some log nitpicks to make them nicer