This PR adds `DeleteRange` to `ethdb.KeyValueWriter`. While range
deletion using an iterator can be really slow, `DeleteRange` is natively
supported by pebble and apparently runs in O(1) time (typically 20-30ms
in my tests for removing hundreds of millions of keys and gigabytes of
data). For leveldb and memorydb an iterator based fallback is
implemented. Note that since the iterator method can be slow and a
database function should not unexpectedly block for a very long time,
the number of deleted keys is limited at 10000 which should ensure that
it does not block for more than a second. ErrTooManyKeys is returned if
the range has only been partially deleted. In this case the caller can
repeat the call until it finally succeeds.
* cmd/geth, ethdb/pebble: polish method naming and code comment
* implement db stat for pebble
* cmd, core, ethdb, internal, trie: remove db property selector
* cmd, core, ethdb: fix function description
---------
Co-authored-by: prpeh <prpeh@proton.me>
Co-authored-by: Gary Rong <garyrong0905@gmail.com>
This removes text parsing in leveldb metrics collection code. All metrics
can now be accessed through the stats API provided by leveldb.
We also add new gauge-typed metrics that count the number of tables at each level.
---------
Co-authored-by: Exca-DK <Exca-DK@users.noreply.github.com>
Co-authored-by: Gary Rong <garyrong0905@gmail.com>
Co-authored-by: Felix Lange <fjl@twurst.com>
This changes the CI / release builds to use the latest Go version. It also
upgrades golangci-lint to a newer version compatible with Go 1.19.
In Go 1.19, godoc has gained official support for links and lists. The
syntax for code blocks in doc comments has changed and now requires a
leading tab character. gofmt adapts comments to the new syntax
automatically, so there are a lot of comment re-formatting changes in this
PR. We need to apply the new format in order to pass the CI lint stage with
Go 1.19.
With the linter upgrade, I have decided to disable 'gosec' - it produces
too many false-positive warnings. The 'deadcode' and 'varcheck' linters
have also been removed because golangci-lint warns about them being
unmaintained. 'unused' provides similar coverage and we already have it
enabled, so we don't lose much with this change.
This PR adds an addtional API called `NewBatchWithSize` for db
batcher. It turns out that leveldb batch memory allocation is
super inefficient. The main reason is the allocation step of
leveldb Batch is too small when the batch size is large. It can
take a few second to build a leveldb batch with 100MB size.
Luckily, leveldb also offers another API called MakeBatch which can
pre-allocate the memory area. So if the approximate size of batch is
known in advance, this API can be used in this case.
It's needed in new state scheme PR which needs to commit a batch of
trie nodes in a single batch. Implement the feature in a seperate PR.
* eth/protocols/snap: generate storage trie from full dirty snap data
* eth/protocols/snap: get rid of some more dead code
* eth/protocols/snap: less frequent logs, also log during trie generation
* eth/protocols/snap: implement dirty account range stack-hashing
* eth/protocols/snap: don't loop on account trie generation
* eth/protocols/snap: fix account format in trie
* core, eth, ethdb: glue snap packets together, but not chunks
* eth/protocols/snap: print completion log for snap phase
* eth/protocols/snap: extended tests
* eth/protocols/snap: make testcase pass
* eth/protocols/snap: fix account stacktrie commit without defer
* ethdb: fix key counts on reset
* eth/protocols: fix typos
* eth/protocols/snap: make better use of delivered data (#44)
* eth/protocols/snap: make better use of delivered data
* squashme
* eth/protocols/snap: reduce chunking
* squashme
* eth/protocols/snap: reduce chunking further
* eth/protocols/snap: break out hash range calculations
* eth/protocols/snap: use sort.Search instead of looping
* eth/protocols/snap: prevent crash on storage response with no keys
* eth/protocols/snap: nitpicks all around
* eth/protocols/snap: clear heal need on 1-chunk storage completion
* eth/protocols/snap: fix range chunker, add tests
Co-authored-by: Péter Szilágyi <peterke@gmail.com>
* trie: fix test API error
* eth/protocols/snap: fix some further liter issues
* eth/protocols/snap: fix accidental batch reuse
Co-authored-by: Martin Holst Swende <martin@swende.se>
* core/state/snapshot, ethdb: track deletions more accurately
* core/state/snapshot: don't reset the iterator, leveldb's screwy
* ethdb: don't mess with the insert batches for now
This PR introduces:
- db.put to put a value into the database
- db.get to read a value from the database
- db.delete to delete a value from the database
- db.stats to check compaction info from the database
- db.compact to trigger a db compaction
It also moves inspectdb to db.inspect.
* all: freezer style syncing
core, eth, les, light: clean up freezer relative APIs
core, eth, les, trie, ethdb, light: clean a bit
core, eth, les, light: add unit tests
core, light: rewrite setHead function
core, eth: fix downloader unit tests
core: add receipt chain insertion test
core: use constant instead of hardcoding table name
core: fix rollback
core: fix setHead
core/rawdb: remove canonical block first and then iterate side chain
core/rawdb, ethdb: add hasAncient interface
eth/downloader: calculate ancient limit via cht first
core, eth, ethdb: lots of fixes
* eth/downloader: print ancient disable log only for fast sync
* core, eth, trie: bloom filter for trie node dedup during fast sync
* eth/downloader, trie: address review comments
* core, ethdb, trie: restart fast-sync bloom construction now and again
* eth/downloader: initialize fast sync bloom on startup
* eth: reenable eth/62 until we properly remove it
This PR is a more advanced form of the dirty-to-clean cacher (#18995),
where we reuse previous database write batches as datasets to uncache,
saving a dirty-trie-iteration and a dirty-trie-rlp-reencoding per block.
* ethdb: add Putter interface and Has method
* ethdb: improve docs and add IdealBatchSize
* ethdb: remove memory batch lock
Batches are not safe for concurrent use.
* core: use ethdb.Putter for Write* functions
This covers the easy cases.
* core/state: simplify StateSync
* trie: optimize local node check
* ethdb: add ValueSize to Batch
* core: optimize HasHeader check
This avoids one random database read get the block number. For many uses
of HasHeader, the expectation is that it's actually there. Using Has
avoids a load + decode of the value.
* core: write fast sync block data in batches
Collect writes into batches up to the ideal size instead of issuing many
small, concurrent writes.
* eth/downloader: commit larger state batches
Collect nodes into a batch up to the ideal size instead of committing
whenever a node is received.
* core: optimize HasBlock check
This avoids a random database read to get the number.
* core: use numberCache in HasHeader
numberCache has higher capacity, increasing the odds of finding the
header without a database lookup.
* core: write imported block data using a batch
Restore batch writes of state and add blocks, tx entries, receipts to
the same batch. The change also simplifies the miner.
This commit also removes posting of logs when a forked block is imported.
* core: fix DB write error handling
* ethdb: use RLock for Has
* core: fix HasBlock comment
* accounts, cmd, eth, ethdb: port logs over to new system
* ethdb: drop concept of cache distribution between dbs
* eth: fix some log nitpicks to make them nicer