This PR adds an addtional API called `NewBatchWithSize` for db
batcher. It turns out that leveldb batch memory allocation is
super inefficient. The main reason is the allocation step of
leveldb Batch is too small when the batch size is large. It can
take a few second to build a leveldb batch with 100MB size.
Luckily, leveldb also offers another API called MakeBatch which can
pre-allocate the memory area. So if the approximate size of batch is
known in advance, this API can be used in this case.
It's needed in new state scheme PR which needs to commit a batch of
trie nodes in a single batch. Implement the feature in a seperate PR.
* freezer: add readonly flag to table
* freezer: enforce readonly in table repair
* freezer: enforce readonly in newFreezer
* minor fix
* minor
* core/rawdb: test that writing during readonly fails
* rm unused log
* check readonly on batch append
* minor
* Revert "check readonly on batch append"
This reverts commit 2ddb5ec4ba.
* review fixes
* minor test refactor
* attempt at fixing windows issue
* add comment re windows sync issue
* k->kind
* open readonly db for genesis check
Co-authored-by: Martin Holst Swende <martin@swende.se>
This PR reduces the amount of work we do when answering header queries, e.g. when a peer
is syncing from us.
For some items, e.g block bodies, when we read the rlp-data from database, we plug it
directly into the response package. We didn't do that for headers, but instead read
headers-rlp, decode to types.Header, and re-encode to rlp. This PR changes that to keep it
in RLP-form as much as possible. When a node is syncing from us, it typically requests 192
contiguous headers. On master it has the following effect:
- For headers not in ancient: 2 db lookups. One for translating hash->number (even though
the request is by number), and another for reading by hash (this latter one is sometimes
cached).
- For headers in ancient: 1 file lookup/syscall for translating hash->number (even though
the request is by number), and another for reading the header itself. After this, it
also performes a hashing of the header, to ensure that the hash is what it expected. In
this PR, I instead move the logic for "give me a sequence of blocks" into the lower
layers, where the database can determine how and what to read from leveldb and/or
ancients.
There are basically four types of requests; three of them are improved this way. The
fourth, by hash going backwards, is more tricky to optimize. However, since we know that
the gap is 0, we can look up by the parentHash, and stlil shave off all the number->hash
lookups.
The gapped collection can be optimized similarly, as a follow-up, at least in three out of
four cases.
Co-authored-by: Felix Lange <fjl@twurst.com>
This PR fixes a special corner case in transaction indexing.
When the chain is rewound by SetHead to a historical point which is even lower than the transaction indexes tail, then system will report Failed to decode block body error all the time, because the relevant blocks are already deleted.
In order to avoid this "non-critical-but-annoying" issue, we can recap the indexing target to head+1(to is excluded, so it means indexing transactions from 0 to head).
* all: work for eth1/2 transtition
* consensus/beacon, eth: change beacon difficulty to 0
* eth: updates
* all: add terminalBlockDifficulty config, fix rebasing issues
* eth: implemented merge interop spec
* internal/ethapi: update to v1.0.0.alpha.2
This commit updates the code to the new spec, moving payloadId into
it's own object. It also fixes an issue with finalizing an empty blockhash.
It also properly sets the basefee
* all: sync polishes, other fixes + refactors
* core, eth: correct semantics for LeavePoW, EnterPoS
* core: fixed rebasing artifacts
* core: light: performance improvements
* core: use keyed field (f)
* core: eth: fix compilation issues + tests
* eth/catalyst: dbetter error codes
* all: move Merger to consensus/, remove reliance on it in bc
* all: renamed EnterPoS and LeavePoW to ReachTDD and FinalizePoS
* core: make mergelogs a function
* core: use InsertChain instead of InsertBlock
* les: drop merger from lightchain object
* consensus: add merger
* core: recoverAncestors in catalyst mode
* core: fix nitpick
* all: removed merger from beacon, use TTD, nitpicks
* consensus: eth: add docstring, removed unnecessary code duplication
* consensus/beacon: better comment
* all: easy to fix nitpicks by karalabe
* consensus/beacon: verify known headers to be sure
* core: comments
* core: eth: don't drop peers who advertise blocks, nitpicks
* core: never add beacon blocks to the future queue
* core: fixed nitpicks
* consensus/beacon: simplify IsTTDReached check
* consensus/beacon: correct IsTTDReached check
Co-authored-by: rjl493456442 <garyrong0905@gmail.com>
Co-authored-by: Péter Szilágyi <peterke@gmail.com>
This PR offers two more database sub commands for exporting and importing data.
Two exporters are implemented: preimage and snapshot data respectively.
The import command is generic, it can take any data export and import into leveldb.
The data format has a 'magic' for disambiguation, and a version field for future compatibility.
This PR adds a new accessor method to the freezer database. This new view offers a consistent interface, guaranteeing that all individual tables (headers, bodies etc) are all on the same number, and that this number is not changes (added/truncated) while the operation is performing.
* core/types: rm extranous check in test
* core/rawdb: add lightweight types for block logs
* core/rawdb,eth: use lightweight accessor for log filtering
* core/rawdb: add bench for decoding into rlpLogs
This change is a rewrite of the freezer code.
When writing ancient chain data to the freezer, the previous version first encoded each
individual item to a temporary buffer, then wrote the buffer. For small item sizes (for
example, in the block hash freezer table), this strategy causes a lot of system calls for
writing tiny chunks of data. It also allocated a lot of temporary []byte buffers.
In the new version, we instead encode multiple items into a re-useable batch buffer, which
is then written to the file all at once. This avoids performing a system call for every
inserted item.
To make the internal batching work, the ancient database API had to be changed. While
integrating this new API in BlockChain.InsertReceiptChain, additional optimizations were
also added there.
Co-authored-by: Felix Lange <fjl@twurst.com>
* core/rawdb: implement sequential reads in freezer_table
* core/rawdb, ethdb: add sequential reader to db interface
* core/rawdb: lint nitpicks
* core/rawdb: fix some nitpicks
* core/rawdb: fix flaw with deferred reads not being performed
* core/rawdb: better documentation
* eth/protocols/snap: generate storage trie from full dirty snap data
* eth/protocols/snap: get rid of some more dead code
* eth/protocols/snap: less frequent logs, also log during trie generation
* eth/protocols/snap: implement dirty account range stack-hashing
* eth/protocols/snap: don't loop on account trie generation
* eth/protocols/snap: fix account format in trie
* core, eth, ethdb: glue snap packets together, but not chunks
* eth/protocols/snap: print completion log for snap phase
* eth/protocols/snap: extended tests
* eth/protocols/snap: make testcase pass
* eth/protocols/snap: fix account stacktrie commit without defer
* ethdb: fix key counts on reset
* eth/protocols: fix typos
* eth/protocols/snap: make better use of delivered data (#44)
* eth/protocols/snap: make better use of delivered data
* squashme
* eth/protocols/snap: reduce chunking
* squashme
* eth/protocols/snap: reduce chunking further
* eth/protocols/snap: break out hash range calculations
* eth/protocols/snap: use sort.Search instead of looping
* eth/protocols/snap: prevent crash on storage response with no keys
* eth/protocols/snap: nitpicks all around
* eth/protocols/snap: clear heal need on 1-chunk storage completion
* eth/protocols/snap: fix range chunker, add tests
Co-authored-by: Péter Szilágyi <peterke@gmail.com>
* trie: fix test API error
* eth/protocols/snap: fix some further liter issues
* eth/protocols/snap: fix accidental batch reuse
Co-authored-by: Martin Holst Swende <martin@swende.se>
The Append / truncate operations were racy. When a datafile reaches 2Gb, a new file is needed. For this operation, we require a writelock, which is not needed in the 99.99% of all cases where the data does fit in the current head-file.
This transition from readlock to writelock was incorrect, and as the readlock was released, a truncate operation could slip in between, and truncate the data. This would have been fine, however, the Append operation continued writing as if no truncation had occurred, e.g writing item 5 where item 0 should reside.
This PR changes the behaviour, so that if when we run into the situation that a new file is needed, it aborts, and retries, this time with a writelock.
The outcome of the situation described above, running on this PR, would instead be that the Append operation exits with a failure.
This PR introduces:
- db.put to put a value into the database
- db.get to read a value from the database
- db.delete to delete a value from the database
- db.stats to check compaction info from the database
- db.compact to trigger a db compaction
It also moves inspectdb to db.inspect.
This PR implements the following modifications
- Don't shortcut check if block is present, thus avoid disk lookup
- Don't check hash ancestry in early-check (it's still done in parallel checker)
- Don't check time.Now for every single header
Charts and background info can be found here: https://github.com/holiman/headerimport/blob/main/README.md
With these changes, writing 1M headers goes down to from 80s to 62s.
This commit splits the eth package, separating the handling of eth and snap protocols. It also includes the capability to run snap sync (https://github.com/ethereum/devp2p/blob/master/caps/snap.md) , but does not enable it by default.
Co-authored-by: Marius van der Wijden <m.vanderwijden@live.de>
Co-authored-by: Martin Holst Swende <martin@swende.se>
This PR implements unclean shutdown marker. Every time geth boots, it adds a timestamp to a list of timestamps in the database. This list is capped at 10. At a clean shutdown, the timestamp is removed again.
Thus, when geth exits unclean, the marker remains, and at boot up we show the most recent unclean shutdowns to the user, which makes it easier to diagnose root-causes to certain problems.
Co-authored-by: Nagy Salem <me@muhnagy.com>
* core/state/snapshot: introduce snapshot journal version
* core: update the disk layer in an atomic way
* core: persist the disk layer generator periodically
* core/state/snapshot: improve logging
* core/state/snapshot: forcibly ensure the legacy snapshot is matched
* core/state/snapshot: add debug logs
* core, tests: fix tests and special recovery case
* core: polish
* core: add more blockchain tests for snapshot recovery
* core/state: fix comment
* core: add recovery flag for snapshot
* core: add restart after start-after-crash tests
* core/rawdb: fix imports
* core: fix tests
* core: remove log
* core/state/snapshot: fix snapshot
* core: avoid callbacks in SetHead
* core: fix setHead cornercase where the threshold root has state
* core: small docs for the test cases
Co-authored-by: Péter Szilágyi <peterke@gmail.com>
* database: added counters
* Improved stats for ancient db
* Small improvement
* Better message and added percentage while counting receipts
* Fast counting for receipts
* added info message
* Show both receips itemscount from ancient db and counted receipts
* Fixed default case
* Removed counter for receipts in ancient store
* Removed counting of receipts present in leveldb