This PR fixes an issue with blob transaction propagation due to the blob
transation txpool rejecting transactions with gapped nonces. The
specific changes are:
- fetch transactions from a peer in the order they were announced to
minimize nonce-gaps (which cause blob txs to be rejected
- don't wait on fetching blob transactions after announcement is
received, since they are not broadcast
Testing:
- unit tests updated to reflect that fetch order should always match tx
announcement order
- unit test added to confirm blob transactions are scheduled immediately
for fetching
- running the PR on an eth mainnet full node without incident so far
---------
Signed-off-by: Roberto Bayardo <bayardo@alum.mit.edu>
Co-authored-by: Gary Rong <garyrong0905@gmail.com>
This PR adds the `dns:read` and `dns:edit` permissions to the required
set of permissions checked before deploying an ENR tree to Cloudflare.
These permissions are necessary for a successful publish.
**Background**:
The current logic for `devp2p dns to-cloudflare` checks for `zone:edit`
and `zone:read` permissions. However, when running the command with only
these two permissions, the following error occurs:
```
wrong permissions on zone REMOVED-ZONE: map[#zone:edit:false #zone:read:true]
```
Adding `zone:read` and `zone:edit` to the API token led to a different
error:
```
INFO [08-19|14:06:16.782] Retrieving existing TXT records on pos-nodes.hardfork.dev
Authentication error (10000)
```
This suggested that additional permissions were required. I added
`dns:read`, but encountered another error:
```
INFO [08-19|14:11:42.342] Retrieving existing TXT records on pos-nodes.hardfork.dev
INFO [08-19|14:11:42.851] Updating DNS entries
failed to publish REMOVED.pos-nodes.hardfork.dev: Authentication error (10000)
```
Finally, after adding both `dns:read` and `dns:edit` permissions, the
command executed successfully with the following output:
```
INFO [08-19|14:13:07.677] Checking Permissions on zone REMOVED-ZONE
INFO [08-19|14:13:08.014] Retrieving existing TXT records on pos-nodes.hardfork.dev
INFO [08-19|14:13:08.440] Updating DNS entries
INFO [08-19|14:13:08.440] "Updating pos-nodes.hardfork.dev from \"enrtree-root:v1 e=FSED3EDKEKRDDFMCLP746QY6CY l=FDXN3SN67NA5DKA4J2GOK7BVQI seq=1 sig=Glja2c9RviRqOpaaHR0MnHsQwU76nJXadJwFeiXpp8MRTVIhvL0LIireT0yE3ETZArGEmY5Ywz3FVHZ3LR5JTAE\" to \"enrtree-root:v1 e=AB66M4ULYD5OYN4XFFCPVZRLUM l=FDXN3SN67NA5DKA4J2GOK7BVQI seq=1 sig=H8cqDzu0FAzBplK4g3yudhSaNtszIebc2aj4oDm5a5ZE5PAg-xpCnQgVE_53CsgsqQpalD9byafx_FrUT61sagA\""
INFO [08-19|14:13:16.932] Updated DNS entries new=32 updated=1 untouched=100
INFO [08-19|14:13:16.932] Deleting stale DNS entries
INFO [08-19|14:13:24.663] Deleted stale DNS entries count=31
```
With this PR, the required permissions for deploying an ENR tree to
Cloudflare now include `zone:read`, `zone:edit`, `dns:read`, and
`dns:edit`. The initial check now includes all of the necessary
permissions and indicates in the error message which permissions are
missing:
```
INFO [08-19|14:17:20.339] Checking Permissions on zone REMOVED-ZONE
wrong permissions on zone REMOVED-ZONE: map[#dns_records:edit:false #dns_records:read:false #zone:edit:false #zone:read:true]
```
enode.Node was recently changed to store a cache of endpoint information. The IP address in the cache is a netip.Addr. I chose that type over net.IP because it is just better. netip.Addr is meant to be used as a value type. Copying it does not allocate, it can be compared with ==, and can be used as a map key.
This PR changes most uses of Node.IP() into Node.IPAddr(), which returns the cached value directly without allocating.
While there are still some public APIs left where net.IP is used, I have converted all code used internally by p2p/discover to the new types. So this does change some public Go API, but hopefully not APIs any external code actually uses.
There weren't supposed to be any semantic differences resulting from this refactoring, however it does introduce one: In package p2p/netutil we treated the 0.0.0.0/8 network (addresses 0.x.y.z) as LAN, but netip.Addr.IsPrivate() doesn't. The treatment of this particular IP address range is controversial, with some software supporting it and others not. IANA lists it as special-purpose and invalid as a destination for a long time, so I don't know why I put it into the LAN list. It has now been marked as special in p2p/netutil as well.
Here we clean up internal uses of type discover.node, converting most code to use
enode.Node instead. The discover.node type used to be the canonical representation of
network hosts before ENR was introduced. Most code worked with *node to avoid conversions
when interacting with Table methods. Since *node also contains internal state of Table and
is a mutable type, using *node outside of Table code is prone to data races. It's also
cleaner not having to wrap/unwrap *enode.Node all the time.
discover.node has been renamed to tableNode to clarify its purpose.
While here, we also change most uses of net.UDPAddr into netip.AddrPort. While this is
technically a separate refactoring from the *node -> *enode.Node change, it is more
convenient because *enode.Node handles IP addresses as netip.Addr. The switch to package
netip in discovery would've happened very soon anyway.
The change to netip.AddrPort stops at certain interface points. For example, since package
p2p/netutil has not been converted to use netip.Addr yet, we still have to convert to
net.IP/net.UDPAddr in a few places.
Node discovery periodically revalidates the nodes in its table by sending PING, checking
if they are still alive. I recently noticed some issues with the implementation of this
process, which can cause strange results such as nodes dropping unexpectedly, certain
nodes not getting revalidated often enough, and bad results being returned to incoming
FINDNODE queries.
In this change, the revalidation process is improved with the following logic:
- We maintain two 'revalidation lists' containing the table nodes, named 'fast' and 'slow'.
- The process chooses random nodes from each list on a randomized interval, the interval being
faster for the 'fast' list, and performs revalidation for the chosen node.
- Whenever a node is newly inserted into the table, it goes into the 'fast' list.
Once validation passes, it transfers to the 'slow' list. If a request fails, or the
node changes endpoint, it transfers back into 'fast'.
- livenessChecks is incremented by one for successful checks. Unlike the old implementation,
we will not drop the node on the first failing check. We instead quickly decay the
livenessChecks give it another chance.
- Order of nodes in bucket doesn't matter anymore.
I am also adding a debug API endpoint to dump the node table content.
Co-authored-by: Martin HS <martin@swende.se>
Package filepath implements utility routines for manipulating filename paths in a way compatible with the target operating system-defined file paths.
Package path implements utility routines for manipulating slash-separated paths.
The path package should only be used for paths separated by forward slashes, such as the paths in URLs
Improving two things here:
On hive, where we look at these tests, the Go code comment above the test
is not visible. When there is a failure, it's not obvious what the test is actually
expecting. I have converted the comments in to printed log messages to
explain the test more.
Second, I noticed that besu is failing some tests because it happens to request
a header when we want it to send transactions. Trying the minimal fix here to
serve the headers.
Co-authored-by: lightclient <14004106+lightclient@users.noreply.github.com>
Here we update the eth and snap protocol test suites with a new test chain,
created by the hivechain tool. The new test chain uses proof-of-stake. As such,
tests using PoW block propagation in the eth protocol are removed. The test suite
now connects to the node under test using the engine API in order to make it
accept transactions.
The snap protocol test suite has been rewritten to output test descriptions and
log requests more verbosely.
---------
Co-authored-by: Felix Lange <fjl@twurst.com>
This PR replaces Geth's logger package (a fork of [log15](https://github.com/inconshreveable/log15)) with an implementation using slog, a logging library included as part of the Go standard library as of Go1.21.
Main changes are as follows:
* removes any log handlers that were unused in the Geth codebase.
* Json, logfmt, and terminal formatters are now slog handlers.
* Verbosity level constants are changed to match slog constant values. Internal translation is done to make this opaque to the user and backwards compatible with existing `--verbosity` and `--vmodule` options.
* `--log.backtraceat` and `--log.debug` are removed.
The external-facing API is largely the same as the existing Geth logger. Logger method signatures remain unchanged.
A small semantic difference is that a `Handler` can only be set once per `Logger` and not changed dynamically. This just means that a new logger must be instantiated every time the handler of the root logger is changed.
----
For users of the `go-ethereum/log` module. If you were using this module for your own project, you will need to change the initialization. If you previously did
```golang
log.Root().SetHandler(log.LvlFilterHandler(log.LvlInfo, log.StreamHandler(os.Stderr, log.TerminalFormat(true))))
```
You now instead need to do
```golang
log.SetDefault(log.NewLogger(log.NewTerminalHandlerWithLevel(os.Stderr, log.LevelInfo, true)))
```
See more about reasoning here: https://github.com/ethereum/go-ethereum/issues/28558#issuecomment-1820606613
During snap-sync, we request ranges of values: either a range of accounts or a range of storage values. For any large trie, e.g. the main account trie or a large storage trie, we cannot fetch everything at once.
Short version; we split it up and request in multiple stages. To do so, we use an origin field, to say "Give me all storage key/values where key > 0x20000000000000000". When the server fulfils this, the server provides the first key after origin, let's say 0x2e030000000000000 -- never providing the exact origin. However, the client-side needs to be able to verify that the 0x2e03.. indeed is the first one after 0x2000.., and therefore the attached proof concerns the origin, not the first key.
So, short-short version: the left-hand side of the proof relates to the origin, and is free-standing from the first leaf.
On the other hand, (pun intended), the right-hand side, there's no such 'gap' between "along what path does the proof walk" and the last provided leaf. The proof must prove the last element (unless there are no elements).
Therefore, we can simplify the semantics for trie.VerifyRangeProof by removing an argument. This doesn't make much difference in practice, but makes it so that we can remove some tests. The reason I am raising this is that the upcoming stacktrie-based verifier does not support such fancy features as standalone right-hand borders.
This change addresses an issue in snap sync, specifically when the entire sync process can be halted due to an encountered empty storage range.
Currently, on the snap sync client side, the response to an empty (partial) storage range is discarded as a non-delivery. However, this response can be a valid response, when the particular range requested does not contain any slots.
For instance, consider a large contract where the entire key space is divided into 16 chunks, and there are no available slots in the last chunk [0xf] -> [end]. When the node receives a request for this particular range, the response includes:
The proof with origin [0xf]
A nil storage slot set
If we simply discard this response, the finalization of the last range will be skipped, halting the entire sync process indefinitely. The test case TestSyncWithUnevenStorage can reproduce the scenario described above.
In addition, this change also defines the common variables MaxAddress and MaxHash.
This is a minor refactor in preparation of changes to range verifier. This PR contains no intentional functional changes but moves (and renames) the light.NodeSet
This PR makes the tool use the --bootnodes list as the input to devp2p crawl.
The flag will take effect if the input/output.json file is missing or empty.
* core/forkid: skip genesis forks by time
* core/forkid: add comment about skipping non-zero fork times
* core/forkid: skip all time based forks in genesis using loop
* core/forkid: simplify logic for dropping time-based forks
This changes the forkID calculation to ignore time-based forks that occurred before the
genesis block. It's supposed to be done this way because the spec says:
> If a chain is configured to start with a non-Frontier ruleset already in its genesis, that is NOT considered a fork.
The Go authors updated golang/x/ext to change the function signature of the slices sort method.
It's an entire shitshow now because x/ext is not tagged, so everyone's codebase just
picked a new version that some other dep depends on, causing our code to fail building.
This PR updates the dep on our code too and does all the refactorings to follow upstream...
The clean trie cache is persisted periodically, therefore Geth can
quickly warmup the cache in next restart.
However it will reduce the robustness of system. The assumption is
held in Geth that if the parent trie node is present, then the entire
sub-trie associated with the parent are all prensent.
Imagine the scenario that Geth rewinds itself to a past block and
restart, but Geth finds the root node of "future state" in clean
cache then regard this state is present in disk, while is not in fact.
Another example is offline pruning tool. Whenever an offline pruning
is performed, the clean cache file has to be removed to aviod hitting
the root node of "deleted states" in clean cache.
All in all, compare with the minor performance gain, system robustness
is something we care more.
Follow-up to #26697, makes the crawler less verbose on route53-based scenarios.
It also changes the loglevel from debug to info on Updates, which are typically the root, and can be interesting to see.