docs: FAQ fixes and some info about snap sync (#24812)

* docs: update FAQ.md, fixed spelling and grammar

* docs: updated FAQ.md with info about snap sync

* turn qs into headings

* faq: add toc

* faq: minor

* faq: fixes

Co-authored-by: Sina Mahmoodi <itz.s1na@gmail.com>
pull/24831/head
Aviv 3 years ago committed by GitHub
parent f19b3ccdc0
commit ecab0a497c
No known key found for this signature in database
GPG Key ID: 4AEE18F83AFDEB23
  1. 68
      docs/_interface/FAQ.md

@ -4,65 +4,71 @@ permalink: docs/faq
sort_key: C sort_key: C
--- ---
**Q.** I noticed my peercount slowly decrease, and now it is at 0. Restarting doesn't get any peers. * TOC
{:toc}
**A.** Check and sync your clock with ntp. [Example](https://askubuntu.com/questions/254826/how-to-force-a-clock-update-using-ntp) `sudo ntpdate -s time.nist.gov` #### I noticed my peercount slowly decreasing, and now it is at 0. Restarting doesn't get any peers.
--- Check and sync your clock with ntp. For example, you can [force a clock update using ntp](https://askubuntu.com/questions/254826/how-to-force-a-clock-update-using-ntp) like so:
**Q.** I would like to run multiple geth instances but got the error "Fatal: blockchain db err: resource temporarily unavailable". ```sh
sudo ntpdate -s time.nist.gov
```
**A.** Geth uses a datadir to store the blockchain, accounts and some additional information. This directory cannot be shared between running instances. If you would like to run multiple instances follow [these](getting-started/private-net) instructions. #### I would like to run multiple geth instances but got the error "Fatal: blockchain db err: resource temporarily unavailable".
--- Geth uses a datadir to store the blockchain, accounts and some additional information. This directory cannot be shared between running instances. If you would like to run multiple instances follow [these](getting-started/private-net) instructions.
**Q.** How do Ethereum syncing work? #### When I try to use the --password command line flag, I get the error "Could not decrypt key with given passphrase" but the password is correct. Why does this error appear?
**A.** The current default mode of sync for Geth is called fast sync. Instead of starting from the genesis block and reprocessing all the transactions that ever occurred (which could take weeks), fast sync downloads the blocks, and only verifies the associated proof-of-works. Downloading all the blocks is a straightforward and fast procedure and will relatively quickly reassemble the entire chain. Especially if the password file was created on Windows, it may have a Byte Order Mark or other special encoding that the go-ethereum client doesn't currently recognize. You can change this behavior with a PowerShell command like:
Many people falsely assume that because they have the blocks, they are in sync. Unfortunately this is not the case, since no transaction was executed, so we do not have any account state available (ie. balances, nonces, smart contract code and data). These need to be downloaded separately and cross checked with the latest blocks. This phase is called the state trie download and it actually runs concurrently with the block downloads; alas it take a lot longer nowadays than downloading the blocks. ```sh
echo "mypasswordhere" | out-file test.txt -encoding ASCII
```
So, what's the state trie? In the Ethereum mainnet, there are a ton of accounts already, which track the balance, nonce, etc of each user/contract. The accounts themselves are however insufficient to run a node, they need to be cryptographically linked to each block so that nodes can actually verify that the account's are not tampered with. This cryptographic linking is done by creating a tree data structure above the accounts, each level aggregating the layer below it into an ever smaller layer, until you reach the single root. This gigantic data structure containing all the accounts and the intermediate cryptographic proofs is called the state trie. Additional details and/or any updates on more robust handling are at <https://github.com/ethereum/go-ethereum/issues/19905>.
Ok, so why does this pose a problem? This trie data structure is an intricate interlink of hundreds of millions of tiny cryptographic proofs (trie nodes). To truly have a synchronized node, you need to download all the account data, as well as all the tiny cryptographic proofs to verify that no one in the network is trying to cheat you. This itself is already a crazy number of data items. The part where it gets even messier is that this data is constantly morphing: at every block (15s), about 1000 nodes are deleted from this trie and about 2000 new ones are added. This means your node needs to synchronize a dataset that is changing 200 times per second. The worst part is that while you are synchronizing, the network is moving forward, and state that you begun to download might disappear while you're downloading, so your node needs to constantly follow the network while trying to gather all the recent data. But until you actually do gather all the data, your local node is not usable since it cannot cryptographically prove anything about any accounts. #### How does Ethereum syncing work?
If you see that you are 64 blocks behind mainnet, you aren't yet synchronized, not even close. You are just done with the block download phase and still running the state downloads. You can see this yourself via the seemingly endless `Imported state entries [...]` stream of logs. You'll need to wait that out too before your node comes truly online. The current default syncing mode used by Geth is called [snap sync](https://github.com/ethereum/devp2p/blob/master/caps/snap.md). Instead of starting from the genesis block and processing all the transactions that ever occurred (which could take weeks), snap sync downloads the blocks, and only verifies the associated proof-of-works. Downloading all the blocks is a straightforward and fast procedure and will relatively quickly reassemble the entire chain.
--- Many people falsely assume that because they have the blocks, they are in sync. Unfortunately this is not the case. Since no transaction was executed, so we do not have any account state available (ie. balances, nonces, smart contract code and data). These need to be downloaded separately and cross-checked with the latest blocks. This phase is called the state trie download phase. Snap sync tries to hasten this process by downloading contiguous chunks of useful state data, instead of doing so one-by-one, as in previous synchronization methods.
**Q: The node just hangs on importing state enties?!** #### So, what's the state trie?
**A**: The node doesn't hang, it just doesn't know how large the state trie is in advance so it keeps on going and going and going until it discovers and downloads the entire thing. In the Ethereum mainnet, there are a ton of accounts already, which track the balance, nonce, etc of each user/contract. The accounts themselves are however insufficient to run a node, they need to be cryptographically linked to each block so that nodes can actually verify that the accounts are not tampered with.
The reason is that a block in Ethereum only contains the state root, a single hash of the root node. When the node begins synchronizing, it knows about exactly 1 node and tries to download it. That node, can refer up to 16 new nodes, so in the next step, we'll know about 16 new nodes and try to download those. As we go along the download, most of the nodes will reference new ones that we didn't know about until then. This is why you might be tempted to think it's stuck on the same numbers. It is not, rather it's discovering and downloading the trie as it goes along. This cryptographic linking is done by creating a tree-like data structure, where each leaf corresponds to an account, and each intermediary level aggregates the layer below it into an ever smaller layer, until you reach a single root. This gigantic data structure containing all the accounts and the intermediate cryptographic proofs is called the state trie.
--- #### Why does the state trie download phase require a special syncing mode?
**Q: I'm stuck at 64 blocks behind mainnet?!** The trie data structure is an intricate interlink of hundreds of millions of tiny cryptographic proofs (trie nodes). To truly have a synchronized node, you need to download all the account data, as well as all the tiny cryptographic proofs to verify that no one in the network is trying to cheat you. This itself is already a crazy number of data items.
*A:* As explained above, you are not stuck, just finished with the block download phase, waiting for the state download phase to complete too. This latter phase nowadays take a lot longer than just getting the blocks. The part where it gets even messier is that this data is constantly morphing: at every block (roughly 13s), about 1000 nodes are deleted from this trie and about 2000 new ones are added. This means your node needs to synchronize a dataset that is changing more than 200 times per second. Until you actually do gather all the data, your local node is not usable since it cannot cryptographically prove anything about any accounts. But while you're syncing the network is moving forward and most nodes on the network keep the state for only a limited number of recent blocks. Any sync algorithm needs to consider this fact.
--- #### What happened to fast sync?
**Q: Why does downloading the state take so long, I have good bandwidth?** Snap syncing was introduced by version [1.10.0](https://blog.ethereum.org/2021/03/03/geth-v1-10-0/) and was adopted as the default mode in version [1.10.4](https://github.com/ethereum/go-ethereum/releases/tag/v1.10.4). Before that, the default was the "fast" syncing mode, which was dropped in version [1.10.14](https://github.com/ethereum/go-ethereum/releases/tag/v1.10.14). Even though support for fast sync was dropped, Geth still serves the relevant `eth` requests to other client implementations still relying on it. The reason being that snap sync relies on an alternative data structure called the [snapshot](https://blog.ethereum.org/2020/07/17/ask-about-geth-snapshot-acceleration/) which not all clients implement.
**A:** State sync is mostly limited by disk IO, not bandwidth. You can read more in the article posted above why snap sync replaced fast sync in Geth. Below is a table taken from the article summarising the benefits:
The state trie in Ethereum contains hundreds of millions of nodes, most of which take the form of a single hash referencing up to 16 other hashes. This is a horrible way to store data on a disk, because there's almost no structure in it, just random numbers referencing even more random numbers. This makes any underlying database weep, as it cannot optimize storing and looking up the data in any meaningful way. ![snap-fast](https://user-images.githubusercontent.com/129561/109820169-6ee0af00-7c3d-11eb-9721-d8484eee4709.png)
Not only is storing the data very suboptimal, but due to the 200 modification / second and pruning of past data, we cannot even download it is a properly pre-processed way to make it import faster without the underlying database shuffling it around too much. The end result is that even a fast sync nowadays incurs a huge disk IO cost, which is too much for a mechanical hard drive. #### When doing a fast sync, the node just hangs on importing state enties?!
--- The node doesn’t hang, it just doesn’t know how large the state trie is in advance so it keeps on going and going and going until it discovers and downloads the entire thing.
**Q: Wait, so I can't run a full node on an HDD?** The reason is that a block in Ethereum only contains the state root, a single hash of the root node. When the node begins synchronizing, it knows about exactly 1 node and tries to download it. That node, can refer up to 16 new nodes, so in the next step, we’ll know about 16 new nodes and try to download those. As we go along the download, most of the nodes will reference new ones that we didn’t know about until then. This is why you might be tempted to think it’s stuck on the same numbers. It is not, rather it’s discovering and downloading the trie as it goes along.
**A:** Unfortunately not. Doing a fast sync on an HDD will take more time than you're willing to wait with the current data schema. Even if you do wait it out, an HDD will not be able to keep up with the read/write requirements of transaction processing on mainnet. During this phase you might see that your node is 64 blocks behind mainnet. You aren't actually synchronized. That's a side-effect of how fast sync works and you need to wait out until all state entries are downloaded.
You however should be able to run a light client on an HDD with minimal impact on system resources. If you wish to run a full node however, an SSD is your only option. #### I have good bandwidth, so why does downloading the state take so long when using fast sync?
State sync is mostly limited by disk IO, not bandwidth.
--- The state trie in Ethereum contains hundreds of millions of nodes, most of which take the form of a single hash referencing up to 16 other hashes. This is a horrible way to store data on a disk, because there's almost no structure in it, just random numbers referencing even more random numbers. This makes any underlying database weep, as it cannot optimize storing and looking up the data in any meaningful way. Snap sync solves this issue by adopting the Snapshot data structure.
**Q: When I try to use the --password command line flag, I get the error "Could not decrypt key with given passphrase" but the password is correct. Why does this error appear?** #### Wait, so I can't use fast sync on an HDD?
**A:** Especially if the password file was created on Windows, it may have a Byte Order Mark or other special encoding that the go-ethereum client doesn't currently recognize. You can change this behavior with a PowerShell command like `echo "mypasswordhere" | out-file test.txt -encoding ASCII`. Additional details and/or any updates on more robust handling are at <https://github.com/ethereum/go-ethereum/issues/19905>. Doing a "fast" sync on an HDD will take more time than you're willing to wait, because the data structures used are not optimized for HDDs. Even if you do wait it out, an HDD will not be able to keep up with the read/write requirements of transaction processing on mainnet. You however should be able to run a light client on an HDD with minimal impact on system resources.

Loading…
Cancel
Save