docs: update EVM tracing docs (#25242)

Improved tracing docs. Added section about native tracing. Co-authored-by: Sina Mahmoodi <itz.s1na@gmail.com>
2 years ago · 69424d6bcd
parent 8d276f51bf
commit 69424d6bcd
3 changed files with 596 additions and 448 deletions
--- a/docs/_dapp/custom-tracer.md
+++ b/docs/_dapp/custom-tracer.md
@ -0,0 +1,458 @@
 ---
 title: Custom EVM tracer
 sort_key: B
 ---
 In addition to the default opcode tracer and the built-in tracers, Geth offers the possibility to write custom code
 that hook to events in the EVM to process and return the data in a consumable format. Custom tracers can be
 written either in Javascript or Go. JS tracers are good for quick prototyping and experimentation as well as for
 less intensive applications. Go tracers are performant but require the tracer to be compiled together with the Geth source code.
 * TOC
 {:toc}
 ## Custom Javascript tracing
 Transaction traces include the complete status of the EVM at every point during the transaction execution, which
 can be a very large amount of data. Often, users are only interested in a small subset of that data. Javascript trace
 filters are available to isolate the useful information. Detailed information about `debug_traceTransaction` and its
 component parts is available in the [reference documentation](/docs/rpc/ns-debug#debug_tracetransaction).
 ### A simple filter
 Filters are Javascript functions that select information from the trace to persist and discard based on some
 conditions. The following Javascript function returns only the sequence of opcodes executed by the transaction as a
 comma-separated list. The function could be written directly in the Javascript console, but it is cleaner to 
 write it in a separate re-usable file and load it into the console. 
 1. Create a file, `filterTrace_1.js`, with this content:
   ```javascript
   tracer = function(tx) {
      return debug.traceTransaction(tx, {tracer:
         '{' +
            'retVal: [],' +
            'step: function(log,db) {this.retVal.push(log.getPC() + ":" + log.op.toString())},' +
            'fault: function(log,db) {this.retVal.push("FAULT: " + JSON.stringify(log))},' +
            'result: function(ctx,db) {return this.retVal}' +
         '}'
      }) // return debug.traceTransaction ...
   }   // tracer = function ...
   ```
 2. Run the [JavaScript console](https://geth.ethereum.org/docs/interface/javascript-console).
 3. Get the hash of a recent transaction from a node or block explorer.
 4. Run this command to run the script:
   ```javascript
   loadScript("filterTrace_1.js")
   ```
 5. Run the tracer from the script. Be patient, it could take a long time.
   ```javascript
   tracer("<hash of transaction>")
   ```
   The bottom of the output looks similar to:
   ```sh
   "3366:POP", "3367:JUMP", "1355:JUMPDEST", "1356:PUSH1", "1358:MLOAD", "1359:DUP1", "1360:DUP3", "1361:ISZERO", "1362:ISZERO",
   "1363:ISZERO", "1364:ISZERO", "1365:DUP2", "1366:MSTORE", "1367:PUSH1", "1369:ADD", "1370:SWAP2", "1371:POP", "1372:POP", "1373:PUSH1",
   "1375:MLOAD", "1376:DUP1", "1377:SWAP2", "1378:SUB", "1379:SWAP1", "1380:RETURN"
   ```
 6. Run this line to get a more readable output with each string in its own line.
   ```javascript
   console.log(JSON.stringify(tracer("<hash of transaction>"), null, 2))
   ```
 More information about the `JSON.stringify` function is available
 [here](https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/JSON/stringify). 
 The commands above worked by calling the same `debug.traceTransaction` function that was previously 
 explained in [basic traces](https://geth.ethereum.org/docs/dapp/tracing), but with a new parameter, `tracer`. 
 This parameter takes the JavaScript object formated as a string. In the case of the trace above, it is:
 ```javascript
 {
   retVal: [],
   step: function(log,db) {this.retVal.push(log.getPC() + ":" + log.op.toString())},
   fault: function(log,db) {this.retVal.push("FAULT: " + JSON.stringify(log))},
   result: function(ctx,db) {return this.retVal}
 }
 ```
 This object has three member functions:
 - `step`, called for each opcode.
 - `fault`, called if there is a problem in the execution.
 - `result`, called to produce the results that are returned by `debug.traceTransaction` after the execution is done.
 In this case, `retVal` is used to store the list of strings to return in `result`.
 The `step` function adds to `retVal` the program counter and the name of the opcode there. Then, in `result`, this
 list is returned to be sent to the caller.
 ### Filtering with conditions
 For actual filtered tracing we need an `if` statement to only log relevant information. For example, to isolate
 the transaction's interaction with storage, the following tracer could be used:
 ```javascript
 tracer = function(tx) {
      return debug.traceTransaction(tx, {tracer:
      '{' +
         'retVal: [],' +
         'step: function(log,db) {' +
         '   if(log.op.toNumber() == 0x54) ' +
         '     this.retVal.push(log.getPC() + ": SLOAD");' +
         '   if(log.op.toNumber() == 0x55) ' +
         '     this.retVal.push(log.getPC() + ": SSTORE");' +
         '},' +
         'fault: function(log,db) {this.retVal.push("FAULT: " + JSON.stringify(log))},' +
         'result: function(ctx,db) {return this.retVal}' +
      '}'
      }) // return debug.traceTransaction ...
 }   // tracer = function ...
 ```
 The `step` function here looks at the opcode number of the op, and only pushes an entry if the opcode is
 `SLOAD` or `SSTORE` ([here is a list of EVM opcodes and their numbers](https://github.com/wolflo/evm-opcodes)).
 We could have used `log.op.toString()` instead, but it is faster to compare numbers rather than strings.
 The output looks similar to this:
 ```javascript
 [
  "5921: SLOAD",
  .
  .
  .
  "2413: SSTORE",
  "2420: SLOAD",
  "2475: SSTORE",
  "6094: SSTORE"
 ]
 ```
 ### Stack Information
 The trace above reports the program counter (PC) and whether the program read from storage or wrote to it. 
 That alone isn't particularly useful. To know more, the `log.stack.peek` function can be used to peek 
 into the stack. `log.stack.peek(0)` is the stack top, `log.stack.peek(1)` the entry below it, etc.
 The values returned by `log.stack.peek` are Go `big.Int` objects. By default they are converted to JavaScript 
 floating point numbers, so you need `toString(16)` to get them as hexadecimals, which is how 256-bit values such as
 storage cells and their content are normally represented.
 #### Storage Information
 The function below provides a trace of all the storage operations and their parameters. This gives
 a more complete picture of the program's interaction with storage. 
 ```javascript
 tracer = function(tx) {
      return debug.traceTransaction(tx, {tracer:
      '{' +
         'retVal: [],' +
         'step: function(log,db) {' +
         '   if(log.op.toNumber() == 0x54) ' +
         '     this.retVal.push(log.getPC() + ": SLOAD " + ' +
         '        log.stack.peek(0).toString(16));' +
         '   if(log.op.toNumber() == 0x55) ' +
         '     this.retVal.push(log.getPC() + ": SSTORE " +' +
         '        log.stack.peek(0).toString(16) + " <- " +' +
         '        log.stack.peek(1).toString(16));' +
         '},' +
         'fault: function(log,db) {this.retVal.push("FAULT: " + JSON.stringify(log))},' +
         'result: function(ctx,db) {return this.retVal}' +
      '}'
      }) // return debug.traceTransaction ...
 }   // tracer = function ...
 ```
 The output is similar to:
 ```javascript
 [
  "5921: SLOAD 0",
  .
  .
  .
  "2413: SSTORE 3f0af0a7a3ed17f5ba6a93e0a2a05e766ed67bf82195d2dd15feead3749a575d <- fb8629ad13d9a12456",
  "2420: SLOAD cc39b177dd3a7f50d4c09527584048378a692aed24d31d2eabeddb7f3c041870",
  "2475: SSTORE cc39b177dd3a7f50d4c09527584048378a692aed24d31d2eabeddb7f3c041870 <- 358c3de691bd19",
  "6094: SSTORE 0 <- 1"
 ]
 ```
 #### Operation Results
 One piece of information missing from the function above is the result on an `SLOAD` operation. The
 state we get inside `log` is the state prior to the execution of the opcode, so that value is not
 known yet. For more operations we can figure it out for ourselves, but we don't have access to the
 storage, so here we can't.
 The solution is to have a flag, `afterSload`, which is only true in the opcode right after an
 `SLOAD`, when we can see the result at the top of the stack.
 ```javascript
 tracer = function(tx) {
      return debug.traceTransaction(tx, {tracer:
      '{' +
         'retVal: [],' +
         'afterSload: false,' +
         'step: function(log,db) {' +
         '   if(this.afterSload) {' +
         '     this.retVal.push("    Result: " + ' +
         '          log.stack.peek(0).toString(16)); ' +
         '     this.afterSload = false; ' +
         '   } ' +
         '   if(log.op.toNumber() == 0x54) {' +
         '     this.retVal.push(log.getPC() + ": SLOAD " + ' +
         '        log.stack.peek(0).toString(16));' +
         '        this.afterSload = true; ' +
         '   } ' +
         '   if(log.op.toNumber() == 0x55) ' +
         '     this.retVal.push(log.getPC() + ": SSTORE " +' +
         '        log.stack.peek(0).toString(16) + " <- " +' +
         '        log.stack.peek(1).toString(16));' +
         '},' +
         'fault: function(log,db) {this.retVal.push("FAULT: " + JSON.stringify(log))},' +
         'result: function(ctx,db) {return this.retVal}' +
      '}'
      }) // return debug.traceTransaction ...
 }   // tracer = function ...
 ```
 The output now contains the result in the line that follows the `SLOAD`. 
 ```javascript
 [
  "5921: SLOAD 0",
  "    Result: 1",
  .
  .
  .
  "2413: SSTORE 3f0af0a7a3ed17f5ba6a93e0a2a05e766ed67bf82195d2dd15feead3749a575d <- fb8629ad13d9a12456",
  "2420: SLOAD cc39b177dd3a7f50d4c09527584048378a692aed24d31d2eabeddb7f3c041870",
  "    Result: 0",
  "2475: SSTORE cc39b177dd3a7f50d4c09527584048378a692aed24d31d2eabeddb7f3c041870 <- 358c3de691bd19",
  "6094: SSTORE 0 <- 1"
 ]
 ```
 ### Dealing With Calls Between Contracts
 So the storage has been treated as if there are only 2<sup>256</sup> cells. However, that is not true. 
 Contracts can call other contracts, and then the storage involved is the storage of the other contract. 
 We can see the address of the current contract in `log.contract.getAddress()`. This value is the execution 
 context - the contract whose storage we are using - even when code from another contract is executed (by using
 [`CALLCODE` or `DELEGATECALL`][solidity-delcall]).
 However, `log.contract.getAddress()` returns an array of bytes. To convert this to the familiar hexadecimal
 representation of Ethereum addresses, `this.byteHex()` and `array2Hex()` can be used.
 ```javascript
 tracer = function(tx) {
      return debug.traceTransaction(tx, {tracer:
      '{' +
         'retVal: [],' +
         'afterSload: false,' +
         'callStack: [],' +
         'byte2Hex: function(byte) {' +
         '  if (byte < 0x10) ' +
         '      return "0" + byte.toString(16); ' +
         '  return byte.toString(16); ' +
         '},' +
         'array2Hex: function(arr) {' +
         '  var retVal = ""; ' +
         '  for (var i=0; i<arr.length; i++) ' +
         '    retVal += this.byte2Hex(arr[i]); ' +
         '  return retVal; ' +
         '}, ' +
         'getAddr: function(log) {' +
         '  return this.array2Hex(log.contract.getAddress());' +
         '}, ' +
         'step: function(log,db) {' +
         '   var opcode = log.op.toNumber();' +
         // SLOAD
         '   if (opcode == 0x54) {' +
         '     this.retVal.push(log.getPC() + ": SLOAD " + ' +
         '        this.getAddr(log) + ":" + ' +
         '        log.stack.peek(0).toString(16));' +
         '        this.afterSload = true; ' +
         '   } ' +
         // SLOAD Result
         '   if (this.afterSload) {' +
         '     this.retVal.push("    Result: " + ' +
         '          log.stack.peek(0).toString(16)); ' +
         '     this.afterSload = false; ' +
         '   } ' +
         // SSTORE
         '   if (opcode == 0x55) ' +
         '     this.retVal.push(log.getPC() + ": SSTORE " +' +
         '        this.getAddr(log) + ":" + ' +
         '        log.stack.peek(0).toString(16) + " <- " +' +
         '        log.stack.peek(1).toString(16));' +
         // End of step
         '},' +
         'fault: function(log,db) {this.retVal.push("FAULT: " + JSON.stringify(log))},' +
         'result: function(ctx,db) {return this.retVal}' +
      '}'
      }) // return debug.traceTransaction ...
 }   // tracer = function ...
 ```
 The output is similar to:
 ```javascript
 [
  "423: SLOAD 22ff293e14f1ec3a09b137e9e06084afd63addf9:360894a13ba1a3210667c828492db98dca3e2076cc3735a920a3ca505d382bbc",
  "    Result: 360894a13ba1a3210667c828492db98dca3e2076cc3735a920a3ca505d382bbc",
  "10778: SLOAD 22ff293e14f1ec3a09b137e9e06084afd63addf9:6",
  "    Result: 6",
  .
  .
  .
  "13529: SLOAD f2d68898557ccb2cf4c10c3ef2b034b2a69dad00:8328de571f86baa080836c50543c740196dbc109d42041802573ba9a13efa340",
  "    Result: 8328de571f86baa080836c50543c740196dbc109d42041802573ba9a13efa340",
  "423: SLOAD f2d68898557ccb2cf4c10c3ef2b034b2a69dad00:360894a13ba1a3210667c828492db98dca3e2076cc3735a920a3ca505d382bbc",
  "    Result: 360894a13ba1a3210667c828492db98dca3e2076cc3735a920a3ca505d382bbc",
  "13529: SLOAD f2d68898557ccb2cf4c10c3ef2b034b2a69dad00:b38558064d8dd9c883d2a8c80c604667ddb90a324bc70b1bac4e70d90b148ed4",
  "    Result: b38558064d8dd9c883d2a8c80c604667ddb90a324bc70b1bac4e70d90b148ed4",
  "11041: SSTORE 22ff293e14f1ec3a09b137e9e06084afd63addf9:6 <- 0"
 ]
 ```
 ## Other traces
 This tutorial has focused on `debug_traceTransaction()` which reports information about individual transactions. There are
 also RPC endpoints that provide different information, including tracing the EVM execution within a block, between two blocks, 
 for specific `eth_call`s or rejected blocks. The fill list of trace functions can be explored in the 
 [reference documentation][debug-docs].
 ## Custom Go tracing
 Custom tracers can also be made more performant by writing them in Go. The gain in performance mostly comes from the fact that Geth doesn't need
 to interpret JS code and can execute native functions. Geth comes with several built-in [native tracers](https://github.com/ethereum/go-ethereum/tree/master/eth/tracers/native) which can serve as examples. Please note that unlike JS tracers, Go tracing scripts cannot be simply passed as an argument to the API. They will need to be added to and compiled with the rest of the Geth source code.
 In this section a simple native tracer that counts the number of opcodes will be covered. First follow the instructions to [clone and build](install-and-build/installing-geth#build-from-source-code) Geth from source code. Next save the following snippet as a `.go` file and add it to `eth/tracers/native`:
 ```go
 package native
 import (
    "encoding/json"
    "math/big"
    "sync/atomic"
    "time"
    "github.com/ethereum/go-ethereum/common"
    "github.com/ethereum/go-ethereum/core/vm"
    "github.com/ethereum/go-ethereum/eth/tracers"
 )
 func init() {
    // This is how Geth will become aware of the tracer and register it under a given name
    register("opcounter", newOpcounter)
 }
 type opcounter struct {
    env       *vm.EVM
    counts    map[string]int // Store opcode counts
    interrupt uint32         // Atomic flag to signal execution interruption
    reason    error          // Textual reason for the interruption
 }
 func newOpcounter(ctx *tracers.Context) tracers.Tracer {
    return &opcounter{counts: make(map[string]int)}
 }
 // CaptureStart implements the EVMLogger interface to initialize the tracing operation.
 func (t *opcounter) CaptureStart(env *vm.EVM, from common.Address, to common.Address, create bool, input []byte, gas uint64, value *big.Int) {
        t.env = env
 }
 // CaptureState implements the EVMLogger interface to trace a single step of VM execution.
 func (t *opcounter) CaptureState(pc uint64, op vm.OpCode, gas, cost uint64, scope *vm.ScopeContext, rData []byte, depth int, err error) {
    // Skip if tracing was interrupted
    if atomic.LoadUint32(&t.interrupt) > 0 {
        t.env.Cancel()
        return
    }
    name := op.String()
    if _, ok := t.counts[name]; !ok {
        t.counts[name] = 0
    }
    t.counts[name]++
 }
 // CaptureEnter is called when EVM enters a new scope (via call, create or selfdestruct).
 func (t *opcounter) CaptureEnter(op vm.OpCode, from common.Address, to common.Address, input []byte, gas uint64, value *big.Int) {}
 // CaptureExit is called when EVM exits a scope, even if the scope didn't
 // execute any code.
 func (t *opcounter) CaptureExit(output []byte, gasUsed uint64, err error) {}
 // CaptureFault implements the EVMLogger interface to trace an execution fault.
 func (t *opcounter) CaptureFault(pc uint64, op vm.OpCode, gas, cost uint64, scope *vm.ScopeContext, depth int, err error) {}
 // CaptureEnd is called after the call finishes to finalize the tracing.
 func (t *opcounter) CaptureEnd(output []byte, gasUsed uint64, _ time.Duration, err error) {}
 func (*opcounter) CaptureTxStart(gasLimit uint64) {}
 func (*opcounter) CaptureTxEnd(restGas uint64) {}
 // GetResult returns the json-encoded nested list of call traces, and any
 // error arising from the encoding or forceful termination (via `Stop`).
 func (t *opcounter) GetResult() (json.RawMessage, error) {
    res, err := json.Marshal(t.counts)
    if err != nil {
        return nil, err
    }
    return res, t.reason
 }
 // Stop terminates execution of the tracer at the first opportune moment.
 func (t *opcounter) Stop(err error) {
    t.reason = err
    atomic.StoreUint32(&t.interrupt, 1)
 }
 ```
 As can be seen every method of the [EVMLogger interface](https://pkg.go.dev/github.com/ethereum/go-ethereum/core/vm#EVMLogger) needs to be implemented (even if empty). Key parts to notice are the `init()` function which registers the tracer in Geth, the `CaptureState` hook where the opcode counts are incremented and `GetResult` where the result is serialized and delivered. To test this out the source is first compiled with `make geth`. Then in the console it can be invoked through the usual API methods by passing in the name it was registered under:
 ```console
 > debug.traceTransaction('0x7ae446a7897c056023a8104d254237a8d97783a92900a7b0f7db668a9432f384', { tracer: 'opcounter' })
 {
    ADD: 4,
    AND: 3,
    CALLDATALOAD: 2,
    ...
 }
 ```
 [solidity-delcall]:https://docs.soliditylang.org/en/v0.8.14/introduction-to-smart-contracts.html#delegatecall-callcode-and-libraries
 [debug-docs]: /docs/rpc/ns-debug
--- a/docs/_dapp/tracing-filtered.md
+++ b/docs/_dapp/tracing-filtered.md
@ -1,343 +0,0 @@
 ---
 title: Filtered Tracing
 sort_key: B
 ---
 In the previous section you learned how to create a complete trace. However, those traces can include the complete status of the EVM at every point
 in the execution, which is huge. Usually you are only interested in a small subset of this information. To get it, you can specify a JavaScript filter.
 **Note:** The JavaScript interpreter used by Geth is [duktape](https://duktape.org), which is only up to the
 [ECMAScript 5.1 standard](https://262.ecma-international.org/5.1/). This means we cannot use [arrow functions](https://www.w3schools.com/js/js_arrow_function.asp)
 and [template literals](https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Template_literals).
 ## Running a Simple Trace
 1. Create a file, `filterTrace_1.js`, with this content:
   ```javascript
   tracer = function(tx) {
      return debug.traceTransaction(tx, {tracer:
         '{' +
            'retVal: [],' +
            'step: function(log,db) {this.retVal.push(log.getPC() + ":" + log.op.toString())},' +
            'fault: function(log,db) {this.retVal.push("FAULT: " + JSON.stringify(log))},' +
            'result: function(ctx,db) {return this.retVal}' +
         '}'
      }) // return debug.traceTransaction ...
   }   // tracer = function ...
   ```
   We could specify this function directly in the JavaScript console, but it would be unwieldy and difficult
   to edit.
 2. Run the [JavaScript console](https://geth.ethereum.org/docs/interface/javascript-console).
 3. Get the hash of a recent transaction. For example, if you use the Goerli network, you can get such a value
   [here](https://goerli.etherscan.io/).
 4. Run this command to run the script:
   ```javascript
   loadScript("filterTrace_1.js")
   ```
 5. Run the tracer from the script. Be patient, it could take a long time.
   ```javascript
   tracer("<hash of transaction>")
   ```
   The bottom of the output looks similar to:
   ```json
   "3366:POP", "3367:JUMP", "1355:JUMPDEST", "1356:PUSH1", "1358:MLOAD", "1359:DUP1", "1360:DUP3", "1361:ISZERO", "1362:ISZERO",
   "1363:ISZERO", "1364:ISZERO", "1365:DUP2", "1366:MSTORE", "1367:PUSH1", "1369:ADD", "1370:SWAP2", "1371:POP", "1372:POP", "1373:PUSH1",
   "1375:MLOAD", "1376:DUP1", "1377:SWAP2", "1378:SUB", "1379:SWAP1", "1380:RETURN"]
   ```
 6. This output isn't very readable. Run this line to get a more readable output with each string in its own line.
   ```javascript
   console.log(JSON.stringify(tracer("<hash of transaction>"), null, 2))
   ```
   You can read about the `JSON.stringify` function
   [here](https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/JSON/stringify). If we just
   return the output we get `\n` for newlines, which is why we need to use `console.log`.
 ### How Does It Work?
 We call the same `debug.traceTransaction` function we use for [basic traces](https://geth.ethereum.org/docs/dapp/tracing), but
 with a new parameter, `tracer`. This parameter is a string that is the JavaScript object we use. In the case of the trace
 above, it is:
 ```javascript
 {
   retVal: [],
   step: function(log,db) {this.retVal.push(log.getPC() + ":" + log.op.toString())},
   fault: function(log,db) {this.retVal.push("FAULT: " + JSON.stringify(log))},
   result: function(ctx,db) {return this.retVal}
 }
 ```
 This object has to have three member functions:
 - `step`, called for each opcode
 - `fault`, called if there is a problem in the execution
 - `result`, called to produce the results that are returned by `debug.traceTransaction` after the execution is done
 It can have additional members. In this case, we use `retVal` to store the list of strings that we'll return in `result`.
 The `step` function here adds to `retVal` the program counter and the name of the opcode there. Then, in `result`, we return this
 list to be sent to the caller.
 ## Actual Filtering
 For actual filtered tracing we need an `if` statement to only log relevant information. For example, if we are interested in
 the transaction's interaction with storage, we might use:
 ```javascript
 tracer = function(tx) {
      return debug.traceTransaction(tx, {tracer:
      '{' +
         'retVal: [],' +
         'step: function(log,db) {' +
         '   if(log.op.toNumber() == 0x54) ' +
         '     this.retVal.push(log.getPC() + ": SLOAD");' +
         '   if(log.op.toNumber() == 0x55) ' +
         '     this.retVal.push(log.getPC() + ": SSTORE");' +
         '},' +
         'fault: function(log,db) {this.retVal.push("FAULT: " + JSON.stringify(log))},' +
         'result: function(ctx,db) {return this.retVal}' +
      '}'
      }) // return debug.traceTransaction ...
 }   // tracer = function ...
 ```
 The `step` function here looks at the opcode number of the op, and only pushes an entry if the opcode is
 `SLOAD` or `SSTORE` ([here is a list of EVM opcodes and their numbers](https://github.com/wolflo/evm-opcodes)).
 We could have used `log.op.toString()` instead, but it is faster to compare numbers rather than strings.
 The output looks similar to this:
 ```javascript
 [
  "5921: SLOAD",
  .
  .
  .
  "2413: SSTORE",
  "2420: SLOAD",
  "2475: SSTORE",
  "6094: SSTORE"
 ]
 ```
 ## Stack Information
 The trace above tells us the program counter (PC) and whether the program read from storage or wrote to it. That
 isn't very useful. To know more, you can use the `log.stack.peek` function to peek into the stack. `log.stack.peek(0)`
 is the stack top, `log.stack.peek(1)` the entry below it, etc. The values returned by `log.stack.peek` are
 Go `big.Int` objects. By default they are converted to JavaScript floating point numbers, so you need
 `toString(16)` to get them as hexadecimals, which is how we normally represent 256-bit values such as
 storage cells and their content.
 ```javascript
 tracer = function(tx) {
      return debug.traceTransaction(tx, {tracer:
      '{' +
         'retVal: [],' +
         'step: function(log,db) {' +
         '   if(log.op.toNumber() == 0x54) ' +
         '     this.retVal.push(log.getPC() + ": SLOAD " + ' +
         '        log.stack.peek(0).toString(16));' +
         '   if(log.op.toNumber() == 0x55) ' +
         '     this.retVal.push(log.getPC() + ": SSTORE " +' +
         '        log.stack.peek(0).toString(16) + " <- " +' +
         '        log.stack.peek(1).toString(16));' +
         '},' +
         'fault: function(log,db) {this.retVal.push("FAULT: " + JSON.stringify(log))},' +
         'result: function(ctx,db) {return this.retVal}' +
      '}'
      }) // return debug.traceTransaction ...
 }   // tracer = function ...
 ```
 This function gives you a trace of all the storage operations, and show you their parameters. This gives
 you a more complete picture of the program's interaction with storage. The output is similar to:
 ```javascript
 [
  "5921: SLOAD 0",
  .
  .
  .
  "2413: SSTORE 3f0af0a7a3ed17f5ba6a93e0a2a05e766ed67bf82195d2dd15feead3749a575d <- fb8629ad13d9a12456",
  "2420: SLOAD cc39b177dd3a7f50d4c09527584048378a692aed24d31d2eabeddb7f3c041870",
  "2475: SSTORE cc39b177dd3a7f50d4c09527584048378a692aed24d31d2eabeddb7f3c041870 <- 358c3de691bd19",
  "6094: SSTORE 0 <- 1"
 ]
 ```
 ## Operation Results
 One piece of information missing from the function above is the result on an `SLOAD` operation. The
 state we get inside `log` is the state prior to the execution of the opcode, so that value is not
 known yet. For more operations we can figure it out for ourselves, but we don't have access to the
 storage, so here we can't.
 The solution is to have a flag, `afterSload`, which is only true in the opcode right after an
 `SLOAD`, when we can see the result at the top of the stack.
 ```javascript
 tracer = function(tx) {
      return debug.traceTransaction(tx, {tracer:
      '{' +
         'retVal: [],' +
         'afterSload: false,' +
         'step: function(log,db) {' +
         '   if(this.afterSload) {' +
         '     this.retVal.push("    Result: " + ' +
         '          log.stack.peek(0).toString(16)); ' +
         '     this.afterSload = false; ' +
         '   } ' +
         '   if(log.op.toNumber() == 0x54) {' +
         '     this.retVal.push(log.getPC() + ": SLOAD " + ' +
         '        log.stack.peek(0).toString(16));' +
         '        this.afterSload = true; ' +
         '   } ' +
         '   if(log.op.toNumber() == 0x55) ' +
         '     this.retVal.push(log.getPC() + ": SSTORE " +' +
         '        log.stack.peek(0).toString(16) + " <- " +' +
         '        log.stack.peek(1).toString(16));' +
         '},' +
         'fault: function(log,db) {this.retVal.push("FAULT: " + JSON.stringify(log))},' +
         'result: function(ctx,db) {return this.retVal}' +
      '}'
      }) // return debug.traceTransaction ...
 }   // tracer = function ...
 ```
 The output now contains the result in the line that follows the `SLOAD`. We could have also modified the `SLOAD`
 line itself, but that would have been a bit more work.
 ```javascript
 [
  "5921: SLOAD 0",
  "    Result: 1",
  .
  .
  .
  "2413: SSTORE 3f0af0a7a3ed17f5ba6a93e0a2a05e766ed67bf82195d2dd15feead3749a575d <- fb8629ad13d9a12456",
  "2420: SLOAD cc39b177dd3a7f50d4c09527584048378a692aed24d31d2eabeddb7f3c041870",
  "    Result: 0",
  "2475: SSTORE cc39b177dd3a7f50d4c09527584048378a692aed24d31d2eabeddb7f3c041870 <- 358c3de691bd19",
  "6094: SSTORE 0 <- 1"
 ]
 ```
 ## Dealing With Calls Between Contracts
 So far we have treated the storage as if there are only 2^256 cells. However, that is not true. Contracts
 can call other contracts, and then the storage involved is the storage of the other contract. We can see
 the address of the current contract in `log.contract.getAddress()`. This value is the execution context,
 the contract whose storage we are using, even when we use code from another contract (by using
 `CALLCODE` or `DELEGATECODE`).
 However, `log.contract.getAddress()` returns an array of bytes. We use `this.byteHex()` and `array2Hex()`
 to convert this array to the hexadecimal representation we usually use to identify contracts.
 ```javascript
 tracer = function(tx) {
      return debug.traceTransaction(tx, {tracer:
      '{' +
         'retVal: [],' +
         'afterSload: false,' +
         'callStack: [],' +
         'byte2Hex: function(byte) {' +
         '  if (byte < 0x10) ' +
         '      return "0" + byte.toString(16); ' +
         '  return byte.toString(16); ' +
         '},' +
         'array2Hex: function(arr) {' +
         '  var retVal = ""; ' +
         '  for (var i=0; i<arr.length; i++) ' +
         '    retVal += this.byte2Hex(arr[i]); ' +
         '  return retVal; ' +
         '}, ' +
         'getAddr: function(log) {' +
         '  return this.array2Hex(log.contract.getAddress());' +
         '}, ' +
         'step: function(log,db) {' +
         '   var opcode = log.op.toNumber();' +
         // SLOAD
         '   if (opcode == 0x54) {' +
         '     this.retVal.push(log.getPC() + ": SLOAD " + ' +
         '        this.getAddr(log) + ":" + ' +
         '        log.stack.peek(0).toString(16));' +
         '        this.afterSload = true; ' +
         '   } ' +
         // SLOAD Result
         '   if (this.afterSload) {' +
         '     this.retVal.push("    Result: " + ' +
         '          log.stack.peek(0).toString(16)); ' +
         '     this.afterSload = false; ' +
         '   } ' +
         // SSTORE
         '   if (opcode == 0x55) ' +
         '     this.retVal.push(log.getPC() + ": SSTORE " +' +
         '        this.getAddr(log) + ":" + ' +
         '        log.stack.peek(0).toString(16) + " <- " +' +
         '        log.stack.peek(1).toString(16));' +
         // End of step
         '},' +
         'fault: function(log,db) {this.retVal.push("FAULT: " + JSON.stringify(log))},' +
         'result: function(ctx,db) {return this.retVal}' +
      '}'
      }) // return debug.traceTransaction ...
 }   // tracer = function ...
 ```
 The output is similar to:
 ```javascript
 [
  "423: SLOAD 22ff293e14f1ec3a09b137e9e06084afd63addf9:360894a13ba1a3210667c828492db98dca3e2076cc3735a920a3ca505d382bbc",
  "    Result: 360894a13ba1a3210667c828492db98dca3e2076cc3735a920a3ca505d382bbc",
  "10778: SLOAD 22ff293e14f1ec3a09b137e9e06084afd63addf9:6",
  "    Result: 6",
  .
  .
  .
  "13529: SLOAD f2d68898557ccb2cf4c10c3ef2b034b2a69dad00:8328de571f86baa080836c50543c740196dbc109d42041802573ba9a13efa340",
  "    Result: 8328de571f86baa080836c50543c740196dbc109d42041802573ba9a13efa340",
  "423: SLOAD f2d68898557ccb2cf4c10c3ef2b034b2a69dad00:360894a13ba1a3210667c828492db98dca3e2076cc3735a920a3ca505d382bbc",
  "    Result: 360894a13ba1a3210667c828492db98dca3e2076cc3735a920a3ca505d382bbc",
  "13529: SLOAD f2d68898557ccb2cf4c10c3ef2b034b2a69dad00:b38558064d8dd9c883d2a8c80c604667ddb90a324bc70b1bac4e70d90b148ed4",
  "    Result: b38558064d8dd9c883d2a8c80c604667ddb90a324bc70b1bac4e70d90b148ed4",
  "11041: SSTORE 22ff293e14f1ec3a09b137e9e06084afd63addf9:6 <- 0"
 ]
 ```
 ## Conclusion
 This tutorial only taught the basics of using JavaScript to filter traces. We did not go over access to memory,
 or how to use the `db` parameter to know the state of the chain at the time of execution. All this and more is
 covered [in the reference](https://geth.ethereum.org/docs/rpc/ns-debug#javascript-based-tracing).
 Hopefully with this tool you will find it easier to trace the EVM's behavior and debug thorny contract issues.
--- a/docs/_dapp/tracing.md
+++ b/docs/_dapp/tracing.md
@ -3,24 +3,28 @@ title: EVM Tracing
 sort_key: A
 ---
-There are two different types of transactions in Ethereum: plain value transfers and
+There are two different types of [transactions][transactions] 
-contract executions. A plain value transfer just moves Ether from one account to another
+in Ethereum: simple value transfers and contract executions. A value transfer just 
-and as such is uninteresting from this guide's perspective. If however the recipient of a
+moves Ether from one account to another. If however the recipient of a transaction is 
-transaction is a contract account with associated EVM (Ethereum Virtual Machine)
+a contract account with associated [EVM][evm] (Ethereum Virtual Machine) bytecode - beside 
-bytecode - beside transferring any Ether - the code will also be executed as part of the
+transferring any Ether - the code will also be executed as part of the transaction.
 transaction.
 Having code associated with Ethereum accounts permits transactions to do arbitrarily
 complex data storage and enables them to act on the previously stored data by further
-transacting internally with outside accounts and contracts. This creates an intertwined
+transacting internally with outside accounts and contracts. This creates an interlinked
 ecosystem of contracts, where a single transaction can interact with tens or hundreds of
 accounts.
 The downside of contract execution is that it is very hard to say what a transaction
 actually did. A transaction receipt does contain a status code to check whether execution
-succeeded or not, but there's no way to see what data was modified, nor what external
+succeeded or not, but there is no way to see what data was modified, nor what external
-contracts where invoked. In order to introspect a transaction, we need to trace its
+contracts where invoked. Geth resolves this by re-running transactions locally and collecting
-execution.
+data about precisely what was executed by the EVM. This is known as "tracing" the transaction.
 * TOC
 {:toc}
 ## Tracing prerequisites
@ -29,43 +33,66 @@ reexecute the desired transaction with varying degrees of data collection and ha
 return the aggregated summary for post processing. Reexecuting a transaction however has a
 few prerequisites to be met.
-In order for an Ethereum node to reexecute a transaction, it needs to have available all
+In order for an Ethereum node to reexecute a transaction, all historical state accessed 
-historical state accessed by the transaction:
+by the transaction must be available. This includes:
-
+
- * Balance, nonce, bytecode and storage of both the recipient as well as all internally invoked contracts.
+* Balance, nonce, bytecode and storage of both the recipient as well as all internally invoked contracts.
- * Block metadata referenced during execution of both the outer as well as all internally created transactions.
+* Block metadata referenced during execution of both the outer as well as all internally created transactions.
- * Intermediate state generated by all preceding transactions contained in the same block as the one being traced.
+* Intermediate state generated by all preceding transactions contained in the same block as the one being traced.
-
+
-Depending on your node's mode of synchronization and pruning, different configurations
+This means there are limits on the transactions that can be traced imposed by the synchronization and 
-result in different capabilities:
+pruning configuration of a node.
-
+
- * An **archive** node retaining **all historical data** can trace arbitrary transactions
+* An **archive** node retains **all historical data** back to genesis. It can therefore
-   at any point in time. Tracing a single transaction also entails reexecuting all
+trace arbitrary transactions at any point in the history of the chain. Tracing a single 
-   preceding transactions in the same block.
+transaction requires reexecuting all preceding transactions in the same block.
- * A **full synced** node retaining **all historical data** after initial sync can only
+
-   trace transactions from blocks following the initial sync point. Tracing a single
+* A **full synced** node retains the most recent 128 blocks in memory, so transactions in
-   transaction also entails reexecuting all preceding transactions in the same block.
+that range are always accessible. Full nodes also store occasional checkpoints back to genesis
- * A **fast synced** node retaining only **periodic state data** after initial sync can
+that can be used to rebuild the state at any point on-the-fly. This means older transactions
-   only trace transactions from blocks following the initial sync point. Tracing a single
+can be traced but if there is a large distance between the requested transaction and the most
-   transaction entails reexecuting all preceding transactions **both** in the same block,
+recent checkpoint rebuilding the state can take a long time. Tracing a single
-   as well as all preceding blocks until the previous stored snapshot.
+transaction requires reexecuting all preceding transactions in the same block
- * A **light synced** node retrieving data **on demand** can in theory trace transactions
+**and** all preceding blocks until the previous stored snapshot.
-   for which all required historical state is readily available in the network. In
+
-   practice, data availability is **not** a feasible assumption.
+* A **snap synced** node holds the most recent 128 blocks in memory, so transactions in that
 range are always accessible. However, snap-sync only starts processing from a relatively recent
 block (as opposed to genesis for a full node). Between the initial sync block and the 128 most
 recent blocks, the node stores occasional checkpoints that can be used to rebuild the state on-the-fly.
 This means transactions can be traced back as far as the block that was used for the initial sync.
 Tracing a single transaction requires reexecuting all preceding transactions in the same block,
 **and** all preceding blocks until the previous stored snapshot.
 * A **light synced** node retrieving data **on demand** can in theory trace transactions
 for which all required historical state is readily available in the network. This is because the data
 required to generate the trace is requested from an les-serving full node. In practice, data 
 availability **cannot** be reasonably assumed.
 *There are exceptions to the above rules when running batch traces of entire blocks or
 chain segments. Those will be detailed later.*
 ## Basic traces
-The simplest type of transaction trace that `go-ethereum` can generate are raw EVM opcode
+The simplest type of transaction trace that Geth can generate are raw EVM opcode
 traces. For every VM instruction the transaction executes, a structured log entry is
 emitted, containing all contextual metadata deemed useful. This includes the *program
 counter*, *opcode name*, *opcode cost*, *remaining gas*, *execution depth* and any
 *occurred error*. The structured logs can optionally also contain the content of the
 *execution stack*, *execution memory* and *contract storage*.
-An example log entry for a single opcode looks like:
+The entire output of a raw EVM opcode trace is a JSON object having a few metadata
 fields: *consumed gas*, *failure status*, *return value*; and a list of *opcode entries*:
 ```json
 {
  "gas":         25523,
  "failed":      false,
  "returnValue": "",
  "structLogs":  []
 }
 ```
 An example log for a single opcode entry has the following format:
 ```json
 {
@ -90,26 +117,12 @@ An example log entry for a single opcode looks like:
 }
 ```
 The entire output of an raw EVM opcode trace is a JSON object having a few metadata
 fields: *consumed gas*, *failure status*, *return value*; and a list of *opcode entries*
 that take the above form:
 ```json
 {
  "gas":         25523,
  "failed":      false,
  "returnValue": "",
  "structLogs":  []
 }
 ```
 ### Generating basic traces
-To generate a raw EVM opcode trace, `go-ethereum` provides a few [RPC API
+To generate a raw EVM opcode trace, Geth provides a few [RPC API endpoints](/docs/rpc/ns-debug).
-endpoints](../rpc/ns-debug), out of which the most commonly used is
+The most commonly used is [`debug_traceTransaction`](/docs/rpc/ns-debug#debug_tracetransaction).
 [`debug_traceTransaction`](../rpc/ns-debug#debug_tracetransaction).
-In its simplest form, `traceTransaction` accepts a transaction hash as its sole argument,
+In its simplest form, `traceTransaction` accepts a transaction hash as its only argument. It then
 traces the transaction, aggregates all the generated data and returns it as a **large**
 JSON object. A sample invocation from the Geth console would be:
@ -117,83 +130,103 @@ JSON object. A sample invocation from the Geth console would be:
 debug.traceTransaction("0xfc9359e49278b7ba99f59edac0e3de49956e46e530a53c15aa71226b7aa92c6f")
 ```
-The same call can of course be invoked from outside the node too via HTTP RPC. In this
+The same call can also be invoked from outside the node too via HTTP RPC (e.g. using Curl). In this
-case, please make sure the HTTP endpoint is enabled via `--http` and the `debug` API
+case, the HTTP endpoint must be enabled in Geth using the `--http` command and the `debug` API
-namespace exposed via `--http.api=debug`.
+namespace must be exposed using `--http.api=debug`.
 ```
 $ curl -H "Content-Type: application/json" -d '{"id": 1, "method": "debug_traceTransaction", "params": ["0xfc9359e49278b7ba99f59edac0e3de49956e46e530a53c15aa71226b7aa92c6f"]}' localhost:8545
 ```
-Running the above operation on the Rinkeby network (with a node retaining enough history)
+To follow along with this tutorial, transaction hashes can be found from a local Geth node (e.g. by 
-will result in this [trace dump](https://gist.github.com/karalabe/c91f95ac57f5e57f8b950ec65ecc697f).
+attaching a [Javascript console](/docs/interface/javascript-console) and running `eth.getBlock('latest')` 
-
+then passing a transaction hash from the returned block to `debug.traceTransaction()`) or from a block 
-### Tuning basic traces
+explorer (for [Mainnet](https://etherscan.io/) or a [testnet](https://goerli.etherscan.io/)).
-By default the raw opcode tracer emits all relevant events that occur within the EVM while
+It is also possible to configure the trace by passing Boolean (true/false) values for four parameters
-processing a transaction, such as *EVM stack*, *EVM memory* and *updated storage slots*.
+that tweak the verbosity of the trace. By default, the *EVM memory* and *Return data* are not reported 
-Certain use cases however may not need some of these data fields reported. To cater for
+but the *EVM stack* and *EVM storage* are. To report the maximum amount of data:
 those use cases, these massive fields may be omitted using a second *options* parameter
 for the tracer:
-```json
+```shell
-{
+enableMemory: true
-  "disableStack": true,
+disableStack: false
-  "disableMemory": true,
+disableStorage: false
-  "disableStorage": true
+enableReturnData: true
 }
 ```
-Running the previous tracer invocation from the Geth console with the data fields
+An example call, made in the Geth Javascript console, configured to report the maximum amount of data 
-disabled:
+looks as follows:
 ```js
-debug.traceTransaction("0xfc9359e49278b7ba99f59edac0e3de49956e46e530a53c15aa71226b7aa92c6f", {disableStack: true, disableMemory: true, disableStorage: true})
+debug.traceTransaction("0xfc9359e49278b7ba99f59edac0e3de49956e46e530a53c15aa71226b7aa92c6f",{enableMemory: true, disableStack: false, disableStorage: false, enableReturnData: true})
 ```
-Analogously running the filtered tracer from outside the node too via HTTP RPC:
+Running the above operation on the Rinkeby network (with a node retaining enough history)
 will result in this [trace dump](https://gist.github.com/karalabe/c91f95ac57f5e57f8b950ec65ecc697f).
 Alternatively, disabling *EVM Stack*, *EVM Memory*, *Storage* and *Return data* (as demonstrated in the Curl request below)
 results in the following, much shorter, [trace dump](https://gist.github.com/karalabe/d74a7cb33a70f2af75e7824fc772c5b4).
 ```
-$ curl -H "Content-Type: application/json" -d '{"id": 1, "method": "debug_traceTransaction", "params": ["0xfc9359e49278b7ba99f59edac0e3de49956e46e530a53c15aa71226b7aa92c6f", {"disableStack": true, "disableMemory": true, "disableStorage": true}]}' localhost:8545
+$ curl -H "Content-Type: application/json" -d '{"id": 1, "method": "debug_traceTransaction", "params": ["0xfc9359e49278b7ba99f59edac0e3de49956e46e530a53c15aa71226b7aa92c6f", {"disableStack": true, "disableStorage": true}]}' localhost:8545
 ```
 Running the above operation on the Rinkeby network will result in this significantly
 shorter [trace dump](https://gist.github.com/karalabe/d74a7cb33a70f2af75e7824fc772c5b4).
 ### Limits of basic traces
-Although the raw opcode traces we've generated above have their use, this basic way of
+Although the raw opcode traces generated above are useful, having an individual log entry for every single
-tracing is problematic in the real world. Having an individual log entry for every single
+opcode is too low level for most use cases, and will require developers to create additional tools to 
-opcode is too low level for most use cases, and will require developers to create
+post-process the traces. Additionally, a full opcode trace can easily go into the hundreds of
-additional tools to post-process the traces. Additionally, a full opcode trace can easily
+megabytes, making them very resource intensive to get out of the node and process externally.
-go into the hundreds of megabytes, making them very resource intensive to get out of the
+
-node and process externally.
+To avoid those issues, Geth supports running custom JavaScript tracers *within* the Ethereum node, 
 which have full access to the EVM stack, memory and contract storage. This means developers only have to
 gather the data they actually need, and do any processing at the source.
 ## Pruning
 Geth does in-memory state-pruning by default, discarding state entries that it deems
 no longer necessary to maintain. This is configured via the `--gcmode` command. An error 
 message alerting the user that the necessary state is not available is common in EVM tracing on 
 anything other than an archive node.
 ```sh
 Error: required historical state unavailable (reexec=128)
   at web3.js:6365:37(47)
   at send (web3,js:5099:62(35))
   at <eval>:1:23(13)
 ```
 The pruning behaviour, and consequently the state availability and tracing capability of
 a node depends on its sync and pruning configuration. The 'oldest' block after which
 state is immediately available, and before which state is not immediately available, 
 is known as the "pivot block". There are then several possible cases for a trace request
 on a Geth node.
 For tracing a transaction in block `B` where the pivot block is `P` can regenerate the desired 
 state by replaying blocks from the last :
-To avoid all of the previously mentioned issues, `go-ethereum` supports running custom
+1. a fast-sync'd node can regenerate the desired state by replaying blocks from the most recent
-JavaScript tracers *within* the Ethereum node, which have full access to the EVM stack,
+checkpoint between `P` and `B` as long as `P` < `B`. If `P` > `B` there is no available checkpoint 
-memory and contract storage. This permits developers to only gather the data they need,
+and the state cannot be regenerated without replying the chain from genesis.
 and do any processing **at** the data. Please see the next section for our *custom in-node
 tracers*.
-### Pruning
+2. a fully sync'd node can regenerate the desired state by replaying blocks from the last available
 full state before `B`. A fully sync'd node re-executes all blocks from genesis, so checkpoints are available
 across the entire history of the chain. However, database pruning discards older data, moving `P` to a more
 recent position in the chain. If `P` > `B` there is no available checkpoint and the state cannot be 
 regenerated without replaying the chain from genesis.
-Geth by default does in-memory pruning of state, discarding state entries that it deems are
+3. A fully-sync'd node without pruning (i.e. an archive node configured with `--gcmode=archive`) 
-no longer necessary to maintain. This is configured via the `--gcmode` option. Often,
+does not need to replay anything, it can immediately load up any state and serve the request for any `B`.
 people run into the error that state is not available.
-Say you want to do a trace on block `B`. Now there are a couple of cases:
+The time taken to regenerate a specific state increases with the distance between `P` and `B`. If the distance
 between `P` and `B` is large, the regeneration time can be substantial.
-1. You have done a fast-sync, pivot block `P` where `P <= B`.
+## Summary
 2. You have done a fast-sync, pivot block `P` where `P > B`. 
 3. You have done a full-sync, with pruning
 4. You have done a full-sync, without pruning (`--gcmode=archive`)
-Here's what happens in each respective case:
+This page covered the concept of EVM tracing and how to generate traces with the default opcode-based tracers using RPC.
 More advanced usage is possible, including using other built-in tracers as well as writing [custom tracing](/docs/dapp/custom-tracer) code in Javascript
 and Go. The API as well as the JS tracing hooks are defined in [the reference](/docs/rpc/ns-debug#debug_traceTransaction).
 1. Geth will regenerate the desired state by replaying blocks from the closest point in
   time before `B` where it has full state. This defaults to `128` blocks max, but you can
   specify more in the actual call `... "reexec":1000 .. }` to the tracer.
 2. Sorry, can't be done without replaying from genesis.   
 3. Same as 1)
 4. Does not need to replay anything, can immediately load up the state and serve the request. 
 [transactions]: https://ethereum.org/en/developers/docs/transactions
 [evm]: https://ethereum.org/en/developers/docs/evm