Looking at the cpu profile of a burntpix benchmark, I noticed that a lot
of time was spent in gas-used, in the interpreter loop. It's an actual
call (not inlined), which explicitly wants to be ignored by tracing
("tracing.GasChangeIgnored"), so it can be safely and simply inlined.
The other change is in `pushX`. These also do a call to
`common.RightPadBytes`. I replaced that by a doing a corresponding `Lsh`
on the `u256` if needed. Note: it's needed only to make the stack output
look right, for fuzzers. It technically doesn't matter what we put
there: if code ends on a pushdata immediate, nothing will consume the
stack element. We could just as well just ignore it, if we didn't care
about fuzzers (which I do).
Seems quite a lot faster on burntpix, according to my runs.
This PR:
```
EVM gas used: 5642735088
execution time: 34.84609475s
allocations: 915683
allocated bytes: 175334088
```
```
EVM gas used: 5642735088
execution time: 36.671958278s
allocations: 915701
allocated bytes: 175340528
```
Master
```
EVM gas used: 5642735088
execution time: 49.349209526s
allocations: 915684
allocated bytes: 175333368
```
```
EVM gas used: 5642735088
execution time: 46.581006598s
allocations: 915681
allocated bytes: 175330728
```
---------
Co-authored-by: Sina M <1591639+s1na@users.noreply.github.com>
Co-authored-by: Felix Lange <fjl@twurst.com>