This commit fixes a bug in parsing indented literal blocks. For example:
test(8)
```
This is a block
```
Prior to this commit, this would fail, but with an unexpected error
message: "Error at 4:3: Cannot deindent in literal block". The
indentation was being parsed at every character, so the parser saw the
`T`, then parsed indentation again. The indentation was 0 (since there
were no tab characters between the `T` and the `h`), but the block
started with an indentation level of 1. 0 < 1, so this would be
considered a dedent, which is not allowed.
This commit introduces a new local variable, `check_indent`, which
controls whether the parser tries to parse indentation or not; now
indentation is only parsed when the last character was a newline. From
my testing this seems to fix the issue - indented literal blocks are now
allowed.
This commit makes using invalid characters in the name a fatal error.
Before this patch, "foo | bar(1)" would parse as "foobar(1)". Now it is
a fatal error and parsing stops.
In the underscore case, the next character is retrieved to check
whether the underscore is at a word break. However, if this character
is UTF8_INVALID, the call to parser_pushch will be a noop. This
results in the loop continuing on further than it should. This just
adds a check to see if next is UTF8_INVALID and returns if it is.
Signed-off-by: Brian Ashworth <bosrsf04@gmail.com>
Currently, the first underscore encountered while underlining ends
underlining. As a result, underscores in underlined words are not
ignored e.g. _hello_world_ does not parse correctly.
This checks the next character to see if it is still in a word before
ending underlining.
Regardless of standards considerations, if there's any advice
that needs to be hammered into man authors, it's to be concise
and accurate, but not pedantic. As Will Strunk commanded,
"Omit needless words."
The most needless words of all are promotional. No man page
should utter words like "powerful", "extraordinarily versatile",
"user-friendly", or "has a wide range of options".
-- Doug McIlroy[1]
[1] https://lists.gnu.org/archive/html/groff/2018-11/msg00058.html
An empty string will rarely be useful, since the only thing that
can be done to it is appending a character with the current state
of the string API. Storing empty strings with a NULL storage pointer
creates unnecessary edge cases in any code handling strings.
The tables test no longer segfaults.
The environment variable SOURCE_DATE_EPOCH [0] is standardized and can
be used to produce reproducible output. Distributions like Debian will
set this variable before the build and scdoc should use it (instead of
the current date) for any timestamps within the man pages.
[0]: https://reproducible-builds.org/docs/source-date-epoch/