MANDOC(3) | Library Functions Manual | MANDOC(3) |
enum mandoc_esc
mandoc_escape(const char const **end, const char const **start, int *sz);
const struct man_meta *
man_meta(const struct man *man);
const struct mparse *
man_mparse(const struct man *man);
const struct man_node *
man_node(const struct man *man);
struct mchars *
mchars_alloc(void);
void
mchars_free(struct mchars *p);
char
mchars_num2char(const char *cp, size_t sz);
int
mchars_num2uc(const char *cp, size_t sz);
const char *
mchars_spec2str(const struct mchars *p, const char *cp, size_t sz, size_t *rsz);
int
mchars_spec2cp(const struct mchars *p, const char *cp, size_t sz);
const struct mdoc_meta *
mdoc_meta(const struct mdoc *mdoc);
const struct mdoc_node *
mdoc_node(const struct mdoc *mdoc);
void
mparse_alloc(enum mparset type, enum mandoclevel wlevel, mandocmsg msg, void *msgarg);
void
mparse_free(struct mparse *parse);
void
mparse_getkeep(const struct mparse *parse);
void
mparse_keep(struct mparse *parse);
enum mandoclevel
mparse_readfd(struct mparse *parse, int fd, const char *fname);
void
mparse_reset(struct mparse *parse);
void
mparse_result(struct mparse *parse, struct mdoc **mdoc, struct man **man);
const char *
mparse_strerror(enum mandocerr);
const char *
mparse_strlevel(enum mandoclevel);
extern const char * const * man_macronames;
extern const char * const * mdoc_argnames;
extern const char * const * mdoc_macronames;
#define ASCII_NBRSP
#define ASCII_HYPH
The following describes a general parse sequence:
The mandoc library also contains routines for translating character strings into glyphs (see mchars_alloc()) and parsing escape sequences from strings (see mandoc_escape()).
The following non-printing characters may be embedded in text strings:
Escape characters are also passed verbatim into text strings. An escape character is a sequence of characters beginning with the backslash (‘\’). To construct human-readable text, these should be intercepted with mandoc_escape() and converted with one of mchars_num2char(), mchars_spec2str(), and so on.
The AST is composed of struct man_node nodes with element, root and text types as declared by the type field. Each node also provides its parse point (the line, sec, and pos fields), its position in the tree (the parent, child, next and prev fields) and some type-specific data.
The tree itself is arranged according to the following normal form, where capitalised non-terminals represent nodes.
The only elements capable of nesting other elements are those with next-lint scope as documented in man(7).
The AST is composed of struct mdoc_node nodes with block, head, body, element, root and text types as declared by the type field. Each node also provides its parse point (the line, sec, and pos fields), its position in the tree (the parent, child, nchild, next and prev fields) and some type-specific data, in particular, for nodes generated from macros, the generating macro in the tok field.
The tree itself is arranged according to the following normal form, where capitalised non-terminals represent nodes.
Of note are the TEXT nodes following the HEAD, BODY and TAIL nodes of the BLOCK production: these refer to punctuation marks. Furthermore, although a TEXT node will generally have a non-zero-length string, in the specific case of ‘.Bd -literal’, an empty line will produce a zero-length string. Multiple body parts are only found in invocations of ‘Bl -column’, where a new body introduces a new phrase.
The mdoc(7) syntax tree accommodates for broken block structures as well. The ENDBODY node is available to end the formatting associated with a given block before the physical end of that block. It has a non-null end field, is of the BODY type, has the same tok as the BLOCK it is ending, and has a pending field pointing to that BLOCK's BODY node. It is an indirect child of that BODY node and has no children of its own.
An ENDBODY node is generated when a block ends while one of its child blocks is still open, like in the following example:
.Ao ao .Bo bo ac .Ac bc .Bc end
This example results in the following block structure:
BLOCK Ao HEAD Ao BODY Ao TEXT ao BLOCK Bo, pending -> Ao HEAD Bo BODY Bo TEXT bo TEXT ac ENDBODY Ao, pending -> Ao TEXT bc TEXT end
Here, the formatting of the ‘Ao’ block extends from TEXT ao to TEXT ac, while the formatting of the ‘Bo’ block extends from TEXT bo to TEXT bc. It renders as follows in -Tascii mode:
<ao [bo ac> bc] end
Support for badly-nested blocks is only provided for backward compatibility with some older mdoc(7) implementations. Using badly-nested blocks is strongly discouraged; for example, the -Thtml and -Txhtml front-ends to mandoc(1) are unable to render them in any meaningful way. Furthermore, behaviour when encountering badly-nested blocks is not consistent across troff implementations, especially when using multiple levels of badly-nested blocks.
October 6, 2013 | NetBSD 7.2 |