xref: /freebsd/contrib/mandoc/mandoc_html.3 (revision dacc43df34a7da82747af82be62cb645eb36f6ca)
1.\"	$Id: mandoc_html.3,v 1.17 2018/06/25 16:54:59 schwarze Exp $
2.\"
3.\" Copyright (c) 2014, 2017, 2018 Ingo Schwarze <schwarze@openbsd.org>
4.\"
5.\" Permission to use, copy, modify, and distribute this software for any
6.\" purpose with or without fee is hereby granted, provided that the above
7.\" copyright notice and this permission notice appear in all copies.
8.\"
9.\" THE SOFTWARE IS PROVIDED "AS IS" AND THE AUTHOR DISCLAIMS ALL WARRANTIES
10.\" WITH REGARD TO THIS SOFTWARE INCLUDING ALL IMPLIED WARRANTIES OF
11.\" MERCHANTABILITY AND FITNESS. IN NO EVENT SHALL THE AUTHOR BE LIABLE FOR
12.\" ANY SPECIAL, DIRECT, INDIRECT, OR CONSEQUENTIAL DAMAGES OR ANY DAMAGES
13.\" WHATSOEVER RESULTING FROM LOSS OF USE, DATA OR PROFITS, WHETHER IN AN
14.\" ACTION OF CONTRACT, NEGLIGENCE OR OTHER TORTIOUS ACTION, ARISING OUT OF
15.\" OR IN CONNECTION WITH THE USE OR PERFORMANCE OF THIS SOFTWARE.
16.\"
17.Dd $Mdocdate: June 25 2018 $
18.Dt MANDOC_HTML 3
19.Os
20.Sh NAME
21.Nm mandoc_html
22.Nd internals of the mandoc HTML formatter
23.Sh SYNOPSIS
24.In "html.h"
25.Ft void
26.Fn print_gen_decls "struct html *h"
27.Ft void
28.Fn print_gen_comment "struct html *h" "struct roff_node *n"
29.Ft void
30.Fn print_gen_head "struct html *h"
31.Ft struct tag *
32.Fo print_otag
33.Fa "struct html *h"
34.Fa "enum htmltag tag"
35.Fa "const char *fmt"
36.Fa ...
37.Fc
38.Ft void
39.Fo print_tagq
40.Fa "struct html *h"
41.Fa "const struct tag *until"
42.Fc
43.Ft void
44.Fo print_stagq
45.Fa "struct html *h"
46.Fa "const struct tag *suntil"
47.Fc
48.Ft void
49.Fo print_text
50.Fa "struct html *h"
51.Fa "const char *word"
52.Fc
53.Ft char *
54.Fo html_make_id
55.Fa "const struct roff_node *n"
56.Fc
57.Ft int
58.Fo html_strlen
59.Fa "const char *cp"
60.Fc
61.Sh DESCRIPTION
62The mandoc HTML formatter is not a formal library.
63However, as it is compiled into more than one program, in particular
64.Xr mandoc 1
65and
66.Xr man.cgi 8 ,
67and because it may be security-critical in some contexts,
68some documentation is useful to help to use it correctly and
69to prevent XSS vulnerabilities.
70.Pp
71The formatter produces HTML output on the standard output.
72Since proper escaping is usually required and best taken care of
73at one central place, the language-specific formatters
74.Po
75.Pa *_html.c ,
76see
77.Sx FILES
78.Pc
79are not supposed to print directly to
80.Dv stdout
81using functions like
82.Xr printf 3 ,
83.Xr putc 3 ,
84.Xr puts 3 ,
85or
86.Xr write 2 .
87Instead, they are expected to use the output functions declared in
88.Pa html.h
89and implemented as part of the main HTML formatting engine in
90.Pa html.c .
91.Ss Data structures
92These structures are declared in
93.Pa html.h .
94.Bl -tag -width Ds
95.It Vt struct html
96Internal state of the HTML formatter.
97.It Vt struct tag
98One entry for the LIFO stack of HTML elements.
99Members are
100.Fa "enum htmltag tag"
101and
102.Fa "struct tag *next" .
103.El
104.Ss Private interface functions
105The function
106.Fn print_gen_decls
107prints the opening
108.Ao Pf \&? Ic xml ? Ac
109and
110.Aq Pf \&! Ic DOCTYPE
111declarations required for the current document type.
112.Pp
113The function
114.Fn print_gen_comment
115prints the leading comments, usually containing a Copyright notice
116and license, as an HTML comment.
117It is intended to be called right after opening the
118.Aq Ic HTML
119element.
120Pass the first
121.Dv ROFFT_COMMENT
122node in
123.Fa n .
124.Pp
125The function
126.Fn print_gen_head
127prints the opening
128.Aq Ic META
129and
130.Aq Ic LINK
131elements for the document
132.Aq Ic HEAD ,
133using the
134.Fa style
135member of
136.Fa h
137unless that is
138.Dv NULL .
139It uses
140.Fn print_otag
141which takes care of properly encoding attributes,
142which is relevant for the
143.Fa style
144link in particular.
145.Pp
146The function
147.Fn print_otag
148prints the start tag of an HTML element with the name
149.Fa tag ,
150optionally including the attributes specified by
151.Fa fmt .
152If
153.Fa fmt
154is the empty string, no attributes are written.
155Each letter of
156.Fa fmt
157specifies one attribute to write.
158Most attributes require one
159.Va char *
160argument which becomes the value of the attribute.
161The arguments have to be given in the same order as the attribute letters.
162If an argument is
163.Dv NULL ,
164the respective attribute is not written.
165.Bl -tag -width 1n -offset indent
166.It Cm c
167Print a
168.Cm class
169attribute.
170This attribute letter can optionally be followed by the modifier letter
171.Cm T .
172In that case, a
173.Cm title
174attribute with the same value is also printed.
175.It Cm h
176Print a
177.Cm href
178attribute.
179This attribute letter can optionally be followed by a modifier letter.
180If followed by
181.Cm R ,
182it formats the link as a local one by prefixing a
183.Sq #
184character.
185If followed by
186.Cm I ,
187it interpretes the argument as a header file name
188and generates a link using the
189.Xr mandoc 1
190.Fl O Cm includes
191option.
192If followed by
193.Cm M ,
194it takes two arguments instead of one, a manual page name and
195section, and formats them as a link to a manual page using the
196.Xr mandoc 1
197.Fl O Cm man
198option.
199.It Cm i
200Print an
201.Cm id
202attribute.
203.It Cm \&?
204Print an arbitrary attribute.
205This format letter requires two
206.Vt char *
207arguments, the attribute name and the value.
208The name must not be
209.Dv NULL .
210.It Cm s
211Print a
212.Cm style
213attribute.
214If present, it must be the last format letter.
215It requires two
216.Va char *
217arguments.
218The first is the name of the style property, the second its value.
219.El
220.Pp
221.Fn print_otag
222uses the private function
223.Fn print_encode
224to take care of HTML encoding.
225If required by the element type, it remembers in
226.Fa h
227that the element is open.
228The function
229.Fn print_tagq
230is used to close out all open elements up to and including
231.Fa until ;
232.Fn print_stagq
233is a variant to close out all open elements up to but excluding
234.Fa suntil .
235.Pp
236The function
237.Fn print_text
238prints HTML element content.
239It uses the private function
240.Fn print_encode
241to take care of HTML encoding.
242If the document has requested a non-standard font, for example using a
243.Xr roff 7
244.Ic \ef
245font escape sequence,
246.Fn print_text
247wraps
248.Fa word
249in an HTML font selection element using the
250.Fn print_otag
251and
252.Fn print_tagq
253functions.
254.Pp
255The function
256.Fn html_make_id
257takes a node containing one or more text children
258and returns a newly allocated string containing the concatenation
259of the child strings, with blanks replaced by underscores.
260If the node
261.Fa n
262contains any non-text child node,
263.Fn html_make_id
264returns
265.Dv NULL
266instead.
267The caller is responsible for freeing the returned string.
268.Pp
269The function
270.Fn html_strlen
271counts the number of characters in
272.Fa cp .
273It is used as a crude estimate of the width needed to display a string.
274.Pp
275The functions
276.Fn print_eqn ,
277.Fn print_tbl ,
278and
279.Fn print_tblclose
280are not yet documented.
281.Sh FILES
282.Bl -tag -width mandoc_aux.c -compact
283.It Pa main.h
284declarations of public functions for use by the main program,
285not yet documented
286.It Pa html.h
287declarations of data types and private functions
288for use by language-specific HTML formatters
289.It Pa html.c
290main HTML formatting engine and utility functions
291.It Pa mdoc_html.c
292.Xr mdoc 7
293HTML formatter
294.It Pa man_html.c
295.Xr man 7
296HTML formatter
297.It Pa tbl_html.c
298.Xr tbl 7
299HTML formatter
300.It Pa eqn_html.c
301.Xr eqn 7
302HTML formatter
303.It Pa out.h
304declarations of data types and private functions
305for shared use by all mandoc formatters,
306not yet documented
307.It Pa out.c
308private functions for shared use by all mandoc formatters
309.It Pa mandoc_aux.h
310declarations of common mandoc utility functions, see
311.Xr mandoc 3
312.It Pa mandoc_aux.c
313implementation of common mandoc utility functions
314.El
315.Sh SEE ALSO
316.Xr mandoc 1 ,
317.Xr mandoc 3 ,
318.Xr man.cgi 8
319.Sh AUTHORS
320.An -nosplit
321The mandoc HTML formatter was written by
322.An Kristaps Dzonsons Aq Mt kristaps@bsd.lv .
323It is maintained by
324.An Ingo Schwarze Aq Mt schwarze@openbsd.org ,
325who also wrote this manual.
326