xref: /illumos-gate/usr/src/cmd/sgs/libld/common/README.XLINK (revision 93a18d6d401e844455263f926578e9d2aa6b47ec)
1#
2# CDDL HEADER START
3#
4# The contents of this file are subject to the terms of the
5# Common Development and Distribution License (the "License").
6# You may not use this file except in compliance with the License.
7#
8# You can obtain a copy of the license at usr/src/OPENSOLARIS.LICENSE
9# or http://www.opensolaris.org/os/licensing.
10# See the License for the specific language governing permissions
11# and limitations under the License.
12#
13# When distributing Covered Code, include this CDDL HEADER in each
14# file and include the License file at usr/src/OPENSOLARIS.LICENSE.
15# If applicable, add the following below this CDDL HEADER, with the
16# fields enclosed by brackets "[]" replaced with your own identifying
17# information: Portions Copyright [yyyy] [name of copyright owner]
18#
19# CDDL HEADER END
20#
21
22#
23# Copyright 2009 Sun Microsystems, Inc.  All rights reserved.
24# Use is subject to license terms.
25#
26
27
28
29
30Notes On Cross Link-Editor Support in libld.so
31-----------------------------------------
32
33The Solaris link-editor is used in two contexts:
34
35	1) The standard ld command
36	2) Via the runtime linker (ld.so.1), when a program
37	   calls dlopen() on a relocatable object (ET_REL).
38
39To support both uses, it is packaged as a sharable library (libld.so).
40The ld command is therefore a simple wrapper that uses libld.
41
42libld.so is a cross linker. This means that it can link objects for
43a system other than the system running the link-editor (e.g. A link-editor
44running on an amd64 system processing sparc objects). This means that every
45instance of libld.so contains code for building objects for every supported
46target. It is not necessary to build libld specifically for the
47platform you're targeting. This is possible because we only support
48Solaris/ELF, with a small number of platforms, and the additional code
49required per target is small.
50
51At initialization, the caller of libld.so specifies the type of objects
52being linked. By default, the ld command determines the machine type and
53class of the object being generated from the first ELF object processed
54from the command line. The -64 and -ztarget options exists to change this
55default, which is useful when creating an object entirely from an archive
56library or a mapfile. During initialization, the link-editor configures
57itself to build an output object of the specified type. This is done via
58indirection, using the global ld_targ structure to access code, data, and
59constants for the specified target.
60
61There are two types of source files used to build libld.so:
62
63	1) Common code used for all targets
64	2) Target specific code used only when linking for
65	  a given target.
66
67All of these files reside in usr/src/cmd/sgs/libld/common. However,
68it is easy to see which files belong in each category by examining
69the object lists maintained in usr/src/cmd/sgs/libld/Makefile.com.
70In addition, the target-specific files usually include the target
71in their name (i.e. machrel.sparc.c).
72
73Although the target dependent and independent (common) code is well separated,
74they are interdependent. The common code is explicitly aware of
75target-specific section types that can occur only for some targets
76(i.e. SHT_AMD64_UNWIND). This is not an architecture that allows
77for arbitrary target support to be dynamically plugged into an unchanged
78platform independent core. Rather, it is an organization that allows
79a common core to support all the targets it knows about in a way that
80is understandable and maintainable. A truly pluggable architecture
81would be considerably more opaque and complex, and is neither necessary,
82nor desirable, given the wide commonality between modern computer
83architectures.
84
85It is possible to add support for new targets to libld.so. The process
86of doing so is largely a matter of examining the files for existing
87platforms, studying the ABI for the new target platform, and then
88filling in the missing pieces for the new target. The remainder of this
89file consists of sections that describe some of the issues and steps
90that you will encounter in adding a new target.
91
92-----------------------------------------------------------------------------
93The relocation code used by ld is shared by the runtime linker (ld.so.1)
94and by the kernel module loader (ktrld), and is therefore found under
95usr/src/uts. You must add code for a relocation engine to support the
96new target. To do so, examine the common header files:
97
98	usr/src/uts/common/krtld/reloc.h
99	usr/src/uts/common/krtld/reloc_defs.h
100
101   and the existing relocation engines:
102
103	usr/src/uts/intel/amd64/krtld/doreloc.c
104	usr/src/uts/intel/ia32/krtld/doreloc.c
105	usr/src/uts/sparc/krtld/doreloc.c
106
107The ABI for the target platform will be the primary information
108you require. If your new system has attributes not found in an existing
109target, you may have to add/modify fields in the Rel_entry struct typedef
110(reloc_defs.h), or you may have to add new flags. Either sort of change
111may require you to also modify the existing relocation engines, and
112undoubtedly the common code in libld.so as well.
113
114When compiled for use by libld, the relocation engine requires an
115argument named "bswap". Each relocation engine must be prepared to
116swap the data bytes it is operating on. This support allows a link-editor
117running on a platform with a different byte order than the target to
118build objects for that target. To see how this is implemented, and how
119to ifdef that support so it only exists in the libld version of
120the engine, examine the code for the existing engines.
121
122-----------------------------------------------------------------------------
123You must create a target subdirectory in usr/src/cmd/sgs/include,
124and construct a machdep_XXX.h file (where XXX is the name of the
125target). The machdep files for the current platforms can be helpful:
126
127	usr/src/cmd/sgs/include/sparc/machdep_sparc.h
128	usr/src/cmd/sgs/include/i386/machdep_x86.h
129
130Note that these files support both the 32 and 64-bit versions of
131a given platform, so there is only one subdirectory and machdep
132file for each platform (i.e. "sparc", instead of "sparc" and "sparcv9").
133
134Once you have created the target machdep_XXX.h file, you must edit:
135
136	usr/src/cmd/sgs/include/machdep.h
137
138and add a #include for your new file to it, surrounded by the
139appropriate #ifdef for the target platform.
140
141This two level structure allows us to #include machdep information
142in two different ways:
143
144	1) Code that wants support for the current platform,
145	   regardless of which platform that is, can include
146	   usr/src/cmd/sgs/include/machdep.h. The runtime linker
147	   (ld.so.1) is the primary consumer of this form.
148
149	2) Code that needs to support multiple targets must never
150	   include the generic machdep.h from (1) above. Instead,
151	   such code explicitly includes the machdep file for the target
152	   it is interested in. For example:
153
154		#include <sparc/machdep_sparc.h>
155
156	   libld.so uses this form to build non-native target
157	   code.
158
159You will find that most of the constants defined in the target
160machdep file end up as initialization for the information that
161libld.so accesses via the ld_targ global variable.
162
163-----------------------------------------------------------------------------
164Study the definition of the Target typedef in
165
166	usr/src/cmd/sgs/libld/common/_libld.h
167
168This is the type of the ld_targ global variable. Filling in a complete
169copy of this definition is the primary task involved in adding support
170for a new target to libld.so, so it will be helpful to be familiar with
171it before you dive deeper. libld follows two simple rules with regards
172to ld_targ, and the Target type:
173
174	1) The target-independent common code can only access
175	   target-dependent code or data via the ld_targ global
176	   variable.
177
178	2) The target-dependent code can access the common
179	   code or data freely.
180
181A given link-editor invocation is always for a single target. The choice
182of target is made at initialization, and does not change within a
183single link. Code for the other targets is present, but is not
184accessed.
185
186-----------------------------------------------------------------------------
187Files containing the target-specific code to support the new
188platform must be added to libld.so. Examine the object lists
189in usr/src/cmd/sgs/libld/Makefile.com to see the files for existing
190platforms, and read those files to get a sense of what is required.
191
192Among the other files, every platform will have a file named
193machrel.XXX.c. This file contains the relocation-related functions,
194and it also contains an init function for the target. This init function
195is responsible for initializing the ld_targ global variable so that
196the common code will use the code and definitions for your
197target.
198
199You should start by creating a machrel.XXX.c file for your new
200target. Add other files as needed. Be aware that any functions or
201variables you place in these target-dependent files must either
202be static, or must have names that will not collide with the names
203used by the rest of libld.so. The easiest way to do this is to
204add a target suffix to the end of all such non-local names
205(i.e. foo_sparc() instead of foo()).
206
207The existing code is the documentation for this phase of things: The
208best way to determine what a given function should do is to read the
209code for other platforms, taking into account the similarities and
210differences in the ABI for your new target and those existing ones.
211
212-----------------------------------------------------------------------------
213You may find that your new target requires support for new concepts
214not found in other targets. A common example of this might be
215a new target specific ELF section type (i.e. SHT_AMD64_UNWIND). Another
216might be details involving PIC code and PLT generation (as found for
217sparc). It may be necessary to add new fields to the ld_targ global
218variable, and to modify the libld.so common code to use these new
219fields.
220
221It is a standard convention that NULL function pointers are used to
222represent functionality not required by a given target. Although the
223common code can consult ld_targ.t_m.m_mach to determine the target it
224is linking for, and although there is some code that does this, it
225is generally undesirable and unnecessary. Instead, the common code
226should test for such pointers, as with this sparc-specific example
227from update.c:
228
229	/*
230	 * Assign a got offset if necessary.
231	 */
232	if ((ld_targ.t_mr.mr_assign_got != NULL) &&
233	    (*ld_targ.t_mr.mr_assign_got)(ofl, sdp) == S_ERROR)
234		return ((Addr)S_ERROR);
235
236It may be tempting to include information in the comment that explains
237the target specific nature of this, and that may even be appropriate.
238Consider however, that a new target may come along with the same feature
239later, and if that occurs, your comments will instantly be dated. In general,
240the use of ld_targ is a strong hint to the reader that they should go read
241the target-specific code referenced to understand what is going on. It is
242best to supply comments at the call site that describe the operation
243in generic terms (i.e. "assign a got if necessary") instead of in
244explicit target terms (i.e. "Assign a sparc got if necessary"). Of
245course, some features are extremely target-specific (like amd64 unwind
246sections), and it doesn't really help to be obscure in such cases.
247This is a judgement call.
248
249If you do add a new field to ld_targ that uses NULL to represent
250an option feature *YOU MUST DOCUMENT IT AS SUCH*. You will find
251comments in _libld.h for existing optional fields. It suffices to
252add a comment for your new field. In the absence of such a comment,
253the common code assumes that all function pointers are safe to call
254through (dereference) without first testing them.
255
256-----------------------------------------------------------------------------
257Byte swapping is a big issue in cross linking, as the system running
258the link-editor may have the opposite byte order from the target. It is
259important to know when, and when not, to swap bytes.
260
261If the build system and target have different byte orders, the
262FLG_OF1_ENCDIFF bit of the ofl_flags field of the output file
263descriptor will be set. If this bit is not set, the target and
264system byte orders are the same, and no byte swapping
265is required.
266
267libld uses libelf to read and write objects. libelf automatically
268swaps bytes for the sections it knows about, such as symbol tables,
269relocation records, and the usual ELF plumbing. It is therefore never
270necessary for your code to swap the bytes in this data. If you find that
271this is not the case, you have probably uncovered a bug in libelf that
272you should look into.
273
274The big exception to libelf transparently handling byte swapping is
275progbits sections (SHT_PROGBITS). libelf does not understand program
276code or data as anything other than a series of byte values, and as such,
277cannot do byte swapping for them. If your code examines or modifies
278such data, you are responsible for handling the byte swapping required.
279
280The OFL_SWAP_RELOC macros found in _libld.h can be helpful in making such
281determinations. You should use these macros instead of writing your own
282tests for this, as they have high documentation value. If you find they
283don't handle your target, add a new one that does.
284
285GOT and PLT sections are SHT_PROGBITS. You will probably find
286that the vast majority of byte swapping you have to handle
287concern the handling of these items.
288
289libld contains generic functions for byte swapping:
290
291	ld_bswap_Word();
292	ld_bswap_Xword();
293
294These functions are built on top of the of the BSWAP_ macros found
295in usr/src/cmd/sgs/include/_machelf.h:
296
297	BSWAP_HALF
298	BSWAP_WORD
299	BSWAP_XWORD
300
301When copying data from one address to another in a cross link environment,
302the source and/or destination addresses may not have adequate alignment for
303the data type being copied. For example, a sparc platform cannot access
3048-byte data types on 4-byte boundaries, but it might need to do so when
305linking X86 objects where the alignment of such data can be 4. The
306UL_ASSIGN macros can be used to copy potentially unaligned data:
307
308	UL_ASSIGN_HALF
309	UL_ASSIGN_WORD
310	UL_ASSIGN_XWORD
311
312The UL_ASSIGN_BSWAP macros do unaligned copies, and also perform
313byte swapping when the linker host and target byte orders are
314different:
315
316	UL_ASSIGN_BSWAP_HALF
317	UL_ASSIGN_BSWAP_WORD
318	UL_ASSIGN_BSWAP_XWORD
319
320If you are reading/writing to relocation data, the following
321routines understand relocation records and will get/set the
322proper amount of data while handling any needed swapping:
323
324	ld_reloc_targval_get()
325	ld_reloc_targval_set()
326
327Byte swapping is a fertile area for mistakes. If you're having trouble
328getting a successful link in a cross link situation, you should always
329do the experiment of doing the link on a platform with the same byte
330order as the target. If that link succeeds, then you are looking at
331a bug involving incorrect byte swapping.
332
333-----------------------------------------------------------------------------
334   As mentioned above, incorrect byte swapping is a common
335error when developing libld target specific code. In addition to
336trying a build machine with the same byte order as the target, elfdump
337can also be a powerful tool for debugging. The first step with
338elfdump is to simply dump everything and read it looking for obviously
339bad information:
340
341	% elfdump outobj 2>&1 | more
342
343elfdump tries to do sanity checking on the objects it
344displays. Hence, the following command is a a common
345idiom:
346
347	% elfdump outobj > /dev/null
348
349Any problems with the file that elfdump can detect will be
350written to stderr.
351
352-----------------------------------------------------------------------------
353Once you have the target-specific code in place, you must modify the
354libld initialization code so that it will know how to use it. This
355logic is found in
356
357	usr/src/cmd/sgs/libld/common/ldmain.c
358
359in the function ld_init_target().
360
361-----------------------------------------------------------------------------
362The ld front end program that uses libld must be modified so that
363the "-z target=platform" command line option recognizes your
364new target. This code is found in
365
366	usr/src/cmd/sgs/ld/common
367
368The change consists of adding an additional strcasecmp() to the
369command line processing for -ztarget.
370
371-----------------------------------------------------------------------------
372You probably changed things getting your target integrated.
373Please update this document to reflect your changes.
374