Monday, December 10, 2007

Linker and Loader

Runtime Linker,Static Linker and Loader
===============================

Concept of Run Time Linker and Static Linker is same in case of Solaris and Linux.

There are two linkers:

A static linker (also called link-editor) that you use to link your application binaries after
compiling your source code into object files

A dynamic linker (also called as Run time linker or Interpreter) performing the run time linking of dynamic executable and shared libraries.
The runtime linker plays an important role in the execution of the program.
The dynamic linker is the runtime linker (also called loader or ld.so.1(1)). Since this loaded shared libraries also load executable file if not loaded by exec().

In Open-Solaris Dynamic linker is ld.so.1 that resides in /usr/lib/ld.so.1, which is soft link (symbolic link) to /lib/ld.so.1
In Linux Dynamic linker name is ld-linux.so.2 that resides in /lib/ld-linux.so.2

The first kind of linker is the link-editor (also called linker or static linker or native linker or program linker ie ld, and you use it to link object files into shared objects or executable. The user can either directly execute the link-editor, or it can be invoked by the compiler.

Please do not invoke link-editor directly ,if you use linker directly .init and .fini section will not registered properly.

By default, the static linker makes all applications' symbols global in scope (making them into what is also called "exported symbols"). This means it puts the symbols into the dynamic symbol table of the resulting binary such that other binary modules can access those symbols.

dynamic relocations that the dynamic linker performs are only necessary for the global (also known as external or exported) symbols. The static linker resolves references to local symbols (for example, names of static functions) statically when it links the binary.


Linking Stages
============

There are 3 types of linking in case of Solaris,Linux,Windows.
1. Static Linking
-------------------
Static linking means that for each function your program calls, the assembly to that function is actually included in the executable file. Function calls are performed by calling the address of this code directly, the same way that functions of your program are called.

2. Dynamic Linking
-----------------------
Dynamic Linking means there are shared libraries which are used by application.
This linking is done at compile and build time of an executable.
The required relocations of data and functions are done at load and run time by Run Time Linker.
3. Runtime Linking
-----------------------
Runtime linking is linking that happens when a program calls a function from a library that is was not linked against at compile time. The library is mapped with dlopen() under Solaris ,Linux,UNIX, and LoadLibrary() under Microsoft Windows, both of which return a handle that is then passed to symbol resolution functions (dlsym() and GetProcAddress()), which actually return a function pointer that may be called directly from the program as if it were any normal function.


There can be two type of executable
============================
1. Static executable :This type of executable is build using static linking as described above.

What happen when running the static executable:
-----------------------------------------------
The kernel ie exec() would load the executable and jump directly to its entry point(ehdr:e_entry) ie _start address used by linker. Static means "no interpreter (ie no run time linker)".
2.Dynamic executable:
This type of executable is build using dynamic linking or run time linking as described above.

What happen when running the dynamic executable:
------------------------------------------------
exec() (ie part of the kernel) works as loader and do the following:
----------------------------------------------------------------
1. exec() loads the executable.
2. exec() inspects the executable to find the interpreter name.
3. exec() loads the interpreter(run time linker).
4. exec() jumps to the interpreter.

The interpreter(run time linker)
--------------------------------
1.Analyse and loads the executable dependencies.
2.relocates any loaded objects
3.and then jumps to the executable.
The Interpreter
-------------------

An executable file may have one PT_INTERP program header element if executable file not contains PT_INTERP section Then this is Static Executable.Dynamic Executable always contains PT_INTERP section in which the path of run time linker ] is written.
During exec ie loading executable, the system retrieves a path name from the PT_INTERP segment and creates the initial process image from the interpreter file's segments. That is, instead of using the original executable file's segment images, the system composes a memory image for the interpreter.
It then is the interpreter's responsibility to receive control from the system and provide an environment for the application program.
The interpreter receives control in one of two way:
-----------------------------------------------------------
1. First, it may receive a file descriptor to read the executable file, positioned at
the beginning. It can use this file descriptor to read and/or map the
executable file's segments into memory.
2. Second, depending on the executable file format, the system may load the executable file into
memory instead of giving the interpreter an open file descriptor.
An interpreter may be either a shared object or an executable file. In Linux or Solaris this is a shared object.

Sunday, July 29, 2007

ELF

I worked on ELF, so writing some basic details about this.
Here I will use gcc, elfdump on Open-Solaris for explanation.
You can done whatever I am doing here on linux using GNU toolchain and objdump.

What is ELF?
ELF means Executable and Linking Format this is executable format which is used now a days in UNIX variant platforms such as Linux,Solaris etc.
In other way wen say ,It defines the format of executable binaries used on Linux,Solaris etc - and also for relocatable, shared object and core dump files too.

ELF is used by both Static Linkers,Dynamic Linker and loaders. For more information on Static Linkers,Dynamic Linker and Loaders, please read my following bolg:

http://bhushanverma.blogspot.com/search/label/ELF%2FLinker%2FLoader


Executable and Linkable Format is:

Standard library format for object files
Derived from AT&T System V Unix
Later adopted by BSD Unix variants and Linux
Generic name: ELF binaries
Better support for shared libraries than old a.out formats.

ELF describes three types of Object File:

• relocatable file(.o)
• executable file
• shared object file(.so)

--------------

Now moves from Theory to Practical
How We can see inside ELF:

Step1: Make an c source file as
$cat test.c

int
main()
{
return 0;
}

Step 2:Build executable as
$gcc -o main main.c

Step 3: Now fire following command
$elfdump -e main
foll lowing will be displayed
ELF Header
ei_magic: { 0x7f, E, L, F }
ei_class: ELFCLASS32 ei_data: ELFDATA2LSB
e_machine: EM_ARM e_version: EV_CURRENT
e_type: ET_EXEC
e_flags: 514
e_entry: 0x8360 e_ehsize: 52 e_shstrndx: 14
e_shoff: 0x544 e_shentsize: 40 e_shnum: 17
e_phoff: 0x34 e_phentsize: 32 e_phnum: 5


Lets try to understand one by one of this header elements

ei_magic:This is magic number of ELF ie this tell about file is ELF or not.
ei_class:This tells about class types ie 32 bit file or 64 bit file.
ELFCLASS32:32 bit file
ELFCLASS64:64 bit file
ei_data: This tell about endianness.ie this run on Little-endian machine or Big-endian machine.
ELFDATA2LSB means Litte -endian
ELFDATA2MSB means Big-endian
e_machine: This tells about for which this file is build or on which machine this executable will run.
e_version: This member identifies the object file version.
e_type: This tells about type of file is executable,object,shared object or core
Executable -- e_type:ET_EXEC
Shared Object(.so) -- e_type:ET_DYN
Relocatable object files (.o)-- e_type:ET_REL

e_entry: Its entry point ie _start address used by linker.This gives the virtual address to which the system first transfers control,for starting the process.
e_shoff: This holds the section header table's file offset in bytes. If the file has no section header table, this holds zero.
e_ehsize: This holds the ELF header's size in bytes.
e_shstrndx: This holds the section header table index of the entry associated with the
section name string table. If the file has no section name string table, this member
holds the value SHN_UNDEF.
e_shnum:This holds the number of entries in the section header table.
e_phoff: This holds the program header table's file offset in bytes.
e_phentsize: This holds the size in bytes of one entry in the file's program header table;
all entries are the same size.
e_phnum:This holds the number of entries in the program header table

The above is only one example to read the ELF file. There are plenty information inside ELF file.
There are various tools to play with ELF.
If you want to play with ELF use the following tools:
GNU Binutils:
* ar - A utility for creating, modifying and extracting from archives.
* nm - Lists symbols from object files.
* objcopy - Copys and translates object files.
* objdump - Displays information from object files.
* ranlib - Generates an index to the contents of an archive.
* readelf - Displays information from any ELF format object file.
* size - Lists the section sizes of an object or archive file.
* strings - Lists printable strings from files.
* strip - Discards symbols.

Open-Solaris Tools:
elfdump:
Displays information from any ELF format object file.
nm: Lists symbols from object files.


Views of ELF:
There are two views: Linking View and Execution View
While doing low level programming know that ELF files consist of several sections.
for example,
elfdump -c main shows
----------------------------------------------------------
Section Header[1]: sh_name: .interp
sh_addr: 0x80d4 sh_flags: [ SHF_ALLOC ]
sh_size: 0x11 sh_type: [ SHT_PROGBITS ]
sh_offset: 0xd4 sh_entsize: 0
sh_link: 0 sh_info: 0
sh_addralign: 0x1

Section Header[2]: sh_name: .hash
sh_addr: 0x80e8 sh_flags: [ SHF_ALLOC ]
sh_size: 0x54 sh_type: [ SHT_HASH ]
sh_offset: 0xe8 sh_entsize: 0x4
sh_link: 3 sh_info: 0
sh_addralign: 0x4


....
Section Header[16]: sh_name: .strtab
sh_addr: 0 sh_flags: 0
sh_size: 0xe0 sh_type: [ SHT_STRTAB ]
sh_offset: 0xb0c sh_entsize: 0
sh_link: 0 sh_info: 0
sh_addralign: 0x1
----------------------------------------------------------------

ELF loader in the kernel doesn't really have any relation with ELF sections at all, instead it looks on so called "program headers".

elfdump -p main shows program headers as
--------------------------------------------------------------
Program Header[0]:
p_vaddr: 0x8034 p_flags: [ PF_X PF_R ]
p_paddr: 0x8034 p_type: [ PT_PHDR ]
p_filesz: 0xa0 p_memsz: 0xa0
p_offset: 0x34 p_align: 0x4

Program Header[1]:
p_vaddr: 0x80d4 p_flags: [ PF_R ]
p_paddr: 0x80d4 p_type: [ PT_INTERP ]
p_filesz: 0x11 p_memsz: 0x11
p_offset: 0xd4 p_align: 0x1

Program Header[2]:
p_vaddr: 0x8000 p_flags: [ PF_X PF_R ]
p_paddr: 0x8000 p_type: [ PT_LOAD ]
p_filesz: 0x3d0 p_memsz: 0x3d0
p_offset: 0 p_align: 0x8000

Program Header[3]:
p_vaddr: 0x103e0 p_flags: [ PF_W PF_R ]
p_paddr: 0x103e0 p_type: [ PT_LOAD ]
p_filesz: 0xc0 p_memsz: 0xc0
p_offset: 0x3e0 p_align: 0x8000

Program Header[4]:
p_vaddr: 0x103e0 p_flags: [ PF_W PF_R ]
p_paddr: 0x103e0 p_type: [ PT_DYNAMIC ]
p_filesz: 0x98 p_memsz: 0x98
p_offset: 0x3e0 p_align: 0x4


ELF files has two views:
1. Section view (Linking View) defining what goes where in file
2. Program view (Execution View) defining exact mapping of ELF data in process address space.
A Program header table tells the system how to create a process image.
Files used to build a process image (execute a program) must have a program header table;
Relocatable files do not need one.
A section header table contains information describing the file's sections.






Figure ELF views





Have a Nice Day! Keep Enjoy .

Bhushan