ref: 25e676d87fef3efa5cd4101ce7f955f477564cfc
parent: 08f9c9fbdba14ce22efed411a31d6d536ea1aa2a
author: Roberto E. Vargas Caballero <k0ga@shike2.com>
date: Fri Feb 2 15:55:20 EST 2018
[doc] Add documentation about myro
--- /dev/null
+++ b/doc/myro.txt
@@ -1,0 +1,179 @@
+Object File Format
+------------------
+
+The object file format is designed to be the simplest format that covers
+all the needs of many modern programming languages, with sufficient support
+for hand written assembly. All the types are little endian.
+
+File Format
+-----------
+
+ +== Header ======+
+ | signature | 32 bit
+ +----------------+
+ | format str | 32 bit
+ | |
+ +----------------+
+ | entrypoint | 64 bit
+ | |
+ +----------------+
+ | stringtab size | 64 bit
+ | |
+ +----------------+
+ | section size | 64 bit
+ | |
+ +----------------+
+ | symtab size | 64 bit
+ | |
+ +----------------+
+ | reloctab size | 64 bit
+ | |
+ +== Metadata ====+
+ | strings... |
+ | .... |
+ +----------------+
+ | sections... |
+ | ... |
+ |----------------+
+ | symbols.... |
+ | ... |
+ +----------------+
+ | relocations... |
+ | ... |
+ +== Data ========+
+ | data... |
+ | ... |
+ +================+
+
+The file is composed of three components: The header, the metadata, and
+the data. The header begins with a signature, containing the four bytes
+"uobj", identifying this file as a unified object. It is followed by
+a string offste with a format description (it may be used to indicate
+file format version, architecture, abi, ...) .This is followed by the
+size of the string table, the size of the section table, the size of
+the symbol table, and the size of the relocation table.
+
+Metadata: Strings
+-----------------
+
+The string table directly follows the header. It contains an array of strings.
+Each string is a sequence of bytes terminated by a zero byte. A string may
+contain any characters other than the zero byte. Any reference to a string
+is done using an offset of 32 bits into the string table. If it is needed
+to indicate a "no string" then the value 0FFFFFFFFH may be used.
+
+Metadata: Sections
+------------------
+
+The section table follows the string table. The section table defines where
+data in a program goes.
+
+ +== Sect ========+
+ | str | 32 bit
+ +----------------+
+ | flags | 16 bit
+ +----------------+
+ | fill value | 8 bit
+ +----------------+
+ | aligment | 8 bit
+ +----------------+
+ | offset | 64 bit
+ +----------------+
+ | len | 64 bit
+ +----------------+
+
+All the files must defined at least 5 sections, numbered 1 through 5,
+which are implcitly included in every binary:
+
+ .text SprotRread | SprotWrite | Sload | Sfile | SprotExec
+ .data SprotRread | SprotWrite | Sload | Sfile
+ .bss SprotRread | SprotWrite | Sload
+ .rodata SprotRread | Sload | Sfile
+ .blob Sblod | Sfile
+
+A program may have at most 65,535 sections. Sections have the followign flags;
+
+ SprotRead = 1 << 0
+ SprotWrite = 1 << 1
+ SprotExec = 1 << 2
+ Sload = 1 << 3
+ Sfile = 1 << 4
+ Sabsolute = 1 << 5
+ Sblob = 1 << 6
+
+Blob section. This is not loaded into the program memory. It may be used
+for debug info, tagging the binary, and other similar uses.
+
+Metadata: Symbols
+-----------------
+
+The symbol table follows the string table. The symbol table contains an array
+of symbol defs. Each symbol has the following structure:
+
+
+ +== Sym =========+
+ | str name | 32 bit
+ +----------------+
+ | str type | 32 bit
+ +----------------+
+ | section id | 8 bit
+ +----------------+
+ | flags | 8 bit
+ +----------------+
+ | offset | 64 bit
+ | |
+ +----------------+
+ | len | 64 bit
+ | |
+ +----------------+
+
+A symbol is 24 bytes in size.
+
+The string is an offset into the string table, pointing to the start
+of the string. The kind describes where in the output the data goes
+and what its role is. The offset describes where, relative to the start
+of the data, the symbol begins. The length describes how many bytes it is.
+
+Currently, there's one flag supported:
+
+ 1 << 1: Deduplicate the symbol.
+ 1 << 2: Common storage for the symbol.
+ 1 << 3: external symbol
+ 1 << 4: undefined symbol
+
+Metadata: Relocations
+----------------------
+
+The relocations follow the symbol table. Each relocation has the
+following structure:
+
+
+ +== Reloc =======+
+ | 0 | symbol id | 32 bit
+ | 1 | section id |
+ +----------------+
+ | flags | 8 bit
+ +----------------+
+ | rel size | 8 bit
+ +----------------+
+ | mask size | 8 bit
+ +----------------+
+ | mask shift | 8 bit
+ +----------------+
+ | offset | 64 bit
+ | |
+ +----------------+
+
+Relocations write the appropriate value into the offset requested.
+The offset is relative to the base of the section where the symbol
+is defined.
+
+The flags may be:
+
+ Rabsolute = 1 << 0
+ Roverflow = 1 << 1
+
+Data
+----
+
+It's just data. What do you want?