Introduction To Translators, Compilers, Assemblers, Interpreters
Case study: Translators, Compilers, Assemblers, Interpreters
Before you read: To get the best experience while reading please use a desktop PC browser.
Translators, compilers, interpreters and assemblers are all software programming tools that convert code into another type of code, but each term has specific meaning. All of the above work in some way towards getting a high-level programming language translated into machine code that the central processing unit (CPU) can understand. Examples of CPUs include those made by Intel (e.g., x86), AMD (e.g., Athlon APU), NXP (e.g., PowerPC), and many others. It’s important to note that all translators, compilers, interpreters and assemblers are programs themselves.
What are translators?
Translators are software programs which can translate programs written in source language to target language.
Types of translators:
Assembler: This is a translator for an assembly language of a computer.
An assembly language is the lowest level programming language for a computer. It is peculiar to a certain computer system and hence it is machine dependent.
Compilers: are system programs that translates an input program from a high level language format to its low level/machine equivalent, checks for errors, optimizes the code and prepares the code for execution. Compilers are the translators which translate the program written in high level language such as; C++, C#, Java, Kotlin etc. into machine code for the target computer architecture. The generated code can be later executed many times against each data.
Another software processor is which is called the loader also known as linkage editor performs some very low level processing of object code in order to convert it into a ready-to-run program for the target machine.
Interpreter: It reads a source program written in high level programming language as well as data for this program and it runs the program against the data to produce some results.
Attributes that differentiate an interpreter from a compiler:
- It is a system program that looks at and executes programs on a line-by-line basics rather than producing object code.
- Whenever the programs have to be executed repeatedly the source code has to be interpreted every time in contrast to compiled programs that create object code which is executed on the target machine.
- The entire source code needs to be present in memory during execution as a result of which the memory space required is more compared to that of a compiled code.
Loader: This is a system program loader, loads the binary code in the memory ready for execution. Loaders are responsible for locating program in the main memory (RAM, ROM) every time it is being executed.
There are various loading schemes:
- Assemble-and-go-loader: The assembler simply places the code into memory and the loader executes a single instruction that transfers control to the starting instruction of the assembled program. In this scheme, some portion of the memory is used by the assembler itself which would otherwise have been available for the object program.
- Absolute loader: Object code must be loaded into the absolute addresses in the memory to run. If there are multiple sub-routines, then the absolute address must be specified explicitly.
- Relocating loader: This loader modifies the actual instructions of the program during the process of loading the program so that the effect of the load address is taken into account.
Linkers: Linking is the process of combining various pieces of code or data together to form a single executable that can be loaded into memory. Linking is of two types:
- Static Linking: All references are resolved during loading at linkage time
- Dynamic Linking: References made to the code in the external module are resolved during run-time.
Lexical analyzer: Breaking the source code text into smaller pieces each representing a single atomic unit of the language, for instance a keyword, symbol, name, identifier etc. this phase is called lexical scanning.
Syntax analyzer: Identifying the syntactic structure of source code. The phase is called parsing.
Semantic analyzer: This is to recognize the meaning of program code and start to prepare for output. In this phase type checking is done and most of the compiler error shows up.
Code optimization: The intermediate language representation is transformed into functionally equivalent but faster forms.
Code generation: The transformed intermediate language is translated into the output usually native machine language of the target system.
Symbol table manager: It is the detailed systems of lists. The compiler needs a symbol table to record each identifier ad collects each information about it. The symbol table manager has a FIND function that returns a pointer to the descriptor for an identifier when given its lexicon. Compiler phases uses this pointer to read and modify information about the identifier. FIND function returns a null pointer if there is no record of a given lexica.
The INSERT function in the symbol table manager inserts new record into the symbol table when given its lexica.
Error handlers: Detection and reporting of errors in source programs is one of the main functions of a compiler.
Analysis phase: In this part, source programs are broken into constituent piece and creates an intermediate representation. Analysis can be done in 3 phases:
- Lexical analysis
- Syntax analysis
- Semantic analysis
Synthesis: This constructs the desired target program from intermediate representation. Synthesis can be done in 3 phases:
- Code generation
- Code optimization
- Intermediate code generation
Front end: There are several backend for different target machine, all of which use the same parser and code generator called font end.
Back end: Intermediate to binary translation is usually done by a separate compilation parser called back end.
Pre-processor: A Pre-processor is one which produces input to the compiler. A source program may be divided into modules stored in separate files. The task of colleting the source program is sometimes in trusted to a distinct program called Pre-processor. The processor may also expand macros into source language statement.
The target architecture of a compiler:
- Real machine: A real machine is a piece of hardware equipment like a microprocessor, capable of executing code. Real machines have the advantage of working very fast as the hardware of the micro-processor works
- Virtual machine: A virtual machine is a software program that acts like a real machine
Multi-pass compiler: Is a compiler that scams the input source, produces the first modified form, then scan the modified form again and produces a second modified form. It makes several passes over the program. The output of proceeding phase is stored in a data structure and used by subsequent phases. It passes through the source code or abstracts through the syntax tree several times
One-pass compiler: Is a compiler that passes through a source code of each compilation unit only once.
Differences between one-pass and multi-pass compilers:
|One-pass compiler||multi-pass compiler|
|passes a source code for each compilation |
unit only once
|processes the source code of a program serval|
|Faster than multi-pass compiler||Slower than one-pass compiler|
|Has limited scope of passes||Has a wide scope of passes|
|They are narrow compilers as they are |
|They are wide compilers as they are called|
|Many programming languages cannot be|
supported by one-pass compilers
|Many programming Languages e.g. Java took off |
with the concept of multi-pass compilers