version 0.6.0
Lexer / Parser

Topics

 Source file
 Source file manager.
 
 Lexer
 Lexer.
 
 Parser
 Parser.
 

Detailed Description

This group defines a multi-purpose lexer/parser.

The lexer/parser is organized as follow:

                               keyword list
                                     v
       ┌──────────┐            ┌───────────┐          ┌────────────┐
files  │          │ characters │           │  tokens  │            │ ───────>
─────> │   File   │ ──────────>│   Lexer   │ ───────> │   Parser   │  tokens / actions
       │          │            │           │          │            │ <───────
       └──────────┘            └───────────┘          └────────────┘
      mod_source_file            mod_lexer             mod_parser
                                                       mod_identifier
                                                       mod_scope

The lexer/parser is split into 3 units:

  1. File unit:
    Reads the whole content of a file into a character array. Rank 0 processor reads the file and spread its content to all processors. This unit manages a character flow. Each character is identified by its global position in the file and its row/column coordinates. A row ends after a '\n' (char(10)) character.
  2. Lexer unit:
    Identifies a single or a series of characters to a given pattern (a token). For instance, ';' is identified as tk_semicolon. The list of token type is defined in the mod_lexer module. The lexer can manage multiple source file simultaneously.
  3. Parser unit:
    Manages a flow of tokens. The role of the parser is to transform a flow of token into a series of actions. The mod_parser module contains routines to organize the flow of token. For instance, it contains an identifier list to handle variable declaration and a scope manager.

For a practical usage of the lexer/parser, look at the OBJ Wavefront reader.