The PPJ compiler uses three configuration files located in the config/ directory to define the language syntax and semantics. These files serve as the authoritative specification for lexical analysis, syntax analysis, and semantic analysis phases. Understanding these configuration files is essential for extending the compiler or modifying the supported language features.
The compiler uses the following configuration files:
config/lexer_definition.txt: Defines token patterns and lexical rulesconfig/parser_definition.txt: Defines context-free grammar productionsconfig/semantics_definition.txt: Defines semantic rules and type system constraints
Configuration files are loaded using a hierarchical path resolution strategy:
-
Environment Variable Override: Check for custom path via environment variable
LEXER_DEFINITION_PATH: Override lexer definition pathPARSER_DEFINITION_PATH: Override parser definition pathSEMANTICS_DEFINITION_PATH: Override semantics definition path
-
Project Root Detection: Automatically detect project root by searching for:
pom.xmlfile (Maven project marker)config/directory
-
Default Paths: If project root is found, use:
config/lexer_definition.txtconfig/parser_definition.txtconfig/semantics_definition.txt
-
Fallback: If project root not found, use current directory
Configuration loading is implemented in:
- Lexer:
hr.fer.ppj.lexer.config.LexerConfig.getLexerDefinitionPath() - Parser:
hr.fer.ppj.parser.config.ParserConfig.getParserDefinitionPath() - Semantics: Similar pattern (check semantic analyzer code)
Example:
// Load lexer definition
Path lexerDef = LexerConfig.getLexerDefinitionPath();
// Load parser definition
Path parserDef = ParserConfig.getParserDefinitionPath();The lexer definition file (config/lexer_definition.txt) uses a custom format with four sections:
- Macro Definitions:
{name} pattern - State Declarations:
%X state1 state2 ... - Token Declarations:
%L TOKEN1 TOKEN2 ... - Lexer Rules:
<state>pattern { actions }
See Also: Lexer Configuration Reference
The parser definition file (config/parser_definition.txt) uses BNF-like notation:
- Non-terminal Declarations:
%V <nonterm1> <nonterm2> ... - Terminal Declarations:
%T TOKEN1 TOKEN2 ... - Synchronization Tokens:
%Syn TOKEN1 TOKEN2 ... - Productions:
<nonterm> ::= alternative1 | alternative2 ...
See Also: Parser Configuration Reference
The semantics definition file (config/semantics_definition.txt) defines semantic rules:
- Type compatibility rules
- Scope resolution rules
- Semantic constraint specifications
See Also: Semantics Configuration Reference
Configuration files are validated during loading:
- Lexer: Validates macro definitions, state declarations, token declarations, rule syntax
- Parser: Validates grammar syntax, non-terminal/terminal declarations, production format
- Semantics: Validates semantic rule syntax and consistency
Error Handling: Invalid configuration files cause compilation to fail with descriptive error messages.
When multiple configuration sources are available:
- Environment Variables (highest priority)
- Project Root Configuration Files
- Current Directory Configuration Files (fallback)
- Add token name to
%Lsection inlexer_definition.txt - Add pattern rule in lexer rules section
- Add token to
%Tsection inparser_definition.txt(if used in grammar)
- Add non-terminal to
%Vsection inparser_definition.txt - Add productions for the non-terminal
- Update semantic rules in
semantics_definition.txtif needed
- Add rule specification to
semantics_definition.txt - Implement rule checking in semantic analyzer code
- Add tests for new semantic constraints
- Version Control: Always commit configuration files to version control
- Documentation: Document non-obvious patterns or rules
- Testing: Test configuration changes with example programs
- Consistency: Ensure token names match between lexer and parser definitions
- Incremental Changes: Make small, incremental changes and test frequently
- Configuration File Reference: Detailed format specifications
- Configuration Examples: Example configurations and usage patterns
- Lexical Analysis Documentation: How lexer uses configuration
- Syntax Analysis Documentation: How parser uses configuration
- Semantic Analysis Documentation: How semantic analyzer uses configuration
Configuration files are the foundation of the compiler's language specification. Understanding them is essential for compiler maintenance and extension.