The Owen Language Specification 0.1.0
The specification is versioned using Semver 2.0. Parser Expression Grammar is used to define the syntax of Owen. All source code is encoded using UTF-8. Source files uses the .owen extension.
1. Compilation Units
compilationUnit = whitespace directives declaration* directives = namespaceDirective useDirective* namespaceDirective = namespace qualifiedIdentifier useDirective = use qualifiedIdentifier qualifiedIdentifier = identifier (dot identifier)* declaration = access? ( functionDeclaration / propositionDeclaration / structureDeclaration / unionDeclaration / enumerationDeclaration) access = public identifier = !keyword [A-Za-z] [A-Za-z0-9]* whitespace keyword = namespace / use / public / function / input / output / end / if / else / for / each / in / while / break / structure / proposition / enumeration / of / size / union namespace = 'namespace' whitespace use = 'use' whitespace public = 'public' whitespace function = 'function' whitespace input = 'input' whitespace output = 'output' whitespace end = 'end' whitespace if = 'if' whitespace else = 'else' whitespace for = 'for' whitespace each = 'each' whitespace in = 'in' whitespace while = 'while' whitespace break = 'break' whitespace structure = 'structure' whitespace proposition = 'proposition' whitespace enumeration = 'enumeration' whitespace of = 'of' whitespace size = 'size' whitespace union = 'union' whitespace dot = '.' whitespace whitespace = (' ' / '\n' / comment)* comment = '//' (!'\n' .)* '\n'?
The namespaceDirective specifies that all declarations in the compilationUnit are in the given name space. The useDirective and namespaceDirective makes all the public declarations in the given name space available to the compilationUnit.
Nested identifiers cannot be the same as the declaration's identifier.
functionDeclaration = functionSignature statements? end functionSignature = function identifier (input arguments)? (output type (comma type)*)? arguments = argument (comma argument)* argument = type indentifier
Declares a function named identifier. input defines a list of arguments that a caller must pass to the function. The arguments are in the same scope as statements. Functions can be overloaded with different order of input types otherwise each identifier must be unique. The output list is the types of the values that the function returns in the order they are listed.
2.1.1 The Main Function
A package can have one function called main. If declared it is the entry point of the program. The main function cannot have any input. The return type of main must be i32.
propositionDeclaration = proposition statements? end
Propositions are nameless functions that returns no values. They are run before the main function if they are included using the --test command line argument.
structureDeclaration = structure identifier fields end fields = field (comma field)* field = access? type identifier
A structure is a sequence of fields laid out in memory as they are lexically declared. Padding may be inserted between fields. The size of the structure is the sum of its fields and padding.
unionDeclaration = union identifier fields end
Works exactly like structures except that all fields starts at the same address. The size of the union is the size of the largest field.
enumerationDeclaration = enumeration identifier of type enumerationConstants end enumerationConstants = enumerationConstant (comma enumerationConstant)* enumerationConstant = identifier (assign integerLiteral)
The identifier is the name of the enumeration. The type must be an iXX. integerLiteral's must have the same type as the type. If an enumerationConstant omits the integerLiteral then it is the value of the last constant + 1. If the first constant omits the integerLiteral then its value is 0.
statements = statement+ statement = assignmentStatement / ifStatement / forEachStatement / whileStatement / breakStatement / callStatement / returnStatement / assertStatement
Each statement are executed in lexical order.
3.1. Assignment Statements
assignmentStatement = expressions assign expressions assign = '=' whitespace
Assigns expressions to another set of expressions. Both lists must be equal in length. If an expression on the right hand side is a call that returns multiple values they are inserted into the expression list at the point where the function was called. Each expression on the left hand side must be assignable. If an expression on the left hand side is an undefined identifier it is declared as a variable of the same type as the expression being assigned to the identifier.
3.2. If Statements
ifStatement = if expression statements? (else if expression statements?)* (else statements?)? end
Each expression is evaluated in lexical order until one is true. The statements following the expression are then executed. If none of the expressions are true and the else block is defined then its statements are executed.
3.3. For Each Statements
forEachStatement = for each identifier in range statements? end range = (expression colon)? expression (colon expression)?
The statements are executed for each value in the range. The identifier is set to the next value in the range and has the same scope as the statements. The range can either result in an array or iXX.
If the middle expression is an array, then the first and last expressions must be omitted. The elements of the array are traversed from the first to the last. If the middle expression is not an array the first expression is the start of the range. If omitted the range starts from 0. The middle expression is the last value in the range inclusively. The last expression is the value that the identifier is incremented by per iteration. All 3 expressions must be of the same iXX type.
3.4. While Statements
whileStatement = while expression statements? end
The expression must be of type of bool. If the expression is true, then the statements are executed. After the statements have executed, the expression is evaluated again, and if true the statements are executed again. This continues until the expression is false.
3.5. Break Statements
breakStatement = break
The breakStatement stops the execution of the innermost loop in which it is declared. Execution resumes after the innermost loop.
3.6. Call Statements
callStatement = callExpression
3.7. Return Statements
returnStatement = return expressions?
Returns the control the function that called the one that contains the return statement. If the function containing the return statement doesn't specify any output, then the statement cannot specify any expressions to return and the function may omit the statement entirely. Since in that case the control is returned to the caller after the last statement. If output is specified, then all code paths must end with a return statement with an expression of the same type as the return type.
3.8. Assert Statements
assertStatement = assert expression
The expression must be type of bool. If the expression is true, then nothing happens. If the expression is false, then the current proposition stops execution and a description of the failing assertion is given.
expressions = expression (comma expression)* expression = notExpression / negateExpression / binaryExpression / primaryExpression / parentherizedExpression primaryExpression = callExpression / literalExpression / sizeOfExpreassion / dotExpression / indexExpression / identifierExpression literalExpression = floatLiteral / integerLiteral / booleanLiteral / arrayLiteral / structureLiteral
The expressions are evaluated from left to right. If an integer or floating point expression overflows at run time the behaviour is undefined. If the expression is constant and overflows it is a compile time error.
4.1. Unary Expressions
4.1.1. Not Expressions
notExpression = not expression not = '!' whitespace
The expression must be type of bool. The not operator flips the expression from true to false and vice versa.
4.1.2. Negate Expressions
negateExpression = negate expression negate = '-' whitespace
The expression must be type of iXX or fXX. The negate operator flips the sign of the value.
4.2. Binary Expressions
binaryExpression = expression binaryOperator expression binaryOperator = mathOperator / relationalOperator / booleanOperator mathOperator = '+' / '-' / '*' / '/' / '%' / '|' / '&' / '<<' / '>>' whitespace relationalOperator = '==' / '!=' / '<=' / '>=' whitespace booleanOperator = '||' / '&&' / '<' / '>' whitespace
4.2.1. First Precedence
The || operator is logical or. Both operands must be type of bool. If one of the operands are is true, then the expression is true; otherwise false. The operator short circuits.
4.2.2. Second Precedence
The && operator is logical and. Both operands must be type of bool.
4.2.3. Third Precedence
The ==, !=, <, <=, > and >= are the equal, not equal, less than, less than or equal, greater than and greater than or equal operators respectively. Both operands must be of the same iXX or fXX type.
4.2.4. Fourth Pecedence
The + and - operators works on iXX and fXX operands. Both operands must be of the same type. The result of the operation is the same type as the operands. The | is the bitwise or operator. Both operands must be of the same iXX type. The result of the operation is the same type as the operands.
4.2.5. Fifth Pecedence
The *, / and % are the multiply, divide and the modulus operators respectively. Both operands must be of the same iXX and fXX type. The << and >> are the right shift and left shift operators respectively. Both operands must be of the same iXX type. The & operator is bitwise and. Both operands must be of the same iXX type.
4.3. Call Expressions
callExpression = identifier leftParenthesis expressions? rightParenthesis leftParenthesis = '(' whitespace rightParenthesis = ')' whitespace
Calls the function with the same identifier in scope. The expressions are the input for the function.
4.4.1. Floating Point Literals
floatLiteral = '-'? [0-9]+ '.' [0-9]+ 'f' ( '32' / '64' )
Floating point values are defined as in IEEE 754.
|f32||±1.18×10−38 to ±3.4×1038|
|f64||±2.23×10−308 to ±1.80×10308|
4.4.2. Integer Literals
integerLiteral = '-'? [0-9]+ ('i' / 'u') ( '8' / '16' / '32' / '64' )
4.4.3. Boolean Literals
booleanLiteral = true / false
booleanLiterals are type of bool.
4.4.4. Array Literals
arrayLiteral = type dimensions elements? / elements dimensions = leftSquareBracket expression? (comma expression?)* rightSquareBracket elements = leftCurlyBracket element (comma element)* rightCurlyBracket element = elements / expressions leftSquareBracket = '[' whitespace rightSquareBracket = ']' whitespace
The type is the type of the elements in the innermost dimension. Each expression, which must be an integer type, in dimensions specifies the size of each dimension. The maximum size of a dimension is implementation specific. The smallest dimension size is 0.
If only the elements are declared then the first element of the innermost dimension declares the type of the array. The elements also specifies the size of each dimension.
4.4.5. Structure Literals
structureLiteral = structure identifier fieldInitializers end fieldInitializers = fieldInitializer (comma fieldInitializer)* fieldInitializer = identifier equal expression
The structureLiteral's identifier is the name of the structure to initialize. The fieldInitializer's identifier is the name of the field to initialize. The expression is the value of the given field. The type of the expression must match field's type. Fields that are not initialized have undefined values.
4.5. Size of Expressions
sizeOfExpreassio = size of expression
Results the size of the expression as a u32 in bytes. The expression must be a type.
4.6. Dot Expressions
dotExpression = expression dot expression
The left expression's type must be a structure, enumeration or union. The right expression must be a member on that type.
4.7. Index Expressions
indexExpression = expression leftSquareBracket expressions rightSquareBracket
The expression must be an array type. Each expression in expressions must be an iXX. Each value must be within 0 and the size of the dimension it is used to index into. Going out of bounds in Debug mode the program is stopped and then a message is displayed to the programmer explaining where the error occurred. Going out of bounds in Release mode results in undefined behaviour.
4.8. Identifier Expressions
identifierExpression = identifier
The identifier can either be a variable, field or enumeration constant.
4.9. Parentherized Expressions
parentherizedExpression = leftParenthesis expression rightParenthesis
type = pointerTo* (identifier / functionSignature) pointerTo = '*' whitespace