Quick Start
This quick start guide is targeted at experienced programmers who want to get started writing Avail as quickly as possible. No special effort has been expended to ensure that the material herein is suitable for those with a limited or nonexistent programming background. Furthermore, the quick start guide assumes that you have already installed and configured an Avail environment suitable for development of personal projects, and that you know how to launch the workbench.
The quick start guide endeavors to use code patterns as often as possible to engage the phenomenal pattern finding abilities of the human brain. For the fastest possible start, I recommend that you study the code samples to get a feel for the language but skim the text as much as possible, concentrating on the text only when samples don't click for you. You may want to read sections that don't contain code samples more carefully, since these sections are likely to answer technical questions about the language at a high-level. Some sections end with a quick reference subsection; you can generally skip these subsections completely on a first pass.
Table of Contents
- Modules
- Tokenization
- Parsing
- Types (quick ref)
- Literals (quick ref)
- Constants (quick ref)
- Variables (quick ref)
- Blocks (quick ref)
- Methods (quick ref)
Modules
Modules are where Avail code lives. Modules are organized recursively into roots and packages, and every package is positioned recursively within a root. Modules and packages both end with the .avail
file extension, but modules are regular files whereas packages are directories. Each package contains an eponymous module, called the representative, which specifies the exports of that package.
Before you can start coding, you will need to create a root for your own Avail packages and modules. Set the AVAIL_ROOTS
environment variable to something like this:
This path contains three (3) module root specifications:
Root | Description |
---|---|
avail |
The standard library. |
examples |
The standard examples. |
myproject |
Your exploratory project, where your own Avail packages and modules go. |
Obviously you can choose a name other than myproject
or a path other than $HOME/myproject
. There is no need for the last component of your root's source path to match the name of the root. For the rest of this guide, I will assume that you called the root myproject
.
Once you have updated AVAIL_ROOTS
, relaunch the Avail workbench. You can now populate your myproject
root with modules and packages. Use a shell or text editor to create a module called "Exploration.avail"
directly inside myproject
, then switch focus back to the workbench and hit F5
(or choose the "Refresh"
item from the "Build"
menu) to update the module view. You should be able to find Exploration
now under your myproject
root. You will double-click Exploration
whenever you want to recompile it.
Anatomy of a Module
Module Header
The module header specifies linkage information for the module. It supports a small variety of keywords, each of which leads a section. The following sections are most likely to be immediately useful:
Keyword | Section | Description |
---|---|---|
Module |
module name section | Declares the name of the module. |
Uses |
private imports section | Declares the private imports of the module. |
Extends |
extended imports section | Declares the re-exported imports of the module. |
Names |
introduced names section | Declares all names introduced and exported by the module. |
Entries |
entry points section | Declares all entry points of the module. |
Module Body
The module body is where code lives. When you are experimenting with Avail, you will put your code into the body of your Exploration
module.
Using the Standard Library
In order to gain access to the native Avail syntax and the extensive functionality of the standard library, be sure to include "Avail"
in the private imports section of the module header. This section begins with the keyword Uses
.
This is very important — much more important even than including the standard libraries of other programming languages. Since Avail's native syntax is supplied by the standard library, and not built into the compiler, you won't even be able to use literals, define constants, declare variables, or write statements without importing the standard library. So don't forget this critical step.
Tokenization
Module bodies are fully tokenized prior to parsing. The parser operates on tokens to produce unambiguous top-level sends of macros and methods.
There are four (4) types of tokens, described below using regular expressions and Unicode categories:
Token Type | Scanning Rule |
---|---|
keyword | [\p{L}\p{Nl}][\p{Cf}\p{L}\p{Mc}\p{Mn}\p{Nd}\p{Nl}\p{Pc}]* |
string literal | "(?:\\"|[^"])*?" |
nonnegative integer literal | \{Nd}+ |
operator | \P{Cn} (and does not match a previous rule) |
Keyword Tokens
Keyword tokens may be used as identifiers, i.e., names of constants and variables. Keyword tokens are read greedily.
String Literals
The Avail compiler has built-in support for scanning string literals. String literals are read non-greedily.
Nonnegative Integer Literals
The Avail compiler has built-in support for scanning nonnegative integer literals, e.g., those described by the type whole number
. Nonnegative integer literals are read greedily, and can therefore denote arbitrarily large finite values, not just ℤ/(232) or ℤ/(264), i.e., the integers modulo 232 or 264.
Operator Tokens
Every other non-whitespace character is scanned as a single operator token, so ::=
produces three (3) tokens, not one (1).
Whitespace
Whitespace (\p{Z}*
) is permitted to appear before or after any token, and must appear between two distinct greedily read tokens of the same type in order to distinguish them as separate tokens.
Intervening whitespace is not necessary to distinguish adjacent tokens of different token types.
Tokens carry their leading ("_'s⁇leading whitespace"
) and trailing whitespace ("_'s⁇trailing whitespace"
), meaning that macros can react to whitespace if they are so inclined. This is what allows 6.4
to be recognized as a literal double, but not 6 . 4
. This mechanism allows macros to decide whether whitespace is significant situationally, rather than the compiler enforcing a universal policy.
Comments
Block comments begin with /*
and end with */
. Unlike in C, C++, or Java, block comments nest, allowing you to comment out sections of code that happen to contain other block comments. Block comments are permitted wherever whitespace would be permitted.
There are no special end-of-line comments.
Parsing
The compiler parses a module body by attempting to convert sequences of tokens into unambiguous top-level sends of statement
-valued macros and ⊤
-valued methods. Whenever such a send is recognized, the compiler evaluates it immediately, allowing it to perform its side effects in the environment of the module undergoing compilation. These side effects are how new Avail elements are introduced, like method definitions, macro definitions, semantic restrictions, grammatical restrictions, etc. Once an Avail element has been introduced, it can be used immediately, i.e., in the next top-level statement.
Types
Avail is statically typed, and more strongly typed than any traditional programming language. Every value has a most specific type that is distinct from the most specific type of every other value. This type is called the instance type (referred to as a singleton type in some programming language literature). For example, 5's type
and "Hello!"'s type
are both instance types.
Explicit type annotations are required for variable declarations and block argument declarations. They are permitted in some other places, like label declarations and return type declarations. Otherwise types are inferred from context.
Every value is an instance of infinitely many types, so Avail trivially supports multiple inheritance. Avail's types are organized into an algebraic lattice, so any two types have a most specific more general type (the type union) and a most general more specific type (the type intersection).
Every type is denotable in Avail itself, i.e., there are no types intelligible to the compiler which cannot be mentioned explicitly by an Avail programmer. Additionally, every type is itself a value, whose own type is one of Avail's infinitely many metatypes. Metatypes are organized by the law of metacovariance: given two types A
and B
, A
is a subtype of B
iff A's type
is a subtype of B's type
.
Types Quick Reference
Message | Description |
---|---|
"number" |
type shared by all numbers |
"integer" |
type shared by all finite integers |
"float" |
type shared by all floats |
"double" |
type shared by all doubles |
"boolean" |
type shared by true and false |
"tuple" |
type shared by all tuples |
"string" |
type shared by all strings |
"set" |
type shared by all sets |
"map" |
type shared by all maps |
"_'s⁇type" |
get instance type of value |
"_'s⁇instance" |
get sole instance of instance type |
"_⊆_" |
subtype |
"_∪_" |
union |
"_∩_" |
intersection |
"_∈_" |
instance of |
"_`?→_†" |
cast |
Literals
Unlike traditional programming languages which have fixed syntax for a small variety of literal types, Avail's macros permit values of any data type to be literalized. Macros act on phrases at compile time, and can execute arbitrary user code to effect substitutions, thereby allowing any computable value to be literalized. It is therefore not possible to give an exhaustive list of literal formats.
Built-in Literals
String literals and nonnegative integer literals are the only literal formats built into the compiler.
Floating-point Literals
Floating-point literals must include the fractional part. Exponential notation is optional. If the literal ends in f
(U+0066)
, then a single-precision floating-point number is indicated; otherwise, a double-precision floating-point number is indicated.
Negative Numeric Literals
Negative numeric literals are constructed using the macro "-_"
.
Boolean Literals
The boolean type comprises the values true
and false
(written thus).
Null Literals?
Avail does not expose a null value for use by a programmer. An unassigned variable can be used for a similar purpose, but it does not travel as conveniently as a traditional null.
Literals Quick Reference
Message | Description |
---|---|
"…#.…#«f»?" |
floating-point literal constructor |
"…#.…#…" |
floating-point literal constructor (positive exponential notation without sign) |
"…#.…#e|E«+|-»!…#«f»?" |
floating-point literal constructor (positive exponential notation with sign) |
"-_" |
negative numeric literal constructor |
"true" |
value representing truth |
"false" |
value representing falsehood |
Constants
Avail features a rich collection of immutable data types and a plethora of persistent operators. It is very natural to bind the results of expressions to constants, thus constant definitions appear much more commonly in Avail code than variable declarations. As the name suggests, constants are written once, upon creation, and never change during their lifetime.
Module Constants
If a constant definition appears as a top-level statement, then a module constant is created. This constant can be referenced from any lexically subsequent statement in the module; it never goes out of scope.
Local Constants
If a constant definition appears within a block expression, then a local constant is created. This constant can be referenced from any lexically subsequent statement or expression in the same block. It goes out of scope after the block expression ends.
Constants Quick Reference
Message | Description |
---|---|
"…::=_;" |
constant definition |
Variables
Despite Avail's many functional programming features, Avail is an imperative programming language. As such, it permits the declaration of variables—holders of state that can be written as well as read and can thus change over time.
Variables can be initialized upon declaration. A variable that is not explicitly initialized is deemed unassigned until it is written. Reading from an unassigned variable will raise a cannot-read-unassigned-variable exception
.
An unassigned variable can be used for a purpose similar to a traditional null value (which Avail does not support).
Module Variables
If a variable declaration appears as a top-level statement, then a module variable is created. This variable can be referenced from any lexically subsequent statement in the module; it never goes out of scope.
Local Variables
If a variable declaration appears within a block expression, then a local variable is created. This variable can be referenced from any lexically subsequent statement or expression in the same block. It goes out of scope after the block expression ends.
Variables Quick Reference
Message | Description |
---|---|
"…:_†;" |
variable declaration |
"…:_†:=_;" |
variable declaration with initialization |
"…" |
variable use (just say its name) |
"…:=_;" |
variable assignment |
"_↑is assigned" |
true if variable is assigned |
"_↑is unassigned" |
false if variable is assigned |
"Clear_↑" |
make variable unassigned |
"_↑++" |
increment (integer variables only) |
"_↑--" |
decrement (integer variables only) |
"cannot-read-unassigned-variable exception" |
raised when reading unassigned variable |
Blocks
Blocks are lexical specifications of functions. A function is a body of code whose execution is deferred until the function is applied (or invoked or executed or called, depending on what terminology is familiar to you).
Blocks without Parameters
Blocks for arity-0 functions (i.e., functions without arguments) are just bodies of code inside left square bracket [
(U+005B)
and right square bracket ]
(U+005D)
.
Blocks with Parameters
Block parameters are separated by commas ,
(U+002C)
and precede a vertical line |
(U+007C)
.
Return Types
Blocks infer their return types from their final expressions, but you can also annotate the return type of a block explicitly. This annotation is only necessary if you wish to weaken the return type of the block (because the compiler will always infer the strongest possible type).
The return type of a block that ends with a statement is ⊤
, Avail's most general type. A return type of ⊤
means that a block returns control to its caller without producing a value.
⊥
is Avail's most specific type. A return type of ⊥
means that a block never returns control to its caller, i.e., because it always raises an exception, loops forever, switches continuations, etc.
Semicolons
By now you might be confused by the semicolons. When are they needed, and when not?
The body of a block is a series of statements followed by an optional expression. Statements are ⊤
-valued, and must end with a semicolon ;
(U+003B)
. The optional final expression produces a value, and must not end with a semicolon, even if ⊥
-valued.
Summary: Every "line of code" in a block except the last one definitely ends with a semicolon. If the last "line" is a value-producing expression, then don't put a semicolon after it; otherwise, throw a semicolon in at the end.
Labels
Following the parameters section, an optional label definition is permitted. The label represents a continuation that resumes at the top of the block, i.e., the beginning of the applied function. The return type of the label is optional, but should be included whenever you plan to exit the continuation using the label. If omitted, the label return type is assumed to be ⊥
(because continuations are contravariant by return type).
Defining a label permits use of several transfer-of-control mechanisms:
Message | Description |
---|---|
"Restart_" |
Restart the continuation (like traditional continue ) |
"Exit_with_" |
Exit the continuation with a specific value (usually like traditional return ) |
"Exit_" |
Exit the continuation without producing a value (like traditional break ) |
Lexical Closures and Outers
Blocks behave as lexical closures of bindings declared in outer scopes. These bindings comprise arguments, labels, primitive failure reasons, local and module constants, and local and module variables. The bindings declared in an outer scope and captured by lexical closure are called outers. Mutable outers, i.e., module and local variables, may be rebound (by assignments or other mechanisms).
Blocks in Control Structures
Because blocks are just lexical specifications of functions, they may appear wherever functions may appear. This means that blocks can occur as arguments to standard control structures, like "If_then_"
and "While_do_"
. This makes Avail ideal for higher-order programming.
And because block labels represent continuations, blocks can even be used to implement low-level constructs like loops.
Evaluating a Block Immediately
It is sometimes useful to evaluate a block immediately after closing it. This is especially useful when you just want to introduce a new scope for local constants and variables, or when you want to use a label to exit an inner scope without exiting the outermost block.
Blocks Quick Reference
Message | Description |
---|---|
"\
\|[\
\|««…:_†§‡,»`|»?\
\|«Primitive…#«(…:_†)»?§;»?\
\|«$…«:_†»?;§»?\
\|«_!§»\
\|«_!»?\
\|]\
\|«:_†»?\
\|«^«_†‡,»»?" |
block definition |
"function" |
type shared by all functions |
"Restart_" |
Restart the continuation (like traditional continue ) |
"Exit_with_" |
Exit the continuation with a specific value (usually like traditional return ) |
"Exit_" |
Exit the continuation without producing a value (like traditional break ) |
"_(«_‡,»)" |
apply a function (with or without arguments) |
Methods
Methods are named operations. Method names are called messages. In traditional languages, messages are typically constrained to be standard identifiers. Avail's story is rather different.
Message Pattern Language
Messages are specified using a pattern language that is commonly understood by the various method definers, like "Method_is_"
. A message teaches the parser how to recognize that some new syntax represents an invocation of the named method.
The chart below describes the pattern language of messages, omitting some patterns that are only useful for defining macros.
- The Pattern column shows a single pattern abstractly, with metacharacters in
blue monospace
. Bold patterns are eponymous and descriptive, i.e., keyword stands for a keyword token. Italic patterns are shortened to conserve horizontal space, and elaborated upon in the description. - The Example Message column shows a message that embeds the pattern at least once.
- The Example Signature shows the types of the arguments accepted by the method, in their natural order.
- The Example Use shows an expression that sends the message.
- The Description tersely describes the feature that is activated by using the pattern.
Throughout the chart, the red arrowing pointing downwards then curving rightwards ⤷
(U+2937)
indicates the beginning of a blue monospace
region that corresponds to the illustrated pattern.
Pattern | Example Message | Example Signature | Example Use | Description |
---|---|---|---|---|
keyword | "⤷integer " |
<> | ⤷integer |
match a keyword token case-sensitively |
operator | "⤷∞ " |
<> | ⤷∞ |
match an operator token |
p⁇ |
"_⤷occurrences⁇ of_" |
<whole number, any> | 7⤷ of 9 OR 7⤷ occurrences of 9 |
match 0 or 1 occurrence of a pattern p |
t1| …| tn |
"⤷If|if _then_else_" |
<boolean, []→⊤, []→⊤> | ⤷If a ≥ 1 then [a] else [0]OR ⤷ if a ≥ 1 then [a] else [0] |
match any token t1 .. tn of an alternation |
« p1| …| pn» |
"⤷«a lion|a tiger|a bear|oh my» " |
<> | ⤷a lion OR ⤷ a tiger OR ⤷ a bear OR ⤷ oh my |
match any pattern p1 .. pn of an alternation |
p~ |
"⤷«red alert»~" |
<> | ⤷red alert OR ⤷ RED ALERT OR ⤷ rEd AlErT OR etc. |
match a pattern p case-insensitively |
_ |
"Print:⤷_ " |
<⤷string > |
Print: ⤷"Hello!" |
match a type-safe argument expression, pass it as an argument |
_↑ |
"`↑⤷_↑ " |
<⤷variable > |
↑⤷var |
match a variable use, pass the variable itself as an argument |
_† |
"_`?→_† " |
<any, ⤷type > |
n ?→ ⤷[10..20] |
match a type expression, evaluate it in the module scope, pass result as an argument |
… |
"⤷… ::=_;" |
<⤷token , expression phrase> |
⤷x ::= 0; |
match a keyword token, pass it as an argument |
…# |
"⤷…# .⤷…# «f»?" |
<⤷literal token⇒integer , ⤷literal token⇒integer , boolean> |
⤷6 .⤷4 |
match a type-safe literal token, pass it as an argument |
…! |
"¢⤷…! " |
<⤷token > |
¢⤷= |
match an operator token, pass it as an argument |
« p» |
"sum⤷«_» " |
<⤷number+ > |
sum ⤷1 2 3 |
match ≥0 of pattern p, pass accumulated arguments as a single tuple |
« p1‡p2» |
"{⤷«_‡,» }" |
<⤷any* > |
{⤷2, 4, 6 } |
match n≥0 of pattern p1 interleaved with n-1 occurrences of pattern p2, pass accumulated arguments as a single tuple |
« p»? |
"…#.…#⤷«f»? " |
<literal token⇒integer, literal token⇒integer, ⤷boolean > |
6.4⤷ OR 6.4⤷ f |
match 0≤n≤1 occurrence of pattern p, pass n=1 as an argument |
« p»# |
"⤷«please‡,»# stop" |
<⤷whole number > |
⤷ stop OR ⤷ please stopOR ⤷ please, please, please stop |
match n occurrences of pattern p, pass n as an argument |
« p1| …| pn»! |
"⤷«a lion|a tiger|a bear|oh my»! " |
<⤷[1..4] > |
⤷a lion OR ⤷ a tiger OR ⤷ a bear OR ⤷ oh my |
match any pattern p1 .. pn of an alternation, pass the pattern's ordinal, e.g., 1 for a lion , 2 for a tiger , etc. |
` m |
"Yes⤷`… ⤷`? " |
<> | Yes⤷… ⤷? |
escape the following metacharacter m, disabling its special meaning (otherwise … and ? each would have had special meaning) |
Method Overriding
The ordering of a method's parameter types is called its signature. Methods can be overridden for any combination of parameter types. The names of the parameters are not considered part of the method's signature.
Method Dispatch
Every argument of a message send participates in the resolution of a method. The method definition selected is the most specific definition available for the supplied combination of arguments. If no method definition is uniquely most specific at runtime, then the dispatch mechanism raises an ambiguous-method-definition exception
.
WORK IN PROGRESS