Semantic Restrictions
Consider an operation called "some element of_"
that accepts a set and answers an arbitrary element. Clearly the parameterization of the set should not impact the meaningfulness of the operation. It is therefore desirable to define the operation with maximum generality: it may accept any set whatsoever and will answer an arbitrary value from it.
someNumbers
is a set comprising only natural numbers, i.e., positive integers. The method accepts any set, and set of natural number
is a subtype of set of any
, so the method is certainly applicable. Intuitively then, an arbitrary element chosen from this set will be a natural number. But the method's return type is any
, so the compiler cannot reach the same conclusion as human intuition. To improve the compiler's inferential capabilities, another method can be introduced that overrides the generic behavior for instances of the subtype.
The ability to define such an override is extremely powerful, but it should set off alarm bells that both methods have the same implementation. This override clearly exists solely for the benefit of the compiler. Without it, the compiler (apparently) cannot deduce that an element arbitrarily chosen from a set of natural numbers can only be a natural number. This is not necessarily so onerous — until one wishes to select an element from a set of string
!
This situation arises naturally and continually when defining and using operations on parameterized types. Were this the only mechanism that Avail provided, then the language would prove grossly unsatisfactory for practical usage. A programmer would be forever duplicating code in order to gain the benefit of type inference. It would be far better if a programmer could define an operation once and for all at the maximum appropriate level of generality, yet any specific usage of the operation could benefit from local information. In order to determine the legality of an operation, the compiler must use knowledge of the types of the arguments of an operation, i.e., a message send, so it would be nice if these types were available for the purpose of ascertaining the type of value produced by the operation.
In Avail, local information can be leveraged at the site of a particular message send through the definition of a semantic restriction — a function that is applied by the compiler to the inferred types of the arguments of the associated method. A single semantic restriction affords several opportunities:
- It can strengthen the result type of a message send if the argument types indicate that such strengthening is both possible and warranted.
- It can reject a message send as illegal if one or more of its arguments are deemed inappropriate based on an assessment of their types.
- It can encode information about the semantics of an operation in such a way that the compiler can mechanically operate upon it. In a sense, it assists the compiler in leveraging the meaning of an operation.
Consider the following redesign of "some element of_"
, and assume that no overrides are defined:
How does this work? When the compiler parses the expression some element of someNumbers
, it determines that it represents a send of the message "some element of_"
to the value of the argument someNumbers
. The argument happens to be a variable use phrase, so its inferred type is set of natural number
. The compiler then identifies which semantic restrictions apply. In this case, only a single semantic restriction exists, and it is applicable for any value that is an instance of (set of any)'s type
. By the law of metacovariance, set of natural number
being a subtype of set of any
necessitates that (set of natural number)'s type
is a subtype of (set of any)'s type
, and therefore set of natural number
is an instance of (set of any)'s type
. The semantic restriction is thus applicable for the value set of natural number
. The compiler then applies the semantic restriction to this value: it runs the user-supplied code on the type inferred for someNumbers
. The function answers natural number
, the element type of set of natural number
. Finally, the compiler computes the type intersection of the declared return type of "some element of_"
— any
— and the value obtained from the semantic restriction — natural number
— to obtain natural number
. This is the type inferred for the expression some element of someNumbers
. It is trivially compatible with the declared type of aNumber
, so the initialization is permissible.
Consideration of the expression some element of someStrings
is analogous. Eventually the compiler invokes the semantic restriction with the value set of string
and it answers string
. So the type inferred for the expression is ultimately string
, meaning that the initialization of aString
is valid. A single implementation of "some element of_"
and a single semantic restriction take the place of the infinitely many overrides previously conjectured to be necessary.
Why does the compiler compute the type intersection of the return type and the semantic restriction's answer? Because multiple semantic restrictions may be specified for the same method and even for the same parameter types. This ensures the modularity of the system. Only the semantic restrictions already defined for a method can be applied to a particular message send. Subsequently loaded modules can introduce new semantic restrictions that will have effect on further message sends.
To make this concrete, consider the following code sample:
Mentally move the initializing declaration of aNumber
to a point below the introduction of the second semantic restriction. When the compiler parses the expression some element of someNumbers
, it applies both semantic restrictions. The first yields the value natural number
and the second yields the value ⊤
. The type intersection of any
, natural number
, and ⊤
is natural number
. The inferred type remains unchanged from before.
The advantage introduced by the new semantic restriction does not pertain to strengthening of the inferred type. Instead it improves the meaningfulness of the operation by utilizing type information local to a particular message send. It is absurd to ask for one of the elements of the empty set because it does not include any elements. The new semantic restriction codifies this by invoking "Reject parse,expected:_"
if the argument necessarily describes a type whose only instance is the empty set. This is possible because the incomplete type set
has a type parameter that constrains the cardinality of instances, and the semantic restriction can query the upper bound of this constraint via "⎡_⎤"
.
Note that the implementation specifies nothing about the lower bound of the cardinality constraint. So, it does not block message sends that might not work correctly, only those that cannot possibly work correctly. But because semantic restrictions are modular, a less tolerant semantic restriction can later be introduced that imposes a burden on the programmer to prove that a set cannot be empty before it can qualify as an argument to "some element of_"
.
The compiler cannot check a semantic restriction for correctness. For example, it is entirely possible to define a semantic restriction on "some element of_"
that always answers string
; this will certainly be wrong whenever the message is sent to a set of natural number
. Why bother with a semantic restriction then? One may as well ask, why bother with programming at all? Semantic restrictions are tremendously convenient and powerful so long as they are coded and tested for correctness.
A semantic restriction allows a programmer to assert that a particular expression yields a type that is stronger than the compiler can otherwise ascertain. In many programming languages, this is accomplished via a type cast, a special syntactic construct that mandates dynamic coercion of a value into a stronger type. Such a cast is necessary for each expression that has an insufficiently strong static type. The run-time value may or may not be checked for conformance to the stronger type. A language that actually checks type casts for validity during program execution is run-time type safe; one that does not is run-time type unsafe.
Avail is run-time type safe. When a message send is strengthened by a semantic restriction, the result produced at execution time is checked for conformance to the type answered by the semantic restriction at compile time. This ensures that the type error will be detected as soon as possible, preventing further corruption. Semantic restrictions are automatically applied by the compiler wherever appropriate, so the programmer need not annotate expressions that must be coerced at run-time, thereby reducing syntactic clutter and incidental code complexity.
Avail is first-order type safe. If every semantic restriction is correct, then the compiler guarantees that the program is completely free of type errors. Thus every value computed by an expression will be an instance of that expression's inferred type. Avail is not second-order type safe, since semantic restrictions cannot be checked for correctness — although their implementations are ordinary code and therefore themselves first-order type-safe.
‹ Type Inference | | | Return to Type System | | | Type Lattice › |