Principles of Nanyx
Simple /= easy
We believe in the creed "Simple and easy are not the same". We prefer a language that gets things right to one that makes things easy. Such a language might take longer to learn in the short run, but its simplicity pays off in the long run.
Inform; don't block
The Nanyx compiler has both warning and error diagnostics but, unlike many other languages, a warning in Nanyx is considered a build failure (with no option to ignore it). We believe that warnings are often ignored or turned off (and lead to arguments within teams as to whether a warning is harmless or not), and that this ultimately leads to code that will cause problems later on. For Nanyx, we believe that any code that appears troublesome or incorrect to the compiler should outright be rejected.
However, Nanyx is quite unique in that, while a warning will "fail" the build, when building in 'debug' mode the compiler can still produce an executable. This allows the developer to run or test the program despite the fact that the build has nominally failed. The programmer can then iteratively fix the warnings until the program is warning-free.
This is important in allowing the developer to maintain a tight feedback loop, as even type errors or name errors are warnings rather than errors. Only when building in 'release' mode will such issues block the build.
Note that compiler errors (such as syntax errors) will always cause the compiler to abort compilation.
Everything is an expression
Nanyx is a functional language and embraces the idea that everything should be an expression. Nanyx has no local variable declarations or if-then-else statements, instead it has let-bindings and if-then-else expressions. However, Nanyx does not take this idea as far as the Scheme languages. Nanyx still has declarations, namespaces, and so forth that are not expressions.
Separate pure and impure code
Nanyx supports functional, imperative, and logic programming. The context system of Nanyx cleanly and safely separates pure code from impure code. That is, if a function is pure then the programmer can trust that the function behaves like a mathematical function: it returns the same value when given the same arguments and it has no side-effects.
Correctness above all
Nanyx aims to ensure program correctness and considers it more important than raw performance. Languages such as C and C++ often rely on undefined behaviour to achieve stellar performance, whereas most other languages, including Nanyx, try to eschew undefined behaviour in favor of runtime checks for things that are hard to statically ensure. For example, most languages will dynamically check that array accesses are not out of bounds. The cost is a small performance hit, but in our view the benefit towards correctness is immense. Inspired by Ada, Nanyx aims to offer strong guarantees, ideally ensured statically, but when necessary with dynamic checks.
We believe that a language should make it easy to make illegal states unrepresentable. For example, algebraic data types can be used to precisely define the possible values of a type. In Nanyx, in the future, we want to take this a step further, and allow refinement of some types. For example, to express that some value must not only be an integer, but also that it must fall within a range, e.g. [0-99].
Nanyx aims to support developer productivity; the ability to do a lot with little ceremony or boilerplate. A hand-crafted C program might run faster than a Nanyx program, but it won't be as short, concise, or expressive as the Nanyx program. Nanyx aims to be a language with powerful constructs and high-level abstractions.
This does not mean that Nanyx is slow.
One language
Nanyx is one programming language. The Nanyx compiler does not have feature flags or compiler plugins that change or extend the semantics of the language. We want to avoid fragmentation in the ecosystem where programs end up being written in different "dialects" of the language. There is one language, now and forever. Of course, we can still expect the language to evolve over time.
Principle of least surprise and the pit of success
We should strive to adhere to the principle of least surprise. That is, we should favor sane defaults, and when there is no immediately obvious default, we should not have a default at all, but force the programmer to be explicit about their intention.
Wherever the easiest thing should also be the safest thing; for example, immutable data structures should be the default, and it is mutable data structures that should require explicit annotation.
Local type inference
The Nanyx type system is based on Hindley-Milner which supports full type inference. As a design choice, we require (with a warning) all exported members to be annotated with their argument and return types. Type signatures serve as useful documentation and aid program understanding and we believe that requiring type signatures has three distinct advantages that outweigh the disadvantages.
- Type signatures improve program readability and accurately assign blame for type errors
- Type signatures enable parallel type checking
- "Pinning" type signatures of exported members helps avoid the accidental changing of APIs
Syntax vs. Semantics
Syntax is important. Semantics are important. But we should not confuse the two. A syntactic issue should not be resolved by a enrichment of the semantics. For example, extension methods and implicit classes seem to be semantic solutions to (mostly) syntactic issues. Nanyx aims to avoid such pitfalls.
Consistent syntax
Nanyx aims to have consistent and predictable syntax. As an example, we try to have the syntax of types mirror that of expressions:
- A function application is written as
f(a, b, c), as is a type applicationF(A, B, C). - A function expression is written as
{x -> x + 1}whereas a function type is written asInt -> Int. - A tuple is written as
(true, 12345)whereas a tuple type is written as(bool, int).
As much as possible each symbol (or pair of symbols) should have a single meaning. For example, braces {} are always used for function bodies, = is always used for definitions, and so forth.
Human-readable errors and compiler messages
In the spirit of languages like Elm and Rust, Nanyx aims to have human readable and understandable compiler messages, as they should be thought of as the user interface of the compiler.
Messages in Nanyx should be crisp, concise, and clear, getting straight to the point: the error message Duplicate definition: 'foo' is better than the error message The definition 'foo' is defined twice because the programmer will likely understand the problem as soon as they have scanned the first word. The "80 / 20" rule means that 80% of the time a developer will need only minimal information to understand why a compiler message is being shown; most likely the developer will already have seen the specific error message hundreds of times before. But 20% of the time, the developer will have never or rarely seen the message before and will need more information.
With this in mind, a Nanyx compiler message consists of three components:
- Title: A one sentence summary. The message shown on hover in the code editor.
- Context: A multi-line text that contains all relevant details, including the relevant symbol(s) and program fragment(s).
- Explanation: A description of why the problem occurs and preferably hints as to what can be done to fix it. When relevant, explain how Nanyx differs from other languages.
Regarding style and tone: the language should be friendly or neutral, and written in the passive voice (i.e. not "you did this" or "I couldn't do that"). An error message should not blame the programmer; for example, we should prefer Unexpected foo over Illegal foo, since the latter implies that the programmer did something wrong.
Closed world assumption
Nanyx requires all code to be available at compile-time. This enables a range of compilation techniques, such as: Monomorphization to avoid unnecessary boxing of primitives. Aggressive dead code elimination ("tree shaking") to remove unused functions. Inlining across namespaces. Whole-program analysis.
Nothing is executed before main
In Nanyx, main is the entry point of a program. No (user-defined) code is ever executed before main. No static initializers, no static fields. No class loaders. Main is always first. This makes it easy to reason about startup behavior.
Small but comprehensive standard library
Nanyx has a small standard library with a few common data types, e.g. Option, List, Set, and Map, but for these it offers a comprehensive collection of functionality. For example, the standard library has more than 65 functions for working with lists. We want the standard library to offer a common set of abstractions which are usable by most programs, but not much else.
Declare before use
In Nanyx things must be defined before they can be used. Algebraic data types, functions, local variables, and other programming elements must be declared before they can be used by other program parts. Declarations make it easy to assign blame; we assume declarations to be correct and check every use against its declaration. For example, an algebraic data type declares a set of cases, and the compiler checks that every use refers to one of these cases, and that every case is covered.
No unnecessary declarations
We believe that a programming language should reduce the volume of declarations it requires. Declarations may be useful and are sometimes necessary, but Nanyx aims to minimize its internal dependence on them. To give an example, Nanyx supports extensible records which permits the usage of flexible and type-safe records without a strict requirement that record types must be declared upfront.
No global state
In Nanyx there is no global shared state. This avoids a plethora of issues, including difficulties with initialization order and race conditions in the presence of concurrency. A Nanyx programmer is free to construct some state in the main function and pass it around, but there is no built-in mechanism to declare global variables. In a real system, the programmer still has to deal with the state of the world, e.g. the state of the file system, the network, and other resources.
Share memory by communicating
Nanyx follows the Go mantra: Do not communicate by sharing memory; instead, share memory by communicating. In other words: mutable memory should never be shared between processes. Processes should only share immutable messages (and data structures). We believe this significantly reduces the risk of race conditions.
Bugs are not recoverable errors
We believe in the Midori Error Model; that is, there are two kinds of errors: recoverable errors and program bugs. Recoverable errors are things like illegal user input, network errors, etc. Errors that can be anticipated and where there is a chance of recovery. Program bugs, on the other hand, are unanticipated and we cannot expect to recover from them. We should treat these two types of errors differently: For recoverable errors, we should enforce that they are checked and handled. For program bugs, we should terminate execution as quickly as possible to prevent data corruption and security issues.
Fail fast, fail hard
To aid debugging and prevent potential harmful behaviour, Nanyx aborts execution when an unrecoverable error is encountered. In the presence of concurrency, if a process fails, Nanyx aborts the entire program. This ensures that the outside environment is duly notified and can take corrective action, e.g. to restart the program.
No pre-processor
Nanyx does not have and will not have a pre-processor. Programs that use pre-processing for textual code generation are notoriously difficult to understand and debug. We want to avoid that for Nanyx. Instead, Nanyx may some day have a macro system, but so far there has been little need.
No null value
Nanyx does not have the null value. The null value is now widely considered a mistake and languages such as C#, Dart, Kotlin and Scala are scrambling to adopt mechanisms to ensure non-nullness. In Nanyx, we adopt the standard solution from functional languages which is to represent the absence of a value using the Option type. This solution is simple to understand, works well, and guarantees the absence of dreaded NullPointerExceptions.
No implicit coercions
In Nanyx, a value of one type is never implicitly coerced or converted into a value of another type. For example, No value is ever coerced to a boolean. No value is ever coerced to a string. Integers and floating-point are never truncated or promoted.
No reflection
Nanyx does not support reflection, i.e. the ability to inspect the structure of the program at run-time. Reflection tends to break the kind of program reasoning that both compilers and humans rely on. At some point in the future, Nanyx might support some notion of compile-time meta programming.
No unused declarations
Inspired by Rust, the Nanyx compiler will reject programs that contain unused declarations. We believe that rejecting such programs will help programmers avoid mistakes where some algebraic data type or function is unintentionally left unused.
No unused variables
Nanyx disallows unused local variables, whether they are introduced by let, introduced by pattern matching, or part of the formal parameters of a function. Research [1] [2] has repeatedly shown that minor mistakes are a common source of bugs, e.g. using the wrong local variable. Disallowing unused local variables help avoid such mistakes.
No overloading
Nanyx does not support function overloading (using the same name for different functions). Instead, Nanyx encourages the use of meaningful names, e.g. Map.filter and Map.filterWithKey, for functions that share similar functionality.
No variadic (varargs) functions
Nanyx does not support variadic (varargs) functions. It is not clear to us how a language design can support both currying and variadic functions cleanly. Moreover, it seems that the supposed benefits of variadic functions is not that great in a language which already has concise syntax for list and array literals.
No binary or octal literals
Nanyx does not support binary or octal literals. It is our understanding that these features are rarely used in practice.
Exhaustive pattern matches
The Nanyx compiler enforces that pattern matches handle all cases of an algebraic data type. If a match expression is found to be non-exhaustive, the program is rejected. We believe this encourages more robust code and enables safer refactoring of algebraic data types.
Timeless design
A few years ago HTML was all the rage. Hence it was only natural that Java adopted HTML-style comments. A bit later, XML was all the rage, hence it was only natural that Scala added support for native XML literals. Today, JSON and Markdown are all the rage, but if history is any guide, we should not add any special support for these to Nanyx.
Built-in documentation
Nanyx supports comments as part of the language. We believe such integration avoids fragmentation of the ecosystem and ultimately leads to better tool support.
Built-in unit tests
Nanyx supports unit tests as part of the language. We believe such integration avoids fragmentation of the ecosystem and ultimately leads to better tool support.
Unification of tuples and records
Nanyx unifies tuples and records. A tuple is simply a record where the fields are numbered 0, 1, 2, ... It even allows these forms to be mixed in the same type. This unification simplifies the language and makes it easier to learn and understand. It also means that the programmer is not forced to make wide changes to a codebase simply because they want to add a field name to a tuple.