Artifact [db5468fd66]
Not logged in

Artifact db5468fd668eea10ab8764febea4253f35dc5829:

Wiki page [Ideas] by nem 2014-12-16 11:18:45.
D 2014-12-16T11:18:45.694
L Ideas
N text/x-markdown
P 473419eb051119d1363bcf4c344e6cb4624de2aa
U nem
W 3292
This page collects some thoughts on future directions. 

Firstly, the language should make it simple to represent:

 - Datalog clauses, e.g. `AncestorOf(x, y) :- ParentOf(x, z), AncestorOf(z, y)`
 - Algebraic languages - relational algebra, linear algebra, e.g. [VecTcl](http://auriocus.github.io/VecTcl/)
 - Optional types (not yet decided on this)
 - Domain-specific languages
 - Actions/commands

I _believe_ that it should be possible to use expressions/functions for most things and therefore restrict commands to being at the top-level only. This allows co-opting `[...]` syntax for other means.

Down to brass tacks. Like Tcl, SiCL is a command language. A program is a sequence of commands, separated by newlines or semicolons. A command is a sequence of words. A word is an expression, of one of the following forms:

 - Numbers: `123`, `123.4e12` etc
 - Strings: `foo`, `"foo"`, `'foo'`, `“foo”`, `‘foo’`, `«foo»`, `‹foo›` - may as well allow other quotation characters, especially as they are nestable!
 - Symbols: `%$^$%` - punctuation symbols are also just strings, but the tokenizer will break on them, so `foo+%$bar` is three tokens: `foo`, `+%$` and `bar`. Use whitespace if needed or quotes to group.
 - Unary/binary operators: `f x`, `x f y` etc. Normally adjacent expressions will be parsed as separate words in the command. However, some strings may be declared as operators (prefix, infix, or postfix). In this case, when encountered, they will be applied to the surrounding arguments as per their precedence, associativity etc. E.g., if `f` is a prefix unary operator then a command like `foo f x y` will be parsed as `foo (f x) y`.
 - Sub-expressions: `(...)`
 - Lists: `(x, y, z)` - note that this is not a binary operator `,` but rather a special syntax. Thus `((x, y),z)` is different from `(x,(y,z))` is different from `(x,y,z)`.
 - Dicts: `(x: y, z: a)`
 - Blocks of commands `: ... `. These start with a `:` and can only occur as a top-level word (to avoid confusion with dict syntax). Edging towards indentation syntax for these - either the block is a single command on the same line as the colon, or there is a newline immediately after the colon and then the indentation of the next line sets the block indent: the first line indented less than that ends the block.
 - Sets: `{x, y, z}` - same as lists but no duplicates allowed.
 - Vectors/Arrays: `[x, y, z]` - same as lists but using VecTcl or some optimised representation. All elements forced to be of the same type (and compactly represented).

The set form also provides an alternative syntax to constructing dictionaries: `{x: 1, y: "foo"}`. Possibly this syntax constructs a form based on balanced tree of some kind (red-black tree)?

This syntax also (I think) ensures that SiCL is a super-set of JSON, which is quite a handy property.

I quite like the idea of using `#` to distinguish different implementation choices:

 - `[x, y, z]` is something like a linked-list or maybe a rope: worst-case linear access time, but constant-time appends.
 - `#[x, y, z]` vector/array
 - `{x, y, z}` - balanced tree
 - `#{x, y, z}` - hashed
Possibly, this is all overkill and the built-in Tcl structures with CoW and ref-counting are good enough in most cases.

Z d65109d73aedd970a96269e9f7e248e1