This page collects some thoughts on future directions.
Firstly, the language should make it simple to represent:
- Datalog clauses, e.g.
AncestorOf(x, y) :- ParentOf(x, z), AncestorOf(z, y)
- Algebraic languages - relational algebra, linear algebra, e.g. VecTcl
- Optional types (not yet decided on this)
- Domain-specific languages
- Actions/commands
I believe that it should be possible to use expressions/functions for most things and therefore restrict commands to being at the top-level only. This allows co-opting [...]
syntax for other means.
Down to brass tacks. Like Tcl, SiCL is a command language. A program is a sequence of commands, separated by newlines or semicolons. A command is a sequence of words. A word is an expression, of one of the following forms:
- Numbers:
123
,123.4e12
etc - Strings:
foo
,"foo"
,'foo'
,“foo”
,‘foo’
,«foo»
,‹foo›
- may as well allow other quotation characters, especially as they are nestable! - Symbols:
%$^$%
- punctuation symbols are also just strings, but the tokenizer will break on them, sofoo+%$bar
is three tokens:foo
,+%$
andbar
. Use whitespace if needed or quotes to group. - Unary/binary operators:
f x
,x f y
etc. Normally adjacent expressions will be parsed as separate words in the command. However, some strings may be declared as operators (prefix, infix, or postfix). In this case, when encountered, they will be applied to the surrounding arguments as per their precedence, associativity etc. E.g., iff
is a prefix unary operator then a command likefoo f x y
will be parsed asfoo (f x) y
. - Sub-expressions:
(...)
- Lists:
(x, y, z)
- note that this is not a binary operator,
but rather a special syntax. Thus((x, y),z)
is different from(x,(y,z))
is different from(x,y,z)
. - Dicts:
(x: y, z: a)
- Blocks of commands
: ...
. These start with a:
and can only occur as a top-level word (to avoid confusion with dict syntax). Edging towards indentation syntax for these - either the block is a single command on the same line as the colon, or there is a newline immediately after the colon and then the indentation of the next line sets the block indent: the first line indented less than that ends the block. - Sets:
{x, y, z}
- same as lists but no duplicates allowed. - Vectors/Arrays:
[x, y, z]
- same as lists but using VecTcl or some optimised representation. All elements forced to be of the same type (and compactly represented).
The set form also provides an alternative syntax to constructing dictionaries: {x: 1, y: "foo"}
. Possibly this syntax constructs a form based on balanced tree of some kind (red-black tree)?
This syntax also (I think) ensures that SiCL is a super-set of JSON, which is quite a handy property.
I quite like the idea of using #
to distinguish different implementation choices:
[x, y, z]
is something like a linked-list or maybe a rope: worst-case linear access/append time, but constant-time prepend.#[x, y, z]
vector/array{x, y, z}
- balanced tree (b-tree?)#{x, y, z}
- hashed Possibly, this is all overkill and the built-in Tcl structures with CoW and ref-counting are good enough in most cases.