Many hyperlinks are disabled.
Use anonymous login
to enable hyperlinks.
Overview
| Comment: | A few updates |
|---|---|
| Downloads: | Tarball | ZIP archive |
| Timelines: | family | ancestors | descendants | both | trunk |
| Files: | files | file ages | folders |
| SHA1: |
1a694ab3c2f07227ee792d5bb6b894cc |
| User & Date: | andy 2021-12-26 17:42:20.948 |
Context
|
2021-12-27
| ||
| 00:40:47 | Begin updating grammar Leaf check-in: ed36f3fb28 user: andy tags: trunk | |
|
2021-12-26
| ||
| 17:42:20 | A few updates check-in: 1a694ab3c2 user: andy tags: trunk | |
| 17:42:09 | Roll back one overzealous instance of $$ check-in: 7446a62a1f user: andy tags: trunk | |
Changes
Changes to doc/concepts.md.
| ︙ | ︙ | |||
27 28 29 30 31 32 33 | - [Script](#script) - [Substitution](#substitution) - [Indexing](#indexing) - [Expression](#expression) # <a name="word"></a> Word <a href="#table_of_contents" style="font-size: small">[top]</a> | | | > | | | | | > | > | | > > > | | 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 | - [Script](#script) - [Substitution](#substitution) - [Indexing](#indexing) - [Expression](#expression) # <a name="word"></a> Word <a href="#table_of_contents" style="font-size: small">[top]</a> In the Brush programming language, the fundamental data unit is the word. All of the following are examples of words: - value of a variable - argument to a command - return value of a command - a number - result of an expression - component of a compound word - any value at all, anywhere Conceptually, words are immutable, and all words are strings. Any attempt to change a word's value only replaces it with a new word. To improve performance, unshared words can be directly modified in place, and their implementation is optimized according to type, where a word's type is determined by how the word is used. These internal details are visible only at the C API level, not the script level. The term "word" was chosen because of the analogy to machine words. A machine word is typically operated on as a unit, yet can be split into its constituent bits and bytes. A Brush word is likewise typically operated on as a unit, yet can be divided. The difference between machine words and Brush words is that machine words are a fixed size whereas Brush words are variable and can in fact contain other words. Another analogy is to natural language. Words initially seem to be the atomic building blocks of sentences, yet upon further examination they are revealed to be made up of letters, morphemes, syllables, and stems. ## <a name="word_type"></a> Word type <a href="#table_of_contents" style="font-size: small">[top]</a> All words are strings, but that is simply because strings are the common denominator between all types. Many specialized types exist. Word type is a flexible concept, and it varies freely throughout the execution of a program. Here is a list giving some examples of word types: - string - integer - blob - real number - reference - glob expression - regular expression - list - set - map - script ## <a name="string"></a> String <a href="#table_of_contents" style="font-size: small">[top]</a> A string is a sequence of zero or more Unicode characters. Unicode characters have 21-bit code points ranging from 0 through hexadecimal `0x1fffff`. Internally, strings are encoded using UTF-8 with two modifications: - NUL (code point 0) is represented as the two-byte sequence `0xc0 0x80` - `0x00` is appended to the end of each string These two modifications make encoded strings backward compatible with classic NUL-terminated strings, even if the string contains embedded NULs. Aside from NUL as described above, denormalized characters are not allowed. Surrogate pairs and UTF-16 are not used. ## <a name="blob"></a> Blob <a href="#table_of_contents" style="font-size: small">[top]</a> A blob is a sequence of arbitrary 8-bit bytes. "Blob" is short for "binary large object", though of course blobs can be any size. ## <a name="reference"></a> Reference <a href="#table_of_contents" style="font-size: small">[top]</a> |
| ︙ | ︙ | |||
109 110 111 112 113 114 115 | - vector - range - stride ## <a name="list"></a> List <a href="#table_of_contents" style="font-size: small">[top]</a> A list is a compound word containing zero or more component words in a linear | | | | > > > > > | > > > > > > | | 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 | - vector - range - stride ## <a name="list"></a> List <a href="#table_of_contents" style="font-size: small">[top]</a> A list is a compound word containing zero or more component words in a linear sequence. Words are addressed by their zero-based numerical index. This is known as vectored indexing. ## <a name="set"></a> Set <a href="#table_of_contents" style="font-size: small">[top]</a> A set is a list that is being accessed with the `[set]` commands. Sets provide fast exact-match searches, known as keyed indexing. Behind the scenes, a critbit tree is used to optimize access and provide other fast operations such as sorting, minimum, and maximum. Sets cannot contain duplicate keys. Should a list containing duplicate words be accessed via the `[set]` commands, duplicates will be ignored, and only the final instance of each duplicate word will be treated as a key in the set. ## <a name="map"></a> Map <a href="#table_of_contents" style="font-size: small">[top]</a> A map is a list containing an even number of words, alternating between key words and their associated value words. The `[map]` commands and keyed indexing operators are used to rapidly perform exact-match searches over the key words. As with sets, if a map contains duplicate keys, only the final (highest-indexed) instance of any given duplicate key is accessible via `[map]` or keyed indexing operators. Earlier duplicates will continue to be present in the list and string representations but will be ignored by all map accesses. # <a name="object"></a> Object <a href="#table_of_contents" style="font-size: small">[top]</a> An object is a map with any number of variable keys and an optional attributes key. The variable keys associate variable names with references to their value words. Empty string is not an allowed variable name, nor can variable names begin with a digit `0-9`. Variable names may consist of ASCII characters `0-9a-zA-Z_` as well as any non-ASCII characters (i.e. code points 128 and greater). The attributes key word is empty string, and its value word is a map associating various attribute names with values. The `type` attribute determines the object type, and other attributes vary by type. Custom object types can be defined. The attributes key may be omitted, in which case the object is simply a scope. ## <a name="scope"></a> Scope <a href="#table_of_contents" style="font-size: small">[top]</a> |
| ︙ | ︙ | |||
181 182 183 184 185 186 187 188 189 190 191 192 193 194 | A task is a paused command invocation. Threads appear to execute simultaneously but (on single-core systems) may actually be taking turns, preemptively and automatically scheduled by the operating system. In contrast, tasks expressly take turns and are cooperatively and manually scheduled by the Brush program. ## <a name="thread"></a> Thread <a href="#table_of_contents" style="font-size: small">[top]</a> A thread is a simultaneous execution sequence. Threads are compartmentalized, and each thread runs in its own interpreter. Thus, threads are a specialized form of interpreter. | > > | 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 | A task is a paused command invocation. Threads appear to execute simultaneously but (on single-core systems) may actually be taking turns, preemptively and automatically scheduled by the operating system. In contrast, tasks expressly take turns and are cooperatively and manually scheduled by the Brush program. Tasks are also known as coroutines because they are cooperating subroutines. ## <a name="thread"></a> Thread <a href="#table_of_contents" style="font-size: small">[top]</a> A thread is a simultaneous execution sequence. Threads are compartmentalized, and each thread runs in its own interpreter. Thus, threads are a specialized form of interpreter. |
| ︙ | ︙ | |||
224 225 226 227 228 229 230 | subsequent words are the arguments to the command. There are numerous supported ways of typing words in a script. Each method is known as a word constructor. Most word constructors produce a single word, but the expansion and comment constructors produce multiple or zero words, respectively. The first character of each word determines its word constructor. | | | 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 |
subsequent words are the arguments to the command.
There are numerous supported ways of typing words in a script. Each method is
known as a word constructor. Most word constructors produce a single word, but
the expansion and comment constructors produce multiple or zero words,
respectively. The first character of each word determines its word constructor.
:Syntax |:Type |:Comment
-------------------------------------------------------------------------------
**x** | Bare | Allows substitution, treats whitespace as a delimiter
`"`**x**`"` | Quoted | Allows substitution, inhibits whitespace processing
`{`**x**`}` | Braced | Inhibits both substitution and whitespace processing
`[`**x**`]` | Script | Allows nesting, value is the result of the script
`(`**x**`)` | List |Allows substitution and nesting, preserves word boundaries
`&`**x** | Reference | Creates a reference to a variable
|
| ︙ | ︙ | |||
255 256 257 258 259 260 261 |
within some parts of reference words or other substitutions.
:Syntax |:Type |:Comment
--------------------------------------------------------------------------------
`$`**x** | Simple variable substitution | Literal variable name
`$"`**x**`"` | Computed variable substitution| Name allows nested substitution
`${`**x**`}` | Expression substitution |
| < < < < > | | < < < < < < < < < > | 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 313 |
within some parts of reference words or other substitutions.
:Syntax |:Type |:Comment
--------------------------------------------------------------------------------
`$`**x** | Simple variable substitution | Literal variable name
`$"`**x**`"` | Computed variable substitution| Name allows nested substitution
`${`**x**`}` | Expression substitution |
`$[`**x**`]` | Script substitution |
`$(`**x y z**`)`|List substitution |
`\`**x** | Backslash substitution | **x** is `[abBefnrtv]`
`\`**x** | Backslash quoting | **x** is `[^abefnrtuvx0-7\n]`
`\`**nlws** | Line wrap |
`\`**o** | Octal 3-bit character | **o** is `[0-7]`
`\`**oo** | Octal 6-bit character | **o** is `[0-7]`
`\`**Ooo** | Octal 8-bit character | **O** is `[0-3]`, **o** is `[0-7]`
`\x`**hh** | Hexadecimal 8-bit character | **h** is `[0-9a-fA-F]`
`\u`**Hhhhhh** | Hexadecimal 21-bit character| **H** is `[01]`, **h** is `[0-9a-fA-F]`
In the above table, "**nlws**" refers to a newline followed by any number of
non-newline whitespace characters. When `\`**nlws** appears within
`"`quotes`"`, it is replaced with a single space. Otherwise, it is treated as a
word delimiter.
All substitutions starting with `$` permit indexing, described in the next
section.
The backslash substitution replacements are listed below:
:Sequence |:Replacement |:Description
---------------------------------------------------
`\a` | `\x07` | Audible alert
`\b` | `\x08` | Backspace
`\B` | `\x5c` | Backslash
`\e` | `\x1b` | Escape
`\f` | `\x0c` | Form feed
`\n` | `\x0a` | Line feed, a.k.a. newline
`\r` | `\x0d` | Carriage return
`\t` | `\x09` | Horizontal tab
`\v` | `\x0b` | Vertical tab
|
| ︙ | ︙ |