Marpa

Timeline
Login
Tcl 2016 Conference, Houston/TX, US, Nov 14-18
Send your abstracts to tclconference@googlegroups.com by Sep 12.

Many hyperlinks are disabled.
Use anonymous login to enable hyperlinks.

313 ancestors of [514199b8ed]

2017-10-15
04:42
[514199b8ed] Leaf: Moving critcl after tcl solves OSX issue with install dependency order. Check if this breaks linux. (user: aku, tags: bad-build-order-trouble)
2017-10-11
05:28
[08e6e9634d] Mark recognizer cons/dest points better (user: aku, tags: trunk)
2017-10-06
22:01
[32c320340a] Equivalent changes in the C runtime. 1. The C runtime already inter-twined tree extraction, valuation and hand-over which was added to the Tcl runtime in the previous commit. 2. Fixed same issue with possible L0 discards after G1 end. 3. Fixed bad assertions in symset, byteset, exposed by 2. (user: aku, tags: trunk)
20:12
[8c6bdade0a] Reworked parser completion handling. Do not pull and save all possible parse trees into memory anymore. Instead eval each tree immediately after extraction and pass the resulting SV to the outer backend. Further a bug fix, tell the lexer about expected terminals (none), so that it can still handle any L0 discards which may occur after the G1 end symbol. I.e. while we are not expecting the G1 token stream to continue the L0 byte stream may still have input to process. TODO: Have to add test cases for this situation, both where only the expected discards occur, and where unexpected actual G1 tokens are present. (user: aku, tags: trunk)
19:36
[bbe2253bdb] Fix in Tcl runtime tracing. Bring necessary variable into scope. This was forgotten when placing various operations into their own methods for clarity. (user: aku, tags: trunk)
19:33
[4f1c755959] Debugging enhancement, show actual semantic values in valuation steps. (user: aku, tags: trunk)
19:31
[5fde5977d2] Big tangle of single package sliced into several packages, each containing just related code. (user: aku, tags: trunk)
2017-10-05
21:51
[d38f475f67] Closed-Leaf: Fix package meta data typo. (user: aku, tags: slice)
21:39
[912cadf759] Reworked naming of the generator packages, and associated namespaces. Searching for plugins, i.e. more generators is now simpler (no special cases to exclude). (user: aku, tags: slice)
18:59
[649487dd0c] Updated marpa-gen to new sliced setup, and filled `list-plugins` in marpa::export::config. Next up, look into renaming packages for nicer structure. Start with exporters. (user: aku, tags: slice)
08:06
[444c10e2e4] Heal fork, complete. (user: aku, tags: slice)
08:05
[2175b86257] Closed-Leaf: Heal fork (user: aku, tags: slice-2)
08:04
[77883b0ffd] Split the remaining pieces into three packages: - C runtime - builtin parser (C runtime) - Low-level C wrapper for Tcl runtime foundation Updated tests to work again. More reshuffling. (user: aku, tags: slice-2)
03:34
[ae36822717] Fix missing requirements in the internal tool to re-create the builtin parser. (user: aku, tags: slice)
2017-10-04
23:49
[30d4d13ed3] Took Tcl runtime out of the tangle. Left tangled are the low-level C wrapper and the C runtime. Some shuffling of parts. Note: Needs Kettle commit [kettle:c0f0b90c04] (kt::local* addition, scan fix, @owns fix) to work. (user: aku, tags: slice)
22:14
[4843e825d1] Detangled precedence rewriting, and the exporters, mostly. Have places using an exporter where we need only part (gc formatting). Structure does not make for nice format/plugin discovery either. (user: aku, tags: slice)
06:18
[575ffcc030] Extricated SLIF semantics, and general literal handling. (user: aku, tags: slice)
05:20
[51f54f5f55] Extricated SLIF container implementation and low-level Tcl utilities from the tangle. (user: aku, tags: slice)
2017-10-03
23:34
[2437f3ebb9] Carved the lowlevel unicode support (table access, char classes, case-folding) out of the tangle and placed into its own package. (user: aku, tags: slice)
22:12
[0b47aba686] Extended `marpa-gen` with short options. (user: aku, tags: trunk)
22:06
[91d8cf0a47] Remove superfluous initializer. (user: aku, tags: trunk)
17:17
[bb205813ee] Removed generated qcs map, and code doing the generation. Made the information static in the `sem_tcl.c` glue. Made a few other functions static in there, renamed. (user: aku, tags: trunk)
05:46
[5c42aa1913] Pull completed base RTC work into mainline, and close. (user: aku, tags: trunk)
05:13
[b5b432d6c4] Closed-Leaf: Bootstrap step 2. Switched builtin slif parser to RTC-based implementation. Tests pass. (user: aku, tags: runtime-c)
00:01
[f0dd695aeb] rtc lexer fixes for single-value SVs. Which are single, and single-element lists are reduced to their element. (user: aku, tags: runtime-c)
2017-10-02
23:58
[7a41f6af9c] Separate lexer results by engine. Updated results. (user: aku, tags: runtime-c)
2017-09-30
22:43
[cd68d425f1] Tweaked clex exporter to match lexing-only mode of RTC, particular in its use of the C-level result callback. Updated export test results to match. Added testsuite for rtc lexing-only mode matching tlex. Does __not__ pass the latter yet, i.e. tlex/clex differences (quoting in part, values in part). (user: aku, tags: runtime-c)
21:56
[9d189e9390] Commit [aa5c236ec3] was wrong. The file was used to find the export test cases. Tweaked the suites to now look for their result files instead of a separate flag file. (user: aku, tags: runtime-c)
21:06
[eb9b15eb66] Tweaked RTC to enable execution in a lexing-only mode. Triggered when initialized without a G1. (user: aku, tags: runtime-c)
21:05
[29220fb5fb] Run RTC generation through external `critcl` app. Using a separate process is the important point, preventing the differing parsers and lexers from itnerfering in memory (attempting to multiply define various custom arg/result types.) (user: aku, tags: runtime-c)
21:02
[ae9f6195e3] Fix argument assertions for set link at high end. `n` can and may reach `capacity`. (user: aku, tags: runtime-c)
2017-09-29
22:51
[80b087c9b9] Replaced poking into sv/vector internals with a proper API function, and updated users. (user: aku, tags: runtime-c)
19:32
[aa5c236ec3] Remove bogus result file (user: aku, tags: runtime-c)
18:44
[b4ce1a0d87] Moved requirement for marpa to places where it does not interfere with basic usage. Further changed default name for the output to be derived from the name of the grammar file. (user: aku, tags: runtime-c)
18:42
[e385825b01] Fix missing destruction of helper objects; interfered when used multiple times. (user: aku, tags: runtime-c)
18:41
[47a56e2331] Added tests for cparse and clex exporters. (user: aku, tags: runtime-c)
06:23
[9c90529961] New exporter, clex-critcl. Reduced RTC-based engine, L0 only. Will require modifications to RTC to allow operation without G1 / parser. Renamed the rtc-* exporters to cparse-* Tweaked marpa::fqn utility (varname argument instead of value) and started using it (slif::semantics, exporters). Trouble during the work below shows the need to force objects to FQN form immediately on entry into a method, a later conversion may go wrong depending on context (namespaced procedure, vs global procedure, vs global code). Moved the core testsuites into subdirectory `common`. Reworked and renamed the code essentially implementing a testsuite-specific variant of `bin/marpa-gen`. Updated all users. Factored code out of the zeta-lexer in preparation for use with `clex-*`. (user: aku, tags: runtime-c)
2017-09-28
07:51
[90c7278bc9] Renamed exporter for rtc-based parsers. Further moved common code out of various test suites into separate, shared files. (user: aku, tags: runtime-c)
2017-09-26
23:22
[cdd3d6fd00] Add a few diagrams showing coarse architecture (user: andreask, tags: runtime-c)
22:37
[8741cc1279] Moved testing of the slif semantics into a common core and added testing in conjunction with generated parsers. (user: andreask, tags: runtime-c)
2017-09-20
23:56
[48b47bd193] Bring rtc work to main line. Near-parity. TODO: Specify, generate, and test RTC lexer. Test RTC with semantics and containers. Add engine perf testing (user: aku, tags: trunk)
23:52
[5ee3764313] Pull RTC error generation and associated fixes into the main rtc work. (user: aku, tags: runtime-c)
23:50
[809750bb3e] Closed-Leaf: Fix constructor of parser generated by rtc-critcl. All tests pass. (user: aku, tags: rtc-scratch)
22:51
[efee7dadf9] Get the fix for the memory smash. Tweak to vector handling, do not allocate element array for empty vector. Factored vector expansion into helper function, and avoid iterated expansion. (user: aku, tags: rtc-scratch)
22:46
[8423e46502] Closed-Leaf: Fix memory smash. Code wrote to string[-1] when dealing with an empty vector. Added special-case code avoiding this. (user: aku, tags: rtc-fix-smash)
22:42
[3b78c7b49c] Added lexer mismatch information to error message, updated expected AST information. (user: andreask, tags: rtc-scratch)
05:01
[e2fa82d9c9] Factored common parser testing into a single file driven by builtin and generated parsers (tcl, rtc). Allow for engine-specific results. Fixed expected error results for tcl to match the <>-bracketing added by [74bf29e7cc]. TODO: Complete errors generation for RTC, and fix up its results. TODO: Hunt the smash. (user: aku, tags: rtc-scratch)
2017-09-19
23:37
[74bf29e7cc] - Tweaked Tcl engine progress report (Now <>-bracketing the lhs symbol). - Tweaked char quoting for RTC, use octal for 127+. To match the chars/bytes in the progress report. - Reworked the progress report to proper align the columns. Uses SV data structures (string, vector) to hold the interim data. (user: andreask, tags: rtc-scratch)
23:36
[906570029c] Crash fix: Skip progress report if there is no lexical recognizer. (user: andreask, tags: rtc-scratch)
21:14
[e46c42dfa0] (Crashing in tests, seg.fault) Added L0 progress reporting to the error message. The differences in the reduced grammar used to operate the engines make it clear that we cannot exactly match their error output. While we can get close, we still need per-engine results here. Redo the zeta to be the tcl engine and then the main testsuite can be made to match whichever engine we use as builtin. (user: andreask, tags: rtc-scratch)
19:35
[f2a0be70b1] Get the latest trunk fix. (user: andreask, tags: rtc-scratch)
19:31
[22a19ba8aa] Tweaked char quoting in error message, and reworked to assemble via a DString. (user: andreask, tags: rtc-scratch)
14:56
[0cecbfe05b] Expanded and tweaked generation of error message. Disabled most of it, have to track down a memory smash. (user: aku, tags: rtc-scratch)
05:22
[d28460b501] Update test broken by the (:space:)-fix committed with [d00f6c9ec5], match results again. (user: aku, tags: trunk)
00:14
[6285fa4457] Draft work on generation of error messages by the RTC - In the Tcl glue code. Raw engines are on their own for now. (user: andreask, tags: rtc-scratch)
2017-09-18
22:14
[3a158ad5ae] Move the Tcl specific includes into a single header to allow easy replacement for other environments. TODO: Document the macros. TODO 2: Create a basic header for libc environments without tracing. (user: andreask, tags: runtime-c)
22:12
[f5e2dc153a] Remove rhs listification in progress reports. (user: andreask, tags: runtime-c)
21:25
[17e57b15d2] Moved the general completion processing out of the template into the RTC. (user: andreask, tags: runtime-c)
20:00
[08dbff3993] New testsuite to test the RTC engine using the SLIF grammar. Currently 29 fails, from a quick look all due to the missing generation of a proper error message. ASTs are ok. No crash (anymore, see sva dup fix [9e306213de]). (user: andreask, tags: runtime-c)
19:58
[815f5a7dcf] Fix argument name mismatch in RTC template, vs Tcl engine. (user: andreask, tags: runtime-c)
19:57
[9e306213de] Fix dup of empty vector, nothing to copy. (user: andreask, tags: runtime-c)
19:47
[690cffca93] Disable inadvertently committed tracing (user: andreask, tags: runtime-c)
2017-09-16
00:08
[6c632a56cb] Reworked the template for rtc-critcl to capture the parser SVs, via the new callback. The Tcl level sees the last SV captured. That is identical to how the Tcl engine behaves. A test run using the slif meta grammar produces a proper Tcl AST, no crash. Known limitation: No proper error handling/value yet. Even so, testing against the suite of grammar examples can commence, using an rtc-*.test analogous to the zeta-*.test (user: aku, tags: runtime-c)
2017-09-15
23:54
[138331137d] Extended rtc main with user-callback to handle the SVs generated at parser level. (user: aku, tags: runtime-c)
23:51
[02454030d1] Implemented conversion of rtc sv structures to Tcl_Obj's (user: aku, tags: runtime-c)
23:50
[472af77556] Moved shorthands for access to SV structures into header for use by other parts. (user: aku, tags: runtime-c)
21:59
[00aeda6759] Fixed issues with `marpatcl_rtc_sva_filter`, to wit: - Reordered testing of the possible cases, were in the wrong order. - Modified filter guard, prevent access to mask array beyond its size. - Added forgotten stepping of destination index in copy-down case. - Added nulling of the removed elements after their release. This enables the next. - Truncation of result now reduced to single assignment of new size (and fixed an off+1 on the size). Tracing tweaks in filter and parser: - Rephrased filter tracing output, added indiex information. - Further extended to dump basic vector content (in, out, intermediates). - Extended parser valuation traces to show filter masks. - Tweaked spacing for step-nulling to align output with the other types. (user: aku, tags: runtime-c)
15:20
[7ccc133594] Bring valuation fix and associated crash fixes into the work. (user: aku, tags: runtime-c)
15:19
[614482ab63] Closed-Leaf: Previous commit is fix for issue with valuation. Fixing more issues now exposed: lexer/parser: Fix bad check for exhausted recognizer. lexer/parser: Extend field size for rule/token ids in trace. lexer/parser: Disabled marpa valuation tracer parser: Fix double destroy of the recce. sem value: Enable string generation for null references sv arrays: Fix index/size handling and assertions in copy. sv arrays: Handle nulls when free'ing instances. Known issue: The parser seems to insert NULL into the generated sv at some places. Suspicion is on the sva filter functionality and around that. To be investigated. (user: aku, tags: rtc-trial-val-fix)
2017-09-14
20:48
[12fecaf845] Added "marpa_g_force_valued" used in the Tcl runtime. Check if this fixes the issues with valuation. (user: andreask, tags: rtc-trial-val-fix)
20:46
[73c064c5ef] Comment fix (user: andreask, tags: runtime-c)
2017-08-29
20:02
[94417ce2bc] Implemented the byte range refactorization TODO (sharing definitions of overlaps between ranges). #setup instructions are moderately up, #rules strongly down. (No effect on the issue with value iteration). (user: aku, tags: runtime-c)
2017-08-28
21:44
[1ce7257d39] Added big todo about global byte range optimization to the rtc generator. (user: aku, tags: runtime-c)
21:09
[9f491209c6] Add statistics about rules and rule instructions to the rtc generator output and templates. (user: aku, tags: runtime-c)
20:39
[fa91643e0e] Tweak readable rule output from rtc generator core to show alternates better. (user: aku, tags: runtime-c)
20:15
[43951636ba] Added progress reporting to lexer and parser. Additional runtime data comes out of the grammar setup for this, i.e. mapping from rule ids to lhs and rhs symbols. Testing shows good output for both lexer and parser. Also tells us that our problem with the valuation retrieval is not based in bad recognizer operation. Rules get predicted, processed, and finalized as they should. (user: aku, tags: runtime-c)
20:12
[7c887af758] Added new tracing stream focussed on the gate input, i.e. the processed byte stream. (user: aku, tags: runtime-c)
20:10
[e9168bb09a] Extended the rtc generator core to provide readable rule info for the rule instructions. (user: aku, tags: runtime-c)
2017-08-24
16:24
[79e0734010] Get current state of the rtc work. Mainly to get all the tracing correct here in trunk. (user: aku, tags: trunk)
16:23
[24f27282be] Get fixes from trunk. (user: aku, tags: runtime-c)
16:12
[012110015d] Fix `marpa_` for functions of the binding. Should have been, now is `marpatcl_`. Intermixed some tracing changes from the `runtime-c` branch. (user: aku, tags: trunk)
2017-08-23
04:11
[1f3831b952] Replaced various ifdef trace conditionals with conditional trace macros. (user: aku, tags: runtime-c)
2017-08-22
22:36
[30f37e21a4] Moved old Tcl-level event handling support over to TRACE. (user: aku, tags: runtime-c)
22:36
[d43c89b0f8] Added basic event handling (just printing, via TRACE support). Seeing only e-exhausted from lexer,. In line with the Tcl-level engine. (user: aku, tags: runtime-c)
21:08
[10d903f015] Add marpa tracing of valuation. Non-public function to activate. Unclear what parts of the value to print for trace steps. But they appear, which is something. (user: aku, tags: runtime-c)
19:41
[49a76c16bb] Disable all tracing. Need some pre-commit hook to prevent check-in of active tracing. (user: aku, tags: runtime-c)
19:03
[1d28a85391] Merge bugfix to `marpa::version`. (user: aku, tags: runtime-c)
19:03
[828be518d7] rtc critcl generator, template: tweak comments for clarity. Add debugging config commands. (user: aku, tags: runtime-c)
18:58
[16624fbdf9] Assert order, tree, value objects after creation (user: aku, tags: runtime-c)
18:57
[6d217aae8e] Move bocage tracer from fprintf over to TRACE. Assert order, tree, value objects after creation. (user: aku, tags: runtime-c)
2017-08-18
21:26
[8fd67a60d2] Fix marpa::version - remove superfluous arguments copied from check-version. And argument to the API function is an array of 3 ints, not a single int :( Stack smash fixed. (user: aku, tags: trunk)
16:25
[6af1654715] Fill out the tracing where not done yet. Tweak the trace messages somewhat (show types as cast where needed (pointers, mainly)) (user: aku, tags: runtime-c)
2017-08-17
04:17
[150f7ac23a] Another redo of tracing, adapting to the latest work in critcl. (user: aku, tags: runtime-c)
2017-08-04
04:33
[288de21b91] Added code to stringify semantic values, use it in tracing. (user: aku, tags: runtime-c)
2017-08-02
23:04
[848e47f0c7] Fix missing check post parser alternatives, fixed handling of lexeme length (get_lexeme is destructive). (user: aku, tags: runtime-c)
23:00
[19333843f4] Reworked the critcl tracing support and adapted RTC to it. (Support for file/tag based activation of trace streams). General tweaking of trace output (more and more symbolic information). As part of that get_parse() rewritten to a switch. Redone the tag checks in the SV code. (user: aku, tags: runtime-c)
22:43
[e0bfc6da4b] Keep D(isplay)Names as proper list, and fix mishandling of empty list (user: aku, tags: runtime-c)
07:39
[55d69982c2] Fix typo introduced by commit [f7d1fcaad9] (user: aku, tags: runtime-c)
06:54
[f440b08f4b] Pulled unicode CC fix (class :space:) into the RTC work (user: aku, tags: runtime-c)
06:53
[5af0029e73] Extended tracing, print symbol names in various places. (user: aku, tags: runtime-c)
06:51
[17de50b73c] Tweak and extend tracing (print names of symbols accepted in gate) (user: aku, tags: runtime-c)
06:50
[051e2c0880] Extended bytesets with `size()` accessor function. (user: aku, tags: runtime-c)
06:49
[deaf6d68d4] Fix indentation handling for rule code. (user: aku, tags: runtime-c)
06:41
[d00f6c9ec5] Fix definition of CC `space`, match Tcl `string is space`. Found while working with RTC and a test not recognizing \n as a character in [[:space:]], contrary to Tcl `regexp`. (user: aku, tags: trunk)
02:59
[1e9c44d234] Switched to the enhanced critcl::literals, critcl::emap packages, with the ability to provide C-level access to pools and mappings (pre critcl 3.1.17 work). Updated the marpa pool and map definitions to provide Tcl and C level access. Added C-level tracing to lexer. Added missing grammar precompute. Added better libmarpa checking. Fixed handling of the accept symset (sync'd with generator core): - Store terminal symbols (and pseudo-terminals for the discards) - Convert to ACS on entry (marpa_r_alternative) into the recce. - Capacity limited to lexemes + discards now. - Original scheme would need 256 never-used entries for the byte symbols. Fixed mishandling of token extraction in `get_parse`. (user: aku, tags: runtime-c)
2017-07-31
19:24
[c495a7b25d] Added C-level tracing to lexer. Added missing grammar precompute. Added better libmarpa checking. Fixed handling of the accept symset (sync'd with generator core): - Store terminal symbols (and pseudo-terminals for the discards) - Convert to ACS on entry (marpa_r_alternative) into the recce. - Capacity limited to lexemes + discards now. - Original scheme would need 256 never-used entries for the byte symbols. (user: aku, tags: runtime-c)
18:07
[812b8db505] Added C-level tracing to parser and fixed various issues: - Added missing grammar `precompute`. - Added missing recognizer `start_input`. - Added better libmarpa checking. Similarly for the overall rtc object, added C-level tracing, added missing init of `fail` manager and fixed ordering of init for `store` manager. (user: aku, tags: runtime-c)
17:51
[40dbbbd2c8] More C-level tracing, added to `byteset`, `fail`, `gate`, `inbound`, and `symset`. Further extended the `fail` manager with a function to check libmarpa results (TODO: proper texts for the error codes). Used this to extend the `gate` with better libmarpa checking. (user: aku, tags: runtime-c)
01:32
[fce56a94ff] At this point the output from the rtc/critcl generator compiles and loads ok, with functional constructor/destructor methods. (user: aku, tags: runtime-c)
2017-07-30
23:13
[89bfe909c6] Added assertions and traces to rtc grammar setup. Fixed bug in brange handling. (user: aku, tags: runtime-c)
23:12
[7c0e018f80] Oops. Added new generator to overall package. (user: aku, tags: runtime-c)
23:11
[fbd2f11257] Added rtc/critcl generator. Reworked the array emission once more, expose prefix as core configuration parameter for the two generators using it. (user: aku, tags: runtime-c)
2017-07-28
08:37
[268723c107] Redid array formatting support code for more control over the formatting of chunked arrays. Fixed the quirks of the previous tweak. (user: aku, tags: runtime-c)
07:32
[cc5a8891c3] Removed inadvertently committed debug output. (user: aku, tags: runtime-c)
07:30
[7d49a0ee9e] Split the RTC generator into core and generator with asset, same as the others. Tweaked array formatting. Have to work out some quirks introduced by the tweak. (user: aku, tags: runtime-c)
2017-07-27
19:44
[bf01f1bdd9] Updated C runtime work with trunk work (generator refactoring) (user: aku, tags: runtime-c)
19:42
[2af5734d2d] Refactored the tparse and tlex generators. Moved the core operation (essentially identical in both) into a separate package. The generators now only invoke that core to obtain the configuration they then insert into their template. Their difference now only is in the templates they carry. (user: aku, tags: trunk)
17:33
[3dc8b80196] Fix comment typo in bocage Extended rtc generator to check limits imposed by the data structures of spec.h Extended further to handle RHS masking in G1 rules And further to generate a map from rules to lhs symbols Extended the spec structures for this as well. Fixed bug in the array chunker for empty labels (missed prefix, prevented line breaks and dropped separators). Fixed issues in the _S_PER_ generator and coder. Small optimization for rules without data (i.e. no mask, no semantics). Added decoder to spec functions. Tweaked rtc lexer semantics handling (using macros to capture the memoization and make it look more table-like). Used same structure in rtc parser semantic handling. Filled in parser forest handling and tree execution. Added descriptor codes for `end` and `g1end` - Backend support, no frontend support (yet) General fixes. Moved location of code performing G1 token entry somewhat, i.e shift in the lexer/parser API. Added supporting sv(a) functions. Tweaked SV functions. (user: aku, tags: runtime-c)
2017-07-25
16:37
[cab135cf5b] Added fail handling and semantic store. More filling out of lexer re lexeme completion and semantics. Started to fill in parser, and semantics at that level. (user: aku, tags: runtime-c)
16:32
[cf0dffbe55] Tweaked string SVs. copy flag -> own flag. No copying, indicator of ownership -> auto-free or not (user: aku, tags: runtime-c)
16:31
[1692e7393b] Added helpers for string pool access. (user: aku, tags: runtime-c)
16:29
[e98d51c252] Block rtc export until we have limit checks added (user: aku, tags: runtime-c)
2017-07-24
16:40
[edd2833b03] Brought prettification back to work branch for the C engine (user: aku, tags: runtime-c)
16:38
[8b2be68de7] Closed-Leaf: Prettification completed, everything matches, exporter adapted. (user: aku, tags: scratch-2)
2017-07-22
00:28
[c96284ed90] More prettification. (user: andreask, tags: scratch-2)
2017-07-21
21:00
[12641c849b] Cleanup of supporting low-level data structures (byte sets, symbol sets, integer stacks). -- Naming, code structure. TODO: Create critcl interfaces to these, for testing. NOTE: These might also be useful to boost the performance of the Tcl runtime. (user: andreask, tags: scratch-2)
19:53
[31a9980c8b] Merged the differing scratch-work (user: andreask, tags: scratch-2)
05:39
[42aa6c16f4] scratch-2 (user: aku, tags: scratch-2)
05:37
[5be3a63e1c] More compression for the grammar structures: Split length information from string pointers. Removed latter, added string offset in a single large pool string. Boxed the start/stop information byte ranges into a single entry. Added string transform to the pool handler, applied to the lex symbols. Drops all the @-prefixed tags, adds dashes and []-bracketing to ranges and char classes. This reduces the string pool further. Fixed bug in the names for array semantic tags. (user: aku, tags: runtime-c)
00:09
[1353b0033f] Closed-Leaf: scratch (user: andreask, tags: scratch)
2017-07-20
16:03
[b2de58d181] Added a generator targeting the C runtime (RTC). Extended the configuration utility package with ability to query keys directly. This needed by the RTC generator to get access to the grammar name, from which it will derive a C identifier (prefix) for the generated structures. Lots of changes to the C grammar specification structures (spec.h) and engine setup (spec.c) Still not complete. Masking (Hiding RHS elements from the semantics) is missing. Revamped the bytecode system used to encode the grammar rules a bit, for smaller structures. Byte ranges are now stored as specified, and expanded at setup-time (into their set of priority rules, alternations of bytes). Similarly limited the size of various things to shorts (16bit). Still more things possible: Separate string lengths from the pointers in the pool, i.e. 2 separate arrays instead of one array using a structure with 6 byte of padding. Rework pool generation to reduce the amount and size of strings (strip various tags from symbol names, making them shorter, and causing more duplicates, i.e. less strings to keep around). (user: aku, tags: runtime-c)
2017-07-14
22:31
[0c5f1333fc] Filling out the gaps. The basic skeleton is now present, modeled like the Tcl runtime (user: aku, tags: runtime-c)
07:51
[84db43e656] Start on the C runtime. (user: aku, tags: runtime-c)
2017-07-13
23:34
[5a1350dba9] More optimizations in the Tcl runtime. The sub lexer structures (inbound, gate, lexer internals) use only simple semantic values (location, char+location). These can be transfered directly, there is no need to go through a store, and every reason to avoid it (lots of per-character method calls to put, get, box and unbox the values). The API changes required testsuite updates. Higher-level output was not changed, and passed. Fixed a bug in the SV handling of the lexer (KnownValue) exposed by the change. Benchmarking using `remeta` (which parses the meta grammar) found an improvement of about 7.5% in bytes read per second. This was calcuated by comparing the max speed taken over one thousand runs, with and without going through the store. Further possibilities to consider: - Merging `inbound` into `gate` into `lexer`. (reduce cross-instance method calls) (reduce method calls through inlining). - Read larger chunks from files, channels. (Currently 1K chunks, what about 4K, 8K ?) - Move pieces into C. This can inform a full-C runtime, or even be part of it. (user: aku, tags: trunk)
19:27
[98c21ecd1c] Cleanup: Moved internal doc notes out of the directory for future official docs. (user: aku, tags: trunk)
18:55
[13f7fa67a6] Added caches to the classes `gate` and `lexer`, to boost the performance of method `acceptable`. This method converts from sets of upstream symbols (parser for lexer, lexer for gate) to the sets of local symbols to use in gating. The new caches remember all incoming sets and their conversions. When sets recur the conversion is taken from the cache instead of computed. Benchmarking using `remeta` (which parses the meta grammar) found an improvement of about 2.3% in bytes read per second. This was calcuated by comparing the max speed taken over one thousand runs, with and without caching. <pre> ./build.tcl uninstall install rm TRACE for i in $(seq 0 999) do ./bootstrap/remeta X | grep done | tee -a TRACE done cat TRACE | projection 7 | maxcol <pre> (user: aku, tags: trunk)
2017-07-12
22:48
[af6c661e9d] Trajectory and goal: Replace the manually-created boot parser with a generated parser, based on the meta grammar. That parser should regenerate itself, and pass the testsuite. - Fixed issue in the gate which prevented the very character causing trouble from being shown in error messages. The previous character (= last ok) was shown instead. - Extended gate, lexer, parser to provide more information on lexing/parsing failure (progress reports, pre-mismatch stream for current attempt, symbol maps, ...) - Modified the aggregated parser runtime (rt_parse) to use the new information when generating error messages. - Extended boot-parser, container and slif meta-grammar to support the naming of quantified rules. - Extended the export configuration to distinguish grammar (= package name) from debug tags. Latter is derived from the name. - Tweaked comments in the templates for generated lexers and parsers. - Extended the parser generator to insert action and rule names into its output. - Reworked sorting of rules in the parser generator, to prevent the special rules setting actions and names from being torn away from the rules they apply to. - -->> Goal reached <<-- Replaced the manual slif parser with parser generated from the slif meta-grammar. - -->> Milestone achieved <<-- - Fixed bugs and issues in the SLIF metagrammar - TAB is allowed in strings, without escape. - Added proper naming of the rules to match the boot-parser and existing semantics. - Updated the testsuites. - Created apps: - Regeneration of the boot-parser after changes to the slif. - A very basic parser generator. (user: aku, tags: trunk)
2017-07-11
20:30
[536c5e2eb0] Trajectory and goal: Replace the manually-created boot parser with a generated parser, based on the meta grammar. That parser should regenerate itself, and pass the testsuite. - Modified the low-level `parser` engine, split the non-standard rule ordinals placed into names out, as a new custom array action part (`ord`). Further added support for action and name definition through rules. - Modified the handler for builtin semantics to support `ord`. - Modified the SLIF container to support `ord` in the `action` attribute of g1 rules. - Modified `rt_parse` and boot parser, placed the ordinals back in as a regular part of the action, and moved the definition of the semantics out of the high-level runtime into parser definition, as action definition (See new support in the low-level `parser` engine. - Extended the `tparse` exporter to put action and name information into the rules of the generated parser. - Modified the meta grammar to match the boot parser with respect to `ord`. - Updated all dependent tests, and test support (AST formatting now sees the ordinals, and reconstructs the old form of output). (user: aku, tags: trunk)
2017-07-10
22:09
[f7d1fcaad9] Removed GetString, GetSymbol sem core instances from the lexer. Replaced by code using direct access to parse tree and engine state information to generate the semantic value. Updated test suite. Changes in the output for rule. The old output was a bug in the GetString setup, reporting the internal `@START ~ foo` rule, instead of whichever rule for `foo` was matched. General changes in the lexer output, it now reports separate semantic values for all matched symbols. That was also a bug in the previous setup. It worked only because the runtime was geared towards (start,llength,value), which is identical across symbols. With the proper reporting for other keys, like 'rule', this is not the case anymore. The parser class was updated to match this API change. This change further reflects in the expected reults of the tests for lhs, name, rule, and symbol, testsuite is updated. Some optimization is still done, i.e. it is made sure that identical textual semantic values are stored under the same id as well, i.e. only once. (user: aku, tags: trunk)
2017-07-07
23:48
[ee9edafcbb] Added testing of the lexer pipeline, specifically the handling of the various semantic actions. NOTE: The result for g1start is a known issue with the current semcore/semstd. This is in preparation for the replacement of GetString with a more direct conversion. Some fixes in the existing machinery to have good results. Some bits of the new machinery already present, untested, and incomplete. (user: aku, tags: trunk)
03:57
[e0ce2184fc] Simplified lexer core, replaced GetSymbol handler with simpler direct conversion from parse tree. Updated testsuite. Added some docs about lexeme semantics (user: aku, tags: trunk)
00:11
[37728b785b] Tweaks to the lexer-internal documentation. (user: aku, tags: trunk)
2017-07-06
08:33
[65180c1ee3] Reworked the meta grammar. Removed the ambiguities with exponential blowup of forest size due to a naive specification of string and char class lexemes (vari-length octal and unicode escape sequences). More details in the modified grammar. Updated all dependent files used by the test suites. (user: aku, tags: trunk)
07:50
[ed2fc65f24] Fix a blowup in the boot-parser's handling of posix char classes. Check slif meta grammar for same. (user: aku, tags: trunk)
2017-07-05
22:04
[87701775b3] Moved progress reporting into a mixin (engine_debug.tcl). Added code for the reporting of marpa parse trees (valuation instruction list). Added stream reporting to gate, lexer, and parser. Added progress and forest reporting to lexer and parser. Added capture of complete rule information in the base engine, for the progress report (Semantics might need this as well). Updated the semantics core to match the changes in the engine. Reworked lexer internals with regard to LATM, LTM lexemes, and discards. All of them get an ACS symbol. This means there is an 1:1 mapping between lexeme and discard rules, and the ACS symbols. The difference between LATM and LTM + discard is that the latter group is always acceptable/activated. Note, this means that the first instruction of any parse tree is the token for the ACS symbol, which gives us the matched rule. We do not need the entire parse tree, nor do we have to actually evaluate it. This will allow us to replace the GetSymbol and GetString semantics with simpler code. (user: aku, tags: trunk)
21:45
[fb088da186] When generating classes write the codepoints 1 to 7 as full octal literals. This prevents misinterpretation as back-references by the gate regexp handling these char classes. (user: aku, tags: trunk)
21:41
[566a3da2e2] Ignore comments from the semantics. (user: aku, tags: trunk)
21:40
[8dc5fb4ba2] Fixed bug in boot parser present since beginning: Allow TAB in comments. Dropped duplication of LF. (user: aku, tags: trunk)
21:37
[004994a2f5] Extended error message for bad range (start after end) (user: aku, tags: trunk)
2017-06-26
22:21
[c1b1d00b8e] Removed support for the `bless` adverb from the boot-parser and following stages. Updated tests. Left the cases using bless in the suite, for the parse errors they now cause. (user: aku, tags: trunk)
18:17
[7fbda13fbc] Fixes - Add missing 'create' method in the templates - tparse, tlex backends: - Initialize rules lists - Discards are symbols too - Do not listify the values of character and lexeme maps. The latter is a bool, operation superfluous. The characters are Tcl encoded (char quote tcl) and listifying breaks that. - Sort the words of a list before generating output (All parts are order-independent, order is for human readability and easier search). - tparse backend: - Fix mask generation. Semantics and backend use different data structures for the same information: List of bools over entire rhs, vs list of indices to hide. Updated tests (user: aku, tags: trunk)
2017-06-23
05:18
[225afc2f9d] Created new engine assemblies for lexer-only and parser, using the boot_parser as starting point (tlex, tparse). These are the high-level runtimes assembling the low-level parts into a complete lexer/parser, configurable with grammar information. Created generator backends targeting these two runtimes. Their output are derived classes providing the grammar information missing in the runtime itself. Added testsuites. The APIs of the gate and lexer classes are modified (gate: def, lexer: latm (removed), export) to better suit the SLIF features. For basic testing of the new runtime the boot_parser has been rewritten to target it. The many incremental steps to keep everything running and testable during the rewrite are left out, this commit is an atomic switch-over. Test suite passes. Attention: The lexers and parsers generated by these backends are not tested yet, only the parse runtime (see above). The template classes are gone. The templates are now stored with the generator classes (attached to the file after the code, see new `marpa::asset` utility function for explanations). (user: aku, tags: trunk)
2017-06-21
05:32
[7773074adb] New backend: TLex. Dumps the lexer part of a grammar as the derived class from a tcl-based lexer engine. Extended the grammar container classes with the necessary accessor methods. Note: The engine base class for this still has to be written. Use the bootstrap_parser as the foundational example. (user: aku, tags: trunk)
2017-06-20
04:40
[ac35603331] Added first generator backends: Writing a derived GC container class which loads the grammar on construction. Tweaks to a few serialize methods to strip unwanted string rep, i.e. whitespace/formatting. (user: aku, tags: trunk)
2017-06-16
05:56
[34b21da2aa] Reworked "marpa_scr_rep_from_any" (charclass intrep creation (SCR)) to avoid unnecessary shimmering. The Tcl_ObjType*'s needed for the direct type checks are determined during package context setup, taken from transient Tcl_Obj* created for just that purpose. Switched back to BMP for the supported range, FULL was incorrectly checked in after testing. (user: aku, tags: trunk)
05:15
[0bc6a8787c] Make the semantics work official, pulled into main line. (user: aku, tags: trunk)
05:12
[2b2cf63ab6] Closed-Leaf: Now taking the differences between full and bmp(restricted) unicode support into account. Where needed tests now have separate results, one per supported range/mode. Moved the invokation of the generator tool out of the main marpa file into a utility file. The main marpa file is now a bit more readable. The generated files are now keyed by the range they are for. (user: aku, tags: new-slif-semantics)
2017-06-15
22:01
[63aaa63395] Account for differences between 8.5 and 8.6 (wrong#args messages for special `args`, and handling of x-sequences in input (8.6 properly takes only 2 hex characters, 8.5 takes the last 2 of any hex sequence at least 2 characters long)). (user: aku, tags: new-slif-semantics)
19:21
[63afd69b0b] Enhance robustness: Create the possibly missing directory for generated files before invoking the generator. (user: aku, tags: new-slif-semantics)
06:59
[3853490d5b] Pulled the performance work on char classes and ASBRs into the main development. Passes the tests. No crashes for the entire suite. The new C implementation of <code>2asbr</code> is between 15 and 1500 times faster than the Tcl implementation (i.e. between 1 to 3 orders of magnitude). See below for the known numbers (output from bench/asbr.bench for both branches and manually merged into a single table). <verbatim> Range :: Trunk/Tcl :: Perf/C :: Factor Tcl/C {0} :: 169.013 :: 10.683 :: 15.8207 {0 9} :: 1506.600 :: 12.931 :: 116.5107 {0 99} :: 14781.683 :: 35.007 :: 422.2494 {0 999} :: 242996.860 :: 274.940 :: 883.8178 {0 9999} :: 3233832.400 :: 2693.000 :: 1200.8290 {0 99999} :: 36660675.000 :: 26660.000 :: 1375.1191 {0 999999} :: 432152848.000 :: 277217.000 :: 1558.8974 {0 1114111} :: 484810715.000 :: 308173.000 :: 1573.1771 </verbatim> (user: aku, tags: new-slif-semantics)
00:14
[0fd6b229ab] Closed-Leaf: Cleanup of ignore patterns. Workspace and checkout are fully separated. Tcl_ObjTypes for char classes and ASBRs - Fully traced, dump functions. - Fixed memory smash in creation of string rep (moved functionality as macro into critcl::cutil) Now getting UNI_MAX out of newly generated C decl file Simplified adding range to an SCR. - Not extending anymore - Assert and fail when going over the allocated capacity - Reason: All places using it calculated the needed cap beforehand and allocated as much. Thus extending is not required and bounds-checking is good enough. Removed the Tcl implementation of 2asbr, norm- and negate-class. Updated the testsuite. (user: aku, tags: asbr-perf)
2017-06-14
08:00
[7009d4769d] Extended unidata tool to generate its 1st C decls (mode, max). Updated main file to use this. Note that the generated files are now placed into a separate directory just for them. This will be ignored by fossil (user: aku, tags: asbr-perf)
08:00
[75a243e45e] Drop the local helper script for testing from the repo. (user: aku, tags: asbr-perf)
07:55
[9d9ef26c45] Tcl 8.6 support. The `try` package is only required by 8.5 (user: aku, tags: asbr-perf)
2017-06-09
18:00
[e3a30b4075] Ignore more build artifacts. (user: aku, tags: asbr-perf)
08:14
[4beca374ec] Fixed generation of ranges in the creation of string rep for ASBR (user: aku, tags: asbr-perf)
08:13
[bfaa6cee8f] Moved CC validation into helper function. Better and fixed error messages. (user: aku, tags: asbr-perf)
08:12
[967653c919] Fix handling of bmp/full in the unicode tests. Different set of named classes, and skip over tests for full range when configured for bmp. (user: aku, tags: asbr-perf)
08:10
[9e2b1b63d7] Divorced unidata tool from marpa itself. Only place with a Tcl-implementation of 2asbr and related commands now. Slow, but only once, during build setup. And not limited by an installed marpa's range of supported codepoints. (user: aku, tags: asbr-perf)
2017-06-08
03:46
[9ac3318345] Fixed issues in low-level norm and negate. Added tracing, nicer assertions. Disabled Tcl-level implementations. Started on ASBR compile. Incomplete, broken, seg.faulting. (user: aku, tags: asbr-perf)
03:44
[81100a5bf4] Added use of critcl c-utilities. Requires unreleased critcl head. Removed beginnings of local c-utils. Added the ASBR obj type draft. (user: aku, tags: asbr-perf)
03:43
[fc4cb59f20] Put test installation into explicit directory. This prevents auto-removal by kettle. Keeping it is necessary for good C-level debugging (symbols). (user: aku, tags: asbr-perf)
2017-06-06
18:17
[81bbe98e8f] Brought up to date with main working branch (user: aku, tags: asbr-perf)
2017-05-25
18:14
[832991391f] Dropped local "dict-sort". Using "kt dictsort" instead. (user: aku, tags: new-slif-semantics)
18:13
[b6b6d9fa28] Another whitespace change (user: aku, tags: new-slif-semantics)
2017-05-21
02:53
[a54dd5d2eb] The main directory accumulated files to the point where it became cluttered. Restructured, moved each of the various groups of files into their own directory. (user: aku, tags: new-slif-semantics)
01:57
[a4e5621bbf] Replaced fixed slif meta grammar under tests/grammars/ with reference to new slif meta grammar under bootstrap. The new grammar removes the "bless" adverb. It further uses a redone syntax for strings and char classes which encodes more of Tcl's constraints into the syntax. Modified the semantics to handle the possibility of <single quoted name> using Tcl backslash-escapes, as now allowed by the new meta grammar. While this has not arrived in the grammar used by "s_parser.tcl" yet it may be. Cleaned the bootstrap/ directory up a bit. (Still have to check and compare the various old meta grammars) (user: aku, tags: new-slif-semantics)
2017-05-19
06:20
[74492e5609] Whitespace cleanup. Squashed irrelevant trailing whitespace in code and test. (user: aku, tags: new-slif-semantics)
06:13
[baddf989f3] Implemented rewrite of rules with precedence into sets of rules without, with precedence and associativity encoded in the structure of the new rules. Extended container and subordinates with methods to support the transformation (retrieval of the necessary information, per-symbol and general). Updated and extended testsuite. (user: aku, tags: new-slif-semantics)
2017-05-18
07:21
[3627fce321] Extended validation in semantics and containers, prevent recursive quantified rules. Added tests. (user: aku, tags: new-slif-semantics)
06:40
[a7630b86a8] Typo fixes in docs for precedence rewrite. (user: aku, tags: new-slif-semantics)
2017-05-17
20:02
[bc099e52e4] Implemented basic container validation. As part of that added some plumbing so that the objects internal to the container have a reference to their collection (container, or grammar, depending on context). Needed as some validations have to reach from the location perf,orming them to other parts of the system to get associated data. Extended the testsuites to invoke the validator (result of ctrace -> gcstate, ser/des, reductrion results). Pleased that all of them are good. Related things, to do: Checks for inacessible symbols, non-generating symbols (not reaching terminals/atoms), removal of same. Further to do, rewriting of precedenced priority rules to non-precedenced form. (user: aku, tags: new-slif-semantics)
05:47
[5bcb523050] Modified reduction to merge identical literals under different symbols into a single definition. Extended and modified the API to pass the merging information up so that users of the removed definitions can be rewritten to refer to the new unified forms. Extended the grammar container and supporting classes to bear the main burden of the rewrite (See method "fixup"). Updated test results. (user: aku, tags: new-slif-semantics)
05:43
[2db8657abe] Fix lexeme rule generation where a single-element RHS was not constructed as a proper list. Updated test results. (user: aku, tags: new-slif-semantics)
2017-05-16
07:13
[0e9dd3b6ed] Reduction choices for a C backend based on bytes and byte-ranges. Fixes for problems missed so far. Extended testsuite. Added the gcr_c result files Optimizations for character and ASBR reduction to prevent rule chains. (user: aku, tags: new-slif-semantics)
2017-05-12
20:45
[5558558738] Fixes to normalizer: * Added bytes and byte ranges. Updated doc/atom.md. Fixes to reducer: * During a redo of the implementation of rstate the invokation of the normalizer on new pieces was inadvertently dropped. Added it back. * Added the missed handling of %namedclass'es to the Tcl specific reduction rules. Factored the common code into a separate proc (CC-TCL). Fixes to grammar container * Added missing method `remove`. Fixes to the container testsuite (reduce-tcl): * Skip over grammars without L0, or no literals. * With `grammar` unpacking done early not needed for getting the literals. * Varname typo * Added the gcr_tcl result files (user: aku, tags: new-slif-semantics)
08:01
[638a44dd41] C-level implementations of negate- and norm-class. Exposes trouble in my thoughts about SCR, OTSCR and ref-counting. To redo. (user: aku, tags: asbr-perf)
07:49
[759de51b4b] Continued Tcl_ObjType, public constructor and accessor, plus cproc glue. Compiles without error. !Untested! (user: aku, tags: asbr-perf)
07:36
[0c5716a4bc] Added Tcl_ObjType for SCR-based char classes. Compiles without error. !Untested! (user: aku, tags: asbr-perf)
03:59
[2f3cf157f0] Implemented SCR functions. Compiles without error. !Untested! (user: aku, tags: asbr-perf)
00:31
[54eaf08a6a] For more perf started on C-ifying charclass-related functionality. (user: andreask, tags: asbr-perf)
2017-05-11
21:52
[f9ea41e0f0] Removed perf output from the branch. Unwanted here. (user: aku, tags: new-slif-semantics)
21:41
[18fa980ae2] Integrate the perf-work for the ASBR compiler. (user: aku, tags: new-slif-semantics)
21:28
[efdb819361] Rewrite of the ASBR compiler. Replaced use of TclOO with plain procedures directly working on Tcl lists. Speedup ~ 2.5-2.9 (user: andreask, tags: asbr-perf)
16:12
[1da8aac13b] Create new branch named "asbr-perf" (user: andreask, tags: asbr-perf)
06:15
[fba7f2ccdc] Implemented grammar container deserialization (bulk load). Fixed issue in serialization (empty container) Extended testsuite. (user: aku, tags: new-slif-semantics)
04:42
[04381efeca] Updated reduction rules. Implementation of new rules. Extended testsuite. More shortcuts in the structures specifying the expected results. Plus ability to reference larger blocks of content via file names. Also added tests for the internal ccranges proc. (user: aku, tags: new-slif-semantics)
04:29
[7bea4eb850] Slight tweaks to the testsuite support. (user: aku, tags: new-slif-semantics)
04:28
[7580c557d4] Added more unicode accessors: - Check support for a named class - Retrieve list of supported names Variants for both Tcl and Marpa Unicode Data. (user: aku, tags: new-slif-semantics)
2017-05-10
07:50
[393184ff2c] Extended literal handling with normalization phase 2, reduction of complex constructs into simpler pieces glued back together via priority rules. Extended testsuite. (user: aku, tags: new-slif-semantics)
07:46
[8b68b34fc3] Reworked the lowlevel framework for the slif testsuites. Moved the text utilities into their own file and made them more modular. (user: aku, tags: new-slif-semantics)
07:46
[ad5174fc03] Integrated the full/bmp choice into the main marpa code. (user: aku, tags: new-slif-semantics)
07:42
[c907ffa182] Extended unicode tables with information about the supported range. Modified the tooling to allow generation of full and bmp-limited tables. Implemented class negation. Reworked the ASBR and GRAMMAR accessors to resolve byte range references in the result before returning it. Removed the accessors for byte ranges. These are now an implementation detail hidden from the user. Updated testsuites. (user: aku, tags: new-slif-semantics)
2017-04-26
19:27
[69b1ee0e4d] Updated container tests. Another API change (l0 atom -> l0 literal), with corresponding updates to tests. Everything passes again. Next, deconstruction of literals, i.e. normalization phase II. (user: aku, tags: new-slif-semantics)
08:32
[e596035c8a] Integrated new literal handling into semantics. Updated tests. Container API change, breaks container tests. To fix next. (user: aku, tags: new-slif-semantics)
05:45
[ab2cbccd88] 3rd time the charm? Another redo of literal handling. Not yet integrated with semantics. Full tests. Some add-ons in the unicode support to help, with tests. (user: aku, tags: new-slif-semantics)
2017-04-23
04:24
[9eed4cb141] Dropped derived atom classes, replaced with an extended atom base class storing atom type and details. Updated all users. Tests unchanged, and passing. (user: aku, tags: new-slif-semantics)
02:32
[1fc5f75909] Implemented negated char-classes (Finally). Extended (literal-lowering) and updated (meta-grammar) tests. Added missing pieces for various other types of atoms. Noted: * The number of classes for atoms is getting a bit much, especially as they are essentially all the same, just the data stored in them differs per their type. Yet even that is not truly visible in the signatures either. Therefore next: * Collapse the various atom classes into a single class which stores data __and__ type. * Further, write up the rules for the various atoms, their specification details, and the transformation rules to not loose sight during implementation. (user: aku, tags: new-slif-semantics)
2017-04-21
21:04
[d80850b630] Added tests for cornercases in char classes, and the basic compression. (user: aku, tags: new-slif-semantics)
19:01
[ca1a659087] Added basic string deconstruction. Updated tests to match. (user: aku, tags: new-slif-semantics)
07:15
[58d0aa2893] Small simplification in state retrieval, disabled debug code. (user: aku, tags: new-slif-semantics)
06:29
[7d929144bf] Fixed lurking bug in use of the symbol state engine. When creating a :lexeme (or :discard) the literal's symbol has to be recorded with a l0-usage at that point. Forgetting that is no problem if the symbol is for an actual literal, its state is literal either way. However with the deconstruction of literals into simpler pieces such symbols may stand for priority rules as well. For them, with the l0-usage forgotten they look like toplevel L0 symbols, i.e. lexemes, except that the engine also knows that they were not used in G1, thus reports the situation as error. (user: aku, tags: new-slif-semantics)
2017-04-20
23:28
[f81a58af27] Added structuring comments and indentation to test "literal-lowering" (user: aku, tags: new-slif-semantics)
23:07
[5059e70410] Reduced comments generated by literal handling. Tests updated to match. (user: aku, tags: new-slif-semantics)
22:22
[b50e77e610] Added specific test demonstrating how literals are simplified. (user: aku, tags: new-slif-semantics)
22:00
[2292cecda4] Fix indentation nit. (user: aku, tags: new-slif-semantics)
19:18
[85cd4f4c54] First simplification of literals. Single-element strings to characters, ditto for charclass whose element is a character. Updated tests to match. (user: aku, tags: new-slif-semantics)
06:25
[f2fc6e54fd] Moved literal handling out of the semantics into its own helper class and file. Further restructured the code to allow for easier insertion of simplification rules. Tweaks to symbol names, updated the tests to match. (user: aku, tags: new-slif-semantics)
2017-04-19
07:34
[c7040480bb] Comment tweaks for atoms. Fixed (c) year in places. Added atoms for ranges and named cc's. Started on pulling the literal handling out of the semantics into its own class. (user: aku, tags: new-slif-semantics)
2017-04-18
23:19
[8f33a9c72c] Moved l0 literals into their own class, 'literal'. Added character atoms as supported literal type. Updated the tests to match. (user: aku, tags: new-slif-semantics)
19:48
[ac94875686] Reject identical RHS for priority rules. Prevent multi-creation of :discard priority rules for same literal. New tests added. (user: aku, tags: new-slif-semantics)
06:29
[79414a2533] Big set of related changes - Implemented grammar container. - Implemented state dump (serialization) for same - Pretty printer for state dump, for use in tests. - Created testsuite using SLIF ctrace's to drive a container and check the resulting state. - Tweaked various test cases (slif, derived ast, ctrace, container): Removed superfluous :start statements. - Changes to semantics, updated test cases to match. - Moved passing of start symbol to near the end (symbol existence ensured) - Create literal/lexeme definitions only once. (user: aku, tags: new-slif-semantics)
2017-04-11
05:52
[150fa90af9] Dropped "symbol" from container interface. Not needed. Some text tweaks. Updated the tests to match. (user: aku, tags: new-slif-semantics)
2017-04-08
07:09
[7f1e1389a1] Filling in the container. General attribute manager, plus first custom subclasses. (user: aku, tags: new-slif-semantics)
04:51
[281636de75] Reworked the grammar framework a bit to be more scalable, i.e. not requiring a change with every new key. Updated suites to match. Further added suite and placeholders for container testing. Currently the new suite is a full fail (container completely different from the API coming from the semantics). (user: aku, tags: new-slif-semantics)
2017-04-07
22:53
[4b61c1c2e7] Added the slif meta grammar as test case. Note, this grammar has some TODOs to make it play nicer for Tcl. (user: aku, tags: new-slif-semantics)
22:05
[a63be89730] Fixed mishandling of quoted symbol in separator adverb during normalization. (user: aku, tags: new-slif-semantics)
20:42
[46b45891f1] Oops. Forgot to add the new charclass tests. Fixed. (user: aku, tags: new-slif-semantics)
20:41
[993a1e814c] Changed the string/charclass rep conveyed from semantics to container, removed tagging, type can be infered unambiguously from the value. Updated tests to match. Further fixed ingestion of named (inverted) char classes. Added tests. (user: aku, tags: new-slif-semantics)
19:24
[a546ce312d] Added handling of escaped chars and nocase to charclass operation. Updated tests to match, also tweaked to actually demo nocase where needed. (user: aku, tags: new-slif-semantics)
07:28
[ca988b6b2f] Reworked the handling of escapes, do it first. Simplifies the processing loop coming after. Tweaked the derivation of symbols, slightly different quoting. Updated tests to match. Added testcase for escaped character forms in strings. (user: aku, tags: new-slif-semantics)
06:24
[5b27a4891f] Moved to decimal codepoints in literal specs. Added handling of nocase for chars, strings. Updated tests to match. (user: aku, tags: new-slif-semantics)
05:05
[2d3c378396] Grammar tweak to make handling of discard variants easier. Split the discard rule to inline the variants of <single symbol> Completely reworked the handling of symbols (g1, l0, lexeme, discard, literals, etc.) Now using a state-machine to keep track of the contexts symbols are found in, and the conclusions about which future contexts are thus still legal. Redone the handling of fixups for `lexeme default` and `discard default`. User-specified settings of "latm" and "discard events" now form an exclusion list of symbols, with the regular list to exclude them from pulled out of the new state tracking. Literal handling now performs normalization and derivation of symbols in the semantics. The container does not know anything about anymore. FIX BUG: Tcl escapes for strings and char classes are not handled yet, neither by semantics, nor syntax. FIX BUG: Normalization of specs for "nocase" still missing Updated tests to match. Added tests for all the cases missed by the old tracking scheme. (user: aku, tags: new-slif-semantics)
2017-03-28
04:49
[dcab86d57e] Extended location handling to include span again. Added location info to various errors. Updated tests to match. (user: aku, tags: new-slif-semantics)
03:00
[dbe940e8b8] Moved location management to AST processing, away from the container-to-be. Updated tests to match. (user: aku, tags: new-slif-semantics)
2017-03-25
00:21
[aab73ba179] Tweaked a number of messages, updated tests to match. Split some supporting code into separate files. (user: aku, tags: new-slif-semantics)
2017-03-24
05:40
[3e376473de] Added handling of <inaccessible statement>. Extended testsuite. (user: aku, tags: new-slif-semantics)
00:23
[5c4109d52f] Added handling of g1 parse events. Extended testsuite. (user: aku, tags: new-slif-semantics)
2017-03-23
05:15
[a91d328909] Implemented the defaults for g1 actions/blessings. Extended and updated tests. (user: aku, tags: new-slif-semantics)
2017-03-22
04:39
[b23111ea38] Tests for :discard and <discard default>. (user: aku, tags: new-slif-semantics)
2017-03-21
06:41
[4c3738f282] Replaced LATM class with generic class for fixups. Tests pass. Implemented :discard, default discard handling. No tests for that yet. (user: aku, tags: new-slif-semantics)
05:34
[b3713b521d] Normalized separator/proper and pause/event pairs into single adverbs with aggregated info. Normalized quoted event names. Updated the testsuite to match. (user: aku, tags: new-slif-semantics)
00:26
[7e4778d091] Added tests for :lexeme and its adverbs. (user: aku, tags: new-slif-semantics)
2017-03-16
04:23
[b8b76c92b8] Added test grammars for literal handling (strings and char classes) (user: aku, tags: new-slif-semantics)
02:21
[5cdf9f0023] Completed terminal/lexeme cross-checks. Extended testsuite. Fixed bad tests to not have the newly detected errors. Fixed RHS handling of quantified rulres (A <single symbol> is in a list too, and has to be unboxed). (user: aku, tags: new-slif-semantics)
2017-03-15
05:56
[e49929c8c5] Started cross-checking of terminals vs lexemes. Extended testsuite (user: aku, tags: new-slif-semantics)
04:04
[b3e66077de] Updated testsuite to match the work done to the semantic processor. Reorganized into a deeper hierarchy, by topic. (user: aku, tags: new-slif-semantics)
04:03
[2348ad9401] Added lots of adverb handling. Track which symbols are lexemes, non-terminals. Track LATM flagging. Testsuite changes pending. (user: aku, tags: new-slif-semantics)
04:01
[c4c28a4e44] Fix lexer completion handling when eof happens after a lexeme and no partial input. (user: aku, tags: new-slif-semantics)
03:58
[fd822f64a9] Reworked slif testsuites to find test cases regardless of directory hierarchy. Renamed a log method to avoid clash with builtin "trace". (user: aku, tags: new-slif-semantics)
2017-03-03
00:14
[4fd57c5641] Changed container API expected by the semantics. Some simplification. Updated all grammar ctraces to match (user: aku, tags: new-slif-semantics)
2017-03-02
07:31
[85429db1e7] Lots of forward-lookng support for unicode char classes and case folding, based on compiling them into grammars based on bytes and byte-ranges. Unicode data files, tools for conversion. Tests for support commands. Placeholder for testing the generated data. Build integration. The generated file (p_unidata.tcl) is excluded from the repository, and is dynamically created as needed. (user: aku, tags: new-slif-semantics)
2017-02-28
06:27
[ed750236ca] Added unicode helper commands (unicode to utf, classes to byte-based dfa/grammar, pretty printing of same, testsuite. (user: aku, tags: new-slif-semantics)
2017-02-24
21:26
[20a670f756] Overhauled error propagation in the sustem of gate, lexer, parser, and slif-parser. Proper accumulation of context during forwarding to later stages (See "fail"), and pulling of context forward for errors generated by later stages (See "get-context"). (user: andreask, tags: new-slif-semantics)
07:49
[63bd418477] Reordered and renamed test cases. Added cases to complete the various bnf/match rules (prio, empty, quant). (user: aku, tags: new-slif-semantics)
01:29
[af5ad7072e] Restructured testcase hierarchy. (user: andreask, tags: new-slif-semantics)
01:11
[9f93413bb3] Semantics optimization. Declare symbols, strings, charclasses, ... only once. (user: andreask, tags: new-slif-semantics)
00:53
[bb755531e7] More adverb simplification. Folded them into rule creation, as option arguments. (user: andreask, tags: new-slif-semantics)
00:20
[373fd926a6] Simpler adverb processing. First adverb test cases. Filling out adverb semantics. Symbol name normalization added. (user: andreask, tags: new-slif-semantics)
2017-02-23
22:45
[80d30b9f98] Updated adverb/context information. Transcribed to wiki. Reworked the meta grammar and bootstrap parser to encode where which adverbs are allowed into the grammar itself. Updated the AST results. Updated the ctraces for L0 precedence rules (21, 22), where the assoc adverb is currently not allowed anymore. Have to ping JK about my research. Still to do: Simplify the adverb processing code (remove checks, take advantage of the grammar doing this now). This is the next step, together with test cases which cover that part of the semantics. (user: andreask, tags: new-slif-semantics)
06:29
[0247c7130e] AST/Semantics for basic quantified rules. Doc table adverb/context. Incomplete. (user: aku, tags: new-slif-semantics)
01:39
[caa492dfa9] Extended testsuites for slif ast and semantics: Precedence and lexer rules. Dropped location info from adverbs. (user: andreask, tags: new-slif-semantics)
00:50
[77ab1e631b] Started testsuite for slif parser (AST generation), and semantics (AST to container). Reworking the semantics to not know anything about container internals (no getting of internal instances and talking directly to them). (user: andreask, tags: new-slif-semantics)
00:19
[fee050b52a] Changed logging infrastructure to support shorter output, user-injection, and tweaks to log joining. (user: andreask, tags: new-slif-semantics)
00:18
[9c59155fe1] Added "fail" support into the mock parser classes (user: andreask, tags: new-slif-semantics)
00:17
[1a6131c9fa] Hacked in "fail" propagation. (user: andreask, tags: new-slif-semantics)
2017-02-22
18:38
[4b8669f20a] Removed test helper command "dictsort". Provided by kettle, switched to that. (user: andreask, tags: new-slif-semantics)
2017-02-21
22:24
[ccad12f326] Merged testsuite work to the SLIF work. (user: andreask, tags: new-slif-semantics)
22:09
[76f9af32f6] Closed-Leaf: Fixed two typos (user: andreask, tags: testsuite-work)
19:58
[4ff980abf7] Completed testsuite for semantic core (SC). Extended SC API for introspection. Tweaked SC API (mask -> add-mask), and updated user to match (parser). (user: andreask, tags: testsuite-work)
19:56
[b107343366] Update bootstrap code to match changes in the action syntax. (user: andreask, tags: testsuite-work)
17:40
[350d1441b9] Updated copyright years (user: andreask, tags: testsuite-work)
08:14
[c2ff560ee9] Small comment fixes. Start on tests suite for the semantic core class. (user: aku, tags: testsuite-work)
05:37
[9adb3ce48b] Completed testsuite for standard semantic actions. (user: aku, tags: testsuite-work)
05:09
[6242c71caf] Completed testsuite for various support commands. (user: aku, tags: testsuite-work)
04:37
[3200e8c2d0] Added information about the engine and grammar testsuites. (user: aku, tags: testsuite-work)
01:14
[cd052124e2] Cross-check lexer state machine to test cases. Completed lexer testsuite. Tweaks to error reporting, and internals docs. (user: andreask, tags: testsuite-work)
2017-02-20
22:57
[480187bec7] inbound/gate. Extended the internal state machine docs. Cross-checked state-machine vs test cases. Added 3 missing cases found by this check. Fixed testsuite UP references with better names. (user: andreask, tags: testsuite-work)
21:10
[3906d85433] Conversion of up/downstream references to proper names, for gate and inbound. Updated their testsuites to match. (user: andreask, tags: testsuite-work)
21:04
[e328ba5e5b] Parser side conversion of up/downstream references to proper names. Completed testsuite. Fix parser issues found by it. Added proper reporting of parse failure, although without details. (user: andreask, tags: testsuite-work)
21:01
[976b4ca3c3] More conversion of up/downstream references to proper descriptive names (user: andreask, tags: testsuite-work)
21:00
[e5d8724fa9] Added note about sem value id 0 and the problems it can cause (user: andreask, tags: testsuite-work)
20:59
[04c3c24ddc] Change ambigous up/downstream references to more descriptive names. Added proper argument validation to sym/id converter methods. (user: andreask, tags: testsuite-work)
20:57
[d3aaf61f0f] Fix conditional in helper script (user: andreask, tags: testsuite-work)
17:35
[7a99bf74bf] Added forgotten new support file (user: aku, tags: testsuite-work)
16:06
[f10dc0aa37] Start on parser tests, tweaks to lexer. (user: aku, tags: testsuite-work)
2017-02-18
03:57
[6cf602f879] Fix missing action method in API and sequence. Fixed typo. (user: aku, tags: testsuite-work)
2017-02-17
08:24
[6a33365a78] Added more notes extracted from the SLIF docs (user: aku, tags: testsuite-work)
08:23
[4eb412329b] Added placeholders for missing testsuites. Rename as they are filled (user: aku, tags: testsuite-work)
07:39
[c3377b8b63] Updated location merge a bit, updated users to match. Tweaked lex test support. Removed lots of debug statements, the issue with exhaustion is gone. currently suspecting a memory corruption issue, the changes made should not really affect the C-level libmarpa. Watch this. (user: aku, tags: testsuite-work)
07:35
[e2a561954b] Cleanup, and rework helper to run from the subdir (user: aku, tags: testsuite-work)
02:31
[f4d54e1f8b] Updated code using location commands. Fix forgotten commit of new helpers required by location code. (user: aku, tags: testsuite-work)
02:29
[82fa146475] Tests for location handling and semstore. Extended introspection, cleanup. (user: aku, tags: testsuite-work)
2016-11-26
00:11
[c76fcab35e] Start on testsuite, typos, some reworking of lower-level innards (user: aku, tags: testsuite-work)
2016-05-22
05:07
[b8a987d343] Fix missing integration of sequencing support. Fix build hash-bang. (user: aku, tags: trunk)
05:03
[215241808d] Partial implementation of parser, semantics, and container for SLIF grammars. (user: aku, tags: new-slif-semantics)
05:00
[aa3974cd80] Tests for "marpa::gate". Separated the logic checking the method call sequences of "inbound" and "gate" from the main implementation into mixin-helper classes. Created new base class for writing sequence-validating state-machines. Added supporting code. (user: aku, tags: trunk)
2016-05-18
05:02
[705231127e] Tests for "marpa::inbound" (user: aku, tags: trunk)
2015-12-17
22:45
[22d985b69b] References to sparse int sets => See notes on data structures for the set of acceptable symbols (user: andreask, tags: trunk)
19:08
[09a7c96443] Save work state to repository (user: andreask, tags: trunk)
2015-12-15
22:47
[6ef17017b0] More annotations. (user: aku, tags: trunk)
2015-11-25
08:17
[a17d6ee154] Modified parser to distinguish semantic rules by adding a sequence number to the lhs. Updated printer and grammar backends to make use of this info, simplify actions. (user: aku, tags: trunk)
08:16
[962491557c] More notes (user: aku, tags: trunk)
08:15
[a8de3b64cf] slif meta grammar fixups re hidden rhs, in code and meta-grammar (user: aku, tags: trunk)
07:05
[4fb7a76f90] Completed adverb processing, done a bit more on the other symbols (user: aku, tags: trunk)
07:03
[67894fdd77] Meta grammar fix, match marpa-slif.tcl (user: aku, tags: trunk)
04:48
[dcd09cf223] Added rhs masking to semantic core, and extended grammar spec to provide the information from the SLIF meta-grammar. Added start for grammar container and ingesting a slif AST. (user: aku, tags: trunk)
2015-11-24
07:03
[76e9a4c732] Moved G1/L0 interface support out of engine to lexer. Simplifies the ACS handling, happens fully at lexer level. Extended lexer to support array descriptor semantics. Extended parser for the same. Properly working for name, start, length, value, values. Not working for g1start, g1length. Unknown for lhs, symbol, rule. The semantics rules are incrementally fed from lexer/parser to attached semantic engines as lexler/parse rules are added. Extended lexer for latm vs ltm matching (LTM ignores signaling via ACS). Incomplete, discarded symbols cannot have this set on a per-lexeme basis yet. Hardwired to LATM. Only exported symbols have this properly done. Modified slif bootstrap parser to set the proper semantics and matching discipline. (user: aku, tags: trunk)
06:51
[863199d7de] Added the std semantics to system (user: aku, tags: trunk)
06:51
[c8a2260b1d] More notes (user: aku, tags: trunk)
06:50
[a45f2378e8] location - New helper 'null', and reworked 'merge' to have 'merge2' and 'merge'. (user: aku, tags: trunk)
06:49
[06be81a3a5] semantic core - More narrative, and methods for extension after construction added. Plus helper commands implementing various standard semantics (user: aku, tags: trunk)
2015-11-23
17:10
[5935cdf72e] First save of ongoing work. WOMM only. (user: aku, tags: trunk)
2015-11-20
07:34
[d9ab3ad9c7] initial empty check-in (user: aku, tags: trunk)