Tcl Source Code: Ticket Change Details

Overview

Artifact ID:	c6f2ba5a0afc9c565097f7279ba9bd84fee5660c749253e140b99c2acfc32b5b
Ticket:	31aa44375de2c87ecb1361b15ca0102d831ad155 Tcl_NumUtfChars regression in default 8.6 build
User & Date:	jan.nijtmans 2020-05-10 13:22:00

Changes

icomment:

As a base for further discussion, I now merged the bug-31aa44375d branch to core-8-6-branch. Testcase encoding-15.5 is now marked as "knownBug" because this is IMHO an important feature: The UtfToUtf encoding is meant to fix all kinds or problems when extenal byte sources find their way into Tcl. A 4-byte UTF-8 sequence is currently illegal in Tcl, but it could be legal to the outside world. Therefore it should be translated to a surrogate pair, so Tcl can handle it as intended. It is possible to fix this using special code in the Utf0toUtf encoder/decoder. This test-case checks for that, and it's broken now. That should be fixed. I'll prepare a solution for that.

So, discussion not over yet, but at least we have a base to talk about.

login: "jan.nijtmans"
mimetype: "text/x-fossil-wiki"