Issues with the Tcl http package
I have found the issues listed below with the Tcl http package. I have reported some of them to the Tcl bug tracker. But when I came to the conclusion that I needed to write my own implementation of the HTTP protocol and there had been absolutely no follow up on the issues I had reported, I stopped investing time in that.Persistent Connections
RFC 2616 (HTTP/1.1) states: HTTP implementations SHOULD implement persistent connections. HTTP/1.0 did not originally have any way to reuse a connection. This has later been bolted on via the Keep-Alive header. However, there may be problems when using this header, as described in RFC 2068. So, the preferred mode of operation would be to use persistent connections with HTTP/1.1, but not with HTTP/1.0. This combination is not possible with the Tcl http package. To use persistent connections with HTTP/1.1, the -keepalive option must be set to 1. But that will also send a Keep-Alive header, opening the door to the vulnerabilities described in RFC 2068.
Missing Content-Type header
When a web site doesn't return a Content-Type header, the http package defaults to text/html and goes ahead and replaces all \r\n occurrences to \n. If the content was actually a PNG image, it is irreparably corrupted by this action. This goes against the advice in RFC 2616, which specifies: If and only if the media type is not given by a Content-Type field, the recipient MAY attempt to guess the media type via inspection of its content and/or the name extension(s) of the URI used to identify the resource. If the media type remains unknown, the recipient SHOULD treat it as type "application/octet-stream".
The only way to avoid this issue is to add the -binary option to the http::geturl command. But that treats any response as binary, even when a site does provide a content-type header. So yet another responsibility for the script developer to deal with.
Ticket: 338d979f5b
Content encoding
The http package advertises support for gzip, deflate, and compress encoding in the Content-Encoding header. However, it only handles gzip encoding correctly. If any of the other 2 encodings is returned by the server, it results in a data error exception.
Ticket: a13b9d0ce1
Header field names
As per RFC 2616, header field names are case-insensitive. The http::meta command lists them exactly as received. This makes it much harder to correctly locate the header of interest. Some script writers may simply use the most common notation in a dict get command. That leads to incorrect operation of the application if it ever encounters a web site that uses a different (but perfectly legal) notation.
This is another burden on the shoulders of the script developer to get right.
Asynchronous operation
When requesting asynchronous operation by using the -command option, the http::geturl command still blocks during the DNS lookup part of setting up the TCP connection. This can take several seconds when there is a problem reaching a DNS server. While this is really an issue of the socket command used by http::geturl, it was still a problem for my use case.
Clobbering debug information
When debugging a running application that encountered an error and also uses the http package, the errorInfo variable usually contains a string like: 'can not find channel named "sock55ac118d1540" while executing "eof $sock"'. The information about the actual error you are trying to investigate has quickly been clobbered by the http package.