www

www manual page
Login

Tcl HTTP client library

www - High level HTTP client library for Tcl

SYNOPSIS

package require Tcl 8.6
package require Thread
package require sqlite3
package require tls
package require www 2.4
www subcommand ?arg arg ...?

DESCRIPTION

The www package provides high level commands for working with the HTTP protocol, as defined in RFC 2616. The package supports the GET, POST, HEAD, PUT, and DELETE operations of HTTP/1.1.

The www package implements a single command, www, with several subcommands.

The package automatically handles compression, proxies, cookies, redirects, and authentication so the user of the package can normally just pass the URL and get back the page contents.

For more advanced control, meta data can be obtained, and errors cases can be handled using the normal Tcl error handling mechanisms.

DATA COMMANDS

There are a number of subcommands that actually communicate with an HTTP server and send and/or retrieve data. These are referred to as data commands. The contacted HTTP server may take some time to respond. It is therefor usually best to run data commands from inside a coroutine. Otherwise the command will internally use the vwait command to wait for the response.

COMMAND RESULT

Upon successful execution of a data command, the return value is the body received in the HTTP response. Behind the scenes, the command may have had to perform several HTTP requests to arrive at the final response. For example, a request for a http resource may obtain a 301 (Moved Permanently) response, pointing to a https version of the resource. That may then produce a 401 (Unauthorized) result along with a digest authentication challenge. After the request is repeated, including an authorization response that is constructed based on the challenge and the credetials provided via the -digest option, a 200 (OK) result is finally obtained.

When any fatal problem is encountered while accessing the requested resource, the command throws an error. Possible problems include invalid command options, bad URLs, unreachable servers, but also HTTP status codes that indicate something other than success. For this last group of errors, the command will produce an error code of WWW CODE, followed by the group of codes (i.e. 3XX for redirection, 4XX for client errors, 5XX for server errors), the actual numerical status code, and the reason string. For example, requesting a resource that does not exist on the server will usually result in an error code WWW CODE 4XX 404 {Not Found}. The return value is the body received in the HTTP response. This is usually a small html document explaining the error.

In cases where a response is received from the HTTP server, either success or failure, it may sometimes be useful to access the meta data. This information is available via the return options. Return options can be obtained using the Tcl catch and try commands:

        catch {www get $url} data info
        # or
        try {
            www get $url
        } on ok {data info} {
            puts "Retreived [dict get $info url]"
        }
    

The return options normally contain the following pieces of information:

url
The URL used to retreive the returned data. This may differ from the URL specified in the command due to redirects.
uri
The resource on the server that was retreived.
status
The response status information, both as the full line and split out into version, code, and reason parts.
headers
The response headers. All header names are returned in lowercase. Note that this is a list of alternating header names and their values. It is not a dict, as the same header name may appear multiple times.

COMMON OPTIONS

The data commands share a common set of options. These options are explained here, and then referred to by each subcommand for brevity.

-auth data
Specify a username:password to send to the server for basic authorization.
-digest cred
Specify a username:password to use for digest authorization.
-file file
Send the file contents in the body of the request. The -file option may be repeated to send multiple files in a single request.
-handler callback
Invoke callback whenever HTTP data is available. If present, nothing else will be done with the HTTP data.
-headers dict
Specify additional headers to include in the request.
-maxredir type
The maximum number of redirects that will be followed when retrieving a resource. This prevents infinite loops when URLs point to themselves, either directly or indirectly. The default is 20. To disable redirections, set this option to 0. Use -1 to allow infinite redirections (not recommended).
-multipart type
Control whether the data will be sent as a multipart message.
-name string
Specify the name for the item of the following -file or -value option.
-persistent bool
Specify whether an attempt should be made to reuse HTTP/1.1 connections. By default this is true, as defined by RFC 2616. HTTP/1.0 connections are never reused.
-timeout milliseconds
Specifies a time limit for the response to be obtained. If no response is received within this time, the channel is closed and an error is returned to the caller. The default is 30000 (30 seconds).
-type mediatype
Specify the media type of the data in the following body part(s) of the request.
-upgrade dict
A map of names and their implementations for protocol upgrades that will be offered to the server. This can be used by a client to upgrade a connection from HTTP 1.1 to HTTP 2.0, or an HTTP or HTTPS connection into a WebSocket. See Protocol upgrades for more information.
-value string
Specify a fixed value for a body part. This is similar to the value arguments, but may be used if files and values must be included in a specific order.

REQUEST BODY

HTTP POST, PUT, and DELETE requests may have a body. There are several possible sources of information from which the body may be constructed:

  • The body argument.
  • A single -file option.
  • Additional -file options.
  • One or more key/value pairs.

The following rules apply:

  1. When the -multipart option is specified, its value determines how the body is created. When set to "form-data", key/value pairs are included as individual body parts. Any other value, including the empty string, will result in the key/value pairs being appended as a query string to the url. If the -multipart option is explicitly set to "", the body will contain the body argument, if specified, or the first file specified via a -file option.
  2. When multiple sources are specified, the body will be created by putting all sources into a multipart MIME data stream. The Content-type header is set to multipart/form-data. The value of the -type option is used as the Content-type for the part containing the data from the body argument, if present.
  3. If the body argument is the only specified source, the data is copied to the message body as is. The Content-type header is set to the value specified by the -type option. It defaults to text/plain.
  4. If only a single -file option is specified, the contents of the file is placed verbatim into the message body. Multiple -file options constitute multiple sources and therefor follow rule #1. The Content-type header is set to the value specified by the -type option. If not specified, the Content-type is inferred from the file extension.
  5. If only key/value pairs are specified, then the result depends on the value specified by the -type option. For application/x-www-form-urlencoded, the data is combined into a query string. For multipart/form-data, the data is placed in a multipart MIME data stream. If no -type option is specified, it defaults to application/x-www-form-urlencoded for the post and put subcommands. For the delete subcommand, the key/value are appended to the URL as a query string. The message will not have a body.
  6. If no source for a body is specified for the post or put subcommands, an empty body will be sent.

The following data commands are available:

www get ?option value ...? url ?key value ...?

Perform an HTTP GET operation. Any key/value arguments are formatted as a query string that is appended to the URL.

The following standard options are supported by the get method:

-auth, -digest, -handler, -headers, -persistent, and -timeout.

www post ?option value ...? url ?body? ?key value ...?

Perform an HTTP POST operation. The request must contain a body part. See Request body for more information.

The following standard options are supported by the post method:

-auth, -digest, -file, -handler, -headers, -persistent, -timeout, and -type.

www head ?option value ...? url ?key value ...?

Perform an HTTP HEAD operation. Any key/value arguments are formatted as a query string that is appended to the URL.

The following standard options are supported by the head method:

-auth, -digest, -handler, -headers, -persistent, and -timeout.

www put ?option value ...? url ?body? ?key value ...?

Perform an HTTP PUT operation. The request must contain a body part. See Request body for more information.

The following standard options are supported by the put method:

-auth, -digest, -file, -handler, -headers, -persistent, -timeout, and -type.

www delete ?option value ...? url ?body? ?key value ...?

Perform an HTTP DELETE operation. The request may contain a body part. See Request body for more information.

The following standard options are supported by the delete method:

-auth, -digest, -file, -handler, -headers, -persistent, -timeout, and -type.

CONTROL COMMANDS

www log prefix

The www log command configures a command prefix for handling log messages. Whenever the library has something to report, the log message is appended as an extra argument to prefix and the resulting command is executed.

Example: Print all log messages to standard output:

    www log puts

www configure ?-accept mimetypes? ?-useragent string? ?-proxy prefix? ?-pipeline boolean? ?-socketcmd prefix?

Configure various aspects of the behavior of the package. The following options are supported:

-accept mimetypes
The Accept header of the request. The default is */*, which means that all types of documents are accepted. Otherwise you can supply a comma-separated list of mime type patterns that you are willing to receive. For example, "image/gif, image/jpeg, text/*".
-useragent string
The value of the User-Agent header in the HTTP request. The default is "Tcl-www/1.0".
-proxy prefix
Control the use of a proxy for sending HTTP requests. See Proxies below for more information.
-pipeline boolean
Control the use of pipelining. Pipelining means that multiple HTTP are sent on a single TCP connection without waiting for the corresponding responses. Disabled by default.
-socketcmd prefix
Define a command prefix to use when creating a TCP socket. When necessary, the command is executed in a separate thread with three additional arguments: -async, host, and port. Because the command will run in a separate thread, it has to be self-contained and only rely on standard Tcl commands. For example, to use the socket command of the iocp package, the following code could be used:
        www configure -socketcmd [list apply [list {path args} {
            set ::auto_path $path
            package require iocp_inet
            tailcall socket {*}$args
        }] $auto_path]
    

www cookies subcommand ?arg arg ...?

Manage the cookies related to a url. The cookies subcommand provides the following methods:

www cookies delete url ?cookie ...?
Delete zero or more cookies that have been stored for the specified url. If no cookie is specified, all cookies for the url are deleted.
www cookies get url
Return all cookies applicable for the specified url. The command returns a list of alternating cookie names and their value.
www cookies store url ?cookiespec ...?
Add or overwrite a cookie. The cookiespec is a string defining the cookie name, value and attributes as it might appear in an http Set-Cookie header. For example:
    id=a3fWa; Expires=Wed, 21 Oct 2015 07:28:00 GMT; HttpOnly

www register scheme port prefix ?secure?

Register a custom HTTP scheme by specifying the scheme name, default port and the command for creating the Tcl channel. The optional secure argument should be a boolean indicating whether the channel may be considered secure. This influences sending of secure cookies.

www certify cainfo ?prefix?

Specify the location of the trusted CA certificates. The location may be a directory containing individual CA certificates, or one or more CA certificates concatenated into a single file.

Invoking this command activates the verification of the certificate returned by the server against a set of trusted certificate authorities. When prefix is not specified, standard checks are performed, unless an empty string is passed for cainfo. In that case, any certificate is accepted.

For finer grained control, a command prefix can be specified. This command will be called at several points during the OpenSSL handshake and may reject or accept the certificate at each point by returning a false or true boolean value, respectively. If the callback does not return a valid boolean value, the result is determined using the standard checks. The callback is invoked by appending two arguments to the provided prefix:

  • A depth argument, which is an integer representing the current depth on the certificate chain, with 0 as the subject certificate and higher values denoting progressively more indirect issuer certificates.
  • A cert argument, containing a list of key-value pairs providing details about the received certificate.

www cookiedb database

Specify the location of the sqlite3 database to use for storing permanent cookies. The same database may be used by multiple applications using the www package to share cookies. If no database is configured, the package uses a temporary database, which gets deleted when the application terminates.

www header option ?arg ...?

This command provides several operations for working with a list of http headers such as have been returned by the different data commands.

www header add headerlistvar name ?-nocase? value ?...?
Add one or more values to a header, if they are not alread present. The -nocase option makes the compare operation case insensitive.
www header append headerlistvar name ?value? ?...?
Set a new value for a header in addition to any existing values.
www header exists headerlist name
Check if a header with the specified name exists.
www header get headerlist name ?index? ?-lowercase?
Return the value of the requested header, if any. By default all entries are joined together, separated with a comma and a space. The resulting string is returned. If an index is specified, that is taken as an indication that the header value is defined as a comma-separated list. In that case, a Tcl list is constructed from the individual elements of all entries. The requested index from the resulting list is returned. The special index "all" causes the complete list to be returned. When the -lowercase option is specified, all values are converted to lower case.
www header replace headerlistvar name ?value? ?...?
Set a new value for a header replacing all existing entries. Multiple values are joined together into a comma-separated list. If no values are specified, all entries for the header are removed.

PROXIES

The www package supports the use of a proxy to access resources. The desired method can be configured using the -proxy option of the www configure subcommand. The -proxy option takes a command prefix that will be called before a URL is fetched. The command is invoked with two additional arguments:

url
The URL for the request. This does not include the query string, if any.
host
The host-name derived from the URL (i.e. the string between '://' and the first subsequent ':' or '/').

The command should return a list containing one or more of the following:

DIRECT
Do not use a proxy for this URL.
PROXY server
Use the indicated proxy server via HTTP.
SOCKS server
Use the indicated SOCKS4 proxy server.
HTTP server
Same as PROXY.
HTTPS server
Use the indicated proxy server via HTTPS.
SOCKS4 server
Same as SOCKS.
SOCKS5 server
Use the indicated SOCKS5 proxy server.

The server argument may be specified as "host[:port]". If the port part is omitted, it defaults to 8080 for HTTP(S) proxies, and 1080 for SOCKS([45]) proxies.

PROXY HANDLERS

The www package provides a number of predefined proxy options:

noproxy
Do not use a proxy for any URL.
defaultproxy
Use proxies based on the HTTP_PROXY, HTTPS_PROXY, and NO_PROXY environment variables. The variable names are not case sensitive.
httpproxy server
Use the indicated proxy server via HTTP.
httpsproxy server
Use the indicated proxy server via HTTPS.
socksproxy server
Use the indicated SOCKS4 proxy server.
socks4proxy server
Use the indicated SOCKS4 proxy server.
socks5proxy server
Use the indicated SOCKS5 proxy server.

An additional proxy option is available via the www::proxypac package. This option allows proxies to be selected based on a proxy auto-configuration (PAC) file. See the www::proxypac manual page for more information.

PROTOCOL UPGRADES

HTTP/1.1 connections may be transformed to a different protocol using the protocol upgrade mechanism. The client invites the server to switch to one or more alternative protocols by providing a list of protocol names in descending order of preference. The www package supports this mechanism via the -upgrade option. The option takes a dict that maps protocol names to their implementation. The offered protocols may require additional arguments. These can be provided using the -headers option.

The protocol implementation must be provided in the form of an oo::class. If the server decides to accept the upgrade, the oo::class is mixed into the object command that was created for the HTTP connection. The protocol implementation class must define at least a Startup method that accepts a single argument. This method is invoked immediately after the class is added to the connection object. A dict containing a key-value list of the headers from the server's response is passed to the Startup method.

The www package comes with upgrade implementations for WebSockets and HTTP/2. These can be utilized by loading their respective packages:

    package require www::websocket
    package require www::http2
    
See the manual entries for the packages for details of their use.

SEE ALSO

www::proxypac, www::websocket, www::http2

COPYRIGHT

Copyright © 2020, 2021 Schelte Bron