Fossil: Artifact [a9fa9df4fa]

Artifact a9fa9df4faeeb7e4ac5878adb27d5359fc9bee85fc3908e5b6f9bcccc39740e2:

File www/server.wiki — part of check-in [0bb59100f2] at 2019-08-16 10:27:11 on branch server-docs — Converted the backwards-compatibility sections in www/server.wiki into identified hyperlinks to the new docs, which allows existing external ".../server.wiki#cgi" URLs and such to work without needing the near-empty sections containing only a hyperlink just to anchor the link. (user: wyoung size: 12386)
<title>How To Configure A Fossil Server</title>

<h2>No Server Required</h2>

<blockquote>
Fossil does <em>not</em> require a central server.
Data sharing and synchronization can be entirely peer-to-peer.
Fossil uses [https://en.wikipedia.org/wiki/Conflict-free_replicated_data_type|conflict-free replicated data types]
to ensure that (in the limit) all participating peers see the exact same content.
</blockquote>

<h2>But, A Server Can Be Useful</h2>

<blockquote>
Fossil does not require a server,
but a server does make collaboration easier.
A Fossil server also works well as a complete website for a project.
For example, the [https://www.fossil-scm.org/] website, including the
page you are now reading,
is just a Fossil server displaying the content of the
self-hosting repository for Fossil.

This article is a guide for setting up your own Fossil server.

See "[./aboutcgi.wiki|How CGI Works In Fossil]" for background
information on the underlying CGI technology.
See "[./sync.wiki|The Fossil Sync Protocol]" for information on the
wire protocol used for client/server communication.
</blockquote>

<h2 id="methods">Methods</h2>

<blockquote>
There are basically four ways to set up a Fossil server:

<ol>
  <li>Socket activation:
      <a id="inetd"   href="./server/any/inetd.md">inetd</a>,
      <a id="xinetd"  href="./server/any/xinetd.md">xinetd</a>,
      <a id="stunnel" href="./server/any/stunnel.md">stunnel</a>...
  <li><a id="standalone" href="./server/any/none.md">Stand-alone HTTP server</a>
  <li><a id="scgi" href="./server/any/scgi.md">SCGI</a>
  <li><a  id="cgi" href="./server/any/cgi.md">CGI</a>
</ol>

The HTTP and SCGI options also allow for various sorts of reverse
proxying: Apache, nginx, HAProxy, stunnel (proxy mode), IIS...

Regardless of the method you choose, all can serve either a single repository
or a directory hierarchy containing many repositories with names ending in ".fossil".

We've broken the configuration for each method out into a series of
sub-articles, some of which are OS-specific:
</blockquote>

<table style="margin-left: 6em;">
    <tr>
        <th>&nbsp;</th>
        <th colspan="11" style="background-color: #efefef">Fossil Front-End Program</th>
    </tr>

    <tr>
        <th style="background-color: #e8e8e8; padding: 6px; text-align: right">Host OS</th>
        <th>none</th>
        <th>inetd</th>
        <th>xinetd</th>
        <th>stunnel</th>
        <th>CGI</th>
        <th>SCGI</th>
        <th>althttpd</th>
        <th>nginx</th>
        <th>Apache</th>
        <th>IIS</th>
        <th>OS&nbsp;service</th>
    </tr>

    <tr>
        <th style="background-color: #e8e8e8; padding: 6px; text-align: right">Any</th>
        <td style="text-align: center"><a href="./server/any/none.md">✅</a></td>
        <td style="text-align: center"><a href="./server/any/inetd.md">✅</a></td>
        <td style="text-align: center"><a href="./server/any/xinetd.md">✅</a></td>
        <td style="text-align: center"><a href="./server/any/stunnel.md">✅</a></td>
        <td style="text-align: center"><a href="./server/any/cgi.md">✅</a></td>
        <td style="text-align: center"><a href="./server/any/scgi.md">✅</a></td>
        <td style="text-align: center"><a href="./server/any/althttpd.md">✅</a></td>
        <td style="text-align: center">❌</td>
        <td style="text-align: center">❌</td>
        <td style="text-align: center">❌</td>
        <td style="text-align: center">❌</td>
    </tr>

    <tr>
        <th style="background-color: #e8e8e8; padding: 6px; text-align: right">Debian/Ubuntu</th>
        <td style="text-align: center"><a href="./server/any/none.md">✅</a></td>
        <td style="text-align: center"><a href="./server/any/inetd.md">✅</a></td>
        <td style="text-align: center"><a href="./server/any/xinetd.md">✅</a></td>
        <td style="text-align: center"><a href="./server/any/stunnel.md">✅</a></td>
        <td style="text-align: center"><a href="./server/any/cgi.md">✅</a></td>
        <td style="text-align: center"><a href="./server/any/scgi.md">✅</a></td>
        <td style="text-align: center"><a href="./server/any/althttpd.md">✅</a></td>
        <td style="text-align: center"><a href="./server/debian/nginx.md">✅</a></td>
        <td style="text-align: center">❌</td>
        <td style="text-align: center">❌</td>
        <td style="text-align: center">❌</td>
    </tr>

    <tr>
        <th style="background-color: #e8e8e8; padding: 6px; text-align: right">Windows</th>
        <td style="text-align: center">❌</td>
        <td style="text-align: center">❌</td>
        <td style="text-align: center">❌</td>
        <td style="text-align: center"><a href="./server/windows/stunnel.md">✅</a></td>
        <td style="text-align: center">❌</td>
        <td style="text-align: center">❌</td>
        <td style="text-align: center">❌</td>
        <td style="text-align: center">❌</td>
        <td style="text-align: center">❌</td>
        <td style="text-align: center">❌</td>
        <td style="text-align: center"><a href="./server/windows/service.md">✅</a></td>
    </tr>
</table>

<blockquote>
Where there is a check mark in the "Any" row, the method for that is
generic enough that it works across OSes that Fossil is known to work
on. The check marks below that usually just link to this generic
documentation.

There are several widely-deployed socket activation schemes besides the
<tt>inetd</tt>, <tt>xinetd</tt>, and <tt>stunnel</tt> schemes with
documents linked above: Apple’s <tt>launchd</tt>, Linux’s
<tt>systemd</tt>, Solaris’ SMF, etc. We would welcome [./contribute.wiki
| contributions] to cover these as well. We also welcome contributions
to fill gaps (❌) in the table above.
</blockquote>


<h2 id="ext">CGI Server Extensions</h2>

<blockquote>
    In addition to serving Fossil repositories via CGI, Fossil can
    itself [./serverext.wiki | launch other programs via CGI] to
    implement server extensions. Do not confuse these two concepts. This
    extension mechanism works regardless of the method above you choose
    to serve your Fossil repository.
</blockquote>


<h2 id="tls">Securing a repository with TLS</h2>

<blockquote>
  Fossil's built-in HTTP server (e.g. "fossil server") does not support
  TLS, but there are multiple ways to protect your Fossil server with
  TLS. All of this is covered in a separate document, <a
  href="./ssl.wiki">Using TLS-Encrypted Communications with Fossil</a>.
</blockquote>


<h2 id="chroot">The Fossil Chroot Jail</h2>

<blockquote>
If you run Fossil as root in any mode that serves data on the
network, and you're running it on Unix or a compatible OS, Fossil
will drop itself into a [https://en.wikipedia.org/wiki/Chroot |
chroot jail] shortly after starting up. It will drop its root
privileges once it's done everything that requires root access; most
commonly, you run Fossil as root to allow it to bind to TCP port 80
for HTTP service, since normal users are restricted to ports 1024
and up on OSes where this behavior occurs.

Fossil uses the owner of the Fossil repository file as its new user
ID when dropping root privileges.

When this happens, Fossil needs to have all of its dependencies
inside the chroot jail.  There are several things you typically need
in order to make things work properly:

<ul>
    <li>the repository file(s)

    <li><tt>/dev/null</tt> — create it with <tt>mknod(8)</tt>
    inside the jail directory

    <li><tt>/dev/urandom</tt> — ditto

    <li>any shared libraries your <tt>fossil</tt> binary is linked
    to, such as <tt>/lib/libssl.so</tt>; consider building Fossil as a
    static binary to avoid this
</ul>
</blockquote>

<blockquote>
Fossil does all of this in order to protect the host OS.  There is
no way to bypass it, on purpose.
</blockquote>


<h2 id="loadmgmt">Managing Server Load</h2>

<blockquote>
A Fossil server is very efficient and normally presents a very light
load on the server.
The Fossil [./selfhost.wiki | self-hosting server] is a 1/24th slice VM at
[http://www.linode.com | Linode.com] hosting 65 other repositories in
addition to Fossil (and including some very high-traffic sites such
as [http://www.sqlite.org] and [http://system.data.sqlite.org]) and
it has a typical load of 0.05 to 0.1.  A single HTTP request to Fossil
normally takes less than 10 milliseconds of CPU time to complete, so
requests can be arriving at a continuous rate of 20 or more per second,
and the CPU can still be mostly idle.

However, there are some Fossil web pages that can consume large
amounts of CPU time, especially on repositories with a large number
of files or with long revision histories.  High CPU usage pages include
[/help?cmd=/zip | /zip], [/help?cmd=/tarball | /tarball],
[/help?cmd=/annotate | /annotate] and others.  On very large repositories,
these commands can take 15 seconds or more of CPU time.
If these kinds of requests arrive too quickly, the load average on the
server can grow dramatically, making the server unresponsive.

Fossil provides two capabilities to help avoid server overload problems
due to excessive requests to expensive pages:

<ol>
  <li><p>An optional cache is available that remembers the 10 most recently
      requested /zip or /tarball pages and returns the precomputed answer
      if the same page is requested again.</p>
  <li><p>Page requests can be configured to fail with a
      [http://www.w3.org/Protocols/rfc2616/rfc2616-sec10.html#sec10.5.4 | "503 Server Overload"]
      HTTP error if an expensive request is received while the host load
      average is too high.</p>
</ol>

Both of these load-control mechanisms are turned off by default, but they
are recommended for high-traffic sites.

The webpage cache is activated using the [/help?cmd=cache|fossil cache init]
command-line on the server.  Add a -R option to specify the specific repository
for which to enable caching.  If running this command as root, be sure to
"chown" the cache database (which is a separate file in the same directory
and with the same name as the repository but with the suffix changed to ".cache")
to give it write permission for the userid of the web server.

To activate the server load control feature
visit the Admin → Access setup page in the administrative web
interface; in the "<b>Server Load Average Limit</b>" box
enter the load average threshold above which "503 Server
Overload" replies will be issued for expensive requests.  On the
self-hosting Fossil server, that value is set to 1.5, but you could easily
set it higher on a multi-core server.

The maximum load average can also be set on the command line using
commands like this:
<blockquote><pre>
fossil set max-loadavg 1.5
fossil all set max-loadavg 1.5
</pre></blockquote>

The second form is especially useful for changing the maximum load average
simultaneously on a large number of repositories.

Note that this load-average limiting feature is only available on operating
systems that support the "getloadavg()" API.  Most modern Unix systems have
this interface, but Windows does not, so the feature will not work on Windows.
Note also that Linux implements "getloadavg()" by accessing the "/proc/loadavg"
file in the "proc" virtual file system.  If you are running a Fossil instance
inside a chroot() jail on Linux, you will need to make the "/proc" file
system available inside that jail in order for this feature to work.  On
the [./selfhost.wiki|self-hosting Fossil repositories], this was accomplished
by adding a line to the "/etc/fstab" file that looks like:

<blockquote><pre>
chroot_jail_proc /home/www/proc proc ro 0 0
</pre></blockquote>

The /home/www/proc pathname should be adjusted so that the "/proc" component is
in the root of the chroot jail, of course.

To see if the load-average limiter is functional, visit the [/test_env] page
of the server to view the current load average.  If the value for the load
average is greater than zero, that means that it is possible to activate
the load-average limiter on that repository.  If the load average shows
exactly "0.0", then that means that Fossil is unable to find the load average
(either because it is in a chroot() jail without /proc access, or because
it is running on a system that does not support "getloadavg()") and so the
load-average limiter will not function.

</blockquote>