Diff
Not logged in

Differences From Artifact [9128657200]:

To Artifact [023c06fa69]:


46
47
48
49
50
51
52


53
54
55
56




57
58
59
60
61
62
63
46
47
48
49
50
51
52
53
54




55
56
57
58
59
60
61
62
63
64
65







+
+
-
-
-
-
+
+
+
+







Ordinary characters consume a single character of the target and must
match it exactly.

Special characters (and special character sequences) consume zero or
more characters from the target and describe what matches. The special
characters (and sequences) are:

:Pattern |:Effect
---------------------------------------------------------------------
 *  `*` Matches any sequence of zero or more characters;
 *  `?` Matches exactly one character;
 *  `[...]` Matches one character from the enclosed list of characters; and
 *  `[^...]` Matches one character not in the enclosed list.
`*`      | Matches any sequence of zero or more characters
`?`      | Matches exactly one character
`[...]`  | Matches one character from the enclosed list of characters
`[^...]` | Matches one character not in the enclosed list

Special character sequences have some additional features:

 *  A range of characters may be specified with `-`, so `[a-d]` matches
    exactly the same characters as `[abcd]`. Ranges reflect Unicode
    code points without any locale-specific collation sequence.
 *  Include `-` in a list by placing it last, just before the `]`.
75
76
77
78
79
80
81


82
83


84
85
86
87
88
89
90






91
92
93
94
95
96
97
77
78
79
80
81
82
83
84
85


86
87







88
89
90
91
92
93
94
95
96
97
98
99
100







+
+
-
-
+
+
-
-
-
-
-
-
-
+
+
+
+
+
+







 *  Note that unlike typical Unix shell globs, wildcards (`*`, `?`,
    and character lists) are allowed to match `/` directory
    separators as well as the initial `.` in the name of a hidden
    file or directory.

Some examples of character lists:

:Pattern |:Effect
---------------------------------------------------------------------
 *  `[a-d]` Matches any one of `a`, `b`, `c`, or `d` but not `ä`;
 *  `[^a-d]` Matches exactly one character other than `a`, `b`, `c`,
`[a-d]`  | Matches any one of `a`, `b`, `c`, or `d` but not `ä`
`[^a-d]` | Matches exactly one character other than `a`, `b`, `c`, or `d`
    or `d`;
 *  `[0-9a-fA-F]` Matches exactly one hexadecimal digit;
 *  `[a-]` Matches either `a` or `-`;
 *  `[][]` Matches either `]` or `[`;
 *  `[^]]` Matches exactly one character other than `]`;
 *  `[]^]` Matches either `]` or `^`; and
 *  `[^-]` Matches exactly one character other than `-`.
`[0-9a-fA-F]` | Matches exactly one hexadecimal digit
`[a-]`   | Matches either `a` or `-`
`[][]`   | Matches either `]` or `[`
`[^]]`   | Matches exactly one character other than `]`
`[]^]`   | Matches either `]` or `^`
`[^-]`   | Matches exactly one character other than `-`

White space means the specific ASCII characters TAB, LF, VT, FF, CR,
and SPACE.  Note that this does not include any of the many additional
spacing characters available in Unicode, and specifically does not
include U+00A0 NO-BREAK SPACE.

Because both LF and CR are white space and leading and trailing spaces
125
126
127
128
129
130
131
132
133



134
135
136
137

138
139
140
141
142

143
144
145
146

147
148
149

150
151
152
153
154
155
156
157
158


159
160
161
162
163
164
165







166
167
168
169
170
171
172
128
129
130
131
132
133
134


135
136
137




138





139




140



141


142
143
144
145
146
147
148
149
150







151
152
153
154
155
156
157
158
159
160
161
162
163
164







-
-
+
+
+
-
-
-
-
+
-
-
-
-
-
+
-
-
-
-
+
-
-
-
+
-
-







+
+
-
-
-
-
-
-
-
+
+
+
+
+
+
+







not be a surprise on Unix where all file names are also case
sensitive. However, most Windows file systems are case preserving and
case insensitive. That is, on Windows, the names `ReadMe` and `README`
are names of the same file; on Unix they are different files.

Some example cases:

 *  The glob `README` matches only a file named `README` in the root of
    the tree. It does not match a file named `src/README` because it
:Pattern     |:Effect
--------------------------------------------------------------------------------
`README`     | Matches only a file named `README` in the root of the tree. It does not match a file named `src/README` because it does not include any characters that consume (and match) the `src/` part.
    does not include any characters that consume (and match) the
    `src/` part.
 *  The glob `*/README` does match `src/README`. Unlike Unix file
    globs, it also matches `src/library/README`. However it does not
`*/README`   | Matches `src/README`. Unlike Unix file globs, it also matches `src/library/README`. However it does not match the file `README` in the root of the tree.
    match the file `README` in the root of the tree.
 *  The glob `*README` does match `src/README` as well as the file
    `README` in the root of the tree as well as `foo/bar/README` or
    any other file named `README` in the tree. However, it also
    matches `A-DIFFERENT-README` and `src/DO-NOT-README`, or any other
`*README`    | Matches `src/README` as well as the file `README` in the root of the tree as well as `foo/bar/README` or any other file named `README` in the tree. However, it also matches `A-DIFFERENT-README` and `src/DO-NOT-README`, or any other file whose name ends with `README`.
    file whose name ends with `README`.
 *  The glob `src/README` does match the file named `src\README` on
    Windows because all directory separators are rewritten as `/` in
    the canonical name before the glob is matched. This makes it much
`src/README` | Matches `src\README` on Windows because all directory separators are rewritten as `/` in the canonical name before the glob is matched. This makes it much easier to write globs that work on both Unix and Windows.
    easier to write globs that work on both Unix and Windows.
 *  The glob `*.[ch]` matches every C source or header file in the
    tree at the root or at any depth. Again, this is (deliberately)
`*.[ch]`     | Matches every C source or header file in the tree at the root or at any depth. Again, this is (deliberately) different from Unix file globs and Windows wild cards.
    different from Unix file globs and Windows wild cards.


## Where Globs are Used

### Settings that are Globs

These settings are all lists of glob patterns:

:Setting        |:Description
--------------------------------------------------------------------------------
 *  `binary-glob`
 *  `clean-glob`
 *  `crlf-glob`
 *  `crnl-glob`
 *  `encoding-glob`
 *  `ignore-glob`
 *  `keep-glob`
`binary-glob`   | Files that should be treated as binary files for committing and merging purposes
`clean-glob`    | Files that the [`clean`][] command will delete without prompting or allowing undo
`crlf-glob`     | Files in which it is okay to have `CR`, `CR`+`LF` or mixed line endings.  Set to "`*`" to disable CR+LF checking
`crnl-glob`     | Alias for the `crlf-glob` setting
`encoding-glob` | Files that the [`commit`][] command will ignore when issuing warnings about text files that may use another encoding than ASCII or UTF-8.  Set to "`*`" to disable encoding checking
`ignore-glob`   | Files that the [`add`][], [`addremove`][], [`clean`][], and [`extras`][] commands will ignore
`keep-glob`     | Files that the [`clean`][] command will keep

All may be [versioned, local, or global](settings.wiki). Use `fossil
settings` to manage local and global settings, or a file in the
repository's `.fossil-settings/` folder at the root of the tree named
for each for versioned setting.

Using versioned settings for these not only has the advantage that
189
190
191
192
193
194
195

196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215

216
217
218
219
220
221
222
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216







+




















+







usually named to correspond to the setting they override, such as
`--ignore` to override the `ignore-glob` setting. These commands are:

 *  [`add`][]
 *  [`addremove`][]
 *  [`changes`][]
 *  [`clean`][]
 *  [`commit`][]
 *  [`extras`][]
 *  [`merge`][]
 *  [`settings`][]
 *  [`status`][]
 *  [`unset`][]

The commands [`tarball`][] and [`zip`][] produce compressed archives of a
specific checkin. They may be further restricted by options that
specify glob patterns that name files to include or exclude rather
than archiving the entire checkin.

The commands [`http`][], [`cgi`][], [`server`][], and [`ui`][] that
implement or support with web servers provide a mechanism to name some
files to serve with static content where a list of glob patterns
specifies what content may be served.

[`add`]: /help?cmd=add
[`addremove`]: /help?cmd=addremove
[`changes`]: /help?cmd=changes
[`clean`]: /help?cmd=clean
[`commit`]: /help?cmd=commit
[`extras`]: /help?cmd=extras
[`merge`]: /help?cmd=merge
[`settings`]: /help?cmd=settings
[`status`]: /help?cmd=status
[`unset`]: /help?cmd=unset

[`tarball`]: /help?cmd=tarball
515
516
517
518
519
520
521
522
523
524




525
526
527
528
529


530
531
532
533
534

535
536
537
538
539
540




509
510
511
512
513
514
515



516
517
518
519

520



521
522
523
524
525
526

527






528
529
530
531







-
-
-
+
+
+
+
-

-
-
-
+
+




-
+
-
-
-
-
-
-
+
+
+
+
a glob pattern. Find commands and pages in the fossil sources by
looking for comments like `COMMAND: add` or `WEBPAGE: timeline` in
front of the function that implements the command or page in files
`src/*.c`. (Fossil's build system creates the tables used to dispatch
commands at build time by searching the sources for those comments.) A
few starting points:

 *  [`src/glob.c`][glob.c] implements glob pattern list loading,
    parsing, and matching.
 *  [`src/file.c`][file.c] implements various kinds of canonical
:File            |:Description
--------------------------------------------------------------------------------
[`src/glob.c`][] | Implementation of glob pattern list loading, parsing, and matching.
[`src/file.c`][] | Implementation of various kinds of canonical names of a file.
    names of a file.


[glob.c]: https://www.fossil-scm.org/index.html/file/src/glob.c
[file.c]: https://www.fossil-scm.org/index.html/file/src/file.c
[`src/glob.c`]: https://www.fossil-scm.org/index.html/file/src/glob.c
[`src/file.c`]: https://www.fossil-scm.org/index.html/file/src/file.c

The actual pattern matching is implemented in SQL, so the
documentation for `GLOB` and the other string matching operators in
[SQLite] (https://sqlite.org/lang_expr.html#like) is useful. Of
course, the SQLite source code and test harnesses also make
course, the SQLite [source code]
entertaining reading:

 *  `src/func.c` [lines 570-768]
    (https://www.sqlite.org/src/artifact?name=9d52522cc8ae7f5c&ln=570-768)
 *  `test/expr.test` [lines 586-673]
    (https://www.sqlite.org/src/artifact?name=66a2c9ac34f74f03&ln=586-673)
(https://www.sqlite.org/src/artifact?name=9d52522cc8ae7f5c&ln=570-768)
and [test harnesses]
(https://www.sqlite.org/src/artifact?name=66a2c9ac34f74f03&ln=586-673)
also make entertaining reading.