Many hyperlinks are disabled.
Use anonymous login
to enable hyperlinks.
Overview
| Comment: | Add the complex-requests-from-robots limiter. |
|---|---|
| Downloads: | Tarball | ZIP archive |
| Timelines: | family | ancestors | descendants | both | trunk |
| Files: | files | file ages | folders |
| SHA3-256: |
1a0b3043073b1f2b9274a247df7c6f77 |
| User & Date: | drh 2024-07-26 17:49:16.726 |
References
|
2024-09-11
| ||
| 13:30 | Update tests to account for new settings introduced with [1a0b304307] and [cadfcba32c]. ... (check-in: 6ead7d999e user: andybradford tags: trunk) | |
Context
|
2024-07-27
| ||
| 10:20 | A redirect to the honeypot due to robot complex-request detection also sets the "fossil-goto" cookie with the original URL. If a real users proceeds to login, then a redirect to the complex-request occurs as soon as the login completes. ... (check-in: aa4159f781 user: drh tags: trunk) | |
|
2024-07-26
| ||
| 17:49 | Add the complex-requests-from-robots limiter. ... (check-in: 1a0b304307 user: drh tags: trunk) | |
| 10:49 | When doing a "fossil open URL" such that the repository is first cloned and then opened, leaving the repository as a file in the check-out, make sure the repository pathname in VVAR is relative, so that the entire check-out can be moved without breaking the link to the repository. See [forum:/forumpost/f2f5ff2e35031612|forum thread f2f5ff2e35031612]. ... (check-in: 6e04d9cbd4 user: drh tags: trunk) | |
Changes
Changes to src/cgi.c.
| ︙ | ︙ | |||
895 896 897 898 899 900 901 902 903 904 905 906 907 908 |
if( i<nUsedQP ){
memmove(aParamQP+i, aParamQP+i+1, sizeof(*aParamQP)*(nUsedQP-i));
}
return;
}
}
}
/*
** Add an environment varaible value to the parameter set. The zName
** portion is fixed but a copy is be made of zValue.
*/
void cgi_setenv(const char *zName, const char *zValue){
cgi_set_parameter_nocopy(zName, fossil_strdup(zValue), 0);
| > > > > > > > > > > > > > | 895 896 897 898 899 900 901 902 903 904 905 906 907 908 909 910 911 912 913 914 915 916 917 918 919 920 921 |
if( i<nUsedQP ){
memmove(aParamQP+i, aParamQP+i+1, sizeof(*aParamQP)*(nUsedQP-i));
}
return;
}
}
}
/*
** Return the number of query parameters. Cookies and environment variables
** do not count. Also, do not count the special QP "name".
*/
int cgi_qp_count(void){
int cnt = 0;
int i;
for(i=0; i<nUsedQP; i++){
if( aParamQP[i].isQP && fossil_strcmp(aParamQP[i].zName,"name")!=0 ) cnt++;
}
return cnt;
}
/*
** Add an environment varaible value to the parameter set. The zName
** portion is fixed but a copy is be made of zValue.
*/
void cgi_setenv(const char *zName, const char *zValue){
cgi_set_parameter_nocopy(zName, fossil_strdup(zValue), 0);
|
| ︙ | ︙ |
Changes to src/login.c.
| ︙ | ︙ | |||
1248 1249 1250 1251 1252 1253 1254 1255 1256 1257 1258 1259 1260 1261 |
fossil_exit(0);
}
}
fossil_free(zDecode);
return uid;
}
/*
** This routine examines the login cookie to see if it exists and
** is valid. If the login cookie checks out, it then sets global
** variables appropriately.
**
** g.userUid Database USER.UID value. Might be -1 for "nobody"
** g.zLogin Database USER.LOGIN value. NULL for user "nobody"
| > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > | 1248 1249 1250 1251 1252 1253 1254 1255 1256 1257 1258 1259 1260 1261 1262 1263 1264 1265 1266 1267 1268 1269 1270 1271 1272 1273 1274 1275 1276 1277 1278 1279 1280 1281 1282 1283 1284 1285 1286 1287 1288 1289 1290 1291 1292 1293 1294 1295 1296 1297 1298 1299 1300 1301 1302 1303 1304 1305 1306 1307 1308 1309 1310 1311 1312 1313 1314 1315 1316 1317 1318 1319 1320 1321 1322 1323 1324 1325 1326 1327 1328 1329 1330 1331 1332 1333 1334 1335 1336 1337 1338 1339 1340 1341 1342 1343 1344 1345 1346 1347 1348 1349 |
fossil_exit(0);
}
}
fossil_free(zDecode);
return uid;
}
/*
** SETTING: robot-limiter boolean default=off
** If enabled, HTTP requests with one or more query parameters and
** without a REFERER string and without a valid login cookie are
** assumed to be hostile robots and are redirected to the honeypot.
** See also the robot-allow and robot-restrict settings which can
** be used to override the value of this setting for specific pages.
*/
/*
** SETTING: robot-allow width=40 block-text
** The VALUE of this setting is a list of GLOB patterns which match
** pages for which the robot-limiter is overwritten to false. If this
** setting is missing or an empty string, then it is assumed to match
** nothing.
*/
/*
** SETTING: robot-restrict width=40 block-text
** The VALUE of this setting is a list of GLOB patterns which match
** pages for which the robot-limiter setting should be enforced.
** In other words, if the robot-limiter is true and this setting either
** does not exist or is empty or matches the current page, then a
** redirect to the honeypot is issues. If this setting exists
** but does not match the current page, then the robot-limiter setting
** is overridden to false.
*/
/*
** Check to see if the current HTTP request is a complex request that
** is coming from a robot and if access should restricted for such robots.
** For the purposes of this module, a "complex request" is an HTTP
** request with one or more query parameters.
**
** If this routine determines that robots should be restricted, then
** this routine publishes a redirect to the honeypot and exits without
** returning to the caller.
**
** This routine believes that this is a complex request is coming from
** a robot if all of the following are true:
**
** * The user is "nobody".
** * The REFERER field of the HTTP header is missing or empty.
** * There are one or more query parameters other than "name".
**
** Robot restrictions are governed by settings.
**
** robot-limiter The restrictions implemented by this routine only
** apply if this setting exists and is true.
**
** robot-allow If this setting exists and the page of the request
** matches the comma-separate GLOB list that is the
** value of this setting, then no robot restrictions
** are applied.
**
** robot-restrict If this setting exists then robot restrictions only
** apply to pages that match the comma-separated
** GLOB list that is the value of this setting.
*/
void login_restrict_robot_access(void){
const char *zReferer;
const char *zGlob;
Glob *pGlob;
int go = 1;
if( g.zLogin!=0 ) return;
zReferer = P("HTTP_REFERER");
if( zReferer && zReferer[0]!=0 ) return;
if( !db_get_boolean("robot-limiter",0) ) return;
if( cgi_qp_count()<1 ) return;
zGlob = db_get("robot-allow",0);
if( zGlob && zGlob[0] ){
pGlob = glob_create(zGlob);
go = glob_match(pGlob, g.zPath);
glob_free(pGlob);
if( go ) return;
}
zGlob = db_get("robot-restrict",0);
if( zGlob && zGlob[0] ){
pGlob = glob_create(zGlob);
go = glob_match(pGlob, g.zPath);
glob_free(pGlob);
if( !go ) return;
}
/* If we reach this point, it means we have a situation where we
** want to restrict the activity of a robot.
*/
cgi_redirectf("%R/honeypot");
}
/*
** This routine examines the login cookie to see if it exists and
** is valid. If the login cookie checks out, it then sets global
** variables appropriately.
**
** g.userUid Database USER.UID value. Might be -1 for "nobody"
** g.zLogin Database USER.LOGIN value. NULL for user "nobody"
|
| ︙ | ︙ | |||
1411 1412 1413 1414 1415 1416 1417 1418 1419 1420 1421 1422 1423 1424 |
uid = -1;
zCap = "";
}
login_create_csrf_secret("none");
}
login_set_uid(uid, zCap);
}
/*
** Set the current logged in user to be uid. zCap is precomputed
** (override) capabilities. If zCap==0, then look up the capabilities
** in the USER table.
*/
| > > > | 1499 1500 1501 1502 1503 1504 1505 1506 1507 1508 1509 1510 1511 1512 1513 1514 1515 |
uid = -1;
zCap = "";
}
login_create_csrf_secret("none");
}
login_set_uid(uid, zCap);
/* Maybe restrict access to robots */
login_restrict_robot_access();
}
/*
** Set the current logged in user to be uid. zCap is precomputed
** (override) capabilities. If zCap==0, then look up the capabilities
** in the USER table.
*/
|
| ︙ | ︙ |
Changes to src/main.c.
| ︙ | ︙ | |||
2992 2993 2994 2995 2996 2997 2998 2999 3000 3001 3002 3003 3004 3005 3006 3007 3008 3009 3010 3011 |
** and the SSH_CONNECTION environment variable is set. Use the --test
** option on interactive sessions to avoid that special processing when
** using this command interactively over SSH. A better solution would be
** to use a different command for "ssh" sync, but we cannot do that without
** breaking legacy.
**
** Options:
** --test Do not do special "sync" processing when operating
** over an SSH link
** --th-trace Trace TH1 execution (for debugging purposes)
** --usercap CAP User capability string (Default: "sxy")
**
*/
void cmd_test_http(void){
const char *zIpAddr; /* IP address of remote client */
const char *zUserCap;
int bTest = 0;
Th_InitTraceLog();
zUserCap = find_option("usercap",0,1);
| > > | | | | < | > > | 2992 2993 2994 2995 2996 2997 2998 2999 3000 3001 3002 3003 3004 3005 3006 3007 3008 3009 3010 3011 3012 3013 3014 3015 3016 3017 3018 3019 3020 3021 3022 3023 3024 3025 3026 3027 |
** and the SSH_CONNECTION environment variable is set. Use the --test
** option on interactive sessions to avoid that special processing when
** using this command interactively over SSH. A better solution would be
** to use a different command for "ssh" sync, but we cannot do that without
** breaking legacy.
**
** Options:
** --nobody Pretend to be user "nobody"
** --test Do not do special "sync" processing when operating
** over an SSH link
** --th-trace Trace TH1 execution (for debugging purposes)
** --usercap CAP User capability string (Default: "sxy")
**
*/
void cmd_test_http(void){
const char *zIpAddr; /* IP address of remote client */
const char *zUserCap;
int bTest = 0;
Th_InitTraceLog();
zUserCap = find_option("usercap",0,1);
if( !find_option("nobody",0,0) ){
if( zUserCap==0 ){
g.useLocalauth = 1;
zUserCap = "sxy";
}
login_set_capabilities(zUserCap, 0);
}
bTest = find_option("test",0,0)!=0;
g.httpIn = stdin;
g.httpOut = stdout;
fossil_binary_mode(g.httpOut);
fossil_binary_mode(g.httpIn);
g.zExtRoot = find_option("extroot",0,1);
find_server_repository(2, 0);
g.zReqType = "HTTP";
|
| ︙ | ︙ |
Changes to src/setup.c.
| ︙ | ︙ | |||
488 489 490 491 492 493 494 495 496 497 498 499 500 501 | @ computer is too large. Set the threshold for disallowing expensive @ computations here. Set this to 0.0 to disable the load average limit. @ This limit is only enforced on Unix servers. On Linux systems, @ access to the /proc virtual filesystem is required, which means this limit @ might not work inside a chroot() jail. @ (Property: "max-loadavg")</p> @ <hr> @ <p><input type="submit" name="submit" value="Apply Changes"></p> @ </div></form> db_end_transaction(0); style_finish_page(); } | > > > > > > > > > > > > > > > > > > > > > > > > | 488 489 490 491 492 493 494 495 496 497 498 499 500 501 502 503 504 505 506 507 508 509 510 511 512 513 514 515 516 517 518 519 520 521 522 523 524 525 |
@ computer is too large. Set the threshold for disallowing expensive
@ computations here. Set this to 0.0 to disable the load average limit.
@ This limit is only enforced on Unix servers. On Linux systems,
@ access to the /proc virtual filesystem is required, which means this limit
@ might not work inside a chroot() jail.
@ (Property: "max-loadavg")</p>
@ <hr>
onoff_attribute("Prohibit robots from issuing complex requests",
"robot-limiter", "rlb", 0, 0);
@ <p> A "complex request" is an HTTP request that has one or more query
@ parameters. Some robots will spend hours juggling around query parameters
@ or even forging fake query parameters in an effort to discover new
@ behavior or to find an SQL injection opportunity or similar. This can
@ waste hours of CPU time and gigabytes of bandwidth on the server. Hence,
@ it is recommended to turn this feature on to stop such nefarious behavior.
@ (Property: robot-limiter)
@
@ <p> When enabled, complex requests from user "nobody" without a Referer
@ redirect to the honeypot.
@
@ <p> Additional settings below allow positive and negative overrides of
@ this complex request limiter.
@ <p><b>Allow Robots To See These Pages</b> (Property: robot-allow)<br>
textarea_attribute("", 4, 80,
"robot-allow", "rballow", "", 0);
@ <p><b>Restrict Robots From Seeing Only These Pages</b>
@ (Property: robot-restrict)<br>
textarea_attribute("", 4, 80,
"robot-restrict", "rbrestrict", "", 0);
@ <hr>
@ <p><input type="submit" name="submit" value="Apply Changes"></p>
@ </div></form>
db_end_transaction(0);
style_finish_page();
}
|
| ︙ | ︙ |