Timeline
Not logged in

Many hyperlinks are disabled.
Use anonymous login to enable hyperlinks.

5 check-ins using file src/doc.c version e3815bcae9

2012-10-31
13:58
fix comment check-in: e1aed25eee user: jan.nijtmans tags: improve_looks_like_binary
12:58
Two more enhancements.
- DOS text files sometimes use Control-Z (0x1a) as eof-marker, so this byte should be considered text.
- FEFF, FFFE and FFFF are invalid UTF-16 code points (when not used as BOM), so files containing those should be considered binary.
check-in: e3f3c390f1 user: jan.nijtmans tags: improve_looks_like_binary
09:15
Fix UTF-16 line length determination: j is counted in characters, not bytes. check-in: 44c6be2ab6 user: jan.nijtmans tags: improve_looks_like_binary
08:43
Enhance looks_like_text():
- Detect line-length overflow earlier, not at the next NL
- Implement the same binary and line-length check for UTF-16 as well

For UTF-16, the line-length limit is set to 2/3th of the line length limit for other text, because UTF-16 -> UTF-8 conversion can increase the line length (in bytes) by max 50%. This gu...

check-in: 58702daa55 user: jan.nijtmans tags: improve_looks_like_binary
2012-10-30
20:10
Faster determination of binary files, by not only checking for NUL

re-use looks_like_blob

check-in: 0ba08f9d26 user: jan.nijtmans tags: improve_looks_like_binary