mulster

What is this

This little thing provides multi-line replacements in text files. Also, it provides in-memory replacements, on a list of strings.

Although there are tons of similar utilities, maybe the mulster has its own goods.

It would be useful when you need:

  1. replace periodically some bunches of strings in some texts;
  2. modify periodically a snapshot of some software without its forking;
  3. share your little modifications with your colleagues, without resorting to those great GIT, MERCURIAL, FOSSIL etc.;
  4. customize files generated by document generators like Ruff!;
  5. prepare a Tcl source file to be processed by freeWrap;
  6. use multi-line replacement functions in your software.


Usage

The mulster is called this way:

tclsh mulster.tcl [options] fileini

where

fileini is a name of file containing settings for replacements

options are:

  • -infile input-file-name means a name of input file to be processed;
  • -outfile output-file-name means a name of output file; if -outfile is omitted, output-file-name equals to input-file-name;
  • -backup BAK means the original input files to be backed up to the BAK directory (which is default behavior);
  • -backup 0 means the original input files to be not backed up;
  • -keep 1 means that input files' attributes/times will be saved in output files; by default they are not kept;
  • -charset charset sets a charset of input files, e.g. cp1251;
  • -lineend lineend sets a line ending of input files, e.g. \r\n (by default, \n);
  • -single 1 means standard 'one string for one string' replacements;
  • -files 1 for OUT= lines to be file names to take the lines from (e.g. for freeWrap, "source" lines to be replaced with Tcl files)
  • -mode means the mode of matching input against output strings:
    • -mode 0 (or exact0 or EXACT0) to match exact, without leading/tailing spaces
    • -mode 1 (or exact or EXACT), default, to match exact, with all their leading/tailing spaces
    • -mode 2 (or glob or GLOB) to match glob pattern
    • -mode 3 (regexp, re, REGEXP, RE) to match regexp pattern
    • -mode regexp-- to match regexp pattern and call regsub which means to match regexp for IN= lines and substitute by OUT= lines
      Note: IN= line is to be found, OUT= line is to substitute, according to the regexp syntax accepted in Tcl.
      Note: With regexp and regsub, only one line is of use to IN=/OUT= block.
    • -mode regexp-all to match regexp pattern and call regsub -all
    • -mode regexp-nocase to match regexp pattern and call regsub -nocase
    • -mode regexp-expanded to match regexp pattern and call regsub -expanded
    • -mode regexp can be combined, e.g. -mode regexp-all-nocase
  • -- switches options off (for fileini).
The -infile, -outfile, -mode, -backup, -keep, -single, -files, -charset and -lineend options can be redefined in fileini, e.g.
...
BACKUP=BAK/new backup dir
KEEP=1
CHARSET=cp1251
LINEEND=\r\n
INFILE=input file2 name
OUTFILE=output file2 name
SINGLE=1
MODE=re
...
so that these options can be individual for any file(s).

The fileini has the following structure:

INFILE=input file name
OUTFILE=output file name

IN=BEGIN
line #1 to find
line #2 to find
...
line #N1 to find
IN=END
OUT=BEGIN
line #1 of replacement
line #2 of replacement
...
line #N2 of replacement
OUT=END

IN=BEGIN(r1,r2)
....
IN=END
OUT=BEGIN
...
OUT=END
...

BACKUP=new backup dir
KEEP=1
CHARSET=charset
LINEEND=line ending

INFILE=input file2 name
OUTFILE=output file2 name

SINGLE=1
MODE=regexp
...

The INFILE= and OUTFILE= set the names of input and output files.

The INFILE= names can be glob patterns, but OUTFILE= can contain only "*" that would be replaced with an appropriate INFILE= file name, e.g.:

INFILE=~/DOCS/HTML/*.html
OUTFILE=~/DOCS/HTML/GENERATED/gen_*.html

so that the input ~/DOCS/HTML/name1.html would result in the output ~/DOCS/HTML/GENERATED/gen_name1.html.

If INFILE= and OUTFILE= are glob patterns, the INFILE= directories are scanned recursively for files matching the input pattern. All output files would be created in appropriate subdirectories of OUTFILE= root directory.

If input file name is equal to output file name, all modifications are performed on the same file.

All strings between current IN=BEGIN and IN=END are replaced with strings between next OUT=BEGIN and OUT=END. The sequence of INFILE=, OUTFILE=, IN=, OUT= is set for each processed file.

The IN=BEGIN(r1,r2) form means that a range of found matches should be processed as follows:

IN=BEGIN(r1,r2) - r1-th match through r2-th one
IN=BEGIN(r1,0)  - r1-th match through the last one
IN=BEGIN(0,r2)  - the same as IN=BEGIN(1,r2)
IN=BEGIN(1,1)   - first match only
IN=BEGIN(0,0)   - all matches; the same as IN=BEGIN

All strings outside of INFILE=, OUTFILE=, IN=BEGIN through IN=END, OUT=BEGIN through OUT=END, BACKUP=, KEEP=, CHARSET=, LINEEND=, SINGLE=, FILES=, MODE= are ignored (being sort of comments).

Note: if the mulster comes across BACKUP=, KEEP=, CHARSET=, LINEEND or INFILE= option in fileini, it flushes all collected changes to the current output file and begins a new collection of changes for a new input/output. The SINGLE=, FILES= and MODE= options operate for the next IN=, OUT= blocks.

So, the order of options is important:

  1. BACKUP=, KEEP=, CHARSET=, LINEEND= go first if any
  2. INFILE= and OUTFILE= go next
  3. SINGLE=, FILES= and MODE= go next if any (defined for the following IN=,OUT= blocks)
  4. IN=BEGIN and IN=END go next
  5. OUT=BEGIN and OUT=END go next
  6. (3) through (5) can be repeated
  7. (1) through (6) can be repeated


Examples

For example, applying the following fileini:

INFILE=modul1.tcl
OUTFILE=modul2.tcl
IN=BEGIN
proc1 $a $b
proc2 $a2 $b2
IN=END
OUT=BEGIN
proc3 $a $b $a2 $b2  ;# <=====REPLACED
OUT=END

... to the modul1.tcl containing:

1st-comm
2nd-comm
proc1 $a $b
proc2 $a2 $b2
next-comm
#... other commands
proc1 $a $b
proc2 $a2 $b2

... we get the modul2.tcl containing:

1st-comm
2nd-comm
proc3 $a $b $a2 $b2  ;# <=====REPLACED
next-comm
#... other commands
proc3 $a $b $a2 $b2  ;# <=====REPLACED

Examples of calling:

tclsh mulster.tcl mulster1_ini
tclsh mulster.tcl -mode 0 mulster2_ini
tclsh mulster.tcl -backup ~/BAK mulster3_ini
tclsh mulster.tcl -mode glob -backup 0 -keep 0 mulster4_ini

While using the -backup 0 option, please be careful. This mode is well suitable when:

  • all your input files are not the same as output files;
  • you've made a backup beforehand;
  • you have a nice VCS and are not worried about any data loss.
Otherwise you would take risks of data loss.

The mulster prints out a log of replacements made (as well as made not).

The #-comments of fileini are printed as well. It's good for debugging.

Also in fileini, you can set MODE=debug to print the current modes. Use MODE=exit to stop the processing.


Why

I run into an appropriate case of 'mulstering' some time ago when it occured to me to enhance the context action of Geany IDE. Being not of Geany team nor of their contributors/fans, I couldn't insist on this enhancement with Github PR (though had tried, without success:) At that, forking/cloning Geany from Github to perform my home-made corrections would be an overkill and a waste of time, having in mind the future releases of Geany. Each time at releasing new Geany to do the same manipulations? I'm too lazy, so this scenario isn't for me.

But now, with the mulster at hand, all I need is:

  1. clone a current Geany;
  2. 'mulster' some files of it to get the desirable facility;
  3. 'make istall' Geany;
  4. repeat (1) through (3) at releasing new Geany versions, all those actions being easily automated with shell commands prepared beforehand (as well as 'fileini').

Nearly the same thing has repeated with TKE editor. I have its clone and work at it, but being only a contributor (the author is Trevor Williams) I cannot implement some of its features which are good for me and not acceptable for the author. So, I need these facilities being implemented 'on fly', after pulling TKE from its repository. Then that my changes made, the mulstered TKE code should be undoed before pushing my changes to SourceForge.

These pull/push transactions and accompanying mulsterings are made with one click in a TKE plugin. I need only supervising a short log.

There is a whole class of mulster use cases, namely using it for a fine tuning of generated documents. This way, e.g., trimmer.html and mulster.html had been created initially by Ruff! documentation generator to be processed by mulster afterwards.

The mulster happens to be useful for freeWrap which (ideally) needs one Tcl script to make an executable program. So, we are taking a main Tcl script and replacing its "source" commands with Tcl files which in turn have to be processed as for "source" commands. The problem to resolve here is related to a peculiar mode of sourcing in most cases. This problem is excellently resolved with mulster.

The mulster.zip archive contains fileini files (mulster-geany, mulster-tke, mulster-ruff, mulster-freewrap) to make the appropriate changes for Geany, TKE, Ruff! and freeWrap.


Source

The source of the mulster utility is described at mulster.html generated by Ruff! & mulster.


Download

The source of the mulster utility is available here:

Notice that mulster is still disposed to update.


See also