mr: whitespace issues with mrconfig reading/writing

Adam Spiers vcs-home at adamspiers.org
Wed Sep 19 15:05:49 CEST 2012


Currently mr trims a single whitespace character from the front of
each continuation line in loadconfig():

        elsif (/^(\w+)\s*=\s*(.*)/) {
                my $parameter=$1;
                my $value=$2;

                # continued value
                while (@lines && $lines[0]=~/^\s(.+)/) {
                        $value.="\n$1";
                        chomp $value;
                        $nextline->();
                }

This is OK for tab lovers who can just use a single hard tab to indent
continuation lines, but not for anyone else.  It seems that there is
an implicit expectation that hard tabs will be used in mrconfig files,
since modifyconfig() rewrites an entire mrconfig file at a time using
the following formatter:

        my $formatfield=sub {
                my $field=shift;
                my @value=split(/\n/, shift);

                return "$field = ".shift(@value)."\n".
                        join("", map { "\t$_\n" } @value);
        };

rather than only modifying the parameters for which a change was
requested.  This will cause superfluous whitespace changes unless tabs
were initially used, and can also introduce trailing whitespace on the
"$field = " line which will be flagged by some editors and utilities
(e.g. git).

If mr is intended to be this opinionated about whitespace, I think it
would be good to document what the expectations are; although it would
be nicer if it wasn't quite as opinionated.

Another issue is that the only way to obtain a blank line in the
middle of your continuation lines is to create a line with a single
hard tab on it.  For example:

        |fixups=
        |<TAB>cat <<EOF >myfile
        |<TAB>first line
        |<TAB>
        |<TAB>third line
        |<TAB>EOF
        |<TAB>
        |<TAB># do some more fixups

where the left margin of the above mrconfig is indicated by the '|'
wall.  Again, in this example, git would complain about trailing
whitespace on the lines after 'first line' and 'EOF', but those two
tabs can't be removed without causing mr to fail to parse them
correctly.

This parsing code is duplicated in modifyconfig() with some tiny
differences:

        elsif (/^(\w+)\s*=\s(.*)/) {
                my $parameter=$1;
                my $value=$2;

                # continued value
                while (@lines && $lines[0]=~/^\s(.+)/) {
                        shift(@lines);
                        $value.="\n$1";
                        chomp $value;
                }

I have coded a solution to the whitespace problem which takes the
whitespace prefix from the first continuation line and removes it from
any subsequent continuation lines.  Ideally modifyconfig() would not
rewrite lines it doesn't intend to modify, which would avoid the need
for it to do any parsing.  But as it currently stands, it contains
duplicated code, so I refactored the line reading logic so that it
could be reused.  Unfortunately the existing use of closures strongly
coupled code with lexically scoped state, which meant that the only
clean way of refactoring this was to extract a new LineReader class.
I don't expect that this solution will be accepted but here it is
anyway for reference:

        https://github.com/aspiers/kitenet-mr/commits/LineReader

Thanks,
Adam


More information about the vcs-home mailing list