Friday 26 July 2013

The first corner: "-n" for echo.

So it seems you weren't totally put off by my first post and have decided to see what other bugs I can accidentally put in to a trivial program! Onwards and upwards...

First things first: The two version in my previous post where actually not equivalent, as Andrei kindly pointed out to me. While the first, imperative, version prints a space after the last argument, the second version does not! A rookie error from me... Following the echo spec., the arguments should be separated by spaces, but there is no mention of trailing space. Hence, we need a fixed version of the imperative code:

import std.stdio : write, writeln;
void main(string[] args)
{
    if(args.length > 1)
    {
        foreach(arg; args[1 .. $-1])
        {
            write(arg, ' ');
        }
        write(args[$-1]);
    }
    writeln();
}

There we go. We check if there are arguments we need to print with 
if(args.length > 1)
, if we do then we print all the arguments - barring the last one - as before, but then print the last one on its own. Note that we could have achieved the same thing with an
if
statement in the loop checking whether we are on the last element and omitting the space. I prefer the explicit separation; sometimes (normally in more complicated cases) your optimiser will too. We finish off with a newline, whether we had any arguments or not. Bug fixed!


Now things get a bit more interesting. As with many supposedly "simple" command line tools, the devil is in the options. Take a look here to peruse a normal list of options for echo. Let's make a start with the "-n" option. If you pass the argument "-n", before any non-option arguments (anything other than "-n" for us at the moment), echo will not print a newline at the end of it's output.

In order to achieve this, we're going to introduce an preliminary loop to detect the option and a boolean variable 
writeNewline
to record the information.

import std.stdio : write, writeln;
void main(string[] args)
{
    args = args[1 .. $];
    bool writeNewline = true;
    
    size_t i = 0;
    foreach(arg; args)
    {
        if(arg == "-n")
        {
            writeNewline = false;
            ++i;
        }
        else
        {
            break;
        }
    }
    args = args[i .. $];
    
    if(args.length > 0)
    {
        write(args[0]);
        foreach(arg; args[1 .. $])
        {
            write(' ', arg);
        }
    }

    if(writeNewline)
    {
        writeln();
    }
}

There's a good few more lines there, but the structure is quite simple. We're making use of slicing in a new way, by assigning to 
args
a slice of its former self:
args = args[1 .. $];
, we're slicing off the first element and forgetting about it, because it is of no use to us. We declare the boolean variable
writeNewline
with
bool writeNewline = true;
, initialising it to
true
as that's the default behaviour of echo.

Then we loop over the args until we find our first non-option, recording what we find. 
if(arg == "-n")
* then we've found the "-n" option, so we set
writeNewLine = false;
and carry on. If we don't find an option, we
break;
out of the loop. We're keeping track of how many options we have with
size_t i = 0;
**, incrementing it with
++i;
; this means we can just chop the option arguments off the beginning of
args
with
args = args[i .. $];
.

From there, it's plain sailing. We print any remaining arguments - taking care to get the spaces in the right place - and then print a newline if necessary.


Just like last time, here's a more functional flavoured fancy version.

import std.algorithm : find, joiner;
import std.functional : not;
import std.stdio : write;

enum n = 0b1;

ubyte flags;

void main(string[] args)
{
    args[1 .. $]
        .find!(not!option)()
        .joiner(" ")
        .write(flags & n ? "" : "\n");
}

bool option(string arg)
{
    switch(arg)
    {
        case "-n":
            flags |= n;
            return true;
        default:
            return false;
    }
}

The key to understanding this one is in the use of 
find
. In this case,
find
runs
option
(well,
not!option
actually, which is just the logical negation of
option
) on each argument, checking the return value. As soon as
not!option
returns true for the first time, it passes the rest of the elements through to
joiner
***. The only other unusual (to some) feature is the
ubyte flags;
and it's use with the bitwise
|=
and
&
operators; these are the same as in C, so there's plenty of info out there. Although this approach might look complicated, you'll see later that it scales very nicely.


P.S. Apologies for the long wait for this post, thesis writing is taking up all my time!



*D has built-in string comparison, so we can write things 
someString == "blahblah";
; very neat!

** 
size_t
is a type that is capable of spanning the address space. It's the same as size_t in C, so there's plenty of info out there to explain more.

*** this tells you what it does, but it's not a good explanation of how. See Ali Çehreli's chapter on ranges and info on 
Range find(alias pred, Range)(Range haystack)
in the docs for more info.

No comments:

Post a Comment