Friday 26 July 2013

The first corner: "-n" for echo.

So it seems you weren't totally put off by my first post and have decided to see what other bugs I can accidentally put in to a trivial program! Onwards and upwards...

First things first: The two version in my previous post where actually not equivalent, as Andrei kindly pointed out to me. While the first, imperative, version prints a space after the last argument, the second version does not! A rookie error from me... Following the echo spec., the arguments should be separated by spaces, but there is no mention of trailing space. Hence, we need a fixed version of the imperative code:

import std.stdio : write, writeln;
void main(string[] args)
{
    if(args.length > 1)
    {
        foreach(arg; args[1 .. $-1])
        {
            write(arg, ' ');
        }
        write(args[$-1]);
    }
    writeln();
}

There we go. We check if there are arguments we need to print with 
if(args.length > 1)
, if we do then we print all the arguments - barring the last one - as before, but then print the last one on its own. Note that we could have achieved the same thing with an
if
statement in the loop checking whether we are on the last element and omitting the space. I prefer the explicit separation; sometimes (normally in more complicated cases) your optimiser will too. We finish off with a newline, whether we had any arguments or not. Bug fixed!


Now things get a bit more interesting. As with many supposedly "simple" command line tools, the devil is in the options. Take a look here to peruse a normal list of options for echo. Let's make a start with the "-n" option. If you pass the argument "-n", before any non-option arguments (anything other than "-n" for us at the moment), echo will not print a newline at the end of it's output.

In order to achieve this, we're going to introduce an preliminary loop to detect the option and a boolean variable 
writeNewline
to record the information.

import std.stdio : write, writeln;
void main(string[] args)
{
    args = args[1 .. $];
    bool writeNewline = true;
    
    size_t i = 0;
    foreach(arg; args)
    {
        if(arg == "-n")
        {
            writeNewline = false;
            ++i;
        }
        else
        {
            break;
        }
    }
    args = args[i .. $];
    
    if(args.length > 0)
    {
        write(args[0]);
        foreach(arg; args[1 .. $])
        {
            write(' ', arg);
        }
    }

    if(writeNewline)
    {
        writeln();
    }
}

There's a good few more lines there, but the structure is quite simple. We're making use of slicing in a new way, by assigning to 
args
a slice of its former self:
args = args[1 .. $];
, we're slicing off the first element and forgetting about it, because it is of no use to us. We declare the boolean variable
writeNewline
with
bool writeNewline = true;
, initialising it to
true
as that's the default behaviour of echo.

Then we loop over the args until we find our first non-option, recording what we find. 
if(arg == "-n")
* then we've found the "-n" option, so we set
writeNewLine = false;
and carry on. If we don't find an option, we
break;
out of the loop. We're keeping track of how many options we have with
size_t i = 0;
**, incrementing it with
++i;
; this means we can just chop the option arguments off the beginning of
args
with
args = args[i .. $];
.

From there, it's plain sailing. We print any remaining arguments - taking care to get the spaces in the right place - and then print a newline if necessary.


Just like last time, here's a more functional flavoured fancy version.

import std.algorithm : find, joiner;
import std.functional : not;
import std.stdio : write;

enum n = 0b1;

ubyte flags;

void main(string[] args)
{
    args[1 .. $]
        .find!(not!option)()
        .joiner(" ")
        .write(flags & n ? "" : "\n");
}

bool option(string arg)
{
    switch(arg)
    {
        case "-n":
            flags |= n;
            return true;
        default:
            return false;
    }
}

The key to understanding this one is in the use of 
find
. In this case,
find
runs
option
(well,
not!option
actually, which is just the logical negation of
option
) on each argument, checking the return value. As soon as
not!option
returns true for the first time, it passes the rest of the elements through to
joiner
***. The only other unusual (to some) feature is the
ubyte flags;
and it's use with the bitwise
|=
and
&
operators; these are the same as in C, so there's plenty of info out there. Although this approach might look complicated, you'll see later that it scales very nicely.


P.S. Apologies for the long wait for this post, thesis writing is taking up all my time!



*D has built-in string comparison, so we can write things 
someString == "blahblah";
; very neat!

** 
size_t
is a type that is capable of spanning the address space. It's the same as size_t in C, so there's plenty of info out there to explain more.

*** this tells you what it does, but it's not a good explanation of how. See Ali Çehreli's chapter on ranges and info on 
Range find(alias pred, Range)(Range haystack)
in the docs for more info.

Sunday 7 July 2013

Away from the line: echo in D

edit: spot the bug :) Thanks to Andrei Alexandrescu for pointing out my stupidity.

So... This is going to be a blog about the D programming language. I've been involved in the D community for a while over a year now; with their friendly and patient help I've managed to become quite familiar with the language. I'm not going to attempt to teach you the language in any rigorous manner, for this I would recommend either Ali Çehreli's online book Programming in D (starts from the very basics of programming. Free) or Andrei's tour de force The D Programming Language (for the more experienced reader. Totally worth the money). Instead I'm going to dive in to building some simple programs in D, introducing syntax and features in an entirely ad-hoc manner, the meaning of which will hopefully become obvious by context if not by my explanations. Those coming from C or a related language will likely find the whole experience familiar.

So, without further ado, let's write a "real" program!

import std.stdio : write, writeln;
void main(string[] args)
{
    foreach(arg; args[1..$])
    {
        write(arg, ' ');
    }
    writeln();
}

Remind you of anything? Of course it's the famous utility 
echo
, albeight a very simplified version. We'll build on it later to get a fully-featured implementation. Firstly, let's take a line-by-line look at what we've got so far:

First, we import 
write
and
writeln
from
std.stdio
. We could just write
import std.stdio;
and import the whole module, but it's nice to keep things a little more hygienic.

import std.stdio : write, writeln;

Then comes 
main
, the entry point to the program. Seeing as we're writing
echo
, we're going to need the arguments to the program, so we're including the optional
string[]
argument to
main
to capture them.

void main(string[] args)

We want to print out all the arguments, barring the first argument which is always the executable name, separated by spaces. To iterate over the arguments we are using a 
foreach
loop, accessing each argument in turn through the variable
arg
. In order to skip the first argument, we are actually iterating over a slice of
args
, from index
1
to the end of the array (automatically provided by
$
). This isn't affecting
args
itself, but simply providing us with a narrowed window on to it. For more on slices, see here. We then - for each
arg
  -  
write
the argument followed by a space.

    foreach(arg; args[1..$])
    {
        write(arg, ' ');
    }

finally, we finish things off with a new line, courtesy of 
writeln
.

    writeln();

Hooray! We have a fully working D program that actually does something (vaguely) useful. For those of you who found that a little too obvious, here's an equivalent D program with a lazy functional flavour:

import std.stdio : writeln;
import std.algorithm : joiner;
void main(string[] args)
{
    args[1 .. $].joiner(" ").writeln();
}

If you squint right and don't think too hard, you should be able to get what's happening there (Hint: 
foo(x)
can be rewritten as
x.foo()
). Next time, we'll look at how we can implement some of the command line options to echo, moving towards a fully compliant implementation.