Monday, September 10, 2012

C++11: Simple Compile-time Calculator With Constexpr

Hi everyone! In this post we'll run through a classical computer science exercise: Building an arithmetic evaluator. This one will only manage +, -, * and /, no parenthesis, infix notation, no space (simple to add, but boring). That may look too simple… Let's add a constraint: computations must be done at compile time.

For the little story, I wanted to code it after one of my teacher told me a story about a project for students that consists in building an arithmetic evaluator (bistromathique) able to deal with very huge number (a lot bigger than 232). The computation to do was stored in a file given to the student before 'make' was run to build its project. A student simply created a rule in his Makefile that makes another program evaluates the input file and redirected the result into the expected binary that only printed it out. Nice cheat right? Unfortunately for him, he was caught, that's why we know the story, but the idea is exploitable. Why not making computation at compile time? A real problem is how to parse this expression.

In this post, I'll first show you how to think the problem to be able to represent it simply. Then, I'll present the available tools to make "beautiful" compile-time expressions with C++11, and finally I'll show and explain the whole code. Let's go!

1 How to evaluate an arithmetic expression?

1.1 Language Theory

There is a lot of different way to evaluate an arithmetic expression. Some people will make it with the Shunting Yard algorithm, or with some tools to generate parser (Bison for example), etc. I wanted a simple way to do it. Bison will not be helpful here, Shunting Yard is too heavy. Should we create a constant Abstract Syntax Tree? Or a constant Stack? It doesn't seem interesting for us to translate it, in term of time (at a development level). The proposed solution is based on Language Theory. The arithmetic language follows a simple grammar. First, you have to know how to read a grammar. A grammar is composed of several rules of this form:

<symbol> ::= expression

<symbol> is called nonterminal. expression is a sequence of symbols. Symbols are a nonterminal or a terminal (a terminal doesn't appear on the left side of a rule. In this post, they appear between double quotes). There could be several possibilities (one expression or another, it is represented as ~expression1 | expression2). More information on Wikipedia. The simple arithmetic grammar used is:

<expr> ::= <T>
<T> ::= <U> "+" <T> | <U> "-" <T> | <U>
<U> ::= <number> "*" <U> | <number> "/" <U> | <number>
<number> ::= <number> <digit> | <digit>
<digit> ::= 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9

Well! It seems a little cryptic when you don't know what it is, but you'll see that it's not :). Your expression must follow a path in these rules to be grammatically valid. Otherwise there is an error. Let's see what say these rules: a valid expression follows the rules <T>. <T> means that the expression is a <U> plus <T>, or a <U> minus <T>, or just a <U>. <U> is a <number> times <U>, or a <number> slash <U>, or just a <number>. A <number> is a nonempty list of <digit>. And here it is! You can write an addition, a subtraction, a division and a multiplication.

The rule are selected in function of your expression, it takes (roughly) what matches the more. Here is how it passes through the rules with several examples:

  • 2: take the rule <T>, <U>, <number>, <digit>. Valid.
  • 2+3: take the rule <T>, <U> "+" <T>:
    • expr is 2. <U> becomes <number>, <digit>;
    • expr is 3. <T> becomes <U>, <number>, <digit>. Sill valid.
  • 2+*3: take the rule <T>, <U> "+" <T>:
    • expr is 2. <U> becomes <number>, <digit>;
    • expr is *3. <T> becomes <U>, <number> "*" <digit>:
      • expr is empty. <number> is an error.

    So it is invalid.

I hope these little examples help you to understand how it works if you didn't know. You can ask questions in comments if you want ;-). I'll be pleased to help you. If you spot a mistake, don't hesitate to signal it!

1.2 How to implement it?

You should be convinced (I hope) that these rules would help us to implement simply an arithmetic evaluator. Moreover, since the priority between operators is managed by the recursion (yes, multiplication will occur before addition for example), you can do all the computations on-the-fly. No need to build a Stack, or an Abstract Syntax Tree. I really love this method because it is simple to implement. For people that think that the lack of parenthesis causes its method to be so easy, you're wrong. I let you the job to find where to integrate them (hint: the latter the rule is called, the higher its precedence is. Example, the multiplication is called later than the addition, and so it has a higher precedence).

We just have to represent these rules with functions. It's not really so easy, because if you implement it as this:

evalT(char* expr) { return evalU(head(expr)) - evalT(tail(expr)); }

In this snippet head takes the expression and returns the part before the '-' and tail returns the part after. You'll have this behavior: eval(2-4+2) will return -4. Because it will evaluate 4+2 before applying the '-'. So you must compute one <U> at a time. This way, 2-4+2 will be processed this way: accu = 2 - 4; accu = accu + 2 and accu will be equal to 0 as expected.

So it is really easy to implement, one function for each rule, and you have to loop over each new nonterminal (for <T> it is <U>, for <U> it is <number>…) and apply the right operation. And it will work!

But you may wonder (and it's great!): How can I do an error management? Simply, you can make lexical error (invalid character) when you process the value of your number in the last rule. And you can make a parse error if your last rule received an empty content. That means there is two operators side by side. Finally, all the error management is done in the function that evaluates your number.

Finally, to have a fully featured arithmetic evaluator, you have to find how to put the parenthesis in these recursive functions (hint: error management is also very simple in this case). Once again, you can ask question if you want :)

For having implemented it some time ago, I can say that it may take 3 functions of less than 30 lines each. But now, let's add some fun! Let's see how we can transform this into a compile-time version.

2 Compile-time arithmetic evaluator

2.1 Compile-time tools with C++11

"Constexpr is a 4th sublanguage added to C++" - Andrei Alexandrescu.

In C++03, there was only templates to make computation at compile-time (explicitly I mean. Sometimes compiler can do this kind of optimization alone). But it was really painful. For example, to make something as simple as factorial you had to declare a template class parametrized by an integer, and specialized it when the integer is 0. Here is how it looks:

template <unsigned n>
struct fact
{
  static const int value = n * fact<n - 1>::value;
};

template <>
struct fact<0>
{
  static const int value = 1;
};

The first class fact declares the general case where value is equal to n times n - 1 (it is just written with template). We then had to specialize this class when the integer is 0. That way, when you call fact<5>::value, it will answer 120 as expected. But this approach has some drawbacks. The syntax may look ugly (sweet euphemism…), it is error prone, hard to debug, and so on. I think no one will try to parse and evaluate complex structures with template. If this person exists, I have a deep respect for its courage…

So C++11 comes with a new support for compile time programming: constexpr. It allows to make computation during the compilation ("Because compilation was already too fast" - Herb Sutter). I have already discussed a little about constexpr in a previous post. Let's go a little deeper, according to the standard, there are several constraints that a function must satisfy to be considered as constexpr:

  • Call only constexpr function,
  • Only one return statement,
  • static_assert are fine,
  • All its declaration must be declared constexpr.

I don't write all the conditions here (the standard is long… More on cppreference) but here are the main conditions you should be aware of. First question: how to make real functions with only one return? In fact you have to deal a lot with recursion, and with ternary operator (?:).

Here is the implementation of factorial using constexpr:

constexpr
int fact(int n)
{
  return n == 0 ? 1 : n * fact(n - 1);
}

Personally, I found this version simpler to write and to read. For me this is a great thing to gain in readability. There is a second huge advantage to deal with constexpr instead of template: constexpr can be used in the compile-time and run-time world. It avoids having two (very different) codes to solve the same problem. Some (important) things to note:

  • static_assert can't be used to check the value of your argument since it can be called at run time. If you want to do that, you must pass the argument as template parameter. For a little discussion about that, see this question on stackoverflow.
  • You can't declare an argument of a function constexpr.
  • Prefix and postfix incrementation/decrementation on a variable are forbidden.
  • Constexpr function are implicitly inline.
  • Constexpr non-static method is const (if it is not a constructor of course!).
  • There is an implementation defined (I think) limit for recursion, to change its value, you can use the -fconstexpr-depth= option. I'm not sure if this flag is standard, at least it works with g++-4.6.1, 4.7.1, clang++-3.0 and clang++-3.1.

2.2 How to do error management with constexpr?

This is a tougher question than it seems. I first tried to use static_assert, because I wanted to have an error clear at compile time. But the first point of the list of notes above makes me unable to succeed. So I used exception. Some part are not evaluated. For example, in b ? 1 : 0, the 0 part is evaluated only if b is false. So you can put a throw expression on the false part in a constexpr function. This will cause the function to not respect the definition of a constexpr function and it will fail the compilation. In my opinion, compile-time programming should have compile-time error.

I chose to implement it that way:

constexpr
bool my_static_assert(bool b, const char*)
{
  return b ? true : throw std::exception();
}

If I want to use it only at compile-time (the code above represents that case), I implement it that way. I want to have an explicit error message, that's why there is a second argument. As you can see, I don't use it (reminder: if you don't give a name to an argument, the compiler understand that it will not be used). So, you may wonder why I put that here? To get an helpful error message: Here is an extract of the g++ error message I get if I do a parse error in my program:

$ g++ --std=c++11 -DEXPR="\"3*+9\"" static_eval_expr.cc
static_eval_expr.cc: In function 'int main()':
static_eval_expr.cc:102:32: \
 in constexpr expansion of 'parseT(((const char*)"3*+9"), 0, 1)'
static_eval_expr.cc:96:43:  \
 in constexpr expansion of 'parseT_private(s, s, 0, value, ((int)op))'
........
static_eval_expr.cc:48:60:   in constexpr expansion of \
'my_static_assert((len != 0), \
((const char*)"Parse error: No side by side operators!"))'
static_eval_expr.cc:28:42: erreur: expression \
 '<throw-expression>' is not a constant-expression

(The dot line represents some lines I dropped out to simplify the example, lines finishing by a '\' continue in the line below). I hope it is clear enough. I can't get the exact location, but it is better than just a non understandable compile error right?

Note that I chose to raise std::exception because the error message doesn't give this information clearly, and the main information is contained in the constant string. In all other situations, std::exception is, in my opinion, to avoid. Using clear type is helping the compiler and the programmer. So don't use that in the real world! :)

Now, let's see how works my compile-time version of our arithmetic evaluator!

2.3 How it works

The classical grammar-based algorithm uses recursion, loops, and makes a pointer run through the string. We only have recursion. There is no mutable variable. So we have to figure out how to transform our run-time version (I didn't present the code of the run-time version here because I don't want student of my school to use it for their own project) into a full constant version.

To loop through the string we use recursion. I first implement a compile-time atoi (ascii-to-int). I assume that we use ASCII standard (I want that because in that case: the value of the character '4' subtracted to the character '0' is equal to 4, and the same for every character representing a digit). So let's see how we can do that with some checking. I'll use my_static_assert declared above.

constexpr
bool is_number(const char c)
{
  return c >= '0' && c <= '9';
}

constexpr
int myatoi(const char* s, int len, int value = 0)
{
  return *s && len
    ? (my_static_assert(is_number(*s), "Invalid character!"),
       myatoi(s + 1, len - 1,
             (*s - '0') + value * 10))
    : value;
}

is_number is trivial, it returns true if the character given in argument is a digit and false otherwise. Let's see myatoi now. It takes as first argument the string (we use char* because we can't use std::string since its not constexpr. So we go down one level and we directly work on the pointer). The second argument is the number of characters we must read. Because in our use case, we just send it part of the string to read, and this part is defined by a beginning and a size. The third argument is the current value accumulated. I indent all ternary expression to make it as clear as possible.

If we translate the code above in English, we have: if the current character is not null and we have still some characters to read: check that the current character is a number, if not raise an error. If it is valid, call yourself on the next character, reduce the number of character to read by one, and update the value. If we have finished to read our string, return the accumulated value.

Let's see how we update the value: We read our string from left to right, so we add the first character to value. Then we just have to multiply by 10 the accumulated value each time we move on the right! :)

Now, we have to apply our recursive calls to manage priority between the operators. Remember that we have to find a way to advance your "pointer" to didn't read several time the same part of the string we're working on.

Since all is constant, the way I found is to create a substring defined by the beginning of the substring and its length. Then we call recursively ourselves, with the updated value. So here is how it works: the function that manages the operation '+' and '-' is split into two: there is a function that makes the interface and initialize some variables, and the other one that does the job. So it ends up with two functions per rules. Here is (finally!) the whole code:

#include <iostream>

constexpr int parseT(const char* s, int value = 0, bool op = true);
constexpr int parseU(const char* s, int limit, int value = 1,
                     bool op = true);

constexpr
bool is_number(const char c)
{
  return c >= '0' && c <= '9';
}

constexpr
bool is_valid(const char c)
{
  return is_number(c) || c == '+' || c == '-' || c == '*' || c == '/';
}

constexpr
bool my_static_assert(bool b, const char*)
{
  return b ? true : throw std::exception();
}

constexpr
int myatoi_private(const char* s, int len, int value = 0)
{
  return *s && len
    ? (my_static_assert(is_number(*s), "Invalid character!"),
       myatoi_private(s + 1, len - 1,
                      (*s - '0') + value * 10))
    : value;
}

constexpr
int myatoi(const char* s, int len)
{
  return my_static_assert(len, "No side by side operators!"),
         myatoi_private(s, len);
}

constexpr
int parseU_private(const char* s, int limit, const char* begin, int len,
                   int value, bool mul)
{
  return *s && limit
    ? (*s == '*' || *s == '/'
       ? (mul
          ? parseU(s + 1, limit - 1, value * myatoi(begin, len),
                   *s == '*')
          : parseU(s + 1, limit - 1, value / myatoi(begin, len),
                   *s == '*'))
       : parseU_private(s + 1, limit - 1, begin, len + 1, value, mul))
    : (mul
       ? value * myatoi(begin, len)
       : value / myatoi(begin, len));
}

constexpr
int parseU(const char* s, int limit, int value, bool op)
{
  return parseU_private(s, limit, s, 0, value, op);
}

constexpr
int parseT_private(const char* s, const char* begin, int len,
                   int value, bool add)
{
  return *s
    ? (*s == '+' || *s == '-'
       ? (add
          ? parseT(s + 1, value + parseU(begin, len),
                   *s == '+')
          : parseT(s + 1, value - parseU(begin, len),
                   *s == '+'))
       : parseT_private(s + 1, begin, len + 1, value, add))
    : (add
       ? value + parseU(begin, len)
       : value - parseU(begin, len));
}

constexpr
int parseT(const char* s, int value, bool op)
{
  return parseT_private(s, s, 0, value, op);
}

int main()
{
  constexpr auto e = EXPR;
  constexpr auto res = parseT(e);
  std::cout << res << std::endl;
}

If you want to test the code, copy/paste this in your favorite editor, and compile it like this:

g++(4.6.1): g++ --std=c++0x -DEXPR="\"1+1\"" file.cc
g++(4.7.1 or 4.8): g++ --std=c++11 -DEXPR="\"1+1\"" file.cc
clang++(3.1): clang++ --std=c++11 -DEXPR="\"1+1\"" file.cc

It won't compile with a clang version lower than 3.1. If you have tried another compiler, feel free to report your results in comments :) You can replace 1+1 by any arithmetic expression composed by number, '+', '-', '*', '/'. If your expression is not valid, it won't compile. Note that spaces are forbidden. As said in the beginning, it is not a feature hard to add, but it adds dumb code when I want code to be crystal clear.

Now let's explain a little the code above, starting by the main. EXPR is the variable given on the command line when compiling (with the -D option). parseT takes a const char* and returns an integer. Variables can be declared constexpr, so we declare e and res constexpr. Then we just print the result. Now, let's run through the interesting part of the code.

parseT_private takes as second argument the beginning of the current sub expression we will give to the second level (parseU). So we have to create a parseT function that set the beginning of the expression. This is the same thing for parseU that calls myatoi. The main idea of the recursion is that I read my expression until I get one of the operator of my level (for parseT it is '+' or '-', for parseU it is '*' or '/'). Once it is done, it gives the substring between the beginning and just before the operator to be evaluated by the next level. And this updates the current value. We then run the recursion on the right part of the operator. We have to remember what was the operation before this subexpression to apply it when updating the current value. So at the beginning, value is 0, and the first operation is a '+' (this explains the op = true in parseT).

Let's see some steps of the program to understand how it thinks:

  • 2+2: parseT("2+2") calls parseT_private("2+2", "2+2", 0, 0, true). We read 2, and we arrive to a '+'. So we call parseU on "2", which returns 2 since there is no '*' nor '/' in this expression. We add 2 to the variable value, we remember that we have seen a '+', and we call parseT on the right part of the '+'. We arrive to end of string, so we just call parseU on "2", and we get two. The previous operator was a '+', so we add. 2+2 = 4, as expected!
  • 2*2+2: parseT_private("2*2+2", "2*2+2", 0, 0, true), we still stop at the '+', so parseU receives "2*2" and the variable limit is equal to 3 (to know where stop reading). In parseU_private we are looking for a '*' or a '/'. We get a '*', so we send the left part to myatoi, and we continue our recursion on the right part of the string. We remember that the previous operation was a '*'. We arrive on the limit of our expression (limit is now equal to 0), so we stop, and send 2 to myatoi and we multiply the current value to the return value of it, so we get 4. We are back to parseT_private, and the current value is 4. The rest of the recursion simply add 2 to it. And (finally) we get 6, as expected.

Now, let's develop a little one of the main function of the program, parseT_private. First of all, why this dumb name? It is a toy program, and I just wanted to differentiate it from parseT. So the first idea that came to my mind was just fine. In parseT_private, there are four different cases. The first one is:

  • We are on a '+' or a '-', let's apply the previous operation which was a '+' on the left part of the operator and run the recursion of the right part.
  • We are on a '+' or a '-', let's apply the previous operation which was a '-' on the left part of the operator and run the recursion of the right part.
  • We are not in an operator, we continue to read the string. We increase the size of the current substring.
  • We are arrived to the end. Let's do the last computation.

The code of parseT_private and parseU_private are very similar. They just don't play at the same level (remember the BNF grammar? These levels ;-) ). I found the code above quite elegant, since it solves a problem not so simple, moreover at compile-time, and it does it in less than 100 line of code. These functions just represent the algorithm I have described in the first part of the article with the constexpr constraints. Don't hesitate to ask question in comments if you want more information about a precise part of the code.

You may wonder "how to be sure that the result is computed at compile time?". Two solutions: you can add a static_assert in the main, to check if the result is the one you wanted, or you can use gdb. Let's see how to use our debugger:

$ g++ --std=c++0x -DEXPR="\"3+9-6+5-1+2-12+3*7+3\"" static_eval_expr.cc
$ ./a.out
24
$ gdb -q a.out
Reading symbols from a.out...(no debugging symbols found)...done.
(gdb) disassemble main
Dump of assembler code for function main:
   0x080485f4 <+0>:     push   %ebp
   0x080485f5 <+1>:     mov    %esp,%ebp
   0x080485f7 <+3>:     and    $0xfffffff0,%esp
   0x080485fa <+6>:     sub    $0x20,%esp
   0x080485fd <+9>:     movl   $0x8048772,0x18(%esp)
   0x08048605 <+17>:    movl   $0x18,0x1c(%esp)
   0x0804860d <+25>:    movl   $0x18,0x4(%esp)
   0x08048615 <+33>:    movl   $0x804a040,(%esp)
   0x0804861c <+40>:    call   0x80484c0 <_ZNSolsEi@plt>
   0x08048621 <+45>:    movl   $0x8048530,0x4(%esp)
   0x08048629 <+53>:    mov    %eax,(%esp)
   0x0804862c <+56>:    call   0x8048520 <_ZNSolsEPFRSoS_E@plt>
   0x08048631 <+61>:    mov    $0x0,%eax
   0x08048636 <+66>:    leave
   0x08048637 <+67>:    ret
End of assembler dump.
(gdb) q
$ echo $((0x18))
24

Disassemble shows the assembler code for a function. If we analyze it, we realize that there is no call to parseT. The two calls are due to std::cout and std::endl. If we look at 0x18 (<+17> or <+25>), it corresponds to 24, our result. So all has been computed a compile time, and there is no run time cost. Great isn't?

3 Conclusion

I hope this example of how to use constexpr for making compile-time programs. I don't say that you should make some programs like the one I show in this post, but it can help the compiler to optimize your programs by evaluating code at compile time instead of run time. It also solves certain problem like std::numeric_limits::max() which is a function representing a constant value. How to solve this problem? Putting a constexpr qualifier on this function does the job.

The main aim of this post is to present constexpr to people that didn't know it, to encourage people that have heard of it to take a closer look at this new feature, and finally (at least for me) to see how far I can go with constexpr.

I invite you to share your thoughts about constexpr or this post, your questions or even your own experimentation in the comments!

Monday, September 3, 2012

Customizing Zsh (Part 1): Hooks and RPrompt

This post follows my post on the zsh macros, and explain how to use the zsh hooks "preexec", and how to customize your (right) prompt to give information about something that changes (current branch of git, date, …).

1 Preexec hook

To enable the hooks, the user first has to load the add-zsh-hook function. To achieve that goal, paste the following line:

autoload -U add-zsh-hook

Once it is done, we are able to add and remove a function from a hook. For our case, we suppose we want to add the hook_function to the preexec hook. The following snippet shows how to do that.

hook_function()
{
  echo $1
  echo $2
  echo $3
}

add-zsh-hook preexec hook_function      # Add it to the preexec hook.
# add-zsh-hook -d preexec hook_function # Remove it for this hook.

Adding and removing a function from a hook is done the same way for every hooks.

The preexec hook is ran each time a command is read by the shell and is about to be executed.

Each function run by the preexec hook receives three arguments. The first one is the line as it was written. The second line is the line with alias expanded and truncated with a certain size limit. The third line is the full line with alias expanded. This thread shows an example. For the macros module, I decided to use the third expression because I want to be able to use my aliases in my macros. But that depends of the application you want to write.

2 Interactive prompt

You know that, there is plenty of ways to customize your prompts in Zsh. I'll just present one of them today, some post about the same topic might follow.

What I present today is how to use your RPROMPT to print some information about what you want, and is actualized every time you enter a new command. It is easy to do, here is the first step:

setopt prompt_subst

Maybe you have recognized the beginning of what you have to add to your configuration file to make the zsh macros module working? Well done! Otherwise, it doesn't matter. So what does this little line? According to the man (man zshoptions): "If set, parameter expansion, command substitution and arithmetic expansion are performed in prompts".

Let's see what happens if we don't set this option:

$ msg="Hello"
$ RPROMPT="($msg)"
$                                       (Hello)
$ msg="Goodbye"                         (Hello)
$                                       (Hello)
$ echo $RPROMPT                         (Hello)
(Hello)
$                                       (Hello)

Pretty annoying right? In fact, the shell expands $msg before it is received by RPROMPT, so what happens is simple, it prints the value of what he receives: "Hello" literally. So, let's see what happens if we re-execute the same sequence of commands with the prompt_subst option set?

$ setopt prompt_subst
$ msg="Hello"
$ RPROMPT="($msg)"
$                                       (Hello)
$ msg="Goodbye"                         (Hello)
$                                       (Hello)
$ echo $RPROMPT                         (Hello)
(Hello)
$                                       (Hello)

Here is the most common error (I think) that leads your prompt not to expand your variables. The RPROMPT command doesn't know there is a variable to expand, and you have to prevent your shell to expand it by single-quoting your assignation. This way:

$ RPROMPT='($msg)'
$ echo $RPROMPT                         (Hello)
($msg)
$ msg="Goodbye"                         (Hello)
$                                       (Goodbye)

There is two things to be careful with when you want to have your prompt expanding some content: the option prompt_subst must be set, and the content of the variable RPROMPT contains the thing you want to be expanded each time (Think about single quoting it!).

Now let's see what we can do with it! If you are a module writer, you can use a variable as flag (as I did), or give a function that allows to get information about something (as it's done by zsh to allow user to get vcs information).

For getting your branch in your RPROMPT, I recommend you to read the answer of ko-dos which is very complete. If you just paste the code, it will work. But you know why it uses single quotes, and why there must be the prompt_subst option set. For the zstyle part, I didn't try to understand it. One day, I'll try :)

Let's see how get the time in your right prompt. First, how to get the time only when calling the date command? I read this post to find the right format. It is just date +%T. Now let's apply what we have learn:

$ RPROMPT='$(date +%T)'
$                                       (23:42:00)

Now you are just limited by your needs and by your imagination :) If you make your own custom prompt, please share it in comments. I hope you like it!

Wednesday, August 29, 2012

ZSH Macros

Today, I'll present a module for zsh that I wrote few days ago. The aim of this module is to provide a way to create easily temporary shell scripts, and save their favorites. If you are familiar with Emacs, and if you think about its macros, you're right! I designed this with the macro concept in mind.

I had the idea when working with people who aren't familiar with shell scripts, and who don't want to try it for helping them. The original example was a work-flow with a TODO to update regularly. And yet commands to do were not so difficult:

$ git add TODO
$ git commit -m "Update the TODO."
$ git stash
$ git pull --rebase
$ git stash pop
$ git push

In reality, my coworker didn't plan to stash nor pull, but since this is my story, I can change it a little! :)

I thought that it is easy to write a script to make that works. I just have to copy these lines and paste them in a script. But, on one hand I find this boring, on the other hand someone who is not interested in scripts will never do that. So I had to find a transparent way for the user to have the same result without having the feeling that he plays with scripts. It's here that came the idea to mimic the behavior of the Emacs macros.

Notice that even if the Emacs macros works on text, and the title of this post might be confusing, I don't want to write a tool to enhance the Emacs macro in the Zsh command Line Editor (zle), but I want a way to create simple scripts in a easy and fast way.

1 Zsh macros!

I first tried to create a Perl script that can create scripts, but I realized that there is too many drawbacks (no history, no completion…). So I found an alternative way, fully integrated in zsh: hooks. For people who doesn't know what are hooks: It is a set of functions called at a specific time that allow the user to run their own functions. It is useful for letting the user personalizing a software. As an example, I wrote a git hook to check the log message (I talked about it in a previous post). For this module, I use the preexec hook for achieving my goal. In this post, I'll present my module, why it can be useful for day-to-day usage, and how to use it. In some next posts, I'll show some useful tricks I had to use to make it works.

1.1 Why using it?

Because it allows you to save your time. It is easy to install, easy to learn and easy to use. I realized that sometimes I repeat the same sequence of lines several time. It ends up by having these lines concatenated in one line with a && between them. Pretty ugly, right? But because it is just repeated less than ten times, I don't want to write a script for that because it is faster for me to just reuse my zsh history. But with this module, you just type your sequence of command once, and then you just have to hit macro_execute to get it repeated. Personally, I have aliased this command to e. It is the fastest and cleanest way I know to repeat your work properly.

1.2 How to install it

Glad to see you here! You'll see that it is a good choice :) The first step to install it is to get it from my github (directory zsh_macros). Once it is done, you just have to source the file, in your shell to try it, or in your configuration file if you're sure that it will fit to your needs.

Some things to check: if you have already bound either <ctrl-x><(> or <ctrl-x><)>, I don't bind anything (I don't want to mess up your own configuration!). These bindings are the same than under Emacs. Feel free to adapt these bindings to your own wishes!

You then have to add something in your configuration file:

setopt prompt_subst
RPROMPT='$(zmacros_info_wrapper)'

The first line allows the prompt to perform command substitution in prompts. The second one set your RPROMPT to the value of zmacros_info_wrapper that allows you to know the status of your macro. If you have already assigned something to your RPROMPT, you could simply add it to it.

Once this is done, everything should work fine. If this is not the case, you can either send me a bug report or send me a patch. Now, let's see how use this module.

1.3 How to use zsh macros?

In this part, I assume the bindings are the one originally provided. I think a screenshot may help to figure out how it looks like, I first run it in the shell, to show what you have to do, and what are the result. Between <> are represent the keyboard macros. Do not write it out :).

$ echo foo
foo
$ <ctrl-x><(> echo bar
bar
$ echo baz
baz
$ <ctrl-x><)> e
bar
baz

This screenshot shows how it appears for the user:

The flag on the right (<R$id>) appears right after you type <ctrl-x><(>, and disappear right after you type <ctrl-x><)>. Pretty easy right?

Note that if you don't like key-bindings (Are you a Vim user?), you can call macro-record and macro-end and you'll get the same effects.

Let's go a little deeper: you can have several macro. This module doesn't support nested macros in a same shell, but you can make as many macros as you want. Each macro is associated with an id. This is what is printed on the flag after the R in the prompt. You can run macro-execute with an optional argument that corresponds to the id of the macro you want to run. By default it's the last recorded. Notice that each script has its own file, and there is a master file that track each of them. To add and execute macros, we read and write on this file in /tmp. This way has its advantages and its drawbacks. We have concurrency problems, but since a real user can't make several things in the same time, that should not be a real problem.

The advantages are that a macro recorded in a shell can be used in another one, and you can register two macros at the same time, because the only access to the main file is made when you call macro-record. So recording two macros in two different shells is fine.

All your scripts live as long as your /tmp is not cleaned. If you want to keep a macro for a longer use, it is possible. You just have to call macro-name that will take the macro you asked for (if you give no id, the last one is considered), and copy it in the directory ZMACROS_DIRECTORY. You can set it at the top of the file macros.zsh. Maybe it is a good idea to add this directory to your path, it will allow you call these new functions simply.

This is the features available in this first version of the zsh macros. I planned to add some new ones, but if you have any request, comment, or anything, feel free to comment this post! I'd like to know what would be helpful and what you think about this module.

Wednesday, August 22, 2012

C++11: A generic Singleton

1 Context

Yesterday, I found the solution to a problem that I was unable to solve one year ago. The problem was simple: Creating a generic Singleton. If you don't know what a Singleton is, here is the definition given in the book "Design Pattern" from the Gamma et al.: "Ensure a class only has one instance, and provide a global point of access to it".

I wanted to have a generic singleton because I had to code several singletons in several projects and I hate making the same thing several times. I was unable to succeed in my tries because of the initialization problem. There was no known method to handle the initialization that works for each possible constructors.

A perfect way to change a class into a Singleton would be, in my opinion, something like just inheriting from a base class Singleton. That would be awesome! But I don't think anyone has ever succeed. The best discussion about a generic Singleton I have read is in the Andrei Alexandrescu's book (If you know a better stuff about it, please let me know!). It notably talks about the deletion problem, that is not approached in this article. I just focus on the construction that can be improved thanks to C++11.

The common version is to use the Curiously Recurring Template Pattern (CRTP). So let's start by introducing the CRTP. The concept is simple, we inherit from a template class that takes us as parameter. It allows to make static polymorphism or instance counter… Wikipedia may help for these examples.

I had to say that when I discovered this thing I was amazed. I am passionated by the concept of the template, by the language in the language. One year ago I tried to solve the Singleton with a CRTP (which is the solution used in the C++ Modern Design book), and my specifications was the original one, and I had to call a method "destroySingleton" at the end of my programs. So I have to find a generic constructor that works for all of my Singleton classes and the work is done.

2 First Try

At the beginning, when I had only two classes and two constructors, I though to play on the fact that the template functions not used are not instantiated, so they can be incorrect (no type checking, no name binding… It just must be valid syntactically). Here is an example:

#include <string>

template <class T>
class Singleton
{
public:
  static
  T* get_instance(int v)
  {
    if (!instance_)
      instance_ = new T(v);

    return instance_;
  }

  static
  T* get_instance(std::string s)
  {
    if (!instance_)
      instance_ = new T(s);

    return instance_;
  } // Sounds familiar...

  static
  void destroy_instance()
  {
    delete instance_;
    instance_ = nullptr;
  }

private:
  static T* instance_;
};

template <class T> T* Singleton<T>::instance_ = nullptr;

// The classes that will be singletons:

class Single: public Singleton<Single>
{
public:
  Single(std::string s): s_(s) {}

private:
  std::string s_;
};

class Map: public Singleton<Map>
{
public:
  Map(int scale): scale_(scale) {}

private:
  int scale_;
};

// How to create and destroy them:

int main()
{
  Single* s = Single::get_instance("Forever Alone");
  Map* m = Map::get_instance(42);

  // Use these singletons.

  Map::destroy_instance();
  Single::destroy_instance();
}

When we instantiate Singleton<Single> there is no constructor that takes an int (so the first get_instance is incorrect), and the program still compiles successfully. This is because the function is not called, so not instantiated, so there is no reason to wine for g++.

This version respects one half of the deal to have a singleton, but it is totally non-intrusive in the code of the classes transformed into singleton. To respect the second half ("only one instance"), we have to tweak a little. We have to make the constructors of the derived classes private, and to grant the friendship to the mother class. As a reminder, the friend keyword allows to give access to all of the protected/private methods to the friend class. This way, no user can create directly an object of the derived class, and we respect the specifications.

Now we respect the two part, but we are still unhappy with this. In fact, we have only two classes, and we had to create two get_instance, because we have two different constructors. Since the number of possible combination of arguments to give to the constructor of an object is infinite, our solution isn't suitable. And that's where I was stuck.

3 The Solution

And yesterday, I finally realized that the solution was just in one of my previous post about the C++11 mechanisms that allows to create emplace_back. The solution was the perfect forwarding! In fact, we just have to create a get_instance that forwards the arguments to the constructor. No repetition, fully generic. That's what I wanted to reach! Here is the final version of the Singleton class:

#include <iostream>

template <class T>
class Singleton
{
public:
  template <typename... Args>
  static
  T* get_instance(Args... args)
  {
    if (!instance_)
      {
        instance_ = new T(std::forward<Args>(args)...);
      }

    return instance_;
  }

  static
  void destroy_instance()
  {
    delete instance_;
    instance_ = nullptr;
  }

private:
  static T* instance_;
};

template <class T> T*  Singleton<T>::instance_ = nullptr;

class Map: public Singleton<Map>
{
  friend class Singleton<Map>;
private:
  Map(int size_x, int size_y): size_x_{size_x}, size_y_{size_y} {}

public:
  int size_x_;
  int size_y_;
};

int main()
{
  Map* m = Map::get_instance(4, 5);

  std::cout << m->size_y_ << std::endl; // Outputs 5.

  Map::destroy_instance();
}

As said above, we are not interested in the destruction problem. It is well-covered in the Alexandrescu's book, and I recommend you to read it if you want to see more on the subject. We create a destroy_instance method in the aim to do not leak our instance. The user must call it at the end of the program.

The real novelty of this, is the use of std::forward when creating the object. So we can give to it any class, and it will work. I take the example of a Map (for example for a game) which takes two arguments, but I hope it is clear for everyone that it works with any constructors.

Note that, I didn't write the copy and move constructors private, but they should be. I omit them only to gain some space in my post. For the same reason, I didn't write the class Single in the second example. But it is clear that it works.

Before concluding, I just want to show that the use of the metaprogramming here doesn't lead to a hideous error message by our compiler when not used correctly. Let's assume that we replace the call to get_instance with two arguments, by a call with no arguments. What happens? Here is the answer of g++ (4.6.1):

singleton.cc: In static member function 'static T* \
Singleton<T>::get_instance(Args ...) [with Args = {}, T = Map]':
singleton.cc:53:30:   instantiated from here
singleton.cc:16:9: erreur: no matching function for call to 'Map::Map()'
singleton.cc:16:9: note: candidates are:
singleton.cc:40:3: note: Map::Map(int, int)
singleton.cc:40:3: note:   candidate expects 2 arguments, 0 provided
singleton.cc:35:7: note: constexpr Map::Map(const Map&)
singleton.cc:35:7: note:   candidate expects 1 argument, 0 provided
singleton.cc:35:7: note: constexpr Map::Map(Map&&)
singleton.cc:35:7: note:   candidate expects 1 argument, 0 provided

The message is clear, and there is all the information needed to understand it, and to fix our mistake.

In this post I proposed the use of std::forward for building a generic singleton. I don't talk the deletion problem because for my personal use, I accept to call the destroy method.

But this is just one example to show how the C++11 can help programmers to build new things easily. I hope my article motivates you to experiment new things! Feel free to share your opinion in comments :-)

PS: I just realize that my current solution isn't perfect. In fact, we can't get the instance without giving a valid argument list (valid = there is a constructor that takes this list). So a call to get_instance without argument, which should return the instance leads to an error if there is no constructor that takes no argument. This is not really what we want. A fix would be to separate the initialization and the get_instance. But that doesn't invalidate what I wanted to demonstrate. So it's okay :)

Thursday, August 16, 2012

A wonderful internship

Today I talk about the internship I made from the middle of October (2011) to the middle of January (2012). I'll start by presenting a little the context, and what I have done exactly.

I worked at Aldebaran Robotics on NAO. The goal of my internship was to refactor a blob of code that allows NAO to recharging himself by going to its station. The behavior was developed in Python, and my goal was to build the same thing in C++. Here is the video that shows this behavior:

Pretty cool right? You can guess how excited I was to attack a project like this, and I start by analyzing the existing program to understand what are the keys to make it works in C++. And then, we (my supervisor and I) realized that we could build a better framework to track one or several object(s). We designed an architecture (what are the modules and their responsibilities), and (because my supervisor was great) I was responsible of the whole architecture: what are the objects, how they interact, etc. I develop all by myself and building a set of libraries that allows the user to build a tracker. The aim of the whole project was to being able to propose a set of tracker (a red ball tracker, an Aldebaran logo tracker, …) with as less duplication as possible, and we also wanted to let the user create its own kind of tracker by adding its own little brick of software. This was a success since one of my supervisor developed a face tracker in 20 minutes and 160 lines of code. I created a tool able to track one or two objects. When working with one object, NAO was able to follow an object with its head, or its whole body and staying at a given distance of the object. When there is two objects he is able to go to an exact position in a repair defined by the center of the two objects. A good tracker is also able to detect when it losts its target. Our tool starts by looking at the last place he saw it.

I think it is better if I don't go too much in details about this project since I don't owe its right. But here is what it looks like in video (yes I have asked for the rights for that :P).

In the first part of the video, I show the search of a red ball, and how it finds and follows it.

In the second part, it tracks the (kind of) pie-charts on the box, and go to the left red cross on the floor. He doesn't look at the floor, the red cross is just for us to see that it goes where we asked for!

And the last part was a reproduction of the charging station, just for seeing that we are able to reproduce it with my work. You'll see that it seems to stop. In fact he does because he reaches the first target, and I had to run through my computer to tell him to go to the second!

No more suspense, let's look at the video!

I feel like some personages in cartoons because you see my hand, my foot, my body, but never my head :-P

Feel free to ask if you have any questions about that, I'll be happy to answer.

Tuesday, August 14, 2012

Debugging C++ (Part 4): A print method

And here is the last post of this series about debugging C++. In this one, I develop a good way to use prints for debugging. The main drawback of the prints is that it is hard to maintain. The presented method doesn't have this problem.

I use prints sometimes, when valgrind tells me nothing, and that I can't run gdb for any reason (for example, I use opaque data structure and all the information gdb is able to give is "root node = 2"… Nice! It happens when I work with BDD (Binary Decision Diagram, I think I'll talk about it in the future)). This method works in C++, and is inspired of three things. I'll start by describing these things, and then I show the whole thing.

Akim Demaille, one of my teacher, gave me the base of this method. It consists in using macro to print. Here is the thing:

#define ECHO(content) std::cerr << content << std::endl
#define VAR(v) "`" #v "': " << v

int main(int argc, char *argv[])
{
  int i = 42;
  int j = 51;

  ECHO(VAR(i) << " - " << VAR(j)); // Outputs "`i': 42 - `j': 51"
}

I add quote _`'_ to see what happens when I want to see the result of an operation (i + j for example). Now, let's see how it works. In C or C++, you can make a concatenation by just putting two quoted things side by side. As an example this snippets

std::cout << "42 " "is " "the " "answer." << std::endl;

outputs "42 is the answer.". These method also relies on the stringification. The operator # put in front of a variable in a macro quotes it. It allows to make this. The macro VAR takes a variable, and returns its name quoted, and its value ready for being printed to std::cerr. And ECHO just print it followed by a newline.

Looks simple when you see it, but I found it ingenious. I remember myself making this job by hand so much time… I think this was very cool and a sane base to build something better!

We had a discussion with a friend of mine (Guillaume Delahodde) about a way to decide whether print a thing or not when we want to debug or not without any run-time overhead. And the result of this discussion was a macro that decides to execute the code or not:

#ifndef NDEBUG
# define LINE(content) content
#else
# define LINE(content)
#endif

I already talk about NDEBUG in the first post of this series about simple methods to avoid creating stupid bug, so I assume it is clear why it appears here. The usage of this is simple, put your print code inside LINE and it will be printed only in debug mode. So we just have to change the definition of ECHO a little, and it will only print on debug mode.

#define ECHO(content) LINE(std::cerr << content << std::endl)

Now our prints are only here in debug mode, and it is better than before. We could have stop here, but we could enhance the comfort of the user. I think writing VAR(i) << " - " << VAR(j) very annoying. It corresponds to 25 characters, and I think this is too much. I had the occasion to read the book of Andrei Alexandrescu (Modern C++ Design), and in there it uses a set of macros to add syntactic sugar for the user. It is hand written macros that threat the first argument and call the macro that threats n-1 arguments until 1. I call this macro LVN with N the number of argument. I just wrote it for 2, 3 and 4 arguments since they are the more common. Let's see how it looks like:

#define LV2(first, second) VAR(first) << " - " << VAR(second)
#define LV3(first, second, third) VAR(first) << " - " << LV2(second, third)
#define LV4(first, second, third, fourth) \
  VAR(first) << " - " << LV3(second, third, fourth)

And now we are able to replace our 25 characters to print 2 variables into: LV2(i, j). Much better right? Let's see the whole thing in an example:

#include <iostream>

#ifndef NDEBUG
# define LINE(line) line
#else
# define LINE(line)
#endif

#define ECHO(content) LINE(std::cerr << content << std::endl)
#define VAR(v) "`" #v "': " << v

#define LV2(first, second) VAR(first) << " - " << VAR(second)
#define LV3(first, second, third) VAR(first) << " - " << LV2(second, third)
#define LV4(first, second, third, fourth) \
  VAR(first) << " - " << LV3(second, third, fourth)

int main(int argc, char *argv[])
{
  int i = 4;
  int j = 51;

  std::cerr << "`i': " << i << " - " << " `j': " << j << std::endl;
      // Outputs the same thing as the following methods in a worse way.
  ECHO(VAR(i) << " - " << VAR(j)); // Still outputs "`i': 42 - `j': 51"
  ECHO(LV2(i, j));                 // Also outputs "`i': 42 - `j': 51"
}

The first print has the drawback to be boring to maintain. If you go to the release mode you have to track all of these prints and remove them, if you change the name of a variable you have to change it twice (assuming you're not using a perfect refactoring tool), and it is longer to write. The last version is simple to write, you don't have to haunt it to prevent them from printing, and they introduces no overhead in release mode. We are far from the dummy printf method, and it is cool.

The moral of this series is that there is a lot of debugging methods, and you have to know them for being able to adapt your method to the situation. For example, dmesg is very specific, but the other method were very hard to use in these situations.

I invite you to share yours in the comments. I'm sure that there is a lot of other method out there, just waiting for being learned!

Sunday, August 12, 2012

Debugging C++ (Part 3): dmesg

Welcome in the third post of this series about debugging C++. In here, I will talk about something less usual because it allows to debug after the crash of the program, this method use dmesg. I just present the case where we work with several libraries and your program crashes without any clue on which library is responsible of this, nor how to reproduce this behavior.

1 dmesg

I heard about dmesg when reading the tsuna's blog. Unfortunately I don't have any competence (for now) for reading assembler. But the fact that we can discover the name of the faulty function is helpful. I had to use this when I worked on a robot during my internship (a next post will present that). We worked with libraries and this is what shows my post.

On a robot there is a lot of parameters coming from the miscellaneous sensors, and the execution of the same program depends on a lot of parameters. So it is really hard to reproduce a bug. If the nice "segmentation fault" message appears, how can you debug that? Considering that you can't run valgrind, and running gdb is painful.

dmesg was my solution. I wrote a shell script to make the computation for me. Let's start by creating a dummy library which exports one function that segfault if a null pointer is given, and let be sadistic, we will call it with nullptr.

// file: libprint.hh
#ifndef TMP_LIBPRINT_HH_
# define TMP_LIBPRINT_HH_

int dereference(int* t);

#endif // !TMP_LIBPRINT_HH_


// file: libprint.cc
#include "libprint.hh"

int dereference(int* t)
{
  return *t;
}

// file: main.cc
#include "libprint.hh"

int main()
{
  return dereference(nullptr);
}

We create a libprint.so that contains the dereference function. And we compile the file main into a binary print linked with this library. And oh, surprise! Segmentation fault. Let's start the hunting. We call dmesg, and look at the last line:

[184608.332284] print[31332]: segfault at 0 ip b772e422 sp bf8ad218 error 4 in libprint.so[b772e000+1000]

We need two information: the name of the library that contains the bug, and the address of the faulty instruction in this library. To get the name of the library, we have to take the last field and to remove the part into []. To have the address of the faulty instruction we have to take the value of the instruction pointer (ip), and the value before the + in the last field. And we just have to subtract the value of the second value to the value of ip. If you are wondering why subtracting these two values to know the address of the ip in the library a draw may help.

I hope the picture helped, in fact, this subtraction removes the offset corresponding to the position of the library (address).

The question is how to make this process automatically? First, we can make the assumption that we always run dmesg right after the error, so we can suppose that we can make a call to tail to keep only the last line. But sometimes this assumption isn't correct, so our solution must be able to get a value given in argument. In here we use the shell default value assignment. As a little remainder:

output=$1
output=${output:="default value"}

If an argument is given, output will be equal to its value, otherwise it will be equal to "default value". So we can use it to decide whether we use the first argument of the program or directly call dmesg.

The part of the message before the colon is useless, so we can remove it. Then we have to get the value of the fifth field to get the value associated to ip, and we have to get the last field.

The name of the library and the address where it is mapped in the memory lie in the last field. So we have to cut it in two and we can get the needed information.

All these operations can be made by using only awk and sed.

Once we have the two addresses we just have to make the operation. We use the builtin system of the shell to make the subtract. Beware, they are in hexadecimal! So we must prefix the value by 0x to tell the base to the shell. Now we have the result (in decimal), we want it converted into hexadecimal, we use bc. It is a tool for making numeric computations. And we are grateful, there is a way to make it convert a number from a base to another. The syntax is simple, you have to set the variable obase to 16 (default value is 10). And that's all, remember to append the 0x before the address, because bc won't.

Here is the complete script:

#! /bin/sh

output=$1
output=${output:=`dmesg | tail -1`}
output=`echo $output | sed -e 's/.*: //'`

first=`echo $output | awk '{ print $5; }'`
second=`echo $output | awk '{print $11; }'`

library=`echo $second | sed -e 's/\[.*//'`
second=`echo $second | sed -e 's/.*\[//' -e 's/\+.*//'`

address=`echo $((0x$first - 0x$second))`
address=`echo "obase=16; $address" | bc`

echo "Segmentation fault in $library at: 0x$address."

And the way to use it is simple, just run it just after a segmentation fault when working with a library. Here is what it says about our case.

$ ./dmesg.sh
Segmentation fault in libprint.so at: 0x422.

And now, just run gdb like this (it is how I get with my libprint.so example):

$ gdb libprint.so
...
(gdb) disass 0x422
Dump of assembler code for function _Z11dereferencePi:
   0x0000041c <+0>:     push   %ebp
   0x0000041d <+1>:     mov    %esp,%ebp
   0x0000041f <+3>:     mov    0x8(%ebp),%eax
   0x00000422 <+6>:     mov    (%eax),%eax
   0x00000424 <+8>:     pop    %ebp
   0x00000425 <+9>:     ret
End of assembler dump.
(gdb) ...

If you are fluent with assembler you could read it, or use the meta data given by gdb: "Z11dereferencePi". Oops, I realized that I have forgot to use "-g" when compiling. Not important: we have a mangled symbol. We can use one of the method presented in one of my previous post. And voila, we know that our mistake is in the function dereference(int*). Pretty good when, without this method I was unable to know where it fails, why, and in the impossibility to reproduce it since there is too much parameters. I don't know how I would have done without this method.

I put this script on my github account, so if you want to fork it to enhance it, it is possible.

Hope you liked it!

Friday, August 10, 2012

Debugging C++ (Part 2): Valgrind and gdb

This article is the second post of the "Debugging C++" series. This post is an introduction of Valgrind and gdb. It is intended for people that don't know these because it starts from the beginning and gives some links to the documentation. In the valgrind part, I present the notion of "definitely lost", "indirectly lost", etc. If you are already familiar with these notions, I invite you to go through the second part of this post about gdb. In this second part I start by briefly presenting what is gdb and what it is useful. But the main interest of this section is that I present the notion of reverse-debugging, which allows you to run your program backward. I also present a useful trick when you want to debug a code you wrote (and you can modify) inside an unknown environment (for example, imagine you want to debug your (buggy) malloc implementation and you test it with ls).

Before starting tools like Valgrind or gdb, think about compiling your code in debug mode, and not using optimizations. Otherwise you'll get trouble to debug it. For example with gdb, your cursor will jump from one line to another without following the linearity of the source code. This can be disappointing, this is due to that.

1 Valgrind

Valgrind is a set of tools that contains a memory checker (memcheck), a profiler (users can use two modules: callgrind and cachegrind), a tool for checking the memory consumption (massif), a synchronisation checker (Helgrind). The tools I use the most in this list is Memcheck, and this is all I will talk about in this post. I'm sure there will be other posts on this list of tools in the future. This is just a brief introduction. If you already know valgrind, you should skip this part.

Memcheck helps you to detect any memory corruption. How it looks like? Let's assume you have a program like this:

#include <vector>

int main(int argc, char *argv[])
{
  std::vector<int> v;
  v.reserve(2);

  v[2] = 4;
}

The call to reserve allocates the space for two integers. But we try to write in a unallocated address. So it is a mistake from the programmer, and even if in this example it is trivial that there is a mistake, sometimes it is not, because it is hidden in a lot of code. So, detecting this is not so simple. That's where Memcheck is helpful. Here is what he says about this snippet.

==4641== Invalid write of size 4
==4641==    at 0x40099A: main (test.cc:8)
==4641==  Address 0x5955048 is 0 bytes after a block of size 8 alloc d
==4641==    at 0x4C286E7: operator new(unsigned long)\
 (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
==4641==    by 0x400FB3: __gnu_cxx::new_allocator<int>::\
 allocate(unsigned long, void const*) (new_allocator.h:94)
==4641==    by 0x400ECA: std::_Vector_base<int, std::allocator<int> >::\
 _M_allocate(unsigned long) (in /tmp/aout)
==4641==    by 0x400D3B: int* std::vector<int, std::allocator<int> >::\
 _M_allocate_and_copy<std::move_iterator<int*> >(unsigned long, \
 std::move_iterator<int*>, std::move_iterator<int*>) (stl_vector.h:1109)
==4641==    by 0x400AC8: std::vector<int, std::allocator<int> >::\
 reserve(unsigned long) (vector.tcc:76)
==4641==    by 0x400988: main (test.cc:6)

4641 corresponds to the PID of the process. Every line coming from valgrind is formatted like this: ==PID==. Line beginning by a space are just the continuation of the line above. It was just for having the output of valgrind fitting in the page.

Remember to use "-g" when you compile your program, otherwise you'll get the name of the binary instead of the name of the file and the line number.

It starts by giving the name of the violation: "Invalid write" and tells us how many bytes we violate. Here this is 4 (sizeof(int)). Then he tells us where we have made a mistake, and then he shows what is the mistake. We have written after an allocated area. And then he shows the stack of the call that led to allocate this area. It starts by a call to reserve in the main line 6, and so on.

As you can see, the message is clear and it helps finding this kind of mistakes easily. This is a valuable tool for writing bug-free software. Every of your program should work without any error in valgrind. Because even if it doesn't create an error directly, it can lead to very weird error later and this is called (in my school at least^^) a mystical error. Because there is no clue that could help (without valgrind).

This is a simple example just to see how powerful is this tool. Another common use of Memcheck is its ability to detect memory leaks. As an example:

int main()
{
  int* p = new int;

  p = nullptr;
}

This is a trivial case of a memory leak because after the affectation of p to nullptr, there is no more pointer to the area allocated with new. It is lost. This is trivial to detect it, but that could be less trivial. So once again, we can rely on Memcheck to warn us. Let's see what he has to say about this:

==4736== HEAP SUMMARY:
==4736==     in use at exit: 4 bytes in 1 blocks
==4736==   total heap usage: 1 allocs, 0 frees, 4 bytes allocated
==4736==
==4736== 4 bytes in 1 blocks are definitely lost in loss record 1 of 1
==4736==    at 0x4C286E7: operator new(unsigned long) (in \
/usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
==4736==    by 0x40060D: main (tests.cc:3)
==4736==
==4736== LEAK SUMMARY:
==4736==    definitely lost: 4 bytes in 1 blocks
==4736==    indirectly lost: 0 bytes in 0 blocks
==4736==      possibly lost: 0 bytes in 0 blocks
==4736==    still reachable: 0 bytes in 0 blocks
==4736==         suppressed: 0 bytes in 0 blocks

I just show the interesting part of the output of valgrind. To get this, I ran valgrind –leak-check=full <program_name>. We start by reading the heap summary, which says that when we exit, 4 bytes are still in use. It indicates the number of allocation and the number of deallocation. We see that the two numbers are not equal, so there is a problem. There is more allocation than free, it's a leak.

At the bottom of the message we can see the kind of leak. There are four kinds of leaks. The full explanation can be found in the documentation, in the section 4.2.7. A short explanation follows.

AA and BB are heap blocks, and C the set of pointer of your program (in reality it is more than that. See the section 4.2.7 cited above). And consider an arrow represent that there is at least one pointer to the first byte of the allocated area, and an arrow and a `?' represents a pointer to a byte inside the allocated area and not the first (this is called interior-pointer in the link given above). No arrow means that the heap block is unreachable. These are simplified examples to understand the concept behind these things.

Once we have understood what represents these names, we can look at the message in the middle that gives the location of each memory blocks that leak. It says definitely lost. So we just override the address of this area, and it appears line 3. Once we know the supposed reason of this leak and its position in our source file, it is easy to fix it.

This is the power of Memcheck. This is the first thing I run when I have a bug. Because even if the error he can reveal is not responsible of the bug, it will be annoying in the future.

A important note about valgrind, since it use its own allocator and deallocator, some bugs may disappear when using Memcheck. It happens to me (at least) once, when I had the output of my program that depends on the state of the computer (state of the heap in fact), which led my algorithm to have different results if I run it several time consecutively. This is generally due to a memory corruption or something like that, so I used Memcheck. And under Memcheck, the program starts to act normally.

An interesting post about valgrind that talks about can be found here.

2 gdb

A second tool I use when I still have a bug after using valgrind is gdb (GNU Debugger).

gdb is a debugger that allows you to walk through your code at run-time and see how the lines you (or your colleagues) wrote influence the program and detect bugs.

I don't want to write a basic tutorial to gdb since there are really tons of them on the Internet. I'd prefer talking about two things that can be useful and not known by everyone. The first one is backward debugging. The second one is a trick to debug a library when you can't run gdb on the program that uses the library. Baptiste Afsa, one of my teacher at Epita, gave me this trick.

2.1 Backward debugging

Why backward debugging? Simply because I facepalmed myself too many time after going too far in my debugging process. I mean going after the critical point after running through trivial code slowly for 5 minutes and being forced to restart the whole thing.

This feature was introduced in 2009 with gdb version 7.0. Documentation is here. To be able to enable this feature, you need to tell gdb to record the state of your program. The official tutorial is here. So I don't have to explain that in details (No I'm not lazy! :) but I don't like duplication and I don't like duplication). I recommend to read the tutorial!

But the thing is that recording introduces an overhead when you use next, continue or whatever. This can slow down the debugging process, but I think it can really help the programmer.

Give it a try, it is helpful.

2.2 The infinite loop trick

This tips allows you to debug the code of a library call by a program that you have difficulties to run with gdb. The case where I use it is when we had to code malloc (free, realloc, calloc too) at school. And to check if it works we used LD_PRELOAD to make binary use our version of malloc instead of the standard version.

Let's assume we run ls with our malloc, how would you debug it?

$ gdb ls
...
Reading symbols from /bin/ls...(no debugging symbols found)...done.
(gdb) break main
Function "main" not defined.

Well… Seems hard right? The solution given by my teacher was to use an infinite loop. How it works:

#include <iostream>

int main()
{
  bool cond = true;

  while (cond)
    continue;

  std::cout << "Out of the infinite loop" << std::endl;
  // Stuff
}

The trick was to add this snippet at the beginning of the code of our function, and to run the program. For example, compile the code above (don't forget -ggdb). This is an infinite loop, yay!

gdb can attach to a running process with its PID, and is able to set the value of a variable when debugging. So the trick is just:

$ pgrep <program_name>
<pid>
$ gdb attach <pid>
...
main () at tests.cc:8
8           continue;
(gdb) set var cond = false
(gdb) n
7         while (cond)
(gdb) n
10        std::cout << "Out of the infinite loop" << std::endl;
(gdb)

By setting the variable cond to false, we are out of the infinite loop, and we have the hand on the program to debug as we want. Impressive right?

I think this trick is useful for desperate situation like the one described above.

This was two great features I wanted to share with you. Do you have features that you want to share?