CSCI 262 Data Structures

Spring 2018

Lab 6: Debug

(Back to Labs)

Goals


Overview


This lab will take you through some simple uses of the gdb debugger, applied to a somewhat silly program based on a previous week's lab. As usual, we ask that you work with a partner or a team. If you end up solo, please raise your hand and we'll pair you with someone else (or make a three-person team if necessary). You are welcome to change partners throughout the semester!


Instructions


Step 0: preliminaries

Download lab06.zip, which contains all the files you need to complete this lab. Create a project on Cloud9 and add the source files from the zip file into the project. Remember you can also obtain the files directly in Cloud9 using

    wget https://cs.mines.edu/Courses/csci262/spring2018/labs/6-debug/lab06.zip

The zip file should contain 3 source files (main.cpp, rwg.cpp, rwg.h), a Makefile, and dictionary.txt. The code should compile with make without any problem. You may notice when compiling that the source files are compiled using the -g flag, indicating that debugging information is included in the resulting executable. The resulting executable will be named lab06.

Step 1: First bug

The lab06 program, when working, should generate some random words from the provided dictionary. Give the program a try now, giving it the dictionary.txt file as input when prompted.

Okay, it obviously didn't work. You can probably look at the code and find the problem in a few minutes, but this lab is intended to help you with using gdb, so let's find the bug that way. Start by running gdb on the lab06 executable (you may need to expand your terminal window a bit before doing this - you'll want to see quite a few lines at a time):

    gdb lab06

After entering the command above you should see a bunch of lines looking something like:

    GNU gdb (Ubuntu 7.11.1-0ubuntu1~16.5) 7.11.1
    Copyright (C) 2016 Free Software Foundation, Inc.
    License GPLv3+: GNU GPL version 3 or later 
    This is free software: you are free to change and redistribute it.
    There is NO WARRANTY, to the extent permitted by law.  Type "show copying"
    and "show warranty" for details.
    This GDB was configured as "x86_64-linux-gnu".
    Type "show configuration" for configuration details.
    For bug reporting instructions, please see:
    <http://www.gnu.org/software/gdb/bugs/>.
    Find the GDB manual and other documentation resources online at:
    <http://www.gnu.org/software/gdb/documentation/>.
    For help, type "help".
    Type "apropos word" to search for commands related to "word"...
    Reading symbols from lab05...done.
    (gdb) 

The last line, showing (gdb) is what the command prompt in gdb looks like. You can quit out of gdb at any time by entering "q" at a command prompt. Also notice that you can enter "help" at a command prompt to get help on using the debugger, although if you have the gdb quick reference I provided during the lecture on debugging, that may be more helpful (the quick reference can also be found under the course website's Help page, or at: http://cs.mines.edu/Courses/csci262/spring2018/resources/gdb-refcard.pdf).

To start getting familiar with the debugger, let's set a breakpoint on main(), and start single-stepping through lines of code until the program exits. To set the breakpoint, enter:

    break main

at the command prompt. You should get a response saying that a breakpoint has been set at line 9 of file main.cpp.

Now you can run the program using

    run

Almost immediately, the program should halt at the start of main. You should see something like this:

    Breakpoint 1, main () at main.cpp:9
    9	int main() {
    (gdb)

Note that you have a command prompt again, so you can enter gdb commands. In our case, the command we want to enter is

    next

You can also abbreviate most commands in gdb to one or two letters - so n will work as well as next. Go ahead and enter next to get to the next (non empty) line of code. The debugger should show you the next line of code, and return you to a command prompt.

You can repeat the last command in gdb by hitting enter at the command prompt. Go ahead and step through the next few lines using next, or hitting enter. Note that if you step over a line with a cout statement, output is displayed on its own line. When you step over a line with a cin statement, you will be prompted for input as usual. So you can interact with the program in the normal fashion. Go ahead and step until you are prompted for the dictionary file, and enter dictionary.txt as before. Keep stepping with next until you get to line 31 in the code (the line number is shown at the very left on the same line as the displayed code). You should be looking at:

    31		if (generator.is_empty()) {
    (gdb) 

At this time, use the list command to see the code around the current line. By default list displays 10 lines of code, with the current line in the middle. You can get back to a view of just the current line by entering

    frame 0

which tells the debugger to go to the current top of the function call stack and display the current line of code.

When you do list, you can see that the current line of code is critical. If the generator.is_empty() call returns false, the program will exit with the message you got previously when you ran the code. Let's see why generator.is_empty() is returning false, by stepping into the function, rather than over the next line. The next we've been using steps over; to step into, enter

    step

You should now see a line of code from a different file:

    rwg::is_empty (this=0x7fffffffdd50) at rwg.cpp:47
    47		return _max_length == 0;

If you do list, or simply look at line 47 in rwg.cpp, you'll see that the current line of code is, in fact, the entirety of the is_empty() method. So we can guess that we are about to return false if we step over the next line of code. First, let's use the print command to check a few things. The print command is pretty powerful; it will do its best to display a useful representation of whatever variables or expressions you ask it to print. For example, we can see the value of the member variable _max_length using

    print _max_length

You might know or recall that the C++ keyword this is a pointer to the object in which a method is executing. So, if you want, you can also see a display of the current rwg object by doing print *this (note you need to dereference, or you'll just get the address of the object in memory!). Unfortunately, in this case printing the object is not very helpful, as it will try to recursively print all of the member variables - which includes the map object holding all of the words from dictionary.txt! If you've tried it, just use "q" to stop displaying (you should see a prompt at the bottom of the window telling you you can do this).

Feel free at this point to explore the code in rwg.h to see what member variables exist in the rwg object, and print them. You can also try expressions, such as

    print _words.size()

Unfortunately, there are some limitations. Most annoyingly, gdb cannot "see" inlined methods of classes, which means some methods just can't be used in a print expression. For instance, you will have no luck trying to see the first key of the _words object (a map) using print _words.begin()->first, probably indicating begin() is an inline method of the map class.

Okay, clearly something is wrong in our code, because _max_length is zero, which means we're about to return false, which means our program is going to quit even though we supplied a valid dictionary file. You might want to spend some time at this point looking at rwg.h to see if you can figure out the broad outlines of how the rwg class is supposed to work. Or, I can just tell you that the _max_length member variable is supposed to be storing the length of the longest word stored in the map _words. If you try printing _words you will immediately see plenty of words of varying length! So somehow, the _max_length variable got set incorrectly. (Ignore the fact that this is probably the wrong way to check to see if we have words - just go with it.)

There are a lot of ways to proceed at this point, including ways to use the debugger, but time is short, so let's cut to the chase. The problem is in the initialize method of rwg. For practice, let's use the debugger to find the offending line of code. You can kill the currently running instance of the program with kill, or simply exit the debugger and restart. Let's do the latter - you need the practice - following the steps above, but don't set a breakpoint on main(). Instead, set a breakpoint on rwg::initialize - to break on a method, you need to include the scope resolution operator giving the class as well as the method name.

Run the code again with this new breakpoint, then use next to start single-stepping through the lines in initialize(). Be sure not to use step, unless you want to step into some gnarly system code! (If you do so accidentally, for now just quit and restart. Just for reference, the easiest way (that I know) to get out of the system code is to just set a breakpoint on the next line of code that you want to go to, and issue the continue command. You can also use the finish command to get to the end of the current function, but you might have to do that multiple times.) Single step until you hit line 39, and you should see the bug.

Fix the bug by editing line 39 of rwg.cpp only (don't fix any other bugs you happen to notice), re-build the project using make, and continue on to the next step.

Step 2: Second bug

Try running the program again. You should get a little farther this time - it will let you specify the number of words you want to generate. Try asking it for 1000 words. You should (unless you get fantastically lucky) get a message like

    terminate called after throwing an instance of 'std::out_of_range'
      what():  map::at

possibly along with some generated words. It is also possible, but less likely, that you will get a segmentation fault. Try a few times to see.

To the debugger! This time, we'll run the program under the debugger and let it stop the code for us, by stopping when the "out_of_range" exception occurs. (Or, you might get the segmentation fault - if so, restart and try again.)

    gdb lab06
    run

You should see something like

    terminate called after throwing an instance of 'std::out_of_range'
      what():  map::at
    dispossessed 
    Program received signal SIGABRT, Aborted.
    0x00007ffff74aa428 in __GI_raise (sig=sig@entry=6)
        at ../sysdeps/unix/sysv/linux/raise.c:54
    54	../sysdeps/unix/sysv/linux/raise.c: No such file or directory.

Note the line number and file name - you are deep in system code at this time. We need to see the function call stack, to see where we are in relation to our code. Use the backtrace (abbreviation: bt) to see the function call stack. This is going to print out a lot of lines, so unless your terminal window is very large you may have to hit enter a few times to see all of the lines. You should see something like

    #0  0x00007ffff74aa428 in __GI_raise (sig=sig@entry=6)
        at ../sysdeps/unix/sysv/linux/raise.c:54
    #1  0x00007ffff74ac02a in __GI_abort () at abort.c:89
    #2  0x00007ffff7ae484d in __gnu_cxx::__verbose_terminate_handler() ()
       from /usr/lib/x86_64-linux-gnu/libstdc++.so.6
    #3  0x00007ffff7ae26b6 in ?? () from /usr/lib/x86_64-linux-gnu/libstdc++.so.6
    #4  0x00007ffff7ae2701 in std::terminate() ()
       from /usr/lib/x86_64-linux-gnu/libstdc++.so.6
    #5  0x00007ffff7ae2919 in __cxa_throw ()
       from /usr/lib/x86_64-linux-gnu/libstdc++.so.6
    #6  0x00007ffff7b0b2cf in std::__throw_out_of_range(char const*) ()
       from /usr/lib/x86_64-linux-gnu/libstdc++.so.6
    #7  0x0000000000402c22 in std::map, std::allocator >, std::allocator, std::allocator > > >, std::less, std::allocator, std::allocator >, std::allocator, std::allocator > > > > > >::at (this=0x7fffffffdd50, __k=@0x7fffffffdbfc: 26)
        at /usr/include/c++/5/bits/stl_map.h:523
    #8  0x0000000000402654 in rwg::_get_words[abi:cxx11](int) (
        this=0x7fffffffdd50, len=26) at rwg.cpp:90
    #9  0x00000000004025a1 in rwg::_get_word[abi:cxx11](int) (this=0x7fffffffdd50, 
        len=26) at rwg.cpp:80
    #10 0x00000000004023d2 in rwg::generate (this=0x7fffffffdd50, n=1000, out=...)
        at rwg.cpp:56
    #11 0x00000000004019f6 in main () at main.cpp:42

Most of that isn't useful to us (maybe if you were a deep system programmer it would be). The interesting stuff is near the bottom, where we see our files: rwg.cpp and main.cpp. So we really don't want to be at the top of the function call stack (frame 0), because we can't see anything useful - we want to find the bug in our code. To get to a different call stack "frame", we can either use the command up to go up one frame at a time until we get where we want, or go directly to frame number n using frame n. In my listing, it looks like our problems really begin around frame 8, so:

    frame 8

gets me this output:

#8  0x0000000000402654 in rwg::_get_words[abi:cxx11](int) (
    this=0x7fffffffdd50, len=26) at rwg.cpp:90
90		return _words.at(len);

Note that we can see already that we are in the method rwg::_get_words, and that its parameter is len with a value of 26 (in my example - you may have a different number). You can also use print to see the value of len. It appears that we got in trouble executing line 90 in our code. The exception that we got indicates that the problem is we asked for a key that doesn't exist in our map. (The documentation for the map at() method states the reason for the exception: http://www.cplusplus.com/reference/map/map/at/.) So somehow we got the notion that 26 was a valid choice. How did this happen?

We need to go up further and see where _get_words was called, and maybe further than that. Try using a combination of up commands and list commands to see if you can spot where the bug is occurring.

This part is a bit subtle, and may be hard to debug since you didn't write the code and aren't super familiar with the approach taken. So let me summarize: Essentially, the problem occurs in the _get_lengths method, because it uses a too-simple algorithm to determine valid word lengths in the dictionary: it simply chooses values between the minimum and maximum lengths. However, there are some gaps - not all word sizes in this range are represented in the dictionary. The code needs to be fixed to verify that a randomly generated length is actually in the dictionary before adding it to the vector of lengths. Modify the code to do this check, and you'll have fixed the second bug.

Step 3: Third bug

If you've fixed the bug properly and retry the program, you should (again, unless lucky) get a segmentation fault. The difference between the error we got last time and this time is that the previous error was generated by an exception - a "friendly" kind of error - because we used the at() method of map. In this case, we've obviously not used the at() method of some kind of object, instead we went out of range and accessed memory we shouldn't have.

Fortunately, gdb is equally adept at halting the program when a segmentation fault occurs. So fire up the debugger once more! Run your program, and let gdb tell you where the error occurs. At this point you know enough gdb to find the bug yourself - you don't need to do much other than examine the stack trace and examine variables and list the code around the point where the error occurred in our code. See if you can find and fix the bug yourself - if not, a TA or classmate can help.


Wrapping Up


Obviously there are a lot of things that gdb can do that we didn't cover here. More advanced/conditional breakpoints, display variables, probes, and much more. You'll have to explore some of this on your own, using the gdb manual or quick reference card, the internet, and your own buggy code :) Also, there are other debuggers in other IDEs than Cloud9, which should have nearly identical features, but different ways of accessing them. Hopefully we've at least given you the tools to get a running start.

To complete this assignment, please complete the quiz for this lab in Canvas. The quiz asks you to explain, in a few sentences, the nature and cause of the third (and hopefully final) bug in the random word generator.

NOTE: Just because we fixed all of them and the program runs, we can't be certain we got all the bugs!!! As Edsger W. Dijkstra once said, "Program testing can be used to show the presence of bugs, but never to show their absence!"