Whattup With Walt: Whattup with debugging?

Every time I’ve had a big debugging problem, it was because something was going on in the code that I wasn’t seeing.

I’m sure you’ve had the experience of, after solving a bug, saying, “Oh, of course, that should have been obvious.”

Well, it would have been obvious if you had seen what you thought was obvious, earlier.

Having that experience, over and over again, taught me to make the inner workings of my program as visible as possible.

#1. Write complete, detailed logs

I start with writing complete and detailed logging. My log messages contain time, date, thread, log level, filename, line number, microsecond time stamp, and the actual message itself. I have multiple logging levels, of course, but an interesting one is my log message “unexpected.” This is where I log a message if a conditional branch (which is rare) is taken, even though it may not be an error.

#2. Structure logs as macros

I structure the logging facility as macros, so that the logs of varying verbosity can be selectively compiled out, if higher performance is needed.

#3. Check return codes

Always check return codes, and if an error occurs use a facility to log that error. In the C++ environment, I even have a macro that I wrap around every system call:

 

 

#4. Throw exceptions when systems fail

I usually throw exceptions when system calls fail if such a failure means the function cannot continue, i.e. things are broke up bad. A few examples from my code:

 

The Exception class takes the error number, looks up the message for me, provides a stack trace, and the file and line where the error occurred, all bundled up in a nice package that gives me just about all the information I need to track down what happened.

Wow! Just look at all this stuff that helps you figure out what happened. Without my macro I probably would have been too lazy the check the error code.

#5. Automate output assembly

Then comes assembling all of this information.  Forget it, too much work. I make the computer do it. He’s only too glad to produce profuse output:

 

 

Remember, computers are good at drudgework. Make the computer assemble all the information you need, and don’t, I repeat don’t, personally look up what an error code means. Have the computer do that. It’s a lot faster and way less error-prone. The standard “strerror” function does the error code lookup for you. Use it.

#6. Take time to print out variable values

Whenever you are confronted with a difficult debugging task, spend the time necessary to print out variable values, and whatever other information you think you need. It will make the debugging go faster. Don’t spend a lot of time guessing, rather expose the things you need to see via logs or other means such as debuggers, network sniffers, or other tools that let you see more than what you can see with logs alone.

#7. Use debugging tools

Also, use tools. I had some coworkers spend two weeks trying to figure out why their program was suddenly and randomly exiting. They refused to use a debugger. Once I became aware they were stuck, I put the program into GDB (Gnu Debugger) and let the program run.

Surprise! Within minutes, it “crashed,” and I found out why: SIG_CHILD. The debugger told me exactly what was happening. When a child process dies it sends a signal to the parent, which kills the parent unless the signal is captured. My coworkers’ response: “Oh! Of course.” Two weeks down the drain because of a refusal to use a tool (and I can guarantee you will never get that time back). The sooner you see what’s actually going on, the sooner the “duh” light bulb lights up.

#8. Using tcpdump or ngrep

Sometimes the task of debugging network problems seems impossible. But I’ve found that network problems almost always succumb to tcpdump or ngrep; two tools that let you watch the network traffic directly. No more mystery about why this computer isn’t communicating with that one. I see people sitting around guessing, “Maybe there’s a firewall? Maybe the other computer isn’t listening? Maybe the socket is wrong? Maybe it’s buffering? Maybe I have the wrong drivers?” This isn’t Sherlock Holmes, it’s engineering. You’ll have plenty of time to debug hard problems where you just cannot use tools.

Where you can use tools, use them. Otherwise, be prepared to pull all nighters figuring out what’s wrong with your program.

A Picture of Walt Howard

WALT HOWARD

Senior Software Developer Walt Howard is a Senior Software Developer on the CORE engineering team at MediaMath. He began his C++ career with version 1.0 of the Microsoft C++ compiler. In his own words, "I was a C programmer and was continually frustrated because C didn’t have certain automatic, housekeeping functions I thought were only sensible, like destructors, templates, and exceptions. When C++ came out it was all my programming dreams come true. I was an instant convert. It was like Bjarne Stroustrup was reading my diary." Walt finds himself in the peculiar position of being a right-brained programmer in a left-brained programmers' world. He values simplicity where most programmers strive for complexity. “I would rather augment a naive, overly simplistic solution than try to undo a tangled up mess caused by complexity.” As for experience, Walt has been involved in software development with evolving techniques and team sizes since before MS-DOS.
0 Comments.

Leave a Reply

Your email address will not be published. Required fields are marked *