18 posts
« Previous 1 / 2 Next »

Archive for the ‘Nasty Bugs’ Category


To report, or not to report…

Posted by Gwyn Fisher   June 6th, 2011

BalanceCreating a source code analysis (SCA) engine is a balancing act, a decision process of where you believe the most value can be found along the spectrum that is the signal-to-noise ratio of the detection process. At one end lies the realm of massive noise and hopefully complete coverage, whilst at the other is the quiet calm of the theoretically useful but ultimately useless realm of no noise, but ultimately no signal either.

That may sound counter-intuitive. Shouldn’t a zero noise point on the spectrum be accompanied by an infinitely strong signal? Perhaps in the world of DSP this is true, but in the world of SCA reducing noise comes right along with a reduction in detection capability – it’s unfortunately almost a straight-line correlation.

So if we assume that we’re trying to balance a couple of dials on our theoretical tuner, we might start by reducing or dampening noise – it’s the most obvious place to start, after all. Nobody likes to listen to their favorite FM station through the curtain of hissing and popping that accompanies the act of driving through a major city.  Likewise no developer likes sifting through a long list of bogus detection errors in order to find the hidden gems. But to drag out the analogy, assume that the only way of reducing hiss on your FM signal is to turn down the volume… now you’ve got less hiss, but also less Bruce Springsteen goodness to accompany it.

Balance is what we need here, obviously. Enough Boss to make us ignore the hiss, or to put it in a more SCA-like context, enough interesting bugs to make us ignore the incorrect, or the irrelevant (correct detections on the part of the engine that the developer just doesn’t care about, e.g. low memory conditions in a memory-insensitive environment).

Consider the following simple example that clearly lies “on the line”:

    void foo(char* s, int a)
    {
        char* s1 = s;
        if( a > 0 )
            *s1 = 'a';   // potentially use an uninitialized ‘s1’
    }

    void bar(int m)
    {
        char *s;
        foo(s, m);       // s is not initialized prior to calling ‘foo’
    }

So… to report, or not to report?

Lacking any other information, it is obvious that function ‘foo’ interacts under certain situations (when parameter ‘a’ is positive) with parameter ‘s’ (aliased as local variable ‘s1’). As we have no knowledge about the provenance of parameter ‘s’ when analyzing ‘foo’, however, there’s nothing here to cause a report and so we squirrel away the knowledge of what ‘foo’ does for later use.

When analyzing ‘bar’ we know what ‘foo’ does, and we know we’ve got an uninitialized local pointer, ‘s’. But again we’re lacking enough knowledge to know the valid values, or ranges, that parameter ‘m’ may take. There are definitely a set of circumstances here in which we know a problem will occur (if parameter ‘m’ is positive), and a set of circumstances in which we know a problem will not occur (if parameter ‘m’ is zero or negative) – this much is encoded in the functional behavior of ‘foo’. But is it a defect, or should we filter out the report in favor of providing only those situations in which we can be “sure” the bug not only exists, but can be proven to be exercised?

There’s the art of balance in a nut-shell, and it revolves around the phrase “lacking any other information.” In the ideal world, lacking any restrictions in terms of time, memory or computing power (or indeed actual from-the-wall power, as we have to worry about now), we might defer all such decisions until we categorically know that a particular data value is passed down the call graph far enough to get to ‘foo’. But in the real world of multi-million LOC projects, that approach simply can’t scale.

And so, calling on balance as our friend, we can bias a localized decision to report or not, given that we know to at least one order of approximation that bad things could happen here. Different engines pronounce that bias differently, leading to one of the greatest divides between prevalent solutions.

Now ask yourself, as the developer, is it a worthy report if you know that 10 levels up the call graph there’s a check on what eventually becomes parameter ‘m’ to ensure that it’s never positive? Perhaps you’d automatically classify this as a false positive and, annoyed at the tool, move onto the next report. Or perhaps, seeing the size of the gap in the call graph, you might just choose to code defensively, initializing ‘s’ to NULL in ‘bar’ and adding guard code to ‘foo’ because, hey, you never know.

And as we’ve all seen so many times over the years, “you never know” might just as well be written “and so it came to pass…”


Stack smashing

Posted by Alen Zukich   May 3rd, 2011

A while ago I talked about memory overflows.  Now in this latest instalment, as we look at more interesting bugs, I’ve come across a new example.  Here is a situation described by a customer as “stack smashing”, which occurs when you copy a string of unknown length into a fixed buffer size.

#include <stdio.h>;

void foo(char * v) {
    char buffer[10];
    if(v != NULL)
   {
            memcpy(buffer, v,strlen(v));
    }
}

int main (int argc, char ** argv)
{
    foo(argv[1]);
    foo("the longest string you can find");
    return 0;
}

Just like the memory overflow post this is another form of a buffer overflow.  So there you have it, just more terminology to describe bad things in your code.  Gwyn promises to give a follow up to these posts with some details on how this general area (stack corruption, tainted sources, etc.) can be exploited.  Can’t wait to see that.


Memory overflows

Posted by Alen Zukich   April 12th, 2011

A few years back a customer said they had all kinds of trouble with bugs corrupting their stack.  Even though they asked if source code analysis tools could help find stack corruption, once we got an example, we found that they were really looking for was memory overflows.  So what on earth is a memory overflow?  Does that even exist?

Yes, except it is probably not what you’re thinking, it’s not the same as a memory leak;  a memory overflow is quite different.  A memory overflow is really just a form of a buffer overflow.  The impact of memory overflow is unexpected behavior or program failure.   Take this example:

#include <stdio.h>
typedef struct s1_ {
   int i;
   int j;
   char arr[10];
}s1;

typedef struct s2_ {
   char b[20];
   char c[40];
}s2;

main()
{
   s1 block1;
   memset(&block1, 0, sizeof(s2));
   block1.i =1;
}

Here we have the case of incorrectly using ‘memset’ at line 16 where ‘sizeof(s2)’ is bigger than ‘block1′.  In fact, going back to this customer revealed that the issue was due to memset clearing much more area than intended.  If you’re using static analysis or source code analysis tools then you are probably covered by this.  You will find this type of issue usually in the “buffer overflow” category.

So, are you free of your memory overflows?


Another resource leak

Posted by Alen Zukich   March 1st, 2011

It happened again.  For what seems like the 100th time, someone reports to me that they are seeing a number of false positive reports on the resource leak checker.  For those not familiar with a resource leak, take a look at a previous post.  Although resource leaks apply across most languages, the place where this question keeps coming  up seems to always be in Java or C# code.  My last query came from Java code, so we will use that as an example.  Here was a report where the FileInputSteam is not closed on exit.

FileChannel sch = new FileInputStream(...);
FileChannel dch = new FileOutputStream(...);
try{
   ...
} catch (...) {
   ...
} finally {
   try {
      sch.close();
   }
}
...

Do you see this issue?  Clearly, they close the FileInputStream in the finally block.  The problem is that you can still throw an exception when you try to create a FileOutputStream.  The simple fix is to encompass the FileInputStream and FileOutputStream in the try/catch blocks.

Static analysis tools are great for finding bugs like this resource leak.  In this case, the report that a FileInputStream is not closed on exit wasn’t enough information to debug this.  This brings us to a very important discussion with static analysis tools–trace.

Gone are the days where a static analysis tool reports a bug at line X with no context.  Today you get detailed traces.  The trace contains conditions and assignments to give you clues about why the tool is identifying a bug.  In this example below, identifying the “source” and “sink” of any defect will help you trace through the start/end of the control flow.


More importantly in the case of Java, here we have a clear understanding where exceptions are thrown.  This would have helped tremendously for the example above.



Detailed traces are very important and can do many things for individual checkers.  So whatever you do, don’t just look at the bug description, pay attention to the trace!


Dealing with a different type of backlog…your bug backlog

Posted by Todd Landry   February 3rd, 2011

As a product manager, the only backlog I typically care about is my product backlog. Do I have the right stories in there? Do the stories have enough detail? Are they properly prioritized? You know, that kind of stuff. Today, however, I’m going to write about a very different backlog, that is the static analysis defect backlog.

A static analysis backlog is created when you run a static analysis product on your code base for the very first time. Chances are pretty good that the first analysis is going to list a large number of defects, some that are without question real, and some that perhaps are not. Do not freak out! This is the first time that analysis engine has ‘laid eyes’ upon your code and it is going to flex its muscles and show you any weaknesses it believes exist. So how does one deal with this? Here are a few strategies to help you:

1) Don’t boil the ocean. Before you even run that first analysis, don’t have a “wouldn’t it be cool” moment, where you decide to turn on every single rule the analysis engine has. There is a reason why static analysis tools haven’t turned on everything.  They are showing the most accurate and critical issues first.  So unless you have unlimited time and resources, your best bet is to start with a core set of rules and run the analysis based on that set. This core set of rules should include things such as memory/resource leaks, buffer overruns, null pointer dereferences, uninitialized variables, and so on. Add other rules once you have this core set under control.

Is your issue backlog making you cross eyed? Try these coping strategies.


2) Baseline your defects. Consider that first analysis your baseline and choose to ‘park’ them for the time being. Chances are the product that the analysis was run on is one that has already been released to the public, and in good working order. Zero out these defects for now, and start to triage them, which leads into strategy #3.

3) This is going to sound pretty obvious, but when it comes to managing your issue backlog start looking at the most critical issues first. These are the ones that are most likely to cause a failure of some sort, so determine if these issues are real, and if so, fix them immediately. Once you’re done with the most critical issues, move to the next level of severity, and continue on that way.

4) Finally, tune your analysis. Any good vendor will allow you to tune your analysis. The benefits of tuning are twofold; 1) you can find code issues that would otherwise go undetected and, 2) reduce the number of issues that the engine reports incorrectly in the context of your source code. You should think of ways to give the tool more context about your code base to increase accuracy.

If you follow these suggestions, you’ll definitely have a better grasp of your bug backlog, and you’ll be able to execute on reducing that backlog quickly and efficiently. If you don’t, then at some point, you may feel a little like the critter pictured here.

If there are any other strategies you’ve tried to deal with your bug backlog, leave a comment or two. I’d love to hear about them.


Patterns of Bugs

Posted by Brendan Harrison   January 18th, 2011

Nice blog post from Walter Bright over at Dr. Dobbs on the Patterns of Bugs. He ties together bug patterns, recommended process changes, and the resulting productivity payoff from making these improvements. He recommends a bunch of process changes, including static analysis, code reviews, and coding standards, then goes on to review examples of different bug patterns. A few can be detected with static analysis (coding mistakes as written) but many are errors with the code as intended (something static analysis doesn’t check… that’s what testing is for). His main recommendation seems to be that bugs can often be pattern based, so once a bug is identified, take steps to remove that pattern from your code through process or tool changes.

Patterns of Bugs

Once the pattern is identified, then look for changes in process that will permanently eliminate that bug. Eliminate enough of the bug patterns, and you should enjoy a substantial increase in productivity.

His experience is that, over time this kind of systematic approach to fixing bugs makes developers better.

I’ve noticed in my decades of writing programs that I just don’t make the kinds of mistakes I used to. Apparently I’ve unconsciously evolved coping strategies to avoid them. Identifying and building such strategies into the process means everyone can benefit from that experience.

We certainly see that in the area of static analysis bug detection; lots of customers report that their developers make fewer mistakes, defect injection rates go down, and overall productivity is improved.


PM Thoughts on Code Reviews

Posted by Todd Landry   November 9th, 2010

While I may not be the most active Twitter-er in the world, the one thing I have noticed is that there is an awful lot of activity around the term “code review” lately. Since code reviews have become a widely used practice, I thought I would share one of my experiences about code reviews with you, from a product manager perspective.

In my first Agile team, many years ago, it was tabled (in our retrospective meeting after a couple of Sprints) that code reviews should be added to our definition of “Done”.  Let’s just say my initial response was less than enthusiastic… but why was that?  Well, in my opinion (perhaps uneducated on this topic), doing code reviews seems to add more to the time it takes to finish stories, so that means less stories are getting done per iteration, which potentially means longer release times, or releases with less functionality than hoped for. This is not something a Product Manager is usually receptive to. After some debate, we put it to a vote where the “yays” defeated the “nays” by a fairly healthy margin (okay, it missed being unanimous by one vote).  So we updated our “Done” criteria and moved into our next Sprint.

Our next couple of sprints went off similar to our earlier sprints, I didn’t really notice any differences. We seemed to have about the same number of stories being started and completed, and I for one was mildly surprised that we were able to maintain the same velocity, even with the extra process of doing code reviews for each story. Curious, I decided to talk to one of the more senior developers about what was going on. He walked me over to our Scrum board and asked me if anything looked different. Nothing jumped out at me initially, until he pointed out that the number of ‘bug’ cards (the dreaded red cards) were significantly less than in those early iterations. He proceeded to tell me that the code reviews were playing a major role in this. Developers were finding things early and fixing them before passing the code onto the testers, leaving the testers to focus on testing the actual features …crazy, I know.

It really appeared as though the code reviews were producing better code, without actually slowing down the development process. My opinions of code reviews did a complete 180…now they were helping to contribute to better quality code that I could show our customers, without having to sacrifice anything in the way of release delays or velocity degradation. I had become a believer!

 I think I have something to Twitter about now…


Multicore exposes more frog versus snake (deadlock) bugs

Posted by Eric Hollebone   September 30th, 2010

Deadlock: frog vs. snake
Photo: David Maitland / National Geographic

Continuing the discussion about the embedded community moving to muticore/hetrogeneous hardware from watch out here comes multicore, embedded software development teams have historically been shielded from mulitcore issues. This is due to the specialized functionality of many embedded application classes and the inherent serialized nature of the C language.[1]

Muticore is an ambiguous term for software developers and one they don’t really use; software developers think in terms of threads/processes and concurrency, not how many cores or processors are available on the target. Concurrency is not a new topic either as Mark Smotherman captured in a history of multithreading, it has been a subject in computer science since its early beginnings in the 1950s.

What has changed is the rapidly increasing use of multicore technologies for embedded devices. One of the prominent software challenges that moving to multicore execution exposes is latent deadlocking bugs as true parallel execution comes into play, instead of a single core’s task scheduling/context switching techniques.

As an example, consider the following code snippet, which has been paraphrased from a deadlock discovered in a real-world open source multithreaded project.

Can you spot the deadlock?

lock_t lock1, lock2;
int refCount = 0;

void enter() {
   reserve_lock(lock1);
      if( refCount == 0 )
         reserve_lock(lock2);
       release_lock(lock1);
   refCount++;
}

void leave() {
   reserve_lock(lock1);
   refCount--;
   if( refCount == 0 )
      release_lock(lock2);
   release_lock(lock1);
}

To see the answer and understand the conditions that lead to the deadlock, download Klocwork’s whitepaper on Developing software for multicore.

A little about the picture in this post.  I found it when searching for pictures of deadlocks.  The photographer, David Maitland, titled his image “Deadlock” and describes it as a continuing struggle between a Morelet’s tree frog and cat-eyed snake. After three hours in the high-stakes cage match, Maitland said there was no clear winner.



VDC Research, “Next Generation Embedded Hardware Architectures: Driving Onset of Project Delays, Costs Overruns, and Software Development Challenges”, September 2010.


Google offers cash reward for finding bugs in Chrome

Posted by Eric Hollebone   February 5th, 2010

Google Chrome no bugsAs Google Chrome climbs out of obscurity in the browser market and expands into a light-weight but fully functional OS, security seems to have become a top of mind issue over at chromium headquarters.

In the Chromium Blog, Chris Evans of Chrome Security announced a cash for bugs initiative, paying between 500 and 1337 USD depending on the severity for any previously undiscovered flaw.  I am glad to see Google encouraging the community at large to participate in hardening my current browser of choice.  As Chris points out, Mozilla was one of the first to embark on this type of program, but I am happy to see Chrome following suit.  Me and my online transactions appreciate it.

Hmm.  Maybe I should roll-up the sleeves and  invoke the  ”I’m gonna write me a minivan” approach and get the driveway cleared for the armored cash trucks.

But seriously, if you’re interested in helping out and getting a small reward for your efforts, visit the Chromium Security project.


Going Agile Part 4 – Iteration 1: The Good, The Bad, and the Ugly

Posted by Todd Landry   January 19th, 2010

I just couldn’t resist using the classic spaghetti Western as the title for this instalment of my Going Agile series because it a) it was an awesome movie, and b) it truly sums up that 1st iteration of ours. My last post was all about the 1st iteration planning meeting, and how it was such an exciting and productive time for our team. We came out of that meeting a little weary, but extremely motivated to get to work. We were also just a tad naive.

The next 2 weeks were a roller coaster as we cut our teeth with Scrum. First the good:

  • Communication: the interaction amongst the team members was definitely improved. If someone needed an answer to something, they immediately sought out help. The team realized that if they didn’t get timely answers, tasks wouldn’t get done. They really didn’t want to say those dreaded 2 words, “nothing finished”, in the daily scrum meeting.
  • Meetings: The daily Scrum meetings were kept short and  sweet as everyone said what tasks they had finished, what they were working on, and if there were any roadblocks in the way. If something required further discussion, a break out meeting with the appropriate people was held.
  • Energy: This was a high performing team to begin with, but there was now a newfound energy and buzz. This was a fun team to be around!

As the title suggests, there certainly was some bad in that first iteration.

  • Testing and documentation: These were the 2 areas that struggled the most in the first iteration (and the next couple as well). They felt that their work was too heavily back loaded, that is, they would receive their stuff too late in the iteration to either test or document properly. Many of the stories were not totally Done because they were either not tested properly or documented with the time they were given.
  • Defects and bugs: Because testing happened so late in the iteration, many of the bugs they found could not be addressed in that iteration. These bugs would have to be carried over to the next iteration, meaning the number of new stories would have to be reduced.

Now for the ugly.

  • After just a day or so into the iteration, a plethora of unplanned tasks starting showing up on the Scrum board for many of the stories. These stories now had many new hours of tasks added to them, and we fell behind very quickly. This leads into the next ugly…
  • The Burndown chart: Talk about a misnomer! We started to affectionately call our chart the burn-up chart, because there was very little down direction going on with it. Our chart would have looked great at a sales meeting, but in our Scrum meeting, not so much.

So as you can see our 1st iteration had its share of warts, and in fact, the next couple did as well. But we didn’t get frustrated. We learned from our mistakes and changed/added things based on those mistakes. The Retrospective meetings were incredibly useful because they made us all take a hard, honest look at what went well, and what didn’t. The next, and last entry in my Going Agile series will look at the Retrospective meeting.