0 post
« Previous 1 / 2 / 3 / 4 / 5 Next »

Posts Tagged ‘Static Analysis’


Get the red out…

Posted by Todd Landry   June 17th, 2009

When I first started at Klocwork, I didn’t really know a lot about source code analysis. I understood the basic concept of how it finds bugs in software, but that is was essentially it. Sure I knew about Memory leaks, but I truly believed that they were only found a day or two before the GA date…at least, that was when our testing team always found them.

In one of my teams prior to joining Klocwork, we used Scrum. We were hard core, with daily 15 minute scrums, retrospective meetings, sprint planning sessions, defining “done”, secret handshakes, the whole 9 yards. We also broke our features down into small tasks, and those tasks were written on cue cards and then stuck to a big wall for all to see. What a great way to see the progress of a sprint. We had green cards for development tasks, blue cards for testing, yellow cards for documentation, and red cards for bugs. I remember how after 2 or 3 days into a sprint, the red cards would start showing up, and developers would then start addressing them. Since one of our team ‘rules’ was each person could only have a single task checked out at one time, our developers had to check-in the green card they were working on in order to tackle a red card. By the end of a sprint there were always a number of red cards left, which by definition, needed to be addressed first in the next sprint. I’m sure you can imagine the enthusiasm of heading into the next sprint knowing there was a wall of red cards to address first.

Anyways, my first few weeks at Klocwork consisted of talking with a lot of people; customers, prospects, etc. These people knew source code analysis, but they only knew the traditional way of source code analysis (SCA), and not the new generation of SCA where developers check their code before they check-in their code. I remember thinking I must be missing something…why is this so hard for these people to understand?  Source code analysis turns a lot of those red cards into green cards.  For more info on how SCA and agile can work together, check out this webinar I recently did…


False positives in modern static analyzers

Posted by Alen Zukich   May 22nd, 2009

In response to Jason’s post about false positives.  First of all there is a general misconception of false positives.  Modern static source code analysis tools have changed the game.  It is not the Lint tool of the past, a focus with deep inter-procedural technology has placed the requirement that static tools today produce more real issues than false reports.

With that said, Jason is right, large code bases never running static analysis will produce a large number of issues no matter how accurate it is.  Even though static analysis tools do provide a number of ways to manage this (and Jason talks about one) it does make sense to put this in your code reviews. You are looking at legacy code but if you are doing code reviews then you must have changed something with that legacy code.  Therefore having those bugs visible to you during the code review could suddenly now apply.


Static analysis and code reviews

Posted by Alen Zukich   May 19th, 2009

Jason certainly hits the nail on the head.  Automation, specifically using static analysis, is key and it should be tightly integrated with your code review. Although we need to be careful where we label source code analysis.  Static source code analysis certainly can find those low level issues such as labeling your local variables correctly, but it goes beyond simple code style issues.

Where static source code analysis can really help is with the deep inter-procedural context that it can provide.  For example, during a code review you go through some code with a number of function calls.  Hopefully you know what each and every function is doing…but do you really?  This is where the deep analysis of static source code tools can help.  It can help you identify that there may be an issue in the code review and that issue happens to show that a function is returning NULL.  Uh oh, potential null pointer dereference on our hands.

Now add code reviews with other static source code technology, such as full source cross reference information, flowcharting, impact analysis for any function/methods and architectural representation to show you the full context of the system.  Now you’re talking powerful.


Findbugs not recognizing exceptions? Java static analysis

Posted by Alen Zukich   May 4th, 2009

We’ve posted previously on some of the differences between Findbugs’ open source Java analysis and commercial Java static analysis tools, specifically on the JSR-305 specification and source code versus byte code analysis topics. Due to these differences, many Java shops will use a commercial Java static analysis tool in conjunction with Findbugs to make sure they’re getting as complete issue detection as possible.

One area that’s been discussed previously is the ability to identify situations of possible null pointer dereference. This peaked my interest and led me to do some benchmarking against a few open source projects to assess the Findbugs analysis on intra- and inter-procedural possible null pointer dereference issues.

The normal assumption is that Findbugs is strong with intra-procedural analysis but unable to provide inter-procedural Java analysis. Here is an inter-procedural example from an open source project called hsqldb:

    protected void directRefreshTree() {

         int[]                  rowCounts;

        ...

            try {

                rowCounts = getRowCounts(tables, schemas);

            } catch (Exception e) {

                 //  Added: (weconsultants)@users

               CommonSwing.errorMessage(e);

            }

            ResultSet col;

            // For each table, build a tree node with interesting info

            for (int i = 0; i < tables.size(); i++) {

                col = null;

                String name;

                try {

                    name   = (String) tables.elementAt(i);

                    if (isOracle && name.startsWith("BIN$")) {

                        continue;

                        // Oracle Recyle Bin tables.

                        // Contains metacharacters which screw up metadata

                        // queries below.

                    }

                    schema = (String) schemas.elementAt(i);

                    String schemaname = "";

                    if (schema != null && showSchemas) {

                        schemaname = schema + '.';

                    }

                    String rowcount = displayRowCounts

                                      ? (" " + DECFMT.format(rowCounts[i]))

                                      : "";

    ...

    }

In the first try block getRowCounts() can have a null value.  How you ask?

    protected int[] getRowCounts(Vector inTable,

                                 Vector inSchema) throws Exception {

        if (!displayRowCounts) {

            return (null);

        }

      ...

    }

Okay, so automatically finding these issues is not going to happen with Findbugs.  I understand that, and that is why there are commercial tools to help with that.

But what really surprised me are the intra-procedural examples that are missed.  Again using examples from hsqldb and a specific class called TransferDb, Findbugs finds one issue. But there are clearly more intra-procedural issues that you would think Findbugs would have found.  Let’s take a look:

    TransferResultSet getData(String statement)

    throws DataAccessPointException {

        ResultSet rsData = null;

        try {

            if (srcStatement != null) {

                srcStatement.close();

            }

            srcStatement = conn.createStatement();

            rsData       = srcStatement.executeQuery(statement);

        } catch (SQLException e) {

            try {

                srcStatement.close();

            } catch (Exception e1) {}

            srcStatement = null;

            rsData       = null;

            throw new DataAccessPointException(e.getMessage());

        }

        return new TransferResultSet(rsData);

    }

This is a conditional situation where “srcStatement” could have a null value (the conditional is false), then you throw an exception with “createStatement()” and dereference later with “srcStatement.close()”.  Why does Findbugs miss this issue?  There are several issues of this type.

Or another situation where you throw an exception with “createStatement()” again.  This example spans multiple lines so you will find it in an attachment.  Here “select_rs” has a value of null and is clearly dereferenced much later on.

       ResultSet         col            = null;

        int               colnum         = 1;

  Statement         stmt           = null;

Source: null here ->  ResultSet         select_rs      = null;

        ResultSetMetaData select_rsmdata = null;

         try {

Exception throw here -> stmt           = conn.createStatement();

            select_rs      = stmt.executeQuery(TTable.Stmts.sSourceSelect);

            select_rsmdata = select_rs.getMetaData();

            col = meta.getColumns(TTable.Stmts.sDatabaseToConvert,

                                  TTable.Stmts.sSchema,

                                  TTable.Stmts.sSourceTable, null);

        } catch (SQLException eSchema) {

            // fredt - second try with null schema

            if (TTable.Stmts.sSchema.equals("")) {

                try {

                    col = meta.getColumns(TTable.Stmts.sDatabaseToConvert,

                                          null, TTable.Stmts.sSourceTable,

                                          null);

                } catch (SQLException eSchema1) {}

            }

        }

        try {

            while (col.next()) {

                String name = Dest.helper.formatIdentifier(col.getString(4));

                int    type        = col.getShort(5);

                String source      = col.getString(6);

                int    column_size = col.getInt(7);

                String DefaultVal  = col.getString(13);

                boolean rsmdata_NoNulls =

                    (select_rsmdata.isNullable(colnum)

                     == java.sql.DatabaseMetaData.columnNoNulls);

                boolean rsmdata_isAutoIncrement = false;

                try {

                    rsmdata_isAutoIncrement =

                        select_rsmdata.isAutoIncrement(colnum);

                } catch (SQLException e) {

                    rsmdata_isAutoIncrement = false;

                }

                int rsmdata_precision = select_rsmdata.getPrecision(colnum);

                int rsmdata_scale     = select_rsmdata.getScale(colnum);

                type = helper.convertFromType(type);

                type = Dest.helper.convertToType(type);

                Integer inttype  = new Integer(type);

                String  datatype = (String) TTable.hTypes.get(inttype);

                if (datatype == null) {

                    datatype = source;

                    tracer.trace("No mapping for type: " + name + " type: "

                                 + type + " source: " + source);

                }

                if (type == Types.NUMERIC) {

                    datatype += "(" + Integer.toString(rsmdata_precision);

                    if (rsmdata_scale > 0) {

                        datatype += "," + Integer.toString(rsmdata_scale);

                    }

                    datatype += ")";

                } else if (type == Types.CHAR) {

                    datatype += "(" + Integer.toString(column_size) + ")";

                } else if (rsmdata_isAutoIncrement) {

                    datatype = "SERIAL";

                }

                if (DefaultVal != null) {

                    if (type == Types.CHAR || type == Types.VARCHAR

                            || type == Types.LONGVARCHAR

                            || type == Types.BINARY || type == Types.DATE

                            || type == Types.TIME

                            || type == Types.TIMESTAMP) {

                        DefaultVal = "\'" + DefaultVal + "\'";

                    }

                    datatype += " DEFAULT " + DefaultVal;

                }

                if (rsmdata_NoNulls) {

                    datatype += " NOT NULL ";

                }

                v.addElement(inttype);

                datatype = helper.fixupColumnDefRead(TTable, select_rsmdata,

                                                     datatype, col, colnum);

                datatype = Dest.helper.fixupColumnDefWrite(TTable,

                        select_rsmdata, datatype, col, colnum);

                create += name + " " + datatype + ",";

                insert += "?,";

                colnum++;

            }

Sink: dereference here-> select_rs.close();

            stmt.close();

            col.close();

        } catch (SQLException e) {

            throw new DataAccessPointException(e.getMessage());

        }

So I’m a little confused as to why these issues are not highlighted.  It seems that this is the exact type of analysis Findbugs does.  Maybe it has to do with not recognizing the possible exceptions?  Does anyone have any ideas?


Build Analysis and Source code analysis must work together

Posted by Alen Zukich   April 25th, 2009

There’s been some recent discussion around using source code analysis (SCA) technology for build clean-up and optimization. I thought it might be useful to try and separate the spin from reality and outline where and how static source code analysis can be used for build optimization.

First, every SCA tool worth its salt does build analysis. Automated discovery of a customer’s build system is a required capability for deep static code analysis. Most users of SCA attempt to discover bugs, security vulnerabilities, and other maintainability problems. Some customers will also leverage the build analysis itself to conduct targeted clean-up of the build. Three common ways that this can be done are:

  1. Trace file analysis – provides visibility of the entire build process to help find issues and inefficiencies in the build that can impact build times, and ultimately developer productivity.
  2. Header file analysis – goal here is to identify inefficient and overly complex include structures that can lead to long build times and bloated system size.
  3. Interface analysis – find low level issues that can cause build failures due to improper API usage.

First, trace file analysis is the process of analyzing, understanding and mapping every process executed during your build process, including all compiler and linker invocations. This kind of analysis is mandatory for a good SCA tool since it is necessary to understand all the full details of how you compile and link code so that detailed models of your system can be generated. The benefit from a build optimization standpoint is that this maps the entire build process and not just your compile and linking process, gives development leads and build managers visibility into the build, any inefficiencies, and where it may just be broken.

The other two types of analysis, header file and interface analysis, is focused on the source code directly but also important to build improvement. Specifically the focus here is on header files themselves and optimizations you can perform. A simple example of this is an include file that is simply never used. Why include it? This adds to the build time, size of the system and not to mention the complexity and maintainability of the system. Finding extra includes is not ground breaking technology but there are various types of issues to look for with header files. Other examples of more complex issues that involve deep analysis are with extra transitive issues or context dependent issues. For example, a missing include with a transitive dependency is a relationship between three or more files. In the following example, the first file, File1.c, includes the second file, header1.h, which, in turn, includes the third file, header2.h. File File1.c uses some symbols from file header2.h, but does not include it directly.

Good practice would have you include header2.h directly, of course developers include header1.h as a means to simply get the build to work. By eliminating instances where, in this example File1.c doesn’t even use anything in header1.h, real reductions in build time, and potentially system size, can be realized. Interface analysis is another type of issue focused on header files again looking for cyclical header files, duplicate header files and a whole slew of other in code issues that can be a nightmare such as multiple definitions or declarations.

Frankly there are limits to what can be done in this area with SCA. There are vendors in the build space such as Electric Cloud or IBM Buildforge who do this for a living and specialize in not only setting up production build environments that scale, but also have tools that complement the kind of optimization described above.


JSR 305: a silver bullet or not a bullet at all?

Posted by Mikhail Ksenzov   March 30th, 2009

JSR-305 is a Java Specification Request intended to improve the effectiveness of static analysis tools operating in Java 5+ environments. The idea here is that one can use special purpose annotations in order to provide static analysis tools with hints regarding the behaviour and side effects of methods.

An example of such annotations can be found in the presentation ‘Annotations for Software Defect Detection’ by William Pugh, who is masterminding the whole spec. Here we go:

 1: void test() {
 2:    if (spec != null) fFragments.add(spec);
 3:    if (isComplete(spec)) fPreferences.add(spec);
 4: }
 6:
 5: boolean isComplete(AnnotationPreferences spec) {
 6:    return spec.getColorPreferenceKey() != null
 7:        && spec.getColorPreferenceValue() != null
 8:        && spec.getTextPreferenceKey() != null
 9:        && spec.getOverviewRulerPreferenceKey() != null;
10: }

What’s wrong with the snippet above? Well, the check for null on line 2 shows that the developer expects that the value of ‘spec’ can potentially be null, but it is still passed to method ‘isComplete’. Later, ‘isComplete’ attempts to dereference the value, which causes a NullPointerException.

According to Dr. Pugh, the best way to detect this issue statically is to force a developer to add the annotation @Nonnull to the method signature like this:

5: boolean isComplete(@Nonnull AnnotationPreferences spec) {

In this way, a basic static analysis tool that can minimally track ‘spec’ as a potential suspect for ‘null’ can issue a warning when the @Nonnull annotation is contradicted (that is, when ‘spec’ is passed to ‘isComplete’ as a parameter).

There are two problems with this approach:

  • it forces the developer to do work that rightly should be performed by a static analysis engine
  • it takes time to write the annotation for static analysis, but it takes even more effort to maintain the annotations and the actual code base in a consistent state.

In reality, the proposals behind JSR-305 exist to enable a tool intended for single function analysis (so-called intra-procedural analysis) to act as if it were performing whole program analysis by requiring the developer to state expected behaviour up front (whether or not that behaviour is actually expressed correctly in the developer’s code).

In contrast, this same scenario is supported by a whole program static analysis tool (so-called inter-procedural analysis) without developer intervention:

  1. First, a complete call-graph of the system is built, and then all the methods are ordered so that called methods are processed prior to callers — such an ordering allows the tool to generate all the necessary information about, in this instance, the method ‘isComplete’ by the time the analysis of the method ‘test’ begins.
  2. During the analysis of ‘isComplete’, the tool records the fact that the incoming argument ‘spec’ is dereferenced.
  3. Next, the method ‘test’ is analyzed. In this method, the variable ‘spec’ is checked for null, so it is tracked as a potential suspect for an exception. Using the information generated about ‘isComplete’ the tool can reliably issue a warning on line 3, since it already knows that ‘isComplete’ dereferences the incoming argument.

So that example applies to a simple unconditional dereference scenario. In more complicated cases, Dr. Pugh proposes to use the annotation parameter ‘when’, with one of the following values:

  • ALWAYS
  • NEVER
  • MAYBE
  • UNKNOWN

For example: ‘@Nonnull(when=When.NEVER)’ means that a value is always null in the given context.

This specification seems to be a compromise between the amount of information provided by a developer to a static analysis tool and the amount of effort a developer has to put into it, a compromise that does not seem to be a particularly good solution here. First of all, the amount of information provided in such a manner is insufficient to provide accurate analysis, and secondly this seems to be too much work for the developer, especially when this work can be avoided.

Let’s examine how conditional value dereferencing is supported by whole program static analysis tools:

 1: void test() {
 2:     entity.qualifiedName = null;
 3:     saveName(entity, false);
 4: }
 5:
 6: boolean    saveName(Entity entity, boolean qualified) {
 7:    String name;
 8:    if (qualified)
 9:        name = entity.qualifiedName.trim();
10:    else
11:        name = entity.name.trim()
12:
13:     save(name);
14: }

In this example, an inter-procedural static analysis tool would first analyze the method ‘saveName’. A good analysis engine should be able to record the fact that this method only dereferences ‘entity.qualifiedName’ if the second parameter, ‘qualified’, is set to ‘true’. This, it would appear, is a deal more detailed than one can practically achieve by adding @Nonnull(when=When.XXX) annotations, even with all the work the annotation implies for the developer.

Next, the method ‘test’ would be analyzed. A good static analysis tool will naturally keep track of ‘entity.qualifiedName’ because of the assignment to ‘null’ on line 2 and its therefore potential use in an exception causing context. However, given that the actual call to ‘saveName’ on line 3 uses ‘false’ as its second argument, such a tool will not issue a warning that would in reality be a false positive, since the knowledge gained from analyzing ‘saveName’ disqualifies any potential warning due to the conditional relationship between arguments.

In summary, JSR-305 proposes a whole roster of interesting ideas for using annotations to enhance static analysis of Java code, and NPE detection seems to be only one aspect of this specification request. In upcoming blog posts, we shall continue the discussion of proposed annotations as well as offering our own ideas about how and when annotations should be used in static analysis.


Resource Leaks in C#

Posted by Alen Zukich   February 3rd, 2009

I’m picking up the theme of the CWE Top 25 today (see posts below for more detail on the list itself, or check out this blog posting for a more exhaustive description) as we run into what is described as CWE-404 all the time in managed code environments.

Although most C/C++ developers recognize explicit resource management as an issue, I’ve recently found out talking to some of our customers that they are totally unaware of the need to worry about such things in Java and especially C#. I even had one customer tell me in the context of C# that the CLR will “take care” of the issue. While it’s perhaps true that eventually the Finalize() method might take care of many resources, professional developers really should take the utmost care in dealing with precious unmanaged resources like file handles, graphics handles, or database connections.

A simple but surprisingly common example in C# illustrates the point:

using System;
using System.IO;

class Blah
{
static void Main()
{
StreamReader r = null;
r = File.OpenText (“blah.txt”);
if (r.EndOfStream) return;
Console.WriteLine (r.ReadToEnd());
}
}

The StreamReader class used here encapsulates an unmanaged resource, the underlying file handle that the O/S allocates to the CLR during the call to OpenText(). As you can see, that unmanaged resource is never released back to the O/S (until such time as the application terminates, of course).

A simple statement:

r.Close();

in a finally block would clean it up.  Calling Close directly is okay but it is crucial to make sure you put that in the finally block.  The most common resource leaks that I see static analysis tools find are the ones where you see a developer has been smart enough to dispose of his resource but there is another path (such as an exception that can be thrown on a method) where the resource is not disposed.

Another easier way to dispose of the resource is to apply the “using” statement, which calls Dispose on an IDisposable object automatically for you. This code:

using (Streamreader r = File.OpenText (“blah.txt”))
{

}

is the same as writing:

Streamreader r = File.OpenText (“blah.txt”);
try
{

}

finally
{
if (r != null)
((IDisposable)r).Dispose();

}

So make sure you know that you need to dispose of certain resources.  In fairness, these can be hard issues to detect, particularly in an inter-procedural context, but good regimented use of code reviews, coupled with a good static analyzer should get you well on your way.


ISV software quality; tortology or oxymoron…

Posted by Gwyn Fisher   December 10th, 2008

It’s kind of bizarre, but in my pre-Klocwork experience of running ISV development groups, from small teams to global enterprises, it never struck me as wrong that we would routinely ship software containing critical bugs. We knew we were doing it. We knew, on some abstract underground never-to-be-admitted layer of our deepest darkest souls, that this was a “bad thing.”

But mostly, we knew that when somebody found a bug we could just send them a patch.

And what’s more, we knew that customers expected this behavior. We got requests to “send us a patch quickly, this is causing me {insert unpleasant business scenario here}.” We had customers telling us that we were a great company to deal with, because we responded to patch requests quickly.

We weren’t the bad kids on the block, we were right up there with the upstanding corporate citizens of the software world!

Why are software companies allowed, perhaps even expected, to mess up on this scale? If we were engineers, we’d be sued. If we were doctors, we’d be in court before you could say something snappy and relevant like “most software sucks.” But for ISV developers, hey, it’s just business.

Talk to developers in the embedded space, or mil/aero, or telecoms, or a bunch of other places where this approach to business isn’t acceptable and it’s like talking to engineers. They understand what their tolerances are, they understand what it takes to make quality software, and they fundamentally understand on a basic level what happens to them and their company if they get it wrong. Inventory turnover is one thing, killing people is something else.

So is it just that ISV software is unimportant? That ISV software won’t ever be put in a situation where it could be life-threatening (or life-enhancing), so it doesn’t matter if it crashes all the time?

Obviously that’s not the case. Software created by ISVs is used all over the world, in all manner of situations, some dire, some not, but at the end of the day, in every situation, there’s a name on that software.

Your name.