0 post
« Previous 1 / 2 / 3 / 4 Next »

Posts Tagged ‘source code analysis’


Static analysis and code reviews

Posted by Alen Zukich   May 19th, 2009

Jason certainly hits the nail on the head.  Automation, specifically using static analysis, is key and it should be tightly integrated with your code review. Although we need to be careful where we label source code analysis.  Static source code analysis certainly can find those low level issues such as labeling your local variables correctly, but it goes beyond simple code style issues.

Where static source code analysis can really help is with the deep inter-procedural context that it can provide.  For example, during a code review you go through some code with a number of function calls.  Hopefully you know what each and every function is doing…but do you really?  This is where the deep analysis of static source code tools can help.  It can help you identify that there may be an issue in the code review and that issue happens to show that a function is returning NULL.  Uh oh, potential null pointer dereference on our hands.

Now add code reviews with other static source code technology, such as full source cross reference information, flowcharting, impact analysis for any function/methods and architectural representation to show you the full context of the system.  Now you’re talking powerful.


Findbugs not recognizing exceptions? Java static analysis

Posted by Alen Zukich   May 4th, 2009

We’ve posted previously on some of the differences between Findbugs’ open source Java analysis and commercial Java static analysis tools, specifically on the JSR-305 specification and source code versus byte code analysis topics. Due to these differences, many Java shops will use a commercial Java static analysis tool in conjunction with Findbugs to make sure they’re getting as complete issue detection as possible.

One area that’s been discussed previously is the ability to identify situations of possible null pointer dereference. This peaked my interest and led me to do some benchmarking against a few open source projects to assess the Findbugs analysis on intra- and inter-procedural possible null pointer dereference issues.

The normal assumption is that Findbugs is strong with intra-procedural analysis but unable to provide inter-procedural Java analysis. Here is an inter-procedural example from an open source project called hsqldb:

    protected void directRefreshTree() {

         int[]                  rowCounts;

        ...

            try {

                rowCounts = getRowCounts(tables, schemas);

            } catch (Exception e) {

                 //  Added: (weconsultants)@users

               CommonSwing.errorMessage(e);

            }

            ResultSet col;

            // For each table, build a tree node with interesting info

            for (int i = 0; i < tables.size(); i++) {

                col = null;

                String name;

                try {

                    name   = (String) tables.elementAt(i);

                    if (isOracle && name.startsWith("BIN$")) {

                        continue;

                        // Oracle Recyle Bin tables.

                        // Contains metacharacters which screw up metadata

                        // queries below.

                    }

                    schema = (String) schemas.elementAt(i);

                    String schemaname = "";

                    if (schema != null && showSchemas) {

                        schemaname = schema + '.';

                    }

                    String rowcount = displayRowCounts

                                      ? (" " + DECFMT.format(rowCounts[i]))

                                      : "";

    ...

    }

In the first try block getRowCounts() can have a null value.  How you ask?

    protected int[] getRowCounts(Vector inTable,

                                 Vector inSchema) throws Exception {

        if (!displayRowCounts) {

            return (null);

        }

      ...

    }

Okay, so automatically finding these issues is not going to happen with Findbugs.  I understand that, and that is why there are commercial tools to help with that.

But what really surprised me are the intra-procedural examples that are missed.  Again using examples from hsqldb and a specific class called TransferDb, Findbugs finds one issue. But there are clearly more intra-procedural issues that you would think Findbugs would have found.  Let’s take a look:

    TransferResultSet getData(String statement)

    throws DataAccessPointException {

        ResultSet rsData = null;

        try {

            if (srcStatement != null) {

                srcStatement.close();

            }

            srcStatement = conn.createStatement();

            rsData       = srcStatement.executeQuery(statement);

        } catch (SQLException e) {

            try {

                srcStatement.close();

            } catch (Exception e1) {}

            srcStatement = null;

            rsData       = null;

            throw new DataAccessPointException(e.getMessage());

        }

        return new TransferResultSet(rsData);

    }

This is a conditional situation where “srcStatement” could have a null value (the conditional is false), then you throw an exception with “createStatement()” and dereference later with “srcStatement.close()”.  Why does Findbugs miss this issue?  There are several issues of this type.

Or another situation where you throw an exception with “createStatement()” again.  This example spans multiple lines so you will find it in an attachment.  Here “select_rs” has a value of null and is clearly dereferenced much later on.

       ResultSet         col            = null;

        int               colnum         = 1;

  Statement         stmt           = null;

Source: null here ->  ResultSet         select_rs      = null;

        ResultSetMetaData select_rsmdata = null;

         try {

Exception throw here -> stmt           = conn.createStatement();

            select_rs      = stmt.executeQuery(TTable.Stmts.sSourceSelect);

            select_rsmdata = select_rs.getMetaData();

            col = meta.getColumns(TTable.Stmts.sDatabaseToConvert,

                                  TTable.Stmts.sSchema,

                                  TTable.Stmts.sSourceTable, null);

        } catch (SQLException eSchema) {

            // fredt - second try with null schema

            if (TTable.Stmts.sSchema.equals("")) {

                try {

                    col = meta.getColumns(TTable.Stmts.sDatabaseToConvert,

                                          null, TTable.Stmts.sSourceTable,

                                          null);

                } catch (SQLException eSchema1) {}

            }

        }

        try {

            while (col.next()) {

                String name = Dest.helper.formatIdentifier(col.getString(4));

                int    type        = col.getShort(5);

                String source      = col.getString(6);

                int    column_size = col.getInt(7);

                String DefaultVal  = col.getString(13);

                boolean rsmdata_NoNulls =

                    (select_rsmdata.isNullable(colnum)

                     == java.sql.DatabaseMetaData.columnNoNulls);

                boolean rsmdata_isAutoIncrement = false;

                try {

                    rsmdata_isAutoIncrement =

                        select_rsmdata.isAutoIncrement(colnum);

                } catch (SQLException e) {

                    rsmdata_isAutoIncrement = false;

                }

                int rsmdata_precision = select_rsmdata.getPrecision(colnum);

                int rsmdata_scale     = select_rsmdata.getScale(colnum);

                type = helper.convertFromType(type);

                type = Dest.helper.convertToType(type);

                Integer inttype  = new Integer(type);

                String  datatype = (String) TTable.hTypes.get(inttype);

                if (datatype == null) {

                    datatype = source;

                    tracer.trace("No mapping for type: " + name + " type: "

                                 + type + " source: " + source);

                }

                if (type == Types.NUMERIC) {

                    datatype += "(" + Integer.toString(rsmdata_precision);

                    if (rsmdata_scale > 0) {

                        datatype += "," + Integer.toString(rsmdata_scale);

                    }

                    datatype += ")";

                } else if (type == Types.CHAR) {

                    datatype += "(" + Integer.toString(column_size) + ")";

                } else if (rsmdata_isAutoIncrement) {

                    datatype = "SERIAL";

                }

                if (DefaultVal != null) {

                    if (type == Types.CHAR || type == Types.VARCHAR

                            || type == Types.LONGVARCHAR

                            || type == Types.BINARY || type == Types.DATE

                            || type == Types.TIME

                            || type == Types.TIMESTAMP) {

                        DefaultVal = "\'" + DefaultVal + "\'";

                    }

                    datatype += " DEFAULT " + DefaultVal;

                }

                if (rsmdata_NoNulls) {

                    datatype += " NOT NULL ";

                }

                v.addElement(inttype);

                datatype = helper.fixupColumnDefRead(TTable, select_rsmdata,

                                                     datatype, col, colnum);

                datatype = Dest.helper.fixupColumnDefWrite(TTable,

                        select_rsmdata, datatype, col, colnum);

                create += name + " " + datatype + ",";

                insert += "?,";

                colnum++;

            }

Sink: dereference here-> select_rs.close();

            stmt.close();

            col.close();

        } catch (SQLException e) {

            throw new DataAccessPointException(e.getMessage());

        }

So I’m a little confused as to why these issues are not highlighted.  It seems that this is the exact type of analysis Findbugs does.  Maybe it has to do with not recognizing the possible exceptions?  Does anyone have any ideas?


Build Analysis and Source code analysis must work together

Posted by Alen Zukich   April 25th, 2009

There’s been some recent discussion around using source code analysis (SCA) technology for build clean-up and optimization. I thought it might be useful to try and separate the spin from reality and outline where and how static source code analysis can be used for build optimization.

First, every SCA tool worth its salt does build analysis. Automated discovery of a customer’s build system is a required capability for deep static code analysis. Most users of SCA attempt to discover bugs, security vulnerabilities, and other maintainability problems. Some customers will also leverage the build analysis itself to conduct targeted clean-up of the build. Three common ways that this can be done are:

  1. Trace file analysis – provides visibility of the entire build process to help find issues and inefficiencies in the build that can impact build times, and ultimately developer productivity.
  2. Header file analysis – goal here is to identify inefficient and overly complex include structures that can lead to long build times and bloated system size.
  3. Interface analysis – find low level issues that can cause build failures due to improper API usage.

First, trace file analysis is the process of analyzing, understanding and mapping every process executed during your build process, including all compiler and linker invocations. This kind of analysis is mandatory for a good SCA tool since it is necessary to understand all the full details of how you compile and link code so that detailed models of your system can be generated. The benefit from a build optimization standpoint is that this maps the entire build process and not just your compile and linking process, gives development leads and build managers visibility into the build, any inefficiencies, and where it may just be broken.

The other two types of analysis, header file and interface analysis, is focused on the source code directly but also important to build improvement. Specifically the focus here is on header files themselves and optimizations you can perform. A simple example of this is an include file that is simply never used. Why include it? This adds to the build time, size of the system and not to mention the complexity and maintainability of the system. Finding extra includes is not ground breaking technology but there are various types of issues to look for with header files. Other examples of more complex issues that involve deep analysis are with extra transitive issues or context dependent issues. For example, a missing include with a transitive dependency is a relationship between three or more files. In the following example, the first file, File1.c, includes the second file, header1.h, which, in turn, includes the third file, header2.h. File File1.c uses some symbols from file header2.h, but does not include it directly.

Good practice would have you include header2.h directly, of course developers include header1.h as a means to simply get the build to work. By eliminating instances where, in this example File1.c doesn’t even use anything in header1.h, real reductions in build time, and potentially system size, can be realized. Interface analysis is another type of issue focused on header files again looking for cyclical header files, duplicate header files and a whole slew of other in code issues that can be a nightmare such as multiple definitions or declarations.

Frankly there are limits to what can be done in this area with SCA. There are vendors in the build space such as Electric Cloud or IBM Buildforge who do this for a living and specialize in not only setting up production build environments that scale, but also have tools that complement the kind of optimization described above.


Java source code vs bytecode analysis

Posted by Alen Zukich   January 6th, 2009

David posted an interesting discussion on the usage of static analysis tools by developers to find security vulnerabilities.  As always the discussion with static analysis tools lean towards the false positive and false negative discussion.  But also David mentions their results are sometimes difficult to understand.  

This is one of the reasons Klocwork switched from a bytecode analysis tool for Java to a source code analysis tool.  As both have their advantages and disadvantages (and I admit I’m very biased here) we have certainly found that we have been able to reduce our false positive rates, find more issues (not to mention add new issues quickly) and provide more details on the results.

The reason for this is that the bytecode started to get noisy in Java 5 and even more so in Java 6.  Specifically one of the great advantages that static analysis tools have today is being able to show you the details of any issue it finds.  For example

public class Source {
     public static Source getInstance() { return null; }
     public int getValue() { return 32; }
     public Source() { getInstance().getValue(); }
}

This is a Null Pointer Exception issue. Although this is a very small example it is an inter-procedural issue and something that is missed by other defect detection tools. The important thing here is to help the developer understand this issue by tracing through the paths that make this happen.  In other words you need to know that getInstance() has the potential to return null and what paths bring us to the actual error.  This can get quite complex where it starts to not only span the same class but different methods in different classes.

Because of this extra noise this could be part of the reason your seeing new Java annotations being introduced.   Many of you may be familiar with the Java annotation that were introduced in Java 5.  This was part of a specification called JSR-250.  There is a new specification called JSR 305 for the purposes of allowing annotations to help assist tools that detect software defects.  Interestingly enough it is Findbugs leading the charge to add this in for Java 7.  Interesting to see if this specification happens.