DEBUGGING WITH GIT BISECT: A Developer’s K9 & best friend

Dog_FBI

Debugging code is often one of the biggest headaches that a programmer can encounter. This is especially true in the embedded systems world, where issues can live anywhere between the lowest level of hardware, to the highest level of application code. Code regressions that are the result of an unknown change in your commit history are especially difficult to track down. However, thanks to the simplicity and magic of “Git bisect”, it will quickly become one of your favorite scenarios to debug. In particular, Git bisect is used to track down the exact commit that led to your broken repository state.

If you are new to Git code repositories, or version control in general, we couldn’t recommend it more as a way to more safely and sanely develop and control your projects. It’s essential whether coding alone or in teams of almost any size. At the very least, consider Git a map and travel journal that helps you backtrack and get out of the coding “weeds”. Git is one of the most highly adopted version control systems available and is extremely well supported. It can be run locally on your PC and is also available (often for free!) through cloud Git services like GitHub and BitBucket. We’ve provided some links at the foot of this article that can help you get started.

Git bisect works through “divide and conquer”. It initiates a binary search on all commits in a specified range to track down which one broke the current state. To begin a binary search, a start point and an end point are first established. The start point is a commit where the bug being investigated is known not to exist. The end point, usually the most recent commit, is where the bug is known to exist. A midpoint commit is determined halfway between the start and end point, and is tested to verify whether it fails or works. From here, a new midpoint commit is determined based on the results of the previous test, and it is again tested for the presence of the bug. This process is repeated until the test arrives at two adjacent commits in which one works and the other fails. The last commit where the test fails contains the code that introduced the bug, and you are now primed and ready to do some squashing.

Bisecting your repository is especially useful if you need to test thousands of commits to track down a bug. Since it is a binary search, your bad commit should be found in about log2(n) tests. So if you have 20,000 commits to test, you will find your bug within about 15 checks.

Here’s how a Git bisect test looks from the command line:

F = Fail
W = Working
? = Unknown
        F----?----?----?----?----?----?----?----?----?----?----W
        ^                                                      ^
        |                                                      |
        > HEAD                                                 > First Commit
        F----F----F----F----F----F----F----?----?----?----?----W
        ^                             ^                        ^
        |                             |                        |
        > HEAD                        |                        > First Commit
                                      > First test verifies commit is bad
        F----F----F----F----F----F----F----?----?----W----W----W
        ^                                            ^         ^
        |                                            |         |
        > HEAD                                       |         > First Commit
                                                     > Second test is good
        F----F----F----F----F----F----F----W----W----W----W----W
        ^                                  ^                   ^
        |                                  |                   |
        > HEAD                             |                   > First Commit
                                           > Third test is good
        F----F----F----F----F----F----F----W----W----W----W----W
        ^                             ^                        ^
        |                             |                        |
        > HEAD                        |                        > First Commit
                                      > Done bisecting, this is the first failing commit

Basic Example:

To demonstrate Git bisect, we created a simple Git repository and filled it with several commits, including an error that we marked in the log. Here is the initial list of commits for the project.

        * 8f19922 - (HEAD -> master) Changing application name 
        * 560b505 - Remove extra task 
        * 9c11cea - Extra task 
        * 91b7602 - removing extra print 
        * bbd9308 - Fixing wait 
        * 894eed2 - Adding seconds to wait 
        * 48f32df - The error 
        * 966e73c - spelling error fix 
        * b822b9b - Missed a semicolon 
        * 0be9a28 - Intro printf 
        * d315113 - Initial check-in 

In this example, we have an application that is currently checked-in, where HEAD (which is pointed to the latest commit) has an unknown failure, causing our device to crash. When we started with the initial check-in (our first commit), everything was working. A perfect scenario to utilize Git bisect!

To begin, let’s issue the start command and inform Git that our current commit (HEAD) is bad. For this, we will use the commit’s SHA, which is the “hash” or unique commit ID number. Also, inform Git that the first Git commit (d315113) is working. The general Git command line systax is: git bisect <subcommand> <options>. A full list of the Git bisect subcommands and other details can be found here.

    % git bisect start
    % git bisect bad 8f19922
    % git bisect good d315113
    Bisecting: 4 revisions left to test after this (roughly 2 steps)
    [894eed29909354495d73ebcf80f4abeff70321b0] Adding seconds to wait

Once the bad and good revisions are marked, Git immediately picks the midpoint commit and checks it out. At this point, the programmer should test the code to see if it is a failing state. Once the test is run, inform Git of the results by using either ‘git good’, if the bug isn’t present, or ‘git bad’, if it is. In this example, the application is still failing at this commit ID, so it is marked bad.

    % git bisect bad
    Bisecting: 2 revisions left to test after this (roughly 1 step)
    [b822b9b93b0fdd28b9fec9a91e62abe0d0823254] Missed a semicolon
    % git bisect good
    Bisecting: 0 revisions left to test after this (roughly 1 step)
    [48f32dfabcb9b95354e1df1e991104469719fcf1] The error
    % git bisect bad
    Bisecting: 0 revisions left to test after this (roughly 0 steps)
    [966e73c633f29701ba6dbbffb20e8efa3977f76b] spelling error fix
    % git bisect good
    48f32dfabcb9b95354e1df1e991104469719fcf1 is the first bad commit
    commit 48f32dfabcb9b95354e1df1e991104469719fcf1
    Author: Forrest Stanley <fstanley@netburner.com>
    Date:   Fri Jul 20 13:58:31 2018 -0700
        The error
    :100644 100644 7607ff3ff482a4cdfcf26bd144f1c81d2c17a32d 1f629b5ff6d639d2bf09210dc7a4586a2f68db0f M  main.cpp
    %

After repeatedly testing each commit and marking it “good” or “bad”, Git is able to narrow down to the exact commit ID that led to the regression. Amazingly, the individual that submitted the offending commit in the example above noted that it was “bad” in the commit’s description, but you probably will not have the luxury of such a descriptive commit message in practice.

What can go wrong:

Sometimes, Git bisect may run into some hiccups. If your repository is constantly in a broken state, then it may be difficult or almost impossible to test a specific commit and mark it good or bad. Git does have a way around this. If you land on a broken repository while checking, you can use the git bisect skip command to mark the commit as untestable. Git will select another nearby commit to be checked.

Git bisect is much less useful if you, or your cohorts, are in the habit of checking in large commits that are made up of several features, fixes, and new bugs. It is important to keep each individual commit incremental with a concise fix or feature applied. As they say in the community: ‘Commit early and often!’ Not only is this a general best practice but it allows for utilities such as Git bisect and continuous integration tests to work better, which helps find those bugs as early as possible.

More info:

Git includes very complete and concise documentation that can be accessed with git help from the command line. To read more about git bisect, typing git help bisect in the command line is the best place to start. You can also reference the following webpages to learn more about this feature or the broader Git ecosystem:

Share this post

Subscribe to our Newsletter

Get monthly updates from our Learn Blog with the latest in IoT and Embedded technology news, trends, tutorial and best practices. Or just opt in for product change notifications.

Leave a Reply
Click to access the login or register cheese