Skip to content

Open Source

Codehaus & Ben: Thank You and Good Bye

Sonar - Thu, 03/26/2015 - 20:55

It seems very natural today that SonarQube is hosted at Codehaus, but there was a time when it was not! In fact joining Codehaus was a big achievement for us; you might even say it was one of the project’s first milestones, because Codehaus didn’t accept just any project. That may seem strange today, when you can get started on Github in a matter of minutes, but Codehaus was picky, and just being accepted was a big deal.

It was also a big deal because being accepted by Codehaus gave us access to a full suite of best-of-breed tools: IntelliJ, JProfiler, and Nexus, plus Jira, Confluence, and the rest of the Atlassian suite… This, coupled with the fact that Codehaus took on the burden of hosting and maintaining that infrastructure, allowed us to focus on the SonarQube platform and ecosystem. It enabled us to make what we think is a great product – a product that wouldn’t be what it is today without Codehaus.

The first ticket ever created for the SonarQube (née Sonar) project was SONAR-1, entered in the Codehaus Jira on Dec. 17th, 2007. The project was just under a year old at the time (SonarSource hadn’t even been founded yet). Over the next 7+ years, that ticket was followed by nearly 14,000 more across 42 projects, more than 60,000 emails across two mailing lists, and countless documentation revisions over the many versions of SonarQube and its plugins.

Of course, “Codehaus” really boils down to one guy: Ben Walding, who has been running the 1,000-project forge on his own time and his own dime from the beginning. No matter what was going on in Ben’s life, Codehaus was up. And he wasn’t just “keeping the lights on”, either; Ben always made things not just possible, but easy. So when he told us a couple of months ago that Codehaus was shutting down, it wasn’t really a surprise. In fact, as he said, the writing had been on the wall for a while. But it was saddening. Because no matter how many other options there are today for open source projects, Codehaus will always have a special place in the history of the open source movement and in our hearts.

We’ll announce what Life After Codehaus will look like in May, but in the meantime, we say: Merci beaucoup, Большое спасибо, Heel erg bedankt, Grazie mille, vielen Dank, Suur aitäh, Nagyon köszönöm, and Thank you, Ben. Goodbye to Codehaus, and thank you very much.

Categories: Open Source

The speed of a caravan in the desert

Sonar - Thu, 03/12/2015 - 12:22

“What is the speed of a caravan in the desert?” Language Team Technical Lead Evgeny Mandrikov posed that question recently to illustrate a point about developer tools. The answer to the caravan question is that it moves at the speed of the slowest camel. He was using the metaphor to illustrate a point about developer tools: a developer can only work at the speed of her slowest tool.

This is one reason developers want – and smart managers buy – machines with fast processors. We like them not just because we’re gear-head (chip-head?) geeks, but because they get us closer to the ability to work at the speed of thought. But what about the other tools? What about the quality tools?

For the same reason developers want fast processors, fast IDE’s and fast compilers, we also want fast quality tools. Forget the semicolon at the end of a line of Java code and most IDE’s will immediately light it up in red. That fast quality feedback lets you code quickly and efficiently, without agonizing over trivial details.

Similarly, fast feedback on code quality lets you mark features “done and dusted”, and move on. That’s why SonarSource offers three different options for pre-commit analysis. That’s also why we advocate (and practice!) Continuous Inspection. Looking at code quality once a month or once a week or just before a release is waaay too late.

Because developers want to work at the speed of thought, not the speed of corporate bureaucracy. And smart managers want that too.

Categories: Open Source

Eating the dog food

Sonar - Wed, 02/25/2015 - 17:36

The SonarQube platform includes an increasing powerful lineup of tools to manage technical debt. So why don’t you ever see SonarSourcers using Nemo, the official public instance, to manage the debt in the SonarQube code? Because there’s another, bleeding-edge instance where we don’t just manage our own technical debt, we also test our code changes, as soon as possible after they’re made.

Dory (do geeks love a naming convention, or what?), is where we check our code each morning, and mid-morning, and so on, and deal with new issues. In doing so, each one of us gives the UI – and any recent changes to it – a thorough workout. That’s because Dory doesn’t run the newest released version, but the newest milestone build. That means that each algorithm change and UI tweak is closely scrutinized before it gets to you.

The result is that we often iterate many times on any change to get it right. For instance, SonarQube 5.0 introduced a new Issues page with a powerful search mechanism and keyboard shortcuts for issue management. Please don’t think that it sprang fully formed from the head of our UI designer, Stas Vilchik. It’s the result of several months of design, iteration, and Continuous Delivery. First came the bare list of issues, then keyboard shortcuts and inter-issue navigation, then the wrangling over the details. Because we were each using the page on a daily basis, every new change got plenty of attention and lots of feedback. Once we all agreed that the page was both fully functional and highly usable, we moved on.

The same thing happens with new rules. Recently we implemented a new rule in the Java plugin based on FindBugs, "Serializable" classes should have a version id. The changes were made, tested, and approved. Overnight the latest snapshot of the plugin was deployed to Dory, and the next morning the issues page was lit up like a Christmas tree.

We had expected a few new issues, but nothing like the 300+ we got, and we (the Java plugin team and I) weren’t the only ones to notice. We got “feedback” from several folks on the team. So then the investigation began: which issues shouldn’t be there? Well, technically they all belonged: every class that was flagged either implemented Serializable or had a (grand)parent that did. (Subclasses of Serializable classes are Serializable too, so for instance every Exception is Serializable.) Okay, so why didn’t the FindBugs equivalent flag all those classes? Ah, because it has some exclusions.

Next came the debate: should we have exclusions too, and if so which ones? In the end, we slightly expanded the FindBugs exclusion list and re-ran the analyses. A few issues remained, and they were all legit. Perfect. Time to move on.

When I first came to SonarSource and I was told that the internal SonarQube instance was named Dory, I thought I got it: Nemo and Dory. Haha. Cute. But the more I work on Dory, the more the reality sinks it. We rely on Dory on a daily basis; she’s our guide on the journey. But since our path isn’t necessarily a straight line, it’s a blessing for all of us that she can forget the bad decisions and only retain the good.

Categories: Open Source

SonarQube Java Analyzer : The Only Rule Engine You Need

Sonar - Thu, 02/12/2015 - 17:17

If you have been following the releases of the Java plugin, you might have noticed that we work on two major areas for each release: we improve our semantic analysis of Java, and we provide a lot of new rules.

Another thing you might have noticed, thanks to the tag system introduced by the platform last year, is that we are delivering more and more rules tagged with “bug” and “security”. This is a trend we’ll try to strengthen on the Java plugin to provide users valuable rules that detect real problems in their code, and not just formatting or code convention issues.

What you might wonder then is: where do we get the inspiration for those rules?  Well, for starters, the SonarSource codebase is mostly written in Java, and most SonarSource developers are Java developers. So in analyzing our own codebase we find some patterns that we want to detect, turn those patterns into rules, and provide the rules to our users. But that is not enough, and that is why we are taking inspiration from other rule engines, and more specifically FindBugs. We are in the process of deprecating FindBugs rules by rewriting them using our technology.

Our goal is that at some point in 2015 we’ll stop shipping the FindBugs plugin by default with the platform (we’ll still support it and provide it through the update center) because out of the box, the Java Plugin will provide at least as much (hopefully more!) value as FindBugs.

This might seem pretentious, but there are a couple of reasons we are moving in this direction:

  • This is a move we already made with PMD and Checkstyle (and we are still supporting the sonar-pmd-plugin and sonar-checkstyle-plugin).
  • FindBugs works only at the bytecode level: the analysis only runs on compiled classes. The Sonar Java Plugin works with both sources and bytecode, and is thus able to be more precise in its analysis, eliminating false positives and detecting patterns that cannot be detected by FindBugs.
    For instance consider the following code run against the Java Plugin rule “Identical expressions should not be used on both side of a binary operator”, which deprecates multiple FindBugs rules:

    //...
    if(a == a) { //self comparison
      System.out.println(“foo”);
    }
    if( 2+1*12 == 2+1*12 ) { //selfcomparison
      System.out.println(“foo”);
    }//...
    

The approach used by FindBugs, which relies only bytecode, will not be able to detect the second issue because the second if will be erased by the compiler and thus will not be visible in bytecode.

  • FindBugs project activity: The activity on the project is quite low and thus the value coming out of it does not come fast enough to satisfy our users.
  • Documentation: One thing we really value at SonarSource, and that we think has made our products great, is that for each issue we raise we provide a clear explanation of why we raised the issue and an indication of how to fix it. This is something that FindBugs clearly lacks in our view, and we are confident we can offer better value in this area.

As we reimplement the FindBugs rules, our goal is also to remove some useless  or outdated rules, merge close-in-meaning rules, and report fewer false positives that FindBugs does.

However, this is going to take some work: we are still one step behind FindBugs regarding an essential part of what makes it valuable, the Control Flow Graph (CFG). Briefly: CFG allows tracking the value of a variable through the execution paths your code. An example of its use is to be able to detect NullPointerException without executing the code. This feature is not implemented yet in the SonarQube Java Plugin, but a first version was shipped in the latest version (3.3) of the C/C++ plugin. It’s in the roadmap of the Java plugin to embed this feature and deprecate the FindBugs rules requiring it.

This rewriting of FindBugs rules has already started, with a huge effort on documenting and specifying them properly. Out of the 423 rules provided by FindBugs we have decided to reimplement 394, and have already specified replacements for 286. At the time of this writing, 157 rules have already been reimplemented using our own technology (so about 40% of the implementable rules).

Don’t get me wrong: FindBugs is a great tool providing a lot of useful feedback to Java developers. But within the year, we will be at a point in the development of the SonarQube Java plugin where we can deliver even better feedback, and detect issues that even FindBugs can’t find.

Categories: Open Source

C/C++/Objective-C: Dark past, bright future

Sonar - Thu, 02/05/2015 - 14:03

We’ve just released version 3.3 of the C/C++/Objective-C plugin, which features an increased scope and precision of analysis for C, as well as detection of real bugs such as null pointer dereferences and bugs related to types for C. These improvements were made possible by the addition of semantic analysis and symbolic execution, which is the analysis not of the structure of your code, but of what the code is actually doing.

Semantic analysis was part of the original goal set for the plugin about three years ago. Of course, the goal was broader than that: develop a static analyser for C++. The analyzer needed to continuously check your code’s conformance with your coding standards and practices, and more importantly detect bugs and vulnerabilities to help you to keep technical debt under control.

At the time, we didn’t think it would be hard, because many languages were already in our portfolio, including Java, COBOL, PL/SQL. Our best engineers, Freddy Mallet and Dinesh Bolkensteyn, were already working on C, the natural predecessor of C++. I joined them, and together we started work on C++. With the benefit of hindsight, I can say that we all were blind. Totally blind. We had no idea what a difficult and ambitious task we had set ourselves.

You see, a static analyzer is a program which is able to precisely understand what another program does. And, roughly speaking, a bug is detected when this understanding is different from what the developer really wanted to write. Huh! Already, the task is complex, but it’s doubly so for C++. Why is automatic analysis of C++ so complicated?

First of all, both C and C++ have the concept of preprocessing. For example consider this code:

struct command commands[] = { cmd(quit), cmd(help) };

One would think that there are two calls of the “cmd” function with the parameters “quit” and “help”. But that might not be the case if just before this line there’s a preprocessing directive:

#define cmd(name) { #name, name ## _command }

That directive completely changes meaning of the original code, literally turning it into

struct command commands[] = { { "quit", quit_command }, { "help", help_command } };

The existence of the preprocessor complicates many things on many different levels for an analysis. But most important is that the correct interpretation of preprocessing directives is crucial for the correctness and precision of an analysis. We rewrote our preprocessor implementation from scratch three times before we were satisfied with it. And it’s worth mentioning that on the market of static analysers (both commercial and open-source) you can easily find tools that don’t do preprocessing at all or do it only imprecisely.

Let’s move to the next difficulty. I’ve mentioned in the past that C and C++ are hard to parse. It’s time to talk a little bit about why. Roughly speaking, parsing is the process of recognizing language constructions – i.e. seeing what’s a statement, what’s an expression, and so on. Let’s take some example code and try to figure out what it is.

T * a

If this were Java code, the answer would be straightforward: most probably this is multiplication, and part of bigger expression. But the answer isn’t that simple in for C/C++. In general, the answer is “it depends…” This could indeed be an expression statement, if both “T” and “a” are variables:

int T, a;
T * a;

But it could also be the declaration of variable “a” with a type of pointer to “T”, if “T” is a type:

typedef int T;
T * a;

In other words, the context can completely change the meaning of code. This is called ambiguity.

Like natural languages, the grammars of programming languages can be ambiguous. While the C language has just a few ambiguous constructions, C++ has tons of them. And as you’ve seen, correct parsing is not possible without information about types. But getting that information is a difficulty in and of itself because it requires semantic analysis of language constructs before you can understand their types and relations. And that’s where it starts to be really complex. To parse we need semantic analysis, and to do semantic analysis we need to parse. Chicken and egg problem.

We had hit a wall, and when we looked around, we realized we weren’t alone. Many tools don’t even try to parse, get information about types or distinguish between ambiguous and unambiguous cases.

And then we found GLL, a relatively new theory about generalized parsing. It was first published in 2010, and there still aren’t any ready-to-use, publicly-available implementations for Java. Implementing a GLL parser wasn’t easy, and took us quite a while, but the ROI was high. This parser is able to preserve information about encountered ambiguities without their actual resolution. That allows us to do precise analysis of at least the unambiguous constructions without producing false-positives on ambiguous constructions.

The GLL parser was a win-win, and game changer! After 2 years of development from the first commit (which was approximately a year ago) we released precise preprocessing and parsing in version 2.0 of the C++ Plugin.

With the original goal well on the way to being met, we started to dream again, raised our expectations even higher, and were ready to welcome new developers. Today, I still work on the plugin, but it’s maintained primarily by Massimo Paladin and Samuel Mercier. They solved the analysis configuration problem, added support of Objective-C and Microsoft Component Extensions to the plugin.

Our next goal is to apply Semantic Analysis and Symbolic Execution on Objective-C and of course after that on C++, and to use them to cover more MISRA rules. So this is probably not the end of the story about difficulties in development of static analyser for C/C++/Objective-C – who knows what else will be encountered on the way. But now we are not blind as it was before, now we know that this is difficult. However based on past, I can say that we in SonarSource are unstoppable and even most incredible dreams come true! So keep dreaming! And just never ever give up!

Categories: Open Source

Don’t Cross the Beams: Avoiding Interference Between Horizontal and Vertical Refactorings

JUnit Max - Kent Beck - Tue, 09/20/2011 - 03:32

As many of my pair programming partners could tell you, I have the annoying habit of saying “Stop thinking” during refactoring. I’ve always known this isn’t exactly what I meant, because I can’t mean it literally, but I’ve never had a better explanation of what I meant until now. So, apologies y’all, here’s what I wished I had said.

One of the challenges of refactoring is succession–how to slice the work of a refactoring into safe steps and how to order those steps. The two factors complicating succession in refactoring are efficiency and uncertainty. Working in safe steps it’s imperative to take those steps as quickly as possible to achieve overall efficiency. At the same time, refactorings are frequently uncertain–”I think I can move this field over there, but I’m not sure”–and going down a dead-end at high speed is not actually efficient.

Inexperienced responsive designers can get in a state where they try to move quickly on refactorings that are unlikely to work out, get burned, then move slowly and cautiously on refactorings that are sure to pay off. Sometimes they will make real progress, but go try a risky refactoring before reaching a stable-but-incomplete state. Thinking of refactorings as horizontal and vertical is a heuristic for turning this situation around–eliminating risk quickly and exploiting proven opportunities efficiently.

The other day I was in the middle of a big refactoring when I recognized the difference between horizontal and vertical refactorings and realized that the code we were working on would make a good example (good examples are by far the hardest part of explaining design). The code in question selected a subset of menu items for inclusion in a user interface. The original code was ten if statements in a row. Some of the conditions were similar, but none were identical. Our first step was to extract 10 Choice objects, each of which had an isValid method and a widget method.

before:

if (...choice 1 valid...) {
  add($widget1);
}
if (...choice 2 valid...) {
  add($widget2);
}
... 

after:

$choices = array(new Choice1(), new Choice2(), ...);
foreach ($choices as $each)
  if ($each->isValid())
    add($each->widget());

After we had done this, we noticed that the isValid methods had feature envy. Each of them extracted data from an A and a B and used that data to determine whether the choice would be added.

Choice pulls data from A and B

Choice1 isValid() {
  $data1 = $this->a->data1;
  $data2 = $this->a->data2;
  $data3 = $this->a->b->data3;
  $data4 = $this->a->b->data4;
  return ...some expression of data1-4...;
}

We wanted to move the logic to the data.

Choice calls A which calls B

Choice1 isValid() {
  return $this->a->isChoice1Valid();
}
A isChoice1Valid() {
  return ...some expression of data1-2 && $this-b->isChoice1Valid();
}
Succession

Which Choice should we work on first? Should we move logic to A first and then B, or B first and then A? How much do we work on one Choice before moving to the next? What about other refactoring opportunities we see as we go along? These are the kinds of succession questions that make refactoring an art.

Since we only suspected that it would be possible to move the isValid methods to A, it didn’t matter much which Choice we started with. The first question to answer was, “Can we move logic to A?” We picked Choice. The refactoring worked, so we had code that looked like:

Choice calls A which gets data from B

A isChoice1Valid() {
  $data3 = $this->b->data3;
  $data4 = $this->b->data4;
  return ...some expression of data1-4...;
}

Again we had a succession decision. Do we move part of the logic along to B or do we go on to the next Choice? I pushed for a change of direction, to go on to the next Choice. I had a couple of reasons:

  • The code was already clearly cleaner and I wanted to realize that value if possible by refactoring all of the Choices.
  • One of the other Choices might still be a problem, and the further we went with our current line of refactoring, the more time we would waste if we hit a dead end and had to backtrack.

The first refactoring (move a method to A) is a vertical refactoring. I think of it as moving a method or field up or down the call stack, hence the “vertical” tag. The phase of refactoring where we repeat our success with a bunch of siblings is horizontal, by contrast, because there is no clear ordering between, in our case, the different Choices.

Because we knew that moving the method into A could work, while we were refactoring the other Choices we paid attention to optimization. We tried to come up with creative ways to accomplish the same refactoring safely, but with fewer steps by composing various smaller refactorings in different ways. By putting our heads down and getting through the other nine Choices, we got them done quickly and validated that none of them contained hidden complexities that would invalidate our plan.

Doing the same thing ten times in a row is boring. Half way through my partner started getting good ideas about how to move some of the functionality to B. That’s when I told him to stop thinking. I don’t actually want him to stop thinking, I just wanted him to stay focused on what we were doing. There’s no sense pounding a piton in half way then stopping because you see where you want to pound the next one in.

As it turned out, by the time we were done moving logic to A, we were tired enough that resting was our most productive activity. However, we had code in a consistent state (all the implementations of isValid simply delegated to A) and we knew exactly what we wanted to do next.

Conclusion

Not all refactorings require horizontal phases. If you have one big ugly method, you create a Method Object for it, and break the method into tidy shiny pieces, you may be working vertically the whole time. However, when you have multiple callers to refactor or multiple implementors to refactor, it’s time to begin paying attention to going back and forth between vertical and horizontal, keeping the two separate, and staying aware of how deep to push the vertical refactorings.

Keeping an index card next to my computer helps me stay focused. When I see the opportunity for a vertical refactoring in the midst of a horizontal phase (or vice versa) I jot the idea down on the card and get back to what I was doing. This allows me to efficiently finish one job before moving onto the next, while at the same time not losing any good ideas. At its best, this process feels like meditation, where you stay aware of your breath and don’t get caught in the spiral of your own thoughts.

Categories: Open Source

My Ideal Job Description

JUnit Max - Kent Beck - Mon, 08/29/2011 - 21:30

September 2014

To Whom It May Concern,

I am writing this letter of recommendation on behalf of Kent Beck. He has been here for three years in a complicated role and we have been satisfied with his performance, so I will take a moment to describe what he has done and what he has done for us.

The basic constraint we faced three years ago was that exploding business opportunities demanded more engineering capacity than we could easily provide through hiring. We brought Kent on board with the premise that he would help our existing and new engineers be more effective as a team. He has enhanced our ability to grow and prosper while hiring at a sane pace.

Kent began by working on product features. This established credibility with the engineers and gave him a solid understanding of our codebase. He wasn’t able to work independently on our most complicated code, but he found small features that contributed and worked with teams on bigger features. He has continued working on features off and on the whole time he has been here.

Over time he shifted much of his programming to tool building. The tools he started have become an integral part of how we work. We also grew comfortable moving him to “hot spot” teams that had performance, reliability, or teamwork problems. He was generally successful at helping these teams get back on track.

At first we weren’t sure about his work-from-home policy. In the end it clearly kept him from getting as much done as he would have had he been on site every day, but it wasn’t an insurmountable problem. He visited HQ frequently enough to maintain key relationships and meet new engineers.

When he asked that research & publication on software design be part of his official duties, we were frankly skeptical. His research has turned into one of the most valuable of his activities. Our engineers have had early access to revolutionary design ideas and design-savvy recruits have been attracted by our public sponsorship of Kent’s blog, video series, and recently-published book. His research also drove much of the tool building I mentioned earlier.

Kent is not always the easiest employee to manage. His short attention span means that sometimes you will need to remind him to finish tasks. If he suddenly stops communicating, he has almost certainly gone down a rat hole and would benefit from a firm reminder to stay connected with the goals of the company. His compensation didn’t really fit into our existing structure, but he was flexible about making that part of the relationship work.

The biggest impact of Kent’s presence has been his personal relationships with individual engineers. Kent has spent thousands of hours pair programming remotely. Engineers he pairs with regularly show a marked improvement in programming skill, engineering intuition, and sometimes interpersonal skills. I am a good example. I came here full of ideas and energy but frustrated that no one would listen to me. From working with Kent I learned leadership skills, patience, and empathy, culminating in my recent promotion to director of development.

I understand Kent’s desire to move on, and I wish him well. If you are building an engineering culture focused on skill, responsibility and accountability, I recommend that you consider him for a position.

 

===============================================

I used the above as an exercise to help try to understand the connection between what I would like to do and what others might see as valuable. My needs are:

  • Predictability. After 15 years as a consultant, I am willing to trade some freedom for a more predictable employer and income. I don’t mind (actually I prefer) that the work itself be varied, but the stress of variability has been amplified by having two kids in college at the same time (& for several more years).
  • Belonging. I have really appreciated feeling part of a team for the last eight months & didn’t know how much I missed it as a consultant.
  • Purpose. I’ve been working since I was 18 to improve the work of programmers, but I also crave a larger sense of purpose. I’d like to be able to answer the question, “Improved programming toward what social goal?”
Categories: Open Source

Thu, 01/01/1970 - 02:00