Skip to content

Xebia Blog
Syndicate content
Software development done right!
Updated: 2 hours 40 min ago

Why Application Release Automation needs a Release and an Operations view

Wed, 02/01/2012 - 02:30

As the interface between Development and Operations, Application Release Management1 handles information that is highly relevant to your Release and Operations teams. Selecting an Application Release Automation solution that provides insight and analytics from both perspectives is thus a key component of an effective DevOps strategy.

Here, we explain how Deployit‘s Infrastructure and new Release Overview features help you achieve this goal.

Continuous Delivery & the Release Perspective

In today’s highly competitive economic environment, the need to bring new features to market quickly, flexibly and reliably is paramount – a goal that is ultimately the aim of the main IT trends Cloud, Agile and DevOps.

Continuous Delivery – extending Continuous Integration to automatically transition applications down the Dev-Test-Acc-Prod delivery pipeline – is a key component of this strategy. In order to be able to effectively implement this, your ARA solution needs to allow your developers – or, in larger organisations, release or DevOps teams, to quickly and efficiently answer questions such as:

  • How far is MyApplication down the road to Production?
  • When will MyApplication take the next step down the road?
  • What do I still need to do before that next step can be taken2?

Ideally, this dashboard would also allow you to plan MyApplication‘s next step and calculate the estimated go-live data, perhaps even based on an analysis of previews versions of MyApplication.

(Virtual) Environment Management & the Operations Perspective

From an Operations point of view, an individual application is only a small part of the picture. Across your Dev-Test-Acc-Prod landscape, you will need to track all applications vying for these environments, to manage potentially conflicting resource requests, plan environment maintenance activities and the like.

Since these environments are often owned and managed by different teams and certainly have varying service levels, you will also want to limit your view to one or a subset of these environments at a time.

Your Operations or DevOps teams need to know:

  • Which application versions are currently deployed to my environment(s), or were deployed at a certain point in time?
  • Which components do these applications consist of? On which middleware and infrastructure systems are these components deployed?
  • What are the current values of any properties or settings for these components? Which environment-specific customizations have been applied?

Cloud and the on-demand environments it enables will eventually replace the rigid Dev-Test-Acc-Prod distinction3. Nevertheless, the ability to present an environment-centric view will still be required, since virtual environments will still be owned by different groups or teams. Indeed, such a perspective will be even more important if you want to effectively combat “virtual sprawl”.

While the coming generations of “true” cloud architectures will hopefully reduce the shared resource conflicts that greatly complicate much of today’s Dev-Test-Acc-Prod management, databases, legacy systems and external payment providers are not likely to disappear anytime soon.

In fact, Facebook, Twitter and other social elements of your future business services may even increase the number of shared resources you need to manage!

Incorporating ARA Data in the Service Delivery Picture

Whilst your ARA solution should be your “go-to” platform for answers about how your applications and environments relate, it is equally important to consider when this data might be more effectively embedded in a broader service delivery picture.

For example, your ARA platform is not a good candidate for providing a release calendar, since it is not aware of much of the information that is relevant in this context, such as CAB4 meeting schedules, business sign-off dates or operational maintenance windows.

It is thus important to ensure that your ARA solution can make its data accessible via APIs such as RSS feeds, iCal calendars and other APIs, to enable effective integrations with the rest of your service delivery tooling.

Conclusion

The right Application Release Automation platform gives your Delivery and Operations teams fast, accurate insight into your application environments and delivery pipeline.

Choosing a solution like Deployit with focused Operations and Delivery overviews as well as open APIs for easy integration into your overall Service Delivery dashboards and reports greatly enhances the accessibility and effectiveness of your application release management.

Footnotes

  1. a.k.a. Deployment Automation – choose your favourite ;-)
  2. For instance, certain blocking release conditions, such as test sign-off, may still need to be met.
  3. and have long done so in many forward-looking organisations
  4. Change Advisory Board

Share

Categories: Companies

Agile is niet te vermijden

Fri, 01/27/2012 - 15:08

Net als in 2010 heeft Xebia in 2011 het jaarlijks onderzoek naar de de status van Agile in Nederland uitgevoerd. Met ook dit jaar weer opvallende resultaten. Zo zegt bijna 90 procent van de bedrijven die met Agile werken sterk verbeterde resultaten te realiseren bij hun (ICT) projecten. De vraagt die direct bij mij opkomt bij dit soort hoge percentages is waarom niet iedereen met Agile aan de slag gaat.

Daarnaast ervaart 83 procent van de Nederlandse bedrijven die Agile werken hebben geadopteerd, meer werkplezier en 85 procent meer teammotivatie. Dit percentage is aanzienlijk hoger dan vorig jaar, toen gaf driekwart van de respondenten aan meer werkplezier en teammotivatie te ervaren. Dus de mensen die Agile werken varen er wel bij, naar mijn mening een van de belangrijkste redenen voor het succes van Agile. Dit komt ook veelal tot uiting in een lager ziekteverzuim en grotere loyaliteit naar de werkgever toe.

Andere belangrijke effecten zijn een verkorte time-to-market volgens 79 procent van de ondervraagden en toename van de productiviteit (66 procent). Het onderzoek laat ook zien dat kostenverlaging een belangrijker effect van Agile werken is dan vorig jaar (38 procent in 2011 tegen 21 procent in 2010). Niet zo vreemd natuurlijk in deze economisch zware tijden. Maar dit betekent wel dat Agile steeds meer wordt ingezet als onderdeel van een kostenverlagingstraject. Dan is het wel heel belangrijk ervoor te waken, dat met het (meer) sturen op deze doelen de Agile gedachte niet ten onder gaat in de zoektocht naar kostenverlaging. Indien kostenverlaging wordt neergezet als primair doel met Agile als middel, is de kans groot weer te verzanden in ‘ouderwets’ contract en KPI management waarvan we in het verleden nou juist geleerd hebben dat dat niet effectief is.

In de overgang naar het Agile werken is de bedrijfscultuur net als vorig jaar de belangrijkste bottleneck voor veel organisaties (49 procent). En ook dat is geen nieuw inzicht. Maar het is wel iets dat te vaak onderschat wordt, zeker door belangrijke personen die juist onderdeel zijn van die bestaande bedrijfscultuur. De adoptie van Agile vereist veel focus en energie voor langere tijd en de daadwerkelijke borging vindt plaats juist via een inbedding in die bedrijfscultuur.

Door het verbeteren van kennis, kunde en bovenal mindset van Agile zal de uitrol een grotere kans van slagen hebben. De ondervraagde organisaties erkennen dit gegeven zelf overigens ook, want volgens 36 procent is een gebrek aan kennis en kunde de belangrijkste beperkende factor in de (verdere) implementatie en uitrol van Agile. Door juist in deze economisch uitdagende tijden in kennis, kunde en mindset van Agile te investeren zullen organisaties meer rendement kunnen behalen. Het verbeteren van bedrijfsresultaat is nu bijvoorbeeld voor veel financiële instellingen in Nederland een belangrijke reden om Agile te gaan werken: zo kunnen zij namelijk effectief hun te hoge kostenstructuur aanpakken.

Agile belooft dus veel, maar maakt dit ook waar. Zeker gezien de huidige generaties Y en Z die nu opgroeien en de toekomstige werkbevolking van Nederland gaan vormen, is er geen ontkomen meer aan (iets als) Agile. Regelmatig spreek ik diverse (jongere) sollicitanten die Agile, en de manier van werken en omgaan met elkaar die daarbij hoort, heel normaal en gewoon vinden en die gruwelen bij een beschrijving van het traditionele waterval model en bijbehorende werkaanpak. Het aantrekken van jong talent en een werkwijze die niets met Agile of elementen daarvan te maken heeft, staat bijna haaks op elkaar. Blijvende continiuteit en succes op langere termijn voor ieder bedrijf bestaat voor een belangrijk deel uit het aan kunnen blijven trekken en behouden van jong talent. Agile worden of zijn, lijkt hier een belangrijke voorwaarde voor te zijn.

Ik ben benieuwd wanneer u op de succesvolle trein stapt die Agile heet!

Share

Categories: Companies

One practice a day…

Wed, 01/25/2012 - 23:22

How do you change the way you live or work? Many people, and companies, seem to think it’s enough to adopt just one or two practices. While they continue their old habits, too. Will this lead you to your desired outcome? Or will you just get frustrated? 

The desire to change

Suppose one day, after a particularly bad hangover, you decide to change your life.  You long for a trim body, a balanced spirit, lots of energy and no more headaches. In short, a happy mind in a healthy, good looking body. But how do you get there?

Well, we all know that there are many practices that will help you achieve those goals.  Exercise three times a week, stop smoking, reduce your alcohol intake, and change your eating patterns. Oh, also, get more sleep, less stress, no more pills, and drink herbal tea instead of espressos.

But you hate sports, you love chocolate and smoking is your way to reduce your stress.  So what do you do?

The daily fruit practice

In order to get healthy, you decide that from now on, every day, you will eat a piece of fruit at lunch.   It’s pretty easy to do, even though you don’t like fruit very much.  So now you are satisfied, you do a healthy thing every day, and you expect to reach your goals (more energy! greater happiness! better looks!) anytime soon.

Will you? Of course not! While the daily piece of fruit is a step in the right direction, you will never fully get the benefits you are seeking if you keep up your old habits of smoking, drinking and eating greasy food,. And after a while you might even get disappointed and frustrated, and you will decide to stop eating fruit, claiming that “a healthy lifestyle simply doesn’t work for you”.

Doing just one healthy practice, or even a couple of them, isn’t enough. What you really need is to change your mindset. You want to be a healthy person, not do some healthy stuff. Once the healthy mindset becomes your way of life, the practices will simply be the natural way to go. And they will be a lot easier to start with and maintain for a long time.

The daily Agile practice

The same applies to Agile. We see many companies and teams that ‘ do‘ Agile. They take a couple of practices from an Agile framework such as Scrum, and they figure that that will be enough to give them all the Agile benefits. Some teams even think that if they just do a daily stand-up, they do ‘ Agile’.  While daily stand-ups, just like that daily piece of fruit, are definitely a good practice, your are still far removed from true agility if the rest of your behaviour consists of old, non-Agile practices.

A company or team only gets the most out of being Agile, when its mindset is based on the Agile principles stated in the Agile manifesto, such as continuous learning at all levels in the organization, business and IT working closely together, delivering the highest business value first, welcoming change, and creating an environment where motivated individuals can be creative and self-directing.

So if you are doing ‘some’ Agile, expand your practices. But more importantly, change your mindset. Don’t do Agile, be Agile. And remember:

One practice a day does not keep the old habits away…

Share

Categories: Companies

Installing a nodejs application without your good old internet

Mon, 01/23/2012 - 20:55

While we were building a little server to enable auditlogging on our hadoop cluster (more on that in a future blogpost) we needed a way to distribute our application.
This blog is about the packaging of this application. The application is build with nodejs and packaging and dependency management is mostly done with npm (the node package manager).

Of course installing this application in the production environment should have been as easy as the setup on our own laptop’s right? Wrong! On our laptops it was a easy git clone followed by a npm install and voila we have a running application. So how hard could it be to do this on a server at the client. Let me tell you….

This server is not connected to the internet, so the git clone wouldn’t work in the first place. Not really a problem because it’s a small app and we could just make a tarball and ship it to the server.
Next thing was all our dependencies. We used a few modules which were mentioned as dependencies in our package.json file so npm with the install command would do it’s magic.

The npm install magic consists, among other things, of getting the modules from the npm registry and that fails if you’re not connected to the internet. Searching for a way to do this differently I figured out that npm had a cache directory and thought I could get the stuff from there. This might have worked, but with that solution I would miss the dependencies where these modules depended on. And it would be a messy kind of script that I needed to make.

Browsing the internet didn’t provide me with the right answer but it led me on the path to the npm pack function. This is used to pack your module together with all dependencies. The only thing you need to configure it correctly, is a separate array containing the dependencies to bundle with your app.

So far so good, so I went on and added a bundleDependencies section in my package.json and ran npm pack.
The result was a nice .tgz file containing all the files needed for the application together with all the modules it depended on. At least the main modules it depended on. My fellow programmers in crime from which I got these modules hadn’t bothered to add this extra section, so npm had no notice of the modules they depended on.

This was easy to solve. Just add a correct section of bundleDependencies to all package.json files.
Sounded like a boring task to do this manually and because I love my programming job, I decided to write a program for it.

My obvious choice of programming languages was: awk, grep and sed. Why? Because I can.
Without further ado, here it is:

    awk '/dependencies/,/]|}/' $file |
    grep -o '\".*\".*:' |
    sed 's/^.*{//g' |
    sed 's/\"dependencies.*\://g' |
    grep -v -e '^$' |
    uniq |
    sed 's/\"[ ^I]\:/\",/g' |
    sed 's/\"\:/\",/g' |
    sed '$ s/,/ ]/' |
    sed '1 s/\"/\"bundleDependencies\" \: [ \"/' |
    sed 's/\"/\\"/g' |
    tr -d '\n'

What it does, you ask?
I'll explain line by line:

line1: get the dependencies part (an array or json object) from the file (the package.json file)
ex: "dependencies" : { "express": "0.2.2", "findit" : "0.0.1" }

line2: get only the stuff from that object between the double quotes before the colon (removing the version number part of the dependency)
ex: "express":
"findit":

line3: remove anything preceding the {
line4: remove the original dependencies text
line5: remove any empty lines
line6: remove duplicates

line7 and line8: replace the ": by ", so we can create an array from it
ex: "express",
"findit",

line9: replace the last , by an ] to close the array
ex: "express",
"findit"]

line10: replace the first " by "bundleDependencies" : [ "
ex: "bundleDependencies" : [ "express",
"findit"]

line11: precede all quotes by a backslash so it can be safely used in the sed command to add it to the file

line12: remove all newlines
ex: "bundleDependencies" : [ "express", "findit"]

This is added to the package.json file we are currently processing and if we have finished doing this for all package.json files we can use npm pack to create our tarball.
Works pretty well I might say. But I'm the first to admit this isn't the most readable program ever written.

Of course when building a node application you might have node around to help you do this so I also created a javascript version to do this:

var fs=require('fs')
var findit=require('findit')

findit.find('.', function(name) {
  if (endsWith(name,'package.json')) {
    handleFile(name)
  }
}).on('end', bundleApp)

function handleFile(file) {
  var data = fs.readFile(file, function(err, data) {
    if (err) {
      console.log('Not processesed '+file+' bo the following error: '+err)
    } else {
      var arr = []
      var packageFile = JSON.parse(data)
      if (packageFile.bundleDependencies) {
        console.log('Bundledeps already present. Skipping')
      } else {
        for (var d in packageFile.dependencies) {
          arr.push(d+"")
        }
        if (arr.length > 0) {
          packageFile['bundleDependencies'] = arr
          fs.writeFile(file, JSON.stringify(packageFile, null, 4))
        }
      }
    }
  })
}

function bundleApp() {
  console.log('Finished. preparing package.json files for packaging. Now run npm pack to create the fullblown tarball')
  //exercise left for the reader to require('npm') and run the pack command
}

function endsWith(str, suffix) {
  return str.indexOf(suffix, str.length - suffix.length) !== -1;
}

Hope somebody can benefit from this in the future.

Share

Categories: Companies

Beware of the timezone

Sun, 01/15/2012 - 04:26

The country of Samoa has decided to skip a day, the 30th of december 2011 doesn’t exist on Samoa. This decision was driven by economical reasons. I wonder, what potential problems we can run into regarding software development for our IT systems depending on timezone information?

To skip a day, it all sounds so easy. First of all i was curious to know if similar situations have occured in the past and what the software development pitfalls could be. A small investigation showed me that you need to be careful with assumptions regarding regional timezone issues. Let’s have look at how a programming language like JAVA handles these kinds of situations.

To demonstrate the consequences of this kind of detailed timezone changes and how JAVA handles them, i have compiled the following example (Apia is the capital of Samoa):
1. TimeZone timeZoneApia = TimeZone.getTimeZone("Pacific/Apia");
2. SimpleDateFormat sdf = new SimpleDateFormat("dd-MM-yyyy HH:mm:ss");
3. sdf.setTimeZone(timeZoneApia);
4. Calendar calendar = new GregorianCalendar(timeZoneApia);
5. calendar.set(2011, 11, 29, 23, 59, 59);
6. System.out.println(sdf.format(calendar.getTime()));
Obviously, the output of line 6 is:
29-12-2011 23:59:59
Now, let’s add the following lines:
7. calendar.add(Calendar.SECOND, 1);
8. System.out.println(sdf.format(calendar.getTime()));
The output of line 8 depends on the JAVA 7 version you use. If you use the original released JAVA 7 or JAVA 7 update 1, the output will be:
30-12-2011 00:00:00
If you run JAVA 7 update 2 (or if you are running a previous version of JAVA 7 and have updated the timezone configuration) the output will be:
31-12-2011 00:00:00
This example demonstrates how well JAVA has implemented their timezone awareness.

JAVA 7 update 1 was released on the 18th of october 2011, the “Samoa case” was not included here. However, Oracle provides a tool called Timezone Update Tool to update your JAVA timezone configuration. The changelog of the tool gives a nice overview of recent timezone changes. I was suprised by the amount of updates regarding timezones.

This made me curious about other, similar, cases. I found a number of cases and would like to point out one interesting case in the Netherlands where they did not skip a whole day but only a couple of seconds.

Consider the following example:
1. TimeZone timeZoneAmsterdam = TimeZone.getTimeZone("Europe/Amsterdam");
2. SimpleDateFormat sdf = new SimpleDateFormat("dd-MM-yyyy HH:mm:ss");
3. sdf.setTimeZone(timeZoneAmsterdam);
4. calendar = new GregorianCalendar(timeZoneAmsterdam);
5. calendar.set(1937, 5, 30, 23, 59, 59);
6. System.out.println(sdf.format(calendar.getTime()));
Obviously, the output of line 6 is:
30-06-1937 23:59:59
Now, let’s to the same trick as with the “Samoa case”. Add the following lines:
7. calendar.add(Calendar.SECOND, 1);
8. System.out.println(sdf.format(calendar.getTime()));
Question: What will be the output of line 8?

The correct answer is:
01-07-1937 00:00:28
The short story behind this: in the early 1900s the Dutch government declared that the official time in the Netherlands was “Amsterdam Time”. To be more specific, the meridian of the “Westertoren” in Amsterdam. This resulted in an offset of 19 minutes and 32 seconds with GMT.

On the 1st of July 1937 it was decided that the offset with GMT should be 20 minutes, creating a 28 second gap. So, officially, the first 28 seconds of the July 1st 1937 do not exist in the Netherlands. Probably, in 1937, not so many people were worried about the possible IT consequences of such a decision.

Conclusion

With this article i wanted to show that, especially in IT, you need to be careful about assumptions regarding calendar and timezone related issues. Hard to spot errors may be around the corner without realizing it. For example, when constructing a Calendar for a date/time/timezone combination that does not exists, the following happens:

  • JAVA returns a date which is “rounded off” to the nearest valid value in the future (we have seen this in both examples in this article)
  • If you don’t want JAVA to return the nearest valid value in the future, please set the value of the Lienient property to false and you will receive an exception instead of the nearest valid value in the future.
So, when you are unaware of the interesting variations in the timezone you develop software for, working with calendars and dates may result in unexpected return values or exceptions.

Share

Categories: Companies

Product Owner Scaling Problems

Fri, 01/13/2012 - 12:45

Scaling the productowner (PO) role is tricky business. When you scale up too much within the same context, things become cumbersome. We don’t want to bring back the same centralized fear ridden ineffective decision making climate, we tried to kill off in the first place. When people spend so much time and effort to bring back entrepreneurship, they don’t want to create layer over layer of hierarchical PO/CPO relationships.
So if there is this perceived risk of fallback involved, why do we actually want to scale the PO role at all?

Here are some reasons I came up with when thinking of a project context(*)

The perceived need for scaling could follow as a reaction to the scope of the project at hand. It is quite common within projects to try and do as much as possible parallel development of different semi-detached functional areas of the product. This comes quite close to how I used to manage projects in the past. Whatever in the same timebox could be done in parallel, should. I was very efficiency driven back then, but also taught and stimulated to think this way.
Within this paradigm it would be preferable, that the PO has some sort of party around him helping to create the user stories he would be in charge of. Creating big teams gives us more capacity to do more work in our timebox. Larger teams and more work create the need for more coordination, because of limited span of control, thus creating the need to scale.

Also, I think we are still very much driven by the paradigm of the chain of command. It is simply comfortable when there is someone who can make the tough decisions for us and mediate between us whenever conflicting interests arise. The need for extra coordination between and over PO’s in the same context thus fits very naturally with our needs in a growing project context.
Come to think of it, maybe it is no coincidence that the person designated CPO is more times than not the first PO that was on the project. If we look deep within ourselves wouldn’t every PO actually want their first project steps to be so successful, that the extra means are granted to grow the team and consequently rise above to lead the way as CPO? And should we be that CPO, would we ever fully trust another colleague with the work we carefully prepared and poured our blood sweat and tears into? It would be great if we could somehow keep some sort of supervision on the others. Make sure the projects success and our hard work isn’t killed off in the next sprint or two. Scaling with a CPO construction would provide means to fulfill these needs.

Of course the above reasons to scale could be valid, but there are also downsides to them.

Although doing parallel work may sound efficient, maybe validation with your customers shows that the product increment doesn’t fulfill their needs as thought up in the first place. Since you have done a lot of parallel work, the risk of waste is also greater. Should you need to radically pivot your product in another direction, it will be much harder due to the already large scale of your operation.
I thus think that scaling towards having different teams work on the same backlog potentially inhibits us from maximizing validated business value, as you are pulling forward less important features from the backlog, before actually validating and thereby knowing whether you have actually delivered value with your top items. From this angle, it would be wise considering not to scale.

When scaling from success, we run the risk of slipping into a power monger mode and form a team beneath us to gain status and control instead of branching out sideways.
When this happens we are not working towards joint company stakeholdership any more, but we trying to build our own ivory towers. When the CPO leading the PO’s claims decisive power in certain areas, I believe you would not be creating the right entrepreneurial environment in which decisions are based on arguments and value for the customer and company. Maybe, as a PO, if you already know you need a decision from the CPO, you are inclined to use other influencing methods rather than comparable and less subjective measures than for example business cases, jeopardizing the quality of decision making and therefore the success of the product.

During my study, I was taught “there is no one best way to organize”. I think of this line as a universal truth, that in my opinion also applies to scaling the PO role. I therefor think that form is less important than the route taking you there. Knowing that there are risks involved in scaling and consciously dealing with them helps you along this route to find a form that works for you.

(*)Scaling is also common in productline contexts (for instance a CPO over a group of PO’s managing a product family) or within business units where you would have various strategic themes that the CPO would like to have implemented over the same time period.

Share

Categories: Companies

On cloud 3×3

Thu, 12/29/2011 - 15:50

2011 has been an interesting year for cloud computing. Traditionally, cloud computing can be divided into three categories:

While SaaS has been around for some time (Salesforce.com started in 1999!), we are seeing an increase in adoption of IaaS and some heavy development in the PaaS world.

Now that 2011 is coming to an end, this is also the time for lists. So here are my 3 top 3’s of cloud computing.

Software-as-a-Service

3. GitHub

Distributed version control systems are gaining track rapidly, with Git leading the way. GitHub provides Git as a service, and offers unlimited repositories for open source software. It provides an excellent interface that stimulates social coding. This means it is a great incentive for open source development.

2. Dropbox

Well, who doesn’t use Dropbox? Dropbox offers a filesharing service that is easy to use. There are two reasons for the success of Dropbox; it is easy to start with 2GB of free space, and it provides clients for almost all platforms.

1. Google Apps

Google is trying hard to get us all to work in the cloud, and their Google Apps service is their way to do so. They even have a free service for private use. Google Apps provides a full application suite including e-mail, calendar and docs. GMail is massively adopted and has been a game-changer since introduction in 2004.

Infrastructure-as-a-Service

3. OpenStack

When talking about IaaS, people immediately think about the Amazon AWS platform. But what if you don’t like their terms-of-service, or simply want to create something similar in your own data center? Enter OpenStack. If there is one DIY IaaS framework that has momentum, it is OpenStack. It is backed by no less than 144 companies, and best of all, it’s open source.

2. jclouds

If all the IaaS frameworks and providers and all their different APIs are giving you an headache, jclouds is the framework for you. The jclouds API provides an abstraction of the different cloud-specific implementations. Currently over 30 providers are supported, including all the usual suspects (Amazon AWS, OpenStack, Azure, etc).

1. Amazon AWS

The undisputed number 1 of IaaS is of course Amazon AWS. Ever since the introduction in 2002, Amazon AWS is the reference implementation for IaaS. It sets the standard and the rest of the IaaS providers are merely trying to catch up. Amazon also doesn’t sit still, it constantly adds new services to its platform (and is slowly growing into a PaaS). It is available around the globe, with data centers in almost every continent.

Platform-as-a-Service

3. Heroku

Heroku is a fully hosted PaaS platform. It supports lots of languages, and it completely hides the infrastructure (servers, instances, etc) from your applications. It has a partnership with Facebook, creating the Heroku Facebook App Package, which enables quick development of Facebook apps. I think it is one of the best examples of a public PaaS.

2. CloudFoundry

CloudFoundry is being developed by VMware. After acquiring SpringSource back in 2009, this is the next logical step for them. CloudFoundy is positioned as the Open PaaS. While most PaaS solutions limit the choice of frameworks and infrastructure services, CloudFoundry tries to be open and extensible. And best of all, you can use the micro edition for development, the private (open source) edition in your own data center and the hosted edition as a public PaaS (or even a hybrid setup).

1. ???

We are seeing lots of development in this area with all different flavors of PaaS platforms and services. We have even built custom PaaS platforms for our customers based on the traditional application servers (JBoss, WebLogic, etc). But there is still lots of work to be done, before full stack solutions will be readily available. So, I think there is no number 1… yet.

And beyond…

2012 looks like it will be a good year for the cloud. I am very curious to see what the PaaS providers are going to bring to the table. Projects like OpenShift, CloudFoundry and Stratos are looking very promising, and I can’t wait to dive into them.

What are your top cloud services? Or which ones do you think that will become the next best thing? Feel free to add them to the comments below.

Share

Categories: Companies

Innovative Agile

Fri, 12/23/2011 - 17:01

My motto regarding innovation is: being a first mover is a strategic choice, moving fast isn’t. Agile and scrum can help you move fast, so how can it accommodate innovation?

Getting a view on innovation
When a company fills in a portfolio tool like a Boston Consulting Group (BCG) matrix, it gets a view on its product market combination (PMC) portfolio. You can tell a lot about the company and its business from how a BCG matrix (*) develops over time. One of the most fun things in my eyes is the amount of question marks turning into stars. A question mark being a PMC in a high growth, low share section (e.g.; doing something relatively new), a star being a PMC in a high growth, high share section (e.g.; being successful in doing new stuff).

The amount of question marks in the portfolio illustrates the amount of newly launched PMC’s on to target markets. When you also consider the amount of ideas not becoming question marks, and turn this into a ratio, you could get some idea on how innovative the company is. Add to his the amount of question marks turned into stars and you really get a sense of outward successful innovation. I distinguish outward and inward innovation, because I believe that experiencing a commercial hypothesis to be proved wrong is at least just as valuable as seeing it proved right.
This doesn’t mean that innovation can’t be applied in other quadrants, just that it might not be the smartest thing to do when for instance handling dogs. In many cases adding features to cash cow products can be a brilliant strategy. Take for example razors, where adding more razorblades, self-adjusting blades and other sleight handling improvements constantly extends the product lifecycle.

Innovation flavours
To get an idea of how scrum could accommodate for innovation, first we have to get an idea on the various sorts of innovation like “additive innovation” out there. In general there are four forms of innovation a company can venture into:

Incremental innovation: small improvements leading to slightly better results

Additive Innovation: adding product features, customization, new products in existing business lines

Complementary innovation: creating new offerings new in current business, but adjacent to current product lines

Radical innovation: doing completely new things, unknown to business and/or target markets.

Using agile to suit innovation
In the following section I will highlight a couple of alternative ways in which you could use agile and scrum mechanics to shape and facilitate these forms of innovation:

Incremental innovation:
1. Spend a retrospective or a section on this product improvement; try to get in one small improvement each sprint. Maybe this will ease the team into providing more input for the backlog.
2. Agree with the team that every sprint every individual team member comes up with at least one idea to improve the product in some way.

Additive innovation:
1. Create spikes in sprints to prototype new features, validate these with customers;
2. Hold demo like meetings with your target audience; ask them what they think about the product.

Complementary innovation
1. Keep key options open, have stories worked out in two variations and do validate these paths in spikes and demo meetings;
2. Invest time in finding the appropriate product owner for the job. Market and customer knowledge is important here as the company is going to serve adjacent and different markets than before;
3. Take care that the entire DMU is taken into account in the story map. Also look at internal impact of providing new services and products, for instance customer service needs.

Radical innovation
1. Make sure the innovation team is freed from all organizational gravity. Pull them away from status quo and peer paradigms;
2. Reserve time for existing teams to work on a free format project. This could be a once every month time box of a day for example. It can be whatever they would like, as log as results are made transparent. Let them the same social objects as in scrum (boards, graphs, backlogs);
3. Take care that you have means to measure relevant metrics early on. Every addition should increase sales, market share and other relevant metrics. Use retrospectives to find root causes and steer through story map;
4. Keep all options open, incorporate A/B tests, multi-variant tests, prototypes, feature polls and so on. Sprint goals are hypotheses you would like to see validated.

What I love about scrum, is that it so lightweight and adaptable. On the incremental- to radical innovation scale, there is no step in which scrum can’t be adapted to accommodate for innovation while remaining to move fast.


The above list is just a brain dump of what I quickly came up with. I am convinced that there are many more creative ways in which we could adapt agile and scrum practices towards innovation. Please view this blog as an open invite to share your thoughts on this subject.

PS: Merry Christmas and a very Happy New Year!

(*)A BCG matrix says nothing about profitability of the PMC, so market share in a growth market could be labeled as a vanity metric. The matrix also builds on the premise that you know a market to put question marks in. Sometimes however you don’t know what your market is going to be. Furthermore, the BCG matrix can be filled in numerous ways depending on how you define for example the market scope.

Share

Categories: Companies

How to walk with spacewalk

Mon, 12/19/2011 - 04:20

Introduction

One of the most common challenges of managing the configuration of servers in your typical DTAP environment is, in my opinion, keeping all the involved hosts at the same level of configuration in terms of installed operating system packages and their configuration files. It really can be a pain to keep all the systems at the same configuration level. Faillure to do so can lead to interesting situations where software produced by the project team does not run or perform on the acceptance and/or production environment while it was running perfectly on the development and/or test servers.

Ofcourse, there is the possibility of creating one golden virtualized image and pass it around your DTAP environment. However, this can introduce serious issues. For example, when the company hosting your acceptance or production environment does not accept, for obvious reasons, an alien virtualized image to be installed on their precious server farm. By that time, the project has already been running for several months, the engineer who developed the golden virtualized images has left the project and the documentation turned out to be not sufficient to reproduce the golden image.

This is where a Linux systems management solution like Red Hat Satellite can help you out. Since you need a Red Hat subscription for Satellite, this article will discuss the open source alternative called Spacewalk. Spacewalk is an open source Linux systems management solution. It is the upstream community project from which the Red Hat Network Satellite product is derived. Spacewalk manages software content updates for Red Hat derived distributions such as Fedora, CentOS, and Scientific Linux.

With spacewalk you can deploy linux systems, over and over again and always the same way (using kickstart). Centrally manage the packages to be installed on a system and last but not least centrally manage configuration files for each deployed system.

Sounds cool, i want this too!

So, enough about the theory, how does this actually work? To demonstrate this, i have compiled the following cookbook. At the end of this cookbook you will have:

  • A 64 bit CentOS 5.7 server running spacewalk 1.5
  • Deployed a base 64 bit CentOS 6.1 vm using spacewalk
  • Deployed packages on the deployed system using spacewalk
  • Deployed configuration files managed by spacewalk to the deployed server
Prerequisites Getting the VM up and running

I prefer to keep things lean and mean. For this blogpost a minimal 64 bit CentOS 5.7 will be installed using the net-installer. The following walkthrough provides you with vm ready for spacewalk to be installed.

Start your empty vm booting from the attached CentOS net installer iso. During installation select the defaults or change it to whatever suits your environment for language and keyboard-type. The installation-method is, of course, http. tcp/ip configuration: whatever suits your local network needs for internet access.

Select a mirror service from the CentOS website.

Provide the web site name: my.fast.mirror.com
CentOS directory: path/to/5.7/os/x86_64

Click next on the welcome screen, choose to do a fresh install of CentOS.

Partition your disk to suit your needs. Important note regarding partitioning: This blog article assumes some defaults, based on those defaults you should be aware that there are two locations which need sufficient disk space, you may want to keep this in mind while partitioning:

  • /var/satellite (5GB per distro)
  • /u01/app/oracle/oradata/XE (1GB per distro)

Network setup: configure as needed for your vm to fit in your network and to have internet connection.

Finalize the installation by selecting your timezone, entering your root password and unselecting all installation tasks including the default selected “Desktop – Gnome”. Let the installer do it’s job, once the system is rebooted you have a fresh base 64 bit centos 5.7 vm available.

Preparing the system for Spacewalk

Spacewalk uses a database for it’s back-end administration, this can be eighter a Oracle (XE) or PostgreSQL database. In this article we are going to use the Oracle 11g Express Edition (XE) database together with the Oracle 11g instant client. Transfer the rpm’s to your vm and install them (as user root) using the following commands:
yum install --nogpgcheck oracle-xe-11.2.0-1.0.x86_64.rpm
yum install --nogpgcheck oracle-instantclient11.2-basic-11.2.0.2.0.x86_64.rpm
yum install --nogpgcheck oracle-instantclient11.2-sqlplus-11.2.0.2.0.x86_64.rpm
After installation start configuration by:
/etc/init.d/oracle-xe configure
After accepting the defaults (to avoid port conflicts later on in the article, it may be a good idea to specify an other http port then suggested by default. This article assumes you use port 8888), choosing passwords and specifing oracle-xe to start at boot you should have a running oracle XE available. This can be checked by executing the following command:
ps -ef | grep pmon
which should be returning something like this:
[root@spacewalk ~]# ps -ef | grep pmon
oracle    1763     1  0 16:21 ?        00:00:00 xe_pmon_XE
root      3739  1957  0 16:56 pts/0    00:00:00 grep pmon
[root@spacewalk ~]#
Next step is to create a tablespace for spacewalk to store it’s data. Start by loading the Oracle XE environment settings (note the space between th ‘.’ and ‘/’):
. /u01/app/oracle/product/11.2.0/xe/bin/oracle_env.sh
Next, start an sqlplus session.
sqlplus sys as sysdba
Create a tablespace as follows:
create bigfile tablespace spacewalk datafile '/u01/app/oracle/oradata/XE/spacewalk.dbf' size 1G autoextend on;
Create a spacewalk database user and grant it the required privileges:
create user spacewalk identified by spacewalk default tablespace spacewalk;
grant dba to spacewalk;
Oracle XE comes with an apex based management console which can be reached at:
http://hostnameOfYourSpacewalkServer:8888/apex/f?p=4950
Navigate your browser to the url mentioned above and check if management console shows up. For future reference: Oracle XE can be stopped or started using the following command:
service oracle-xe stop
service oracle-xe start

Install Spacewalk

Finally we have arrived at the point where Spacewalk is going to be installed. As user root, perform the following commands to acquire the required repositories:
rpm -Uvh http://spacewalk.redhat.com/yum/1.5/RHEL/5/x86_64/spacewalk-repo-1.5-1.el5.noarch.rpm
rpm -Uvh http://download.fedora.redhat.com/pub/epel/5/x86_64/epel-release-5-4.noarch.rpm
rpm -Uvh http://spacewalk.redhat.com/yum/1.5-client/RHEL/5/x86_64/spacewalk-client-repo-1.5-1.el5.noarch.rpm
Next step is to actually install spacewalk (note: due to the speed of the spacewalk repo’s this step may take up to 30 minutes to complete).
yum install spacewalk-oracle
Next, configure spacewalk by issuing the following command:
spacewalk-setup -disconnected
After providing the setup program with the Oracle SID (XE), spacewalk db username en password the database is populated. Mostly the defaults can be accepted and/or obvious data can be provided during the rest of the setup program.

For future reference: Spacewalk can be started and stopped using the following commands:
/usr/sbin/spacewalk-service stop
/usr/sbin/spacewalk-service start
Check if the spacewalk server is up and running using the following url (note: you may get a certificate exception upon opening this page):
https://hostnameOfYourSpacewalkServer/
The first time this url is selected the following screen appears allowing you to create an administrative user.

Populate Spacewalk

The goal is to deploy a new machine with an os, this means the obvious next step is to populate spacewalk with a Red Hat derived Linux distribution of your choice. In this article the 64 bit version of CentOS 6.1 is used.

First step is to mount the CentOS 6.1 iso’s somewhere on your spacewalk server. Make sure to get a full distro iso, this means that required directories like, for example, images/pxeboot do exist. A minimal or netinst iso of a distribution, in general, does not contain these directories. These directories are used later on in this article, most important at this stage is the content and location of the Packages directory of your distribution’s iso.

The packages which belong to a distribution are administered in Spacewalk as software channels, so we first have to create a software channel before we can add/upload packages to it.

Create a new software channel by opening the spacewalk console and navigate to:

“Channels” -> “Manage Software Channels” -> “Create New Channel”

Enter a reasonable channel name (this is for display only, this article uses: “CentOS 6.1 – 64 Bit”), a channel label (remember this name for later use, this article uses: “centos6.1-x86_64″) and select the correct architecture (x86_64).

Next step is to populate spacewalk with the CentOS packages, this proces is started by issuing the following command:

rhnpush -v --channel=centos6.1-x86_64 --server=http://localhost –dir=/path/to/Packages

where “/path/to/Packages” is the absolute path of the Packages directory of the mounted iso.

CentOS 6.1 consists of two dvd’s, execute above step for both dvd’s.

The rhnpush process uploads all packages and registers them in spacewalk. On average, rhnpush processes packages at a rate of around 2000 packages per 30 minutes (ofcourse depending on the configuration of your host and vm). CentOS 6.1 contains almost 6200 packages so, it will take around one and a half hour to upload all packages from dvd1 and dvd2 to spacewalk.

Since we want the deployed linux system to be able to connect to the spacewalk server and use it’s package and configuration management facilities it is recommended to include the spacewalk client packages in a spacewalk channel as well. In this article we will upload the packages directly from the online repository into a child channel of the just created CentOS channel.

In the spacewalk console navigate to:

“Channels” -> “Manage Software Channels” -> “Create New Channel”

Enter a reasonable channel name (this is for display only, this article uses: “Spacewalk Client 1.5 – el6 – 64 Bit”), a channel label (remember this name for later use, this article uses: “swclnt1.5-el6-x86_64″), the correct architecture (x86_64) and the correct parent channel (this article uses: “CentOS 6.1 – 64 Bit”).

Populating spacewalk with the spacewalk client packages directly from the online repository is started by issuing the following command:

spacewalk-repo-sync -c swclnt1.5-el6-x86_64 --url http://spacewalk.redhat.com/yum/1.5-client/RHEL/6/x86_64

The spacewalk client has a dependency on the python-hwdata-1.2-1.el6.noarch.rpm package from the epel repository. Download the python-hwdata-1.2-1.el6.noarch.rpm package from the epel repository ( http://download.fedora.redhat.com/pub/epel/6/x86_64/ ) and upload it to the spacewalk client child channel using the command (assuming you downloaded the rpm to a folder named epel):

rhnpush -v --channel=swclnt1.5-el6-x86_64 --server=http://localhost -dir=epel

Create a distribution

For automating the installation of a Linux system a method called kickstart can be used. First, we have to setup a directory structure on the spacewalk server based on content of the CentOS dvd1 iso. From your CentOS 6.1 dvd1, copy the following directories:

  • images
  • isolinux
  • repodata

to:
/var/distro-trees/centos6.1-x86_64

Next, open the spacewalk console and navigate to the following location:

systems -> kickstart -> distributions -> new distribution

Enter the following parameters for the new distribution:

  • Distribution label: centos6.1-x86_64
  • tree path: /var/distro-trees/centos6.1-x86_64
  • Base Channel: CentOS 6.1 – 64 Bit
  • Installer Generation: Red Hat Enterprise Linux 6

Next step is to create a kickstart profile for the channel and distribution. Open the spacewalk console and navigate to the following location:

systems -> kickstart -> create new kickstart profile

Enter the following parameters for the new kickstart profile:

  • Label: centos61-minimal
  • Base channel: CentOS 6.1 – 64 Bit
  • Kickstartable tree: centos6.1-x86_64
  • Virtualization type: none

To make sure the spacewalk client repository is used during kickstart, navigate to the following location:

systems -> kickstart -> profiles -> centos61-minimal -> operating system

Make sure the child channel swclnt1.5-el6-x86_64 is checked.

Also, have a look at the other tabs to have an idea of the configuration options which are available. Possible interesting area’s are:

  • Software: Adding extra packages or package groups in addition to the base installation. Add the package just by adding it on a new line, package groups can be added by an @-sign followed by the group name. A package can be excluded by an hyphen (-) followed by the package name
  • Kickstart details -> Details -> Kernel options: Adding and removing kernel options. You can add a kernel option, just by adding it’s key/value pair to the input field. Removal is done by just mentioning the kernel option preceded by an ! and giving it ~ as a value. For example, the value “!text=~ resolution=800×600″ in the kernel option box forces the use of the graphical installer (remove the text kernel option) and sets screen resolution to 800×600.
  • Kickstart details -> Advanced options: Allows detailed configuration of the kickstarted system. For example, to add an user,during installation, named weblogic with password weblogic01, tick the “user” checkbox and add the value “–name=weblogic –password=weblogic01 –plaintext” to the input field.
  • Kickstart details -> Variables: the usage of variables can be done by adding a key/value pair and refer to it in another tab. For example (might be a bad example but it is just to demonstrate the usage), to define the hostname during kickstart, add a key/value pair (hostname=appsrvr1) in the variables tab and refer to it in the Advanced options by adding “–hostname $hostname” to the network text box.
Let’s cobbler

Next step is to create an iso image to boot a new vm from. Important note: In the next couple of steps we are going to deploy a new linux virtual machine. If your virtualization network setup supports a dns where the spacewalk server can be found by it’s hostname you can skip the next step. In other words, your newly created vm must be able to find the spacewalk server using it’s hostname during boot/initial setup. If this is not the case or if you are unsure, please perform the following step to change the spacewalk hostname to it’s ip-address, if you are sure dns is in place you can skip this step:

In /etc/rhn/rhn.conf change the value of the parameter cobbler.host to the ip address of the spacewalk server.
In /etc/cobbler/settings change the value of the parameters server and redhat_management_server to the ip-address of the spacewalk server.

On the spacewalk server, run the command (this only needs to be done once):

cobbler get-loaders

Next, start building the iso using the command:

cobbler buildiso

The result of the buildiso command is a file named generated.iso in the directory from where you issued the command.

Let’s kickstart

On your host, create a new virtual machine and provide it with the generated.iso file to boot from. Upon boot you will see a menu allowing you to specify the centos61-minimal setup to be installed.

Select this entry and the setup will install a base 64 bit CentOS 6.1 Linux system. If all goes well, this will happen completely automated, without any user intervention whatsoever. If, during install, you receive messages like “Error downloading kickstart file”, this probably means you have to look into dns issues as described earlier in the article.

Verify that the system registered itself in spacewalk, it should appear in the system tab on the main screen of the spacewalk web console.

Configuring the client

Now that we have installed a fresh 64 bit CentOS 6.1 Linux vm we have to configure it as a client for spacewalk. Open an ssh session to the newly deployed CentOS 6.1 vm and install the packages rhncfg-client and rhn-check using yum.
yum install -y --nogpgcheck rhncfg-client rhn-check

Managing the configuration of this newly created vm can be done in the following two ways:

  • Deploy new packages to the client
  • Deploy (configuration) files to the client
Deploy new packages to the client

To install a new package from the repository to the new server, go to the spacewalk web console and navigate to the following location:

system -> “your system” -> Software -> Packages -> Install

Select the required package from the repository (for example xauth) and click on “Install Selected Packages”

Next, select “Schedule action as soon as possible” at the confirmation screen and click on “Confirm”

Now, log on to the client and verify the software channels it is subscribed to by executing:

rhn-channel --list

Check if the channel where you made the pending change is in the list. Next, verify if the selected package is not installed yet by executing, on the client:

[root@appsrvr1 ~]# rpm -qa | grep -i xauth
[root@appsrvr1 ~]#

If the package is not installed yet, apply the pending change (installation of the package) by executing:

rhn_check

The server will check for any pending actions (in this case installing the selected package) and execute (install the package) them. Now, check again to verify that the (xauth) package was installed by executing:

[root@appsrvr1 ~]# rpm -qa | grep -i xauth
xorg-x11-xauth-1.0.2-7.1.el6.x86_64
[root@appsrvr1 ~]#

Deploy (configuration) files to the client

In case of managing the configuration files of a linux system through spacewalk, this can be done through configuration channels.

First of all, create a new configuration channel. Open the spacewalk web console and navigate to the following location:

Configuration -> Configuration channels -> create new config channel

Enter information to identify the config channel:

Name: My Config Channel
Label: myConfigChannel
Description: My Config Channel

Next step is to populate this channel with files and directories by navigating to the following location:

Select the configuration channel -> add files -> create file

Now you can create files, directories and symlinks, set ownerships and file permissions. In case of creating a file it is possible to add the actual content of the file in the inline editor. Click on “Create Configuration File” to finalize this action.

To deploy this file to the managed linux system, this system must first be subscribed to the config channel. In the spacewalk web console, navigate to the following location:

systems -> “your system” -> configuration -> manage configuration channels -> subscribe to channels

Next, verify if the client is successfully subscribed to the newly created config channel by executing the following command on the client:
[root@appsrvr1 ~]# rhncfg-client channels
Using server name spacewalk
Config channels:
Label Name
----- ----
myConfigChannel My Config Channel
[root@appsrvr1 ~]#

If the channel appears in the output of the previous command you can get those files (or directories) by issuing:

[root@appsrvr1 ~]# rhncfg-client get
Using server name spacewalk
Deploying /opt/oracle
Deploying /opt/oracle/middleware
Deploying /opt/oracle/middleware/jrockit
[root@appsrvr1 ~]#

If you want to verify if there is a delta between your system and the config channel you can do so by executing
rhncfg-client diff

Conclusion

As usual with this kind of systems, it takes a lot of effort upfront to set it all up. With this article i hope i will help the reader by setting up a spacewalk system relatively easy and fast. Hopefully, the reader will realize pretty soon that managing Linux systems now really is a breeze and all effort for setting it up was worth it. In my opinion, as of version 1.5, which is current at the time of writing this article, stability and functionality has increased a lot since I started working with Spacewalk. If you’re looking for a way to manage your Red Hat derived Linux systems, i highly recommend taking a look at Spacewalk.

Share

Categories: Companies

Continuous Delivery for Enterprise Java Applications

Wed, 12/14/2011 - 14:00

How do you setup a environment that support the continuous deliver of enterprise Java applications? How do you manage the large number of machines that are involved? How do you enable self-service, continuous delivery of applications onto the platform?

In this blog post we will give a description of an open source Java Application Platform as a Service that we created for our customer, using VMware, Redhat Enterprise Linux, Apache WebServer, JBoss Enterprise Application Platform, JBoss Operations Network, Puppet, Deployit,  F5 Load Balancer and  a Layer7 SecureSpan gateway.

Data Center Quality Platform

The customer wanted a data center quality Java Application Platform with the following features:

  • Standard configuration
  • Standardized provisioning
  • Standardized deployment
  • Centralized monitoring
  • Centralized access control
  • Virtual environment
  • Proving technology
Current situation

As the current Java application platform was based on HP-UX on Itanium, the customer was facing high cost for hardware, software licenses and fading support from software vendors.  As all applications ran on a HP Superdome, it was very difficult to add resources to individual applications. In addition, development teams spend too much time taking their software through the development, test and acceptance environments, resulting in slow delivery of software into production. Finally, it was difficult to provide 24×7 availability because all applications are running on a single machine.

Java Application Platform

The following figure illustrate the solution architecture of the java application platform.

jap solution overview

In the following paragraphs we will describe the purpose of the most important components.

Dual Data Center – HP

Not shown in the figure, is the hardware setup of the platform. It consists of HP blades setup in two data centers on two different locations. This provides the basic infrastructure for 24×7 availability and fault tolerance.

VMware ESX

VMware ESX is deployed on top of the hardware in the dual data center. This provides us with the ability to create virtual machines and provide high availability in case of single server of single site failures. It also allows us the quickly scale up virtual machines and increase the resources assigned to individual virtual machines.

For all machines in the platform we use a single VMware template image. This image is installed with RedHat Enterprise Linux and a puppet client.

Puppet

vm-template

Puppet fully automates system management. It is used for the installation of software packages, conformity tests and day to day system administration tasks.  For every type of node, we have a puppet plan. When the machines boots, the puppet agent provisions the machine with all the necessary software and configuration according to the plan for that machine.

The use of Puppet completely automates and standardizes the configuration, ensures 100% reproducibility of the configuration and is fast. Provisioning of a new machine from the template to full operational mode is done in a matter of minutes.

JBoss EAP

JBoss Enterprise Application Platform is the Enterprise Java applications server for all java applications.  The installation and configuration is done by Puppet and uses the official RedHat RPMs.  Puppet configures JBoss to ensure that :

  • JBoss management applications authenticate users against Active Directory, providing a single point of authorization for operations.
  • A JBoss Oracle database schema is automatically provisioned for that specific instance of JBoss, providing persistence for the JBoss server system state.
  • All Business Applications can authenticate users using SAML against the Layer7 Identity provider, providing a single point of authentication and authorization for their customers.
  • The JBoss instance is added to the pool in the F5 Load balancer
  • The application server is added to the Deployit infrastructure inventory, providing the tenants of the platform with the ability to deploy applications to the server.

JBoss application servers are always deployed in multiples of two, where each server of a pair is assigned to a physically different data center location by VMware.

The use of puppet provides us with a fast and reproducible way of provisioning JBoss application servers, allowing for a fast and reliable scale out mechanism for the applications.

JBoss Operations Network

image

JBoss Operations Network (JON) is used for monitoring all the resources in the platform.   By default, Puppet installs a JON agent on every machine. This agent scans the inventory of the machine and reports it to the JON server.

JON has a very good support for high availability and fail over. By simply adding a JON server machine, agents will automatically distribute themselves across the servers and failover if necessary. Each JON server also runs a JON agent, making sure that unavailability of a JON server is also covered.

In JON we created a number of alert templates for different resource types (os, apache, jboss, jon, puppet, etc.)  that will monitor and report critical conditions on the system.  All error messages from the JBoss servers logs are reported as incidents.

All alerts and clearing conditions from JBoss Operations network are reported via SNMP to TNG Unicenter.

Through the  use of JBoss Operations Network all machines, servers and resources in the platform are automatically added to the centralize monitoring system.

Deployit

image

Deployit is used for the automated deployment of applications onto the platform. It automatically deploys all the application components in a stack to the appropriate containers.  Deployit :

  • deploys static content and proxy configuration to the apache webservers,
  • deploys enterprise java application components to all individual JBoss servers in the farm,
  • executes SQL scripts to the database,
  • configures the F5 loadbalancers to add or remove servers or applications to the pool,
  • applies environment specific changes to the application configuration.

The deployment plan for a specific application is prepared in close cooperation between the application developer and platform management staff. When the deployment plan is finished, developers can deploy new versions of the application themselves, directly from a build tool or manually.This ensures solving any installation or configuration problem isn’t postponed until the application is installed for production use, but rather is solved at the early stage of any development.

The same deployment plan is used for all environments. Authorization can be configured per enviroment and per application. LDAP is used to authorize software developers to deploy and configure an application for development and testing purposes, while integration specialist can deploy the application in production.

The use of Deployit provides the platform with a fully automated and standardized deployment mechanism, improving the speed of deployment of applications through the development, test and acceptance environments while reducing the number of staff involved and lowering the number of configuration errors.

F5 Load Balancer

image

The F5 Load balancers is used to support scalability and fail over for the JBoss Application Server farm.

The pools are configured to use a sticky session protocol based upon the JSESSIONID session cookie. If the cookie is not present, round-robin load balancing of the HTTP requests is performed.

Puppet adds the JBoss servers to the  pool in the F5 Load Balancer.

When a server is scheduled for a restart, the server is taken out of the pool. This ensures that this server does not get any new request, but will still be servicing existing sessions. When the session count in JBoss drops to zero, the server is restarted and restored to the pool.

The use of the F5 Load Balancer provides us with the ability to increase and decrease the number of servers in the farm, provide load balancing, fail over and graceful decommissioning of servers in the farm.

Layer7 XML Gateway / Identity Provider

image

The Layer7 SecureSpan Gateway is used as centralized security policy enforcement point and SAML identity provider.

Layer7 supports multiple authentication methods, Kerberos, digital certificates, username+password and is able to use multiple identity stores.

Puppet configures all JBoss application servers with SAML support and configures Layer7 as the identity provider: JBoss receives authentication (identity) and authorization (roles) information as a SAML-token. The information contained in the token is translated to a standard JEE-principal user (using a tiny layer of custom code), so all JEE applications can access the authentication and authorization information in a standard way.  Whether the JEE application is a web application or provides webservices, from a security there’s no distinction. All application designers have to do is declare the application security roles conform the JEE standard.

The use of Layer7 standardizes the authentication and authorization for all business applications and centralizes access control.

Conclusion

The customer wanted a modern data center quality Java Application Platform to ensure that java applications could be deployed with lower cost and with high availability and easy scalability.

VMware, the dual data center, Layer7, the F5 Load balancer and JBoss provide the infrastructure for a  high availability and scalability for any java application. The combination of VMware, Puppet and Deployit are the fabric to enable continuous delivery of enterprise java applications.

Through virtualization and automated provisioning and deployment it has become possible to add a completely new, correctly configured machine to a cluster in a matter of minutes, completely secure and under full monitoring.

Share

Categories: Companies

It’s alive dr. Frankenstein!

Thu, 12/08/2011 - 22:06

A walking skeleton as meant in scrum is not always feasible. That’s the first sentence of one of my previous blogs. This one starts the same but approaches the subject from a different angle. The angle here is that we teach people to make story maps based on personas; the user, administrator and so on, but we don’t actually take into account that the product has to be bought by someone and how that decision actually works. This blog post tries to tie complex buying decisions into story mapping, to find the shortest route to a sellable Frankenstein, rather than a mere bag ‘o bones.


Imagine buying a new TV. Who has a say in that process? Ask yourself who goes out to buy it? Who is going to tell you it should be suitable for 3d-gaming? Who is going to decide the size, shape and look-and-feel that will determine the fit with the rest of the interior? Who is telling you it’s going to be bought at store x or maybe online?

All these questions could and probably will play a role in the consumer
buying process of complex products and therefore ultimately decide whether product “a” is bought or competitor “b”. It’s not said all of these questions get asked and answered by the same person(a). Most buying decisions of complex nature are made by a so called decision making unit or DMU.

Decision Making Unit theory indicates a number of roles with regards to
buying decisions such as the:

- initiator : identifies problem/ need to solve;
- gatekeeper: regulates info for decision;
- influencer: influences decision;
- buyer: buys solution;
- decider: decides what product to buy;
- user: actual user of product;

We build storymaps based on pragmatic personas, which are primarily users. Selling our product however, as decision making theory shows us, may mean tactically taking into account the entire decision making unit, not only the users. I encounter a lot of releasable walking product skeletons, neither shippable nor marketable.
Too thin a walking skeleton, means releasable, but no one will buy it. Too fat a walking skeleton, means putting in too much and missing crucial market windows.

A sellable increment, which if possible is of course also a first release, means fleshing out your skeleton just a bit more making it more of a sellable Frankenstein than just a bag of bones.

A sellable increment or marketable feature set, in my opinion is your
kano threshold attributes + a well thought-out and implemented set of performance attributes, resulting in the product being a viable option in the customers consideration set of alternatives. Next to that, we need a good usp, or a unique set of these, positioning your product from all others. All of these product properties, or a specific set depending on your product marketing strategy, will need to be translated to DMU needs, to be able to hone the product for sales.

My advice would be to start thinking about the DMU and how this works for your product. Get marketing involved and see what they already know about the composition of the DMU. Create some peronas and try to involve them somehow in demos and future plans to get the feedback you need.

Categories: Companies

State of the union of html5 in the mobile revolution

Mon, 12/05/2011 - 18:47

Being relatively new to html5 and mobile development I spotted an excellent opportunity to catch up with the latest trends during the QCon conference in San Fransisco where they offered a wide variety of html5 and mobile tracks.

In this blog I’ll share the insights I gained during the conference. After reading it you should have an overview of the following:

  • where html5 is right now and where it is heading to with regard to mobile development
  • the benefits and drawbacks of html5 for web-apps compared to native apps
  • how to bridge some of the shortcomings of html5 with regard to native apps
  • valuable pointers to resources helping you to get started with html5 mobile development

Overall impression

Having followed several sessions featuring html5 and mobile it felt like riding on a tremendous high-way that is heavily under construction. I got a notion of how the final result is going to look like but in the meantime we have to drive around obstacles, deal with changing detours, take risks to end up in dead-end streets and continuously worry whether the chosen vehicle, our browser can handle the freshly paved lanes (html5 standards). Nevertheless, it became evident that the web is moving hard and great new possibilities are waiting to be exploited.

From html4 to html5: the paradigm shift

The classic web architecture for html4 applications is based on a document centric approach, where client browsers request html-documents from the server and simply render them. The server handles all the conversational state and business logic whereas the client is nothing more than a dumb render engine.

Html5 on the contrary offers a truly stateful runtime environment. This allows for a shift from a document centric request/response approach to an application centric one, where data is synchronized when there is connectivity rather than continuously requested from the server. In other words, the html5 runtime offers support for rich clients that can operate in a standalone, disconnected fashion.

The classic web architecture is nor particularly outdated nor bad. There are definitely cases where stateless clients make sense. However, by supporting stateful features html5 gets much closer to native apps, which are stateful by nature. Statefulness opens a wide range of new possibilities and applications all within reach of the web developer. Let’s briefly touch on the stateful features of html5.

Html5 and state

In the section below I will describe the most important stateful features html5 offers, which are crucial for mobile devices to be connectivity agnostic:

Application Cache

The application cache feature of html5 allows for files to be pre-fetched from a server. It is similar to browser caches, with the significant difference that you can fetch any file of choice eagerly. A cache manifest file is used to configure a selection of files that need to be pre-loaded regardless whether they will be used by the markup or not.

WebStorage

WebStorage is an easy API to store small amounts of textural information in a name/value pair format locally. This offers a wide range of possibilities. Imagine you have a wizard consisting of several pages. All intermediate data can easily be stored by means of the WebStorage API. Furthermore, imagine you have completed the wizard and want to submit the form but there is no connectivity. With WebStorage you simply save the form and show the user a message that the form content will be submitted when connectivity has been reestablished.

Connectivity and Offline/Online Events

The state of the current connectivity can be verified with offline/online events. These events go hand in hand with WebStorage. Instead of relying on connectivity, application state, such as form data, could always be stored locally prior to sending. Before sending we check the connectivity status. In case there is no connectivity we wait for an online event to happen before the state is synchronized with the server.

File API

The File API allows for reading local files and directly assign the content to a html element or render it on a canvas.

IndexedDB

IndexedDB is a lean object store, which stores data in the form of key(s)/object pairs. All keys are indexed. Queries can only be performed on keys not on the object itself. Compared to the WebStorage API, IndexedDB should facilitate bigger datasets, which can be stored and queried locally.

Current support for html5’ stateful features

Almost all current mobile browsers support ApplicationCache, WebStorage and Connectivity events , whereas the File API is still work in progress. The IndexedDB is heavily under construction, therefore it is not surprising that almost no mobile browser supports it.

With the stateful html5 features available a great variety of mobile applications can be built, which can operate in a standalone mode and do therefore not depend on connectivity anymore.

Nevertheless, especially older mobile browser versions will lack the above and other html5 features, making your html5 application useless for such device groups. Currently, there are several remedies to cope with this problem.

Dealing with cross-browser/device compatibility issues

Currently, the following remedies are available in order to deal with missing html5 features:

Feature detection

If you want to check at forehand which features a certain browser supports have a look at CanIUse.

To verify the availability of html5 features at runtime Modernizer is the best library choice. Modernizer offers a straightforward API through which all html5 traits can be tested. Depending on whether a feature is enabled adequate action can be taken, such as using a polyfill alternative.

Polyfills

Polyfill libraries are fallbacks in case a html5 feature is not supported. A good example is socket.io, which provides support for realtime (push) communication. The html5 answer to push are WebSockets. Because not many mobile browsers support WebSockets yet, socket.io bridges this gap by providing other communication mechanisms, such as long-polling in case WebSockets are not available.

There is a vast choice of polyfill libraries, which are preferably used in combination with Modernizer. Should Modernizer see that a certain feature is missing a polyfill counterpart can be loaded.

Native look and feel

Html5 does not offer OS native look and feel out of the box. However, there are some great javascript frameworks/libraries available that offer native widgets and touch support:

Pros and Cons of Html5 Web-Apps vs. Native Apps

With the features html5 offers combined with polyfill libraries we are able to write ‘decent’ web-apps, which can run standalone, use some device capabilities etc. However, competing with all the possibilities native apps offer is out of the question. Nevertheless, html5 also offers great advantages native app development can only dream of. Let’s have a closer look at the differences.

As follows the pros and cons of html5 web-apps compared to native apps:

html5′s Pros
  • One stack to rule them all:
    Use one technology stack (html5, css, javascript) for all platforms.
  • Faster time to market:
    Because only a single technology stack needs to be mastered expanding to other platforms is not that hard. However, be aware that ‘write once run everywhere ‘ is still an illusion for a complex html5 web-app. Various projects have proven that getting a complex html5 web-app work on all browser platforms can still be a considerable investment.
  • Reuse of skills:
    A single team of web developers can write web-apps for all platforms. Even though it can be expensive to make an html5 web-app run on all platforms, it will probably still be cheaper than a native approach. The reason is that one web developer team can support all platforms instead of having a native developer team for each one. Also costs for maintenance and expansion towards new platforms can be much lower than a native approach.
  • Less Code:
    Having only one stack to master will result in less code. Less code means less maintenance, less bugs, less complexity and therefore lower costs.
html5′s Cons
  • Limited access to device functionality:
    Most Html5 browsers do not support access to device functionality such as camera, filesystem, contacts, SMS, gyroscope, cross-app messaging etc. Proposals are available but actual implementations are not within reach yet.
  • No monetization and distribution:
    Html5 web-apps cannot be distributed and monetized, which means they cannot be offered in an app-store. For companies who want to make a living with apps this shortcoming is a serious issue.
  • Worse render performance:
    Html5 cannot match native performance since it does not yet have direct access to the hardware render capabilities of the device. Simple renderings such as scrolling a list, feel nowadays like native. Complex renderings in html5, however, can’t compete with their native counterparts.

Depending on what kind of app you want to write the list of drawbacks can leave you with no other option than going native. When render performance is your killing feature the odds are bad: native is your only choice.

The limitations imposed by html5 for limited access to device functionality and monetization/distribution, however, can be bridged. The answer is: hybrid apps.

From Web-app to Hybrid app

What are hybrid apps? Hybrid apps are html5 web-apps that do not run in a browser but instead in a native wrapper. A native wrapper can be seen as an embedded browser. Such a wrapper is launched within a native app giving the user the impression that he/she is interacting with a traditional app.

To make this work all that is needed is a very thin native code layer that embeds a headless browser wrapper in the app, which loads the html5 web-app at startup and runs it as a normal browser would do. Moreover, this thin native layer can bridge almost all html5 gaps, providing access to device functionality, such as camera, SMS, contacts etc.

There are several hybrid projects out there: PhoneGap (Apache Callback), AppMobi, NimpleKit etc. The most popular one is probably PhoneGap, which has been donated to Apache. It is now an Apache project called Callback in the incubator phase. In the coming section I’ll touch on PhoneGap in order to give you an idea of it’s tremendous potential for mobile development with html5.

A popular hybrid app stack: PhoneGap (Apache Callback)

PhoneGap is fully open source. It is not only widely used but also widely supported by major vendors such as Adobe, IBM, Microsoft, RIM etc. By now there are already ~15.000 PhoneGap apps available in app-stores for various platforms.

How it works

Writing applications with PhoneGap is very straightforward. In essence you write a traditional html5 application including a reference to a phonegap js library in your html. Upon completion you can use PhoneGap’s build service, which wraps the html5 web-app in all native wrappers of choice. In addition, it makes the resulting hybrid-app available in the app-store where it can be downloaded on the corresponding device for use or testing.

Development with PhoneGap

For development convenience and short roundtrips using emulators is desirable. Most major IDEs have a PhoneGap plugin, which allows for quick deployment in the SDK emulator. Other emulators are WebKit and Ripple. I saw the emulators working for XCode and eclipse. Many more are available here.

During an impressive demonstration it took people from Adobe less than a minute to make a html5 application available on an iPad and an Android device. Weinre was used to remotely debug the application that was running on a real device. The same debugging technique can be used when running the application in the native SDK emulator.

Access to device functionality

Because the html5 standards for accessing device functionality is leaky and still heavily under construction PhoneGap chose to write their own PhoneGap javascript API, which bridges all gaps. There are APIs for Sensors (GPS, Accelerometer, Compass, Network and Camera), Data (Contacts, Media, File system, Notifications) and Events.

PhoneGap and other JS-Libraries

Besides PhoneGaps’ own API to access device functionality all commonly used javascript APIs can be used, such as JQuery Mobile, Sencha Touch for GUI, JQuery, XUI, Zepto for DOM etc. With regard to these third party JS libraries PhoneGaps does not impose any restrictions. You only have to make sure that the chosen library works within the embedded native browser.

Access to native code

In case PhoneGap does not bridge all the ‘gaps’ for you it offers a plugin architecture, where you can write your own PhoneGap plugins. With plugins you get direct access to the underlying platform with all the possibilities thereof. However, platform specific knowledge will be required.

So all in all, hybrid approaches such as PhoneGap are great alternative to native apps. They pick both of best worlds: a single technology stack, reuse of skills, shorter time to market and access to device specific capabilities.

Overall Conclusion

Polyfills and hybrid apps prove that html5 is still far from being mature. However, it is moving fast. The Html5 objective of delivering a native user experience for all platforms with one technology stack is not reality yet but frameworks such as PhoneGap do a good job in simulating this utopia.

Html5 and mobile have just taken off. Even though the two will still have to grow on each other the future looks promising with a great variety of new possibilities awaiting us. My advice is: embark the ship to witness the web’s next (r)evolution that will rock this planet ;-)

Acknowledgements

The following presentations from QCon have served as input for this blog:

Share

Categories: Companies

Taking Application Release Automation to the Next Level

Tue, 11/29/2011 - 20:54

Whether the driver is Agile, Cloud or DevOps1, or a “plain old” efficiency drive or process improvement initiative, forward-thinking organisations are currently looking for ways to improve their application release processes through automation. In an area where manual activities are still all too common, it’s unsurprising that the initial focus has been on automating the deployment execution – moving all the bits to the right places.

What early adopters have learnt is that, at the enterprise scale, automating release execution quickly introduces a new bottleneck in today’s dynamic IT environments: continuous management of the deployment plan definition. A new generation of application release automation (ARA) tooling avoids this pitfall by leveraging intelligence to automate deployment planning as well as execution.

The State of ARA

Given how strongly our IT industry is dedicated to the automation of processes, it is nothing short of amazing how much of the deployment of our own solutions depends on manual actions coordinated more or less effectively between large groups of people.
Indeed, a recent analyst report noted that the majority of large enterprises were still relying on manual application release processes or on in-house scripting understood by only a small number of specialists, operated as a black box that – hopefully – will do its job and will most likely necessitate a painful troubleshooting session if it doesn’t.

With key IT trends such as Agile, Cloud and DevOps dramatically ramping up the frequency of application releases in order to increase responsiveness to business needs and provide more and better services to customers, it’s clear that this situation cannot continue.

Thinking of today’s common release processes, it is also hardly surprising that the initial drive has been to automate the actual rollout of the application itself: copying the files to the target machines, restarting the servers, running the SQL against the DB etc. Using a defined workflow to organise these activities makes lots of sense: fewer failures, no more missing steps or steps executed in the wrong order, no more typos, better visualization etc.

Lessons from the First Generation

Ironically, using one of these first generation ARA tools at an enterprise scale quickly made it obvious to early adopters how much effort is required to maintain the substantial number of workflow definitions that quickly accumulate to support full deployments, partial upgrades, rollbacks, environment scale-outs etc. across an enterprise application portfolio.

Of course, this is not a new challenge: ask anyone who has had to update 100 build job definitions in a continuous integration tool to change a compilation parameter, or 100 test plans in an automated testing setup to accommodate a different target browser, just how time-consuming and error-prone this type of maintenance is.

It’s not as though these modifications are unique per process. They tend to be systematic changes that reflect changes to the overall deployment strategy and/or context. The issue is that these first-generation tools, where all the deployment intelligence is stored in the power user’s brain, simply do not have enough internal knowledge of the structure of deployment to assist effectively.

The Next Level of Application Release Automation

A new generation of Application Release Automation includes this intelligence. These advanced tools2 no longer require hand-holding by your power users every step of the way, and encode knowledge of deployment best practices and strategies to automate the planning and execution of deployments.3

No pre-provided strategy can be a 100% fit in an enterprise environment, so the strategies must be configurable, of course. Once your power users have fine-tuned them, however, all the individual deployment plans – initial installations, full and partial upgrades, downgrades, undeployments, scale-outs etc…easily hundreds across an application portfolio – are automatically tailored to your scenario.

This becomes even more efficient the fewer deployment strategies are in play, so these tools also motivate and reward increased standardisation of deployment procedures, in itself a valuable business goal. In fact, with a suitable interface and integrations into the development pipeline you essentially have an enterprise Platform as a Service, potentially on a private or hybrid cloud.

Conclusions

With adoption of Application Release Automation rapidly on the increase, a new generation of solutions are appearing that automate deployment planning as well as execution.
Based on the challenges experienced in scaling the first generation of ARA tools to enterprise levels, these next generation solutions are designed to eliminate “continuous expert hand-holding”, promote standardisation and allow organisations to create a “software factory” that continuously delivers business value.

Footnotes

  1. My spelling preference is still for “Devops” since the whole point is, after all, that Dev and Ops are no longer regarded as separate, but hey… ;-)
  2. like XebiaLabs’ Deployit
  3. This advance closely mirrors the development of build frameworks from tools like Ant to today’s industry standards like Maven and on to the next generation of Gradle, SBT and others.

Share

Categories: Companies

Twitter data fun

Tue, 11/29/2011 - 15:46

I made a map of my followers on Twitter. This is not entirely straight forward, as most Twitter users don’t attach geo coordinates to their tweets or profiles. Luckily, many people leave something sensible in the location field of their profile (e.g. ‘Amsterdam’ or ‘London, UK’). You can match this field against a Lucene index of all the cities in the world, which I happen to have. I was able to place 15 out of my grand total of 19 followers on the map.

Followers of @fzk:

Your browser does not get iframes. Go here for the pretty picture: http://www.fritsie.net/maps/fzk_map.html

Why is this important? Read on! Also, somewhere down the line I will explain how to make such a map for your own account.

Note: this is a cross post. You can see the original here: http://waredingen.nl/twitter-data-fun.

Some time ago, someone asked me to go find out how it is possible to obtain substantial amounts of relevant social network data (from legit sources, for money). That sounded like fun, so I went ahead and Googled. Twitter is a global data source. However, you usually use it to attack a local problem, like when shouting to everyone at the same conference where the beer is at or that your slides are online. This works well, as long as you know exactly who you want to reach or follow, using a hash tag or a user’s screen name. As a company, you often try to attack harder problems, like listening to a group of people who are potentially interested in your products. Given the probability that you are not Google or Facebook, that group usually isn’t made up of everybody on the planet with an internet connection. And your specific audience typically doesn’t exactly come with its own hash tag. Now what?

First of all, you need to obtain Twitter data. One option is to go through the Twitter API, but this is limited. It’s mostly meant for use by client applications for a specific user or, at best, harvesting search results through the streaming API. There is the Twitter sample stream, but that gives you just a random 1% of all tweets. Altogether, Twitter’s own API is not meant for gathering a large volume of data. It has rate limiting and limits the number of concurrent things you can do from a single host. If you need real data, there’s apparently only two places you can go: Gnip and DataSift. Gnip advertises having data streams from lots of different social media whereas DataSift only has Twitter and one other offering, yet both companies appear to emphasize on Twitter. A quick comparison based on their websites reveals that DataSift is a lot more transparent, both in pricing and API, whereas Gnip really want you to contact them in order to talk about the thing you already wanted to get online. That said, both of them deliver a filtered Twitter stream based on a filter that you get to define yourself. DataSift has its own language for filtering. It allows you to create filters using this language and ‘compile’ them using their REST API. Once compiled, you can get streaming results using that filter. I am guessing Gnip works in a similar way.

Now, here’s the problem. I want to have all the Twitter messages by people that matter to me. But “people that matter to me” is not a predicate in the filter language. The things that you can filter on fall roughly into three categories: content properties, user metadata properties, reach properties.

These filtering options mainly allow you to filter based on properties of users or tweets (content, language, etc.). That’s an obvious monetization strategy, but I think there is not enough value in content alone, as long as you don’t have the option to also use properties of the network graph as a means of filtering. The read property that you can use is the closest thing here. It is based on the so called Klout Score of a user. The Klout Score of a user is based on how many unique people that user reaches when shouting on the internet and how likely it is that those people will amplify the message. This is nice, because you can filter on influential people, but the problem remains that this is global, not local.

If you look at the map of my followers, you can see a pattern: the geography of my reach. Below I have the same map for my friend Age (@agemooij). His reach is far greater than mine (with 200+ followers), but also, the geo pattern for his reach is noticeably different.

Followers of @agemooij:

Your browser does not get iframes. Go here for the pretty picture: http://www.fritsie.net/maps/agemooij_map.html

Of course the map is just there because people like maps. The real information is in the pie chart, which is basically a feature vector of the geographic reach. My method for extracting coordinates from location fields is far from spohisticated and will produce erratic results every now and then, so it is probably safer to ignore any country under 1% in the distribution. That said, Age has a portion of his followers in the US, India and UK. This may be a pattern that you’re looking for. It would be nice to be able to filter based on users with a geographic reach similar to Age’s. Also, it should be feasible to do this technically.

As a reference, I also created the map for the official, verified account of the prime minister of The Netherlands. His name is Mark. Mark has tens of thousands of followers, so I had to sample because of rate limiting, but I got 2000+ locations, so it is likely to be representative. You see Mark’s geographic reach consists of only NL, after discarding every category of 1% or lower. Mark’s Klout Score will likely be high, but if you want to look at content that crosses the Atlantic, you’re better of with Age.

Followers of @MinPres:

Your browser does not get iframes. Go here for the pretty picture: http://www.fritsie.net/maps/minpres_map.html

While content based filtering options, let’s you easily do simple things, like getting all tweets that mention your company name or product name, it isn’t very helpful if you want to look at sentiment amongst specific groups. It would be nice to have a filter that let’s you define predicates like ‘people with a mostly Western European network and a Klout Score > 30 and a noticable political interest’. The first two predicates should be doable. Now the last part of that query is a hard one. Perhaps looking at the links that people share can give us some insight in interests. I will give it a try soon and if it works out to something visible, that’ll be another post.

 

Making the map

Onto making the map. All the code that you need is here: https://github.com/friso/twitterfun. So clone that. Building the map consists of three steps, one of which is manual (I did it only three times, so when I have to make another one, I’ll automate the whole process).

  1. Extract location fields from a user’s followers
  2. Turn locations into geo coordinates
  3. Make the HTML file with the map and pie chart

For step 1 I use a piece of python which is here. It takes two arguments: a screen name and a file name. The first is the screen name of the user of interest. The second is a file where the script will write the list of locations it found in the follower list. The script talks to the Twitter REST API for collecting the required data. It will first lookup the user’s follower IDs and then request the user profiles for each of the IDs. It will lookup 30 users at a time, because the API method is limited to some amount of users (amount >= 30) per request. For users with lots of followers, you will run into rate limiting before it finishes. It will throw an error in that case, because of unexpected content in the response.

Step 2 is a Java progam. It has several command line options, as you can see by browsing the main method. You should run it like:

java <java> opts com.xebia.locations.finder.LocationFinder -i <index location> -f <input file> -v

The java opt need to have all the jars that it requires. This is at the very least Lucene and Commons CLI, but there’s probably more. You’re better of creating a project in your IDE for it and run from there or add a run thingie to the Maven pom.xml. The index location is a location of a Lucene index containing all the cities in the world and their population. I will show you in a minute how to get that. The input file is the file that came out of the python script. The -v option tells the program to output records in JSON format, which comes in handy later on.

Next to the LocationFinder class, there is a IndexBuilder class. The nice people at MaxMind publish a database containing all the cities in the world and their population as a text file. The IndexBuilder class can read this file and turn it into a Lucene index at a specified location. Run like this:

java <java opts> com.xebia.locations.locationfinder.IndexBuilder -i <index output dir> -d <maxmind db file>

It will build the index in the directory specified by <index output dir> and use the MaxMind database file <maxmind db file>. It expects text, so extract the gzipped version.

The location finder tool tokenizes the location as entered by the user and does a seach against the entire index for anything that has similarity to that location. It will fetch at most 350 results and then rank the results based on exact matches for city, country and in case of US also state. Additionally it will rank the city with the largest population a bit higher as well. This makes sure that ‘Amsterdam’, matches Amsterdam in The Netherlands more than any of the four places named Amsterdam that exist in the US. It will choose the highest ranking result, if it ranks above a certain threshold.

Step 3 is a manual step. The location finder outputs a bunch of JSON objects (one per line). When you turn that list of objects into an array of objects by adding a comma at the end of each line and putting square brackets around it and subsequently paste that result into the HTML template over here instead of the array that’s already there in the “var locations”. It will give you the desired HTML with a map and the pie chart. Also, Google Maps requires an API key these days, so you’ll have to provide that at line 11 of the HTML file.

I am not going to do a line-by-line walk through of the code in this text, because those are boring. You can checkout and read the code for yourself. Enjoy!

Share

Categories: Companies

Organizational causes, inspired by Aristotle

Sat, 11/26/2011 - 15:52

When I start a new consulting job at an organization, I like to ask people how their organization became the organization it is today. Most of the time, people start telling me about the history of their organization or the values and goals they have. People sometimes start telling me about the people who work in the organization. But I have never got an answer that fullfilled my question completely. What made organizations what they are right now? After reading ‘Die Frage nach der Technik’ written by Martin Heidegger (1889-1976), I got an answer that could help me structure all the answers people gave to me.

Heidegger uses the doctrine of the four causes1 that Aristotle described, which can also be applied to organizations. The first cause is the causa materialis, the material of the organization. In organizations, people, capital and material are together forming the causa materialis. The second cause is the causa formalis, which is the shape of the organization. This is the organizational structure, the processes of an organization and the way the office rooms are organized. The third cause is the causa finalis, the goals that an organization has. These goals can be at the organizational level (f.e. mission, vision, strategy), or at the personal level (the goals you make on your own and get from your boss). The last causa is the causa efficiens, which is the activity of being an organization. It is the resultant of the other causae and added to that the activity of being an organization. The causa efficiens adds the time-factor to the equation.

What makes this theory useful? It proves that every organizations should give attention to these four levels in order to stay a healthy and balanced organization. First, an organization has to work to keep his resources healthy, especially human ones. Mark that humans should not be treated not merely as a means to an end, but at the same time as an end in itself. Second, an organization should improve his form, in order to maximize the outcome (not just output!) of an organization. Third, organizations should define their vision and their goals. It is important to align these goals in such a way that they conflict neither horizontally nor vertically. Too often I have seen managers fighting for resources because their goals conflicted with each other. Sadly, the result is that neither of them reach their goals. Fourth, organizations should have their focus on these three causae and continuously improve them. There is not such a thing as an ideal organization. Time changes things and you need to adapt to (or even initiate) these changes.

Heidegger and Aristotle make useful distinction between causations in organizations. This distinction enable us to focus on the right thing. Healthy resources require attention. This applies not only to materials, but also on your employees. You have to find high qualified developers and treat them as real craftsmen. People are able to take far more responsibility compared with the amount of responsibility organizations give to their people. The organization should be shaped in such a way, that the material (people) could reach their goal in an optimal way. A fundamental condition to execute this is a well designed mission, vision and strategy that is aligned both horizontally and vertically. Steven R. Covey writes in ‘The 8th habit’ that only four of the eleven people know their goals, and only two of the same eleven do actually care. Although most companies do have a mission, vision and strategy, most people cannot answer the question how their activities contribute to them. But most of all, to be committed to do all these items continuously. Agile and Lean can help with this, enabling your organization to adapt changes.

Heidegger and Aristotle are both dead, but their thoughts are still invaluable.

(1) Aristotle, Metaphysics, Book 5, section 1013a

Share

Categories: Companies

Easy breezy restful service testing with Dispatch in Scala

Sat, 11/26/2011 - 02:29

For testing a restful service API I was looking for a lean library, which would allow me to test CRUD operations of rest services with as little code as possible.

My search led me to Dispatch, which is a highly compact Scala DSL wrapper around Apache’s reliable HttpClient. This DSL, however, is not very well documented and rather hard to decipher due to it’s heavy usage of symbolic method names but nevertheless highly appealing when understood.

In this blog I’ll decipher it for you and show how easy it is to test restful services with mere oneliners.

My Prerequisites

Before we dive into Dispatch’s mysterious DSL let’s look at the prerequisites I had for testing my restful service. These will most likely apply for many other rest services as well:

  • Support for CRUD operations with http’s POST, GET, PUT and DELETE methods
  • Support for XML and Json as input and output payloads
  • Support for reading http status codes to test results of erroneous responses like 404 (Not Found)
  • Support for security aspects such as https and basic/digest authentication
Begin with the End in Mind

The goal of this blog is it to create a RestClientHelper that offers a set of generic restful client methods and explain how they’re implemented using Dispatch. With these methods in place we will be able to call and therefore test restful service APIs very easily.

The RestClientHelper will be a trait that offers the following client side CRUD methods:

trait RestClientHelper {

  //Create (POST)
  def create[T](target: String, reqBody: String)(fromRespStr: String => T): T
  def create[T](target: String, reqBody: Elem)(fromRespXml: Elem => T): T 

  //Read (GET)
  def query[T](target: String, params: Seq[(String, String)])( fromRespStr: String => T): (Int, Option[T])
  def query[T](target: String, params: Seq[(String, String)])( fromRespXml: Elem => T): (Int, Option[T]) 

  //Update (PUT)
  def update[T](target: String, reqBody: String)(fromRespStr: String => T): T
  def update[T](target: String, reqBody: Elem)(fromRespXml: Elem => T): T
  def update[T](target: String, reqBody: String): Int
  def update[T](target: String, reqBody: Elem): Int 

  //Delete (DELETE)
  def delete(target: String) :Int
}

As you can see, most methods consist of two parameter lists. The first parameter list represents the rest call input such as target uri and request body or parameter map. The second parameter list is a function that converts the output of a rest service into the desired type. The return type of these methods are of type T, Int or Tuple2[Int, Option[T]]. The Int always represents the http status code, the Option[T] represents the converted output if the service call was successful.

How to use the RestClientHelper

To get an understanding how the RestClientHelper trait is supposed to be used let’s look at an example. Assuming we have a domain object, Person, that provides serialization and deserialization methods for xml a CRUD call sequence with the RestClientHelper trait would look as follows:

val person = Person("John Doe")

//POST /add.xml
val p = create("add.xml",person.toXml){Person.fromXml(_)}
assert(p.id != None)

//GET /search.xml?q=John+Doe
val (status, personOpt) = query("search.xml", Seq("q" -> "John Doe")) { Person.fromXml(_) }
assert(status == 200)

//PUT /update.xml
val changedPerson = p.copy(_.name = "John Who")
val status2  = update("update.xml", changedPerson.toXml)
assert(status2 == 200)

//DELETE /delete/4
val status3 = delete("delete/" + changedPerson.id.get)
assert(status3 == 200)

//GET /search.xml?q=John+Who
1
val (status4, personOpt2) = query("search.xml", Seq("q" -> "John Who")) { Person.fromXml(_) }
assert(status4 == 404)
assert(personOpt2 == None)
Looking under the hood

Now let’s dig a little deeper and discover how one of these rest client helper methods are actually implemented using Dispatch.

def create[T](target: String, reqBody: String)(fromRespStr: String => T): T = {
  executer(:/(host, port) / target << reqBody >- { fromRespStr })
}

Well, this is all that is needed for invoking restful service by means of a POST request. Being new to Dispatch it can help untangling this highly compact DSL for a better understanding.

The core classes participating in an http call as shown above are the following:

  • A HttpExecuter (dispatch.HttpExecuter) responsible for executing the http request. A common implementation would be dispatch.Http. In this example it is respresented by the executer object.
  • A Request (dispatch.Request) and its DSL wrapper dispatch.RequestVerbs, which are responsible for creating a specific kind of request, e.g. a gzip http post with content-type application/x-www-form-urlencoded.
    The class RequestVerbs contains DSL-ish symbolic methods names, which mostly start with a ‘<’ character indicating that something is ‘added’ to a request (left arrow).
    The method << is one of those. It adds a request body to the request and therefore automatically transforms it into a POST request.
  • A Handler (dispatch.Handler) and its DSL wrapper dispatch.HandlerVerbs, which is responsible for handling the result returned by the http call, such as converting it to xml or json.
    The class HandlerVerbs contains as well DSL-ish symbolic method names, which mostly start with a ‘>’ character indicating that something is done with the ‘output’ of the http call (right arrow).
    The method >- is one of those. It converts the response received as a String into the desired type.
  • For both, the Request and Handler, implicit conversions allows them to be converted into their corresponding DSL wrapper RequestVerbs or HandlerVerbs respectively.

With those key classes in mind we can rewrite the above http call in a more verbose manner in order to understand the parts the DSL is made of:

def create[T](target: String, reqBody: String)(fromRespStr: String => T): T = {
    val emptyReq:Request = :/(host, port)

    //implicit conversion applied explicitely
    val reqVerbs:RequestVerbs = Request.toRequestVerbs(emptyReq) 

    val configuredReq:Request = reqVerbs./(target).<<(reqBody)

    //implicit conversion applied explicitely
    val handlerVerbs:HandlerVerbs = Request.toHandlerVerbs(configuredReq)

    //The http executer always needs to be called with a handler
    val handler:Handler[T] = handlerVerbs.>-(fromRespStr)

    executer.apply(handler)
}

The executer needs further explanation. As said, Dispatch requires you to use an executer to finally execute an http request. Dispatch offers several types of executers, such as thread and non-thread safe ones. A thread-safe one that makes use of a shared connection pool could be provided as follows:

  protected lazy val executer = new Http with thread.Safety

The non-thread counterpart could be accessed like this:

  protected def executer = new Http

Various other executers are available. The ones above will most likely be sufficient for most purposes.

To summarize we can conclude that most of the power of Dispatch from a usage point of view lies within the RequestVerbs and HandlerVerbs classes, which allow us to compose a request and decompose its response the way we want it. There is an excellent periodic table of dispatch operators available that lists all the possible Verb methods with a short description.

…and now the nitty-gritty details

With the knowledge we have gained about Dispatch the remaining method implementations of the RestClientHelper should be straight forward. I’ll show each implementation with a short explanation of the most notable Dispatch methods:

Create (POST) with String in- and output

 def create[T](target: String, reqBody: String)(fromRespStr: String => T): T = {
    executer(:/(host, port).POST /  target << reqBody >- { fromRespStr })
  }
  • RequestVars: POST
    Create a POST request (the POST method is optional because of the ‘<<<' method, which forces the request to be a Post)
  • RequestVars: <<(body:String)
    Post the given String value with text/plain content-type
  • RequestVars: >-[T](block:(String) => T)
    Convert the response body from a String to the desired type

Create (POST) with XML in- and output

  def create[T](target: String, reqBody: Elem)(fromRespXml: Elem => T): T = {
    executer(:/(host, port) /  target << reqBody.toString <> { fromRespXml })
  }
  • RequestVerbs: <<(body:String)
    Post the given String value with text/plain content-type
  • HandlerVerbs: <>[T](block:(Elem) => T)
    Convert the response body from an Elem to the desired type

Read (GET)
The read method is probably the most intriguing one from the whole series. In order to understand it fully, I will provide a detailed explanation:

 def query[T](target: String, params: Seq[(String, String)])(fromRespStr: String => T): (Int, Option[T]) = {
    executer x (:/(host, port) /  target <<? params >:> identity) {
      case (200, _, Some(entity), _) => {
     	   val respBodyStr = fromInputStream(entity.getContent()).getLines.mkString
           (200, Some(fromRespStr(respBodyStr)))
      }
      case (status, _, _, _) => (status, None)
    }
 }

Now let’s look at the most important Dispatch ingredients for this GET request:

Input processing
The method <<? from RequestVerbs: <<?(params:Traversable[(String, String)]) simply adds query parameters to the request url.

Output processing
To process the query result we are interested in two things: the response code and response body. Both are returned in the form of a Tuple2 (Int, Option[T]), where Int represents the status code and Option[T] the converted object in case the query yielded a result. The question is how can we retrieve both of them, since Dispatch does not offer a symbolic method that does that for us.

First let’s take a closer look at what is actually called: Instead of providing the executer with a handler directly we use the method ‘x’.

  executer x (…)

Why is that? By calling the executer’s apply method (that’s what finally happens under the hood) the handler block is only called in case the response status code is 200 – 204. In all other cases an Exception is thrown. So if we want to intercept all response codes we need to use the Executer’s x[T](hanlder:Handler[T]) method, which executes the handler no matter which response code is returned.

The next question is what kind of handler is passed to the executer’s x method? If we explicitly assigned the first part of the DSL construct to a handler it would look as follows:

  val intermediateHandler = (:/(host, port) /  target <<? params >:> identity)

The ‘magic’ lies in the >:> identity construct. According to the API the HandlerVerb method >:> accepts a function, by which the response headers can be processed. By passing Scala’s Predef identity method to the >:> method, nothing is processed but the resulting handler of type Handler[Map[String, Set[String]]] is returned.

As said, the handler above does not process the result itself. The real processing is done by the case statements, which are the last arguments that are passed to the DSL construct. How do we need to interpret that? For a better understanding, the last statement could be rewritten as follows:

 val realHandler = intermediateHandler.apply {
    case (200, _, Some(entity), _) => {
       val respBodyStr = fromInputStream(entity.getContent()).getLines.mkString
       (200, Some(fromRespStr(respBodyStr)))
    }
    case (status, _, _, _) => (status, None)
 }
 executer.x(realHandler)

What happens is that we construct another handler by calling the apply method of the previously created one. The apply method of all the handlers accept a function with the following signature:

    apply(next:(Int, HttpResponse, Option[HttpEntity], () => T) => R)

The Int stands for the response code, the HttpResponse and Option[HttpEntity] give us access to the underlying Apache HttpClient implementation and the argument () => T represents the transformation function, which transforms the response to type T, the type of the Handler itself.

As you probably know a case statement IS a function (PartialFunction), which can be chained and passed – even if chained – as a single PartialFunction to a method. So by providing the handler’s apply method with the above case statements we’re able to define in detail how the low-level result of the http call needs to be processed. In our case this means: retrieve and convert the response body when the response code is 200, otherwise simply return the response code.

Even though this construct might look rather complicated it is a very powerful way to access and process the raw result that the http call returns.

Update (PUT) with String in- and output

def update[T](target: String, reqBody: String)(fromRespStr: String => T): T = {
  executer(:/(host, port).PUT /  target <<< reqBody >- { fromRespStr })
}
  • RequestVerbs: PUT
    Create a PUT request (the PUT method is optional because the <<< method forces the request to be a PUT)
  • RequestVerbs: <<<(body:String)
    Put the given String value with text/plain content-type
  • HandlerVerbs: >-[T](block:(String) => T)
    Convert the response body from a String to the desired type

Update (PUT) with XML in- and output

 def update[T](target: String, reqBody: Elem)(fromRespXml: Elem => T): T = {
  executer(:/(host, port) /  target <<< reqBody.toString <> { fromRespXml })
 }
  • RequestVerbs: <<<(body:String)
    Put the given String value with text/plain content-type
  • HandlerVerbs: <>[T](block:(Elem) => T)
    Convert the response body from an Elem to the desired type

Update (PUT) without response body

def update[T](target: String, reqBody: String): Int = {
    executer x ((:/(host, port) /  target <<< reqBody >:> identity) {
      case (status, _, _, _) => status
    })
}

If we want to send a PUT without expecting a response body but are still interested in the response code we use the same construct as described above in the read sample. The only difference compared to the read example is that we do not retrieve the response body, but simply return the response code.

Delete (DELETE)

def delete(target: String):Int = {
   executer x ((:/(host, port)).DELETE /  target >:> identity)  {
      case (status, _, _, _) => status
   }
}

The delete method does not process a response body. In order to know whether the deletion was successful we are however interested in the response code. Therefore, we again use the construct as in the read example.

Security & Authentication

Finally, let’s explain what we needed to do if our restful service was secured with https and/or basic/digest authentication.

For https the RequestVars’ secure method should be called on the request, for authentication the as(“username”, “pwd”) method. Therefore, a secure create (POST) call that uses basic authentication would look as follows:

 def create[T](target: String, reqBody: String)(fromRespStr: String => T): T = {
    executer(:/(host, port).secure.as(“user”, “pwd”) /  target << reqBody >- { fromRespStr })
 }

In case security combined with authentication is used it is self evident that the RestClientHelper trait would provide a method that returns a preconfigured request in order to by DRY:

trait RestClientHelper {
 val host: String
 val port: Int
 protected def username:String = "unkown"
 protected def pwd:String = "unkown"
 private def req = :/(host, port).secure.as(username, pwd)

 def create[T](target: String, reqBody: String)(fromRespStr: String => T): T = {
    http(req / target << reqBody >- { fromRespStr })
 }
...
Wait, there is more

Dispatch offers various other features that are not mentioned in this blog. E.g.: Dispatch can directly convert a response body into a Lift Json object by means of the ># method (a good example is provided on the Dispatch site), it can execute http requests in a background thread, there are various HandlerVerbs and RequestVers, which we have not covered yet like using gzip mode, chaining handlers etc. to mention some of them. (check the periodic table for further reference). For testing restful services however, the stuff covered in this blog should suffice.

The RestClientHelper ready to use

To conclude this blog as follows the source of the RestClientHelper trait, which hopefully makes restful service testing for you a piece of cake and fun (after having Dispatch set up)!

import scala.io.Source._
import scala.xml._
import org.apache.http._
import dispatch._

trait RestClientHelper {
  val host: String
  val port: Int
  val contextRoot: String
  protected val ssl = false
  protected val username = "notdefined"
  protected val pwd = "notdefined"
  protected val executer = new Http with thread.Safety
  private def req = {
    val req = :/(host, port).as(username, pwd) / contextRoot
    if (ssl) req.secure else req
  }

  def query[T](target: String, params: Seq[(String, String)])(fromRespStr: String => T): (Int, Option[T]) = {
    executer x (req / target <<? params >:> identity) {
      case (200, _, Some(entity), _) => {
        val respStr = fromInputStream(entity.getContent()).getLines.mkString
        (200, Some(fromRespStr(respStr)))
      }
      case (status, _, _, _) => (status, None)
    }
  }
  def query[T](target: String)(fromRespStr: String => T): (Int, Option[T]) = {
    query(target, List())(fromRespStr)
  }
  def queryXml[T](target: String)(fromRespXml: Elem => T): (Int, Option[T]) = {
    queryXml(target, List())(fromRespXml)
  }
  def queryXml[T](target: String, params: Seq[(String, String)])(fromRespXml: Elem => T): (Int, Option[T]) = {
    val convertOutput = (s: String) => fromRespXml(XML.loadString(s))
    query(target, params)(convertOutput(_))
  }

  def create[T](target: String, reqBody: String)(fromRespStr: String => T): T = {
    executer(req / target << reqBody >- { fromRespStr })
  }
  def create[T](target: String, reqBody: Elem)(fromRespXml: Elem => T): T = {
    executer(req / target << reqBody.toString <> { fromRespXml })
  }

  def update[T](target: String, reqBody: String)(fromRespStr: String => T): T = {
    executer(req / target <<< reqBody >- { fromRespStr })
  }
  def update[T](target: String, reqBody: Elem)(fromRespXml: Elem => T): T = {
    executer(req / target <<< reqBody.toString <> { fromRespXml })
  }
  def update[T](target: String, reqBody: String): Int = {
    executer x ((req / target <<< reqBody >:> identity) {
      case (status, _, _, _) => status
    })
  }
  def update[T](target: String, reqBody: Elem): Int = {
    update(target, reqBody.toString)
  }

  def delete(target: String): Int = {
    executer x ((req).DELETE / target >:> identity) {
      case (status, _, _, _) => status
    }
  }
}

Possible usage in an object or any other class you want the RestClientHelper to be mixed in:

object MyRestClientHelper extends RestClientHelper {
    val host = "rest.service.host"
    val port = 443
    val contextRoot = "v1"
    override val ssl = true
    override val username = "myusername"
    override val pwd = "secret"
}

Share

Categories: Companies

Sharing Ecosystems

Fri, 11/25/2011 - 13:33

I am convinced that the next blue ocean of agile minds can be found in the creation of sharing ecosystems that are built on shared purpose, trust, intuition and a facilitation of the deeply wired human urge to cooperate as a collective. Understanding that modern day individualism is smothering our effectiveness is a catalyst for our drive to start working together and forming the effectiveness of these systems.

We start to understand that survival of the fittest is, even for humans, less important than survival of a species. Nature learned this a long time ago, with birds flocking to maximize survival rates and schools of fish doing the same. Humans became soloists a long time ago, when everything to support our lives was abundant. The need for teamwork apparently dropped off somewhere and became less important. It’s only sparsely we see naturally inclined teamwork in action.
Maybe you have seen the show “extreme home makeover” or a localized version of this show. You see a family that struggles in live, and instinctively you feel with them. You want to help this unfortunate family and so do the people on the show. The way in which they pull together and build amazing new homes is absolutely amazing. The families that receive this kind gift are filled with pure joy when they see the new roof over their heads, and also the builders become overwhelmed with emotion as a result. I love this show because these emotions are real. We are simply wired to react this way. (for more on this subject please watch the documentary “I am” by Tom Shadyac.)

What puzzles me, is that if we are wired by nature to enjoy cooperating effectively in a collective, then why do we encounter problems with it in our workspace. I think the main cause of these problems lies in the lack of shared purpose that motivates us, and the ability to trust each other in the pursuit to be purposeful. Maximizing income, and therefor your own benefits, is not real purpose that is felt across your organization. This doesn’t work in complex environments like the IT companies we are used to support with agile. Real purpose, mastery and autonomy however do, as Dan Pink shows us in his illustrative video on this subject. Others, like my colleague Olav Maassen, recognize this as well, trying to channel mastery into new forms for the workspace and beyond.
Another great example of seeing Dan’s findings in action, in my opinion, is the IDEO initiative openideo.com. People can literally join forces and find solutions that actually help to solve collective real world problems. They do so for free, just because they want to. Talk about working together to change the world! Although you can score points for helping in openideo, showing off your mastery in product and idea development, I think the greater driver here is the urge to help others in a collective fashion. Wouldn’t it be great if we could have an openideo in every company out there? We have so much knowledge, creativity and willingness in-house to collectively bring our companies to the next level. We only need to truly engage our co-workers.
That’s were autonomy comes into play. We should trust each other enough to facilitate more autonomy in the workplace. Provide room to your stake-takers and become genuine stake-sharers on route to joint company stakeholdership. Why not trust on one another rather than being fearful of the negative impact on our own stakes. I guess this also has to do with purpose somehow. Idealistic purpose with no sense of realism leads nowhere and maybe even worsens the status quo.

As agile consultants and coaches we try to foster teamwork and consult in change towards sharing ecosystems. I guess we could say we are blessed to have nature on our side. I guess we can also say we need to balance idealism and realism in creating purposeful systems and that this requires a significant level of trust within our clients organization. Although this may sound pretty far away, why not start with people in your direct circle of influence. Try to be as transparent as possible towards your client. Work together and build trust through trust, paving the road to change.

Share

Categories: Companies

Size does matter! Be careful to use velocity as measure for improvement

Thu, 11/24/2011 - 19:00

Imagine you are playing a game of rugby against some blacksuited guys who are doing some odd dancing and screaming exercise before you finally get to start playing. You win the game 27 – 3. You can imagine it wasn’t just one beer at the big party after the match and you did not see home before early morning. A year later your team finds itself in the same stadium against the same guys, doing the same little piece of folk dancing, just a little louder than last year. This time you win 27 – 6, only. The coach and the crowd are going mad: your team lost half of its performance in just a year time! You take a shower, no beers, go home and go to bed early. Measuring the improvement in performance is easy!

Acceleration is a must

Scrum advocates the use of pokercards quoting the estimation of work items in story points. Velocity is the total number of story points a team can take on in a sprint.

Next to using velocity for planning, velocity is often promoted as a measure for improvement. Learning, solving impediments, implementing improvements, more fun and better team play all should lead to an increased velocity over sprints. Steady or even declining velocity is a signal of a mediocre, non-improving team.

The theory and actual practice of estimations in story points diverge, which makes drawing performance conclusions out of velocity records a tricky business. The question is whether velocity should be used as a measure of improvement at all. This blog explains why you should not, or at least that doing so should be reserved ‘for trained professionals use only’.

How do you like your Story Points?

In a last-summer blog (June 21 2010, its-effort-not-complexity) Mike Cohn qualifies the relabeling of story point to complexity points as ‘wrong’. He states: “Story points are not about the complexity of developing a feature; they are about the effort required to develop a feature.”

If we follow Mike, story points are a function of complexity, amount of work (‘sheer volume of work’) and expertise (of the team on the domain). Maybe uncertainty should be part of the formula as well. The consequence of this definition is that the amount of work reduces when the team gains more expertise, uses smarter solutions or can profit from structural improvements.

So following Freyr in his first comment on Mike’s blog “a story last year which was a 5, now has some process improvements and this year is a 2. The velocity of the team stays steady, story points per story decrease.” Let’s call this method A.

This way to determine story points differs from the way I was raised with: user stories are estimated relative to a fixed (set of) reference user stories. A story of 5 last year is still a 5. Implemented improvements (whether technical or on teamwork, skills, knowledge or competences) should lead to higher performance so that we can do more such stories in one sprint. Velocity increases, story points per story stay steady. Let’s call this method B.

Count on Story Points while planning

In Scrum estimations and velocity recordings are primarily used for planning purposes. User stories are estimated in story points, and the number of stories planned in sprint depends on the sum of story points of these stories and the velocity of the last sprint or two. Furthermore, story points and velocity records can likewise be used for release planning.

Whether using method A or B a team can properly predict the result of a sprint. And in both cases one can calculate and forecast the number of sprints needed to finish a selected set of user stories in a release.

However, using method A the team will need to deal with new competences leading to lower estimates, even for user stories estimated in ‘the past’. Using method B the team needs to deal with improving velocities when predicting the number of sprints needed for a set Release.

Measuring improvement is a different story

Scrum promises an increase in productivity / performance. A growth factor of 4 even up to 10 is often quoted. Soon after the introduction of Scrum management requests proof of this growth, especially after they have been baffled with all formerly hidden impediments in their organisation: Scrum does not solve your problems, but reveals them! Putting a heavy burden on the belief in reaching the benefits promised.

Very often velocity is used to measure this improvement. This is only valid if we keep the story points for the same kind of stories steady, as shown in method B. In fact this should only be done if we restrict the formula of estimating story points to size, the size of a piece of work. This is shown by an example by Jose in another comment on Mike’s blog:

Imagine that we have a team of painters and their job is to paint walls. The team sees pictures of the walls and they estimate the wall area to be painted. There are small walls, medium walls and some really big walls. The team estimates each area using relative points. They start painting the walls and after some time, they are able to calculate how many points they can do on each sprint, the velocity. Now the customer can calculate how long the project will take, using the velocity and the product backlog.

The problem is that determining size is one of the most difficult challenges in the field of software development. There certainly are environments where this works fine. Recently we coached a team in a Business Intelligence environment, who were able to define their workitems in predefined steps, and who had derived formula’s from experience to calculate the size of work for each step. Their estimations were remarkably accurate.As soon as the painters get better tools and paint, or gain in experience and concentration they will be able to color bigger walls in the same amount of time.

For most other teams the thing coming closest is Function Point Analysis, which is far from easy and a not a common competence in Development Teammembers.

About every other team we have seen estimates effort. Better tooling, a higher grade of automation, more competences, all lead to less effort and are discounted in the final story points. Fine for planning, but using velocity trends based on these story points is a very tricky business and will not reflect the growth teams realized to the amount they deserve.

The point of this story: size matters

Story points are calculated using different methods and formulas. If your formula does not equal just sheer size, you should not use your velocity measures as an indicator for performance growth. Like you cannot learn your performance development by simply comparing the scores of two games of rugby. If there are many other factors influencing the scores you will do wrong to the team and you should stick to use these records ‘to start a conversation’ only. Measuring the improvement in performance ain’t easy.

Share

Categories: Companies

Android Package Synergy

Mon, 11/14/2011 - 20:43

Unlike announced in my previous post this one is neither soon nor on a surprise topic. It is about a general aspect of Android that is, to my opinion, very powerful but often under utilized.

Android apps are not monolithic but rather a collection of components of different kinds. I suspect android took inspiration from the concept of midlet suites in j2me and believe it expanded on that quite well.
These components (except for provider) can be exposed through intent filters in the package’s manifest and can be used by other components in different packages. This allows apps to accomplish tasks together which a single app could never do.

The most important type of component is the Activity and it can be used in several ways:

  • Leaf: view data specified or referred to in the extras of the calling intent.
  • Put: a pre-filled form with data from the extras. performs an action then stops or returns.
  • Pick: select data from a source owned by another app and return it (or a reference to it) to the calling activity.

As activities might start other activities in other tasks, care must be taken to maintain a consistent activity back stack.
Image Shortcut and Send Text demonstrate usage of these intents (but an abnormal use of content provider)

The type of component most suitable for increasing synergy is perhaps the Content Provider.
A Content Provider is accessed through a content resolver and can be used to search, read and update data belonging to a different package. A component can also register to the Content Provider to be notified of changes to its data set.

To send a message to a different package, a Broadcast Receiver is used.
The system also sends a number of messages which apps can listen to.
Sending a broadcast intent can be done asynchronously or synchronously so you can get a result for each matching receiver and have them called in order of priority.
In either case a Broadcast Receiver is not the place to do a lot of processing. Usually either a service is started to handle any data or a notification is raised so the user can take action.

The final type of component is a Service. As the name suggests it runs in the background, and comes in the varieties “run once and die” and “create once then run as demanded”. A service can be bound to through a Binder and communicates through aidl.

I think that, apart from selfishness, fear of abuse is a factor holding back synergy.
Although you can create permissions and set those on your components so other apps need to request those permissions before they can use your components, some users might not review the requested permissions when installing a potentially malicious app.

I am currently developing an Android app at a bank which has a component (currently not exposed) that allows a user to transfer money from his account.
If the activity was exposed it could be used as part of the payment process in any shopping app. The transfer activity would receive the amount and the destination account pre-filled through intent extras set by the shopping app. The transfer activity could return a cookie to the shopping app (similar to pick) which it can verify against the bank’s server independently.

Of course the shopping app shouldn’t have to scan the phone for each banking app so there needs to be defined a common profile of supported component interactions for each type of task. This would allow the shopping app to show an iDeal button calling an intent that matches the filter of any installed banking app’s transfer component in drive-through mode.

‘The whole is greater than the sum of its parts’ Aristotle, Metaphysica

Share

Categories: Companies

Getting the Java out of your Scala, part 2

Sat, 11/12/2011 - 19:09

Getting the Java out of your Scala, part 2

I’m still trying to get rid of old habits, to shake of my winter hide, so to speak, and create some real Scala in stead of ScaVa (i.e. Java with a Scala syntax). If you’re interested you can bear witness to my struggle on GitHub (ShoppingList on GitHub). This story came about because I asked some colleagues for help. We ended up rewriting loops in several ways.
What I’ll show you is some alternatives to classic loops over collections.

The code is attached here. The example shows how to summarize items in a list. The objects in the list are all instances of class Stuff, a simple value container for a String and an Int. The idea is to summarize Stuffs with the same key to produce a new list:

  val items = List(Stuff("A", 1), Stuff("A", 2), Stuff("B", 3), Stuff("B", 4))

should become

  val expectedResult = List(Stuff("A", 3), Stuff("B", 7))

where Stuff is a simple case class with two attributes:

case class Stuff(val label: String, val number: Int)

Because I’m a long time Java programmer, my first solution looks like this:

  def classicSum = {
    var result: List[Stuff] = List()
    for (item <- items) {
      if (result.size > 0 && item.label == result.head.label) {
        result = Stuff(result.head.label, result.head.number + item.number) :: result.drop(1)
      } else {
        result = item :: result
      }
    }
    assertEquals(expectedResult, result.reverse)
  }
 

Good old looping and a var to collect the result. This works, but it doesn’t say what’s happening. All you see is a loop and some fancy list processing. Note the call to drop(1) allowing you to replace the head of the list with a new instance. I like the drop function and I’ve used it in other programs but here it just obfuscates matters.

The next solution is to go recursive. If you check out the version of ShoppingList as of October/November 2011, you’ll find lots of recursion. My goal at the time was to jam recursion into my head by banning all other forms of looping.
Applied to the list of Stuff instances and the problem at hand, the result is tragic:

  def recursiveSum = {
    @tailrec def sum(listOfPairs: List[Stuff], result: List[Stuff]): List[Stuff] = {
      listOfPairs match {
        case Nil => result
        case head :: tail =>
          {
            val currentHead = if (result.size == 0) { Stuff(head.label, 0) } else { result.head }
            val newResult =
              if (currentHead.label == head.label) {
                Stuff(currentHead.label, currentHead.number + head.number) :: result.drop(1)
              } else { head :: result }
            sum(tail, newResult)
          }
      }
    }
    val result = sum(items, List())
    assertEquals(expectedResult, result.reverse)
  }
 

My next version leverages Scala’s collections to drastically reduce the amount of code. The first attempt passes the test but it is cryptic. In the spirit of the red-green-refactor mantra I’ll show it to you anyway:

  def sumByGroupSolution1 = {
    val groupedByLabel = items.groupBy(_.label)
    val result = groupedByLabel map { t => Stuff(t._1, t._2 map { _.number } sum) }
    assertEquals(expectedResult, result)
  }
 

There! Go and parse that if you need to change this code sometime next year (note: having worked with this code for some time now while writing this blog, I must admit that it grows on you; after a while it doesn’t hurt all that much, sort of like a new pair of shoes). This cryptic piece of mal-ware illustrates the use of three powerfull methods named groupBy, map and sum.
groupBy is in a sense comparable to SQLs group by clause. The Scala version takes a collection and returns a Map. The map’s key is the element used to group by (Stuff.label in my case). The value is a List of elements that have the same key. In this case the items collection contains Stuff instances. I’ve grouped by the label field so the type of groupedByLabel is Map[String, List[Stuff]]. On this Map we have to apply a sum function to add up all Stuff instances in the List of Stuffs. Sum however works on Int’s, so before sum can be applied we have to extract the number field of each Stuff instance. This is done by a map: map {_.number}. The mapping is applied to each Stuff instance in the list of Stuffs that was returned by groupedByLabel.

By factoring out part of the code on line 3 in the example above, we can improve readability:

  def sumByGroupSolution2 = {
	val stuffsGroupedByLabel = items.groupBy(_.label)
    def sumOfStuffsWithTheSameLabel(stuffs: List[Stuff]): Int = stuffs map { _.number } sum
    val result = stuffsGroupedByLabel map { t => Stuff(t._1, sumOfStuffsWithTheSameLabel(t._2)) }
    assertEquals(expectedResult, result)
  }
 

I think this is better because now the result of part of the computation is named. Naming a thing makes it easier to see the algorithm.

The next solution uses more intermediate results. The results of each map or sum operation are stored in a variable that gets a meaningful name. This clarifies the meaning of intermediate results but the amount of code grows.

  @Test
  def sumByGroupSolution5 = {
    val calcSumOfStuff = (stuffs: Seq[Stuff]) => stuffs map { _.number } sum
    val groupedByLabel = items.groupBy { _.label }
    val resultGroupedByLabel = groupedByLabel mapValues { calcSumOfStuff }
    val result = resultGroupedByLabel map { t => Stuff(t._1, t._2) }
    assertEquals(expectedResult, result)
  }

The code shows another Map function named mapValues.

 val resultGroupedByLabel = groupedByLabel mapValues { calcSumOfStuff }

mapValues works on the values of a map only, ignoring the keys. In my case the keys don’t really matter (they’re an artifact of the groupBy call) so I can get away with mapValues rather than map.

Naming things was also the inspiration for the next solution. Now I’m not extracting intermediate results but in this case I’ve named the variables being manipulated:

  def sumByGroupSolution4 = {
    val groupedByLabel = items.groupBy(_.label)
    val result = groupedByLabel map { case (label, stuffs) => Stuff(label, stuffs map { _.number } sum) }
    assertEquals(expectedResult, result)
  }

The case statement as argument to the map makes it possible to introduce two variables to identify the things we’re manipulating. Label and stuffs mean more to me than _1 and _2.

Looking back on my struggle I like this solution best. I think it is both concise and easy to read.

Share

Categories: Companies