Skip to content


Should Agile Equal Being Happy?

Leading Agile - Mike Cottmeyer - Tue, 02/17/2015 - 15:50

Ever had a conversation with someone about what they thought “being” Agile meant?  I was having that conversation today.  The other guy said he was surprised that he wasn’t happier.  I asked him to help me understand what he meant by that.

An Agile team should be happy

Someone, somewhere, convinced this fellow that the Manifesto for Agile Software Development included life, liberty, and the pursuit of happiness.

The reality is I feel he was misguided, just like all of those other people who think that if you’re on an Agile team then you don’t plan, you don’t test, or you don’t document. The ideas like Agile is all teddy bears and rainbows has somehow spread to the far reaches of the Agile community.

When asked if Agile makes me happy, my response was simple.


Being an Agile coach, leading Agile transformations, and helping customers reach their potential does not make me happy.  It leaves me with a feeling of satisfaction.  Much like mowing my lawn every weekend in summer, it doesn’t make me happy. But, when I am done with the task at hand, I look at what I have accomplished and I feel satisfied.  Isn’t that a more realistic goal? The pursuit of satisfaction, as it relates to work?  Happiness is an emotional state that I reserve to my personal life, when I combine satisfaction from my work and positive emotions in my off-time.

Is the goal of happiness within an Agile team misguided?

I’m interested in your thoughts.

The post Should Agile Equal Being Happy? appeared first on LeadingAgile.

Categories: Blogs

Python/pandas: Column value in list (ValueError: The truth value of a Series is ambiguous.)

Mark Needham - Mon, 02/16/2015 - 23:39

I’ve been using Python’s pandas library while exploring some CSV files and although for the most part I’ve found it intuitive to use, I had trouble filtering a data frame based on checking whether a column value was in a list.

A subset of one of the CSV files I’ve been working with looks like this:

$ cat foo.csv

Loading it into a pandas data frame is reasonably simple:

import pandas as pd
df = pd.read_csv('foo.csv', index_col=False, header=0)
>>> df
0    1
1    2
2    3
3    4
4    5
5    6
6    7
7    8
8    9
9   10

If we want to find the rows which have a value of 1 we’d write the following:

>>> df[df["Foo"] == 1]
0    1

Finding the rows with a value less than 7 is as you’d expect too:

>>> df[df["Foo"] < 7]
0    1
1    2
2    3
3    4
4    5
5    6

Next I wanted to filter out the rows containing odd numbers which I initially tried to do like this:

odds = [i for i in range(1,10) if i % 2 <> 0]
>>> odds
[1, 3, 5, 7, 9]
>>> df[df["Foo"] in odds]
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/Users/markneedham/projects/neo4j-himym/himym/lib/python2.7/site-packages/pandas/core/", line 698, in __nonzero__
ValueError: The truth value of a Series is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all().

Unfortunately that doesn’t work and I couldn’t get any of the suggestions from the error message to work either. Luckily pandas has a special isin function for this use case which we can call like this:

>>> df[df["Foo"].isin(odds)]
0    1
2    3
4    5
6    7
8    9

Much better!

Categories: Blogs

Managing Flow

TV Agile - Mon, 02/16/2015 - 20:45
This presentation shows the impact that wait time has on when the customer receives the product or service. How by focusing on wait time we can improve the flow of products or services to our customers and significantly reduce the time to delivery to the customer. Deliver with higher frequency and better quality with example […]
Categories: Blogs

Agile Misconceptions: There Is One Right Approach

Johanna Rothman - Mon, 02/16/2015 - 17:59

I have an article up on called Common Misconceptions about Agile: There Is Only One Approach.

If you read my Design Your Agile Project series, you know I am a fan of determining what approach works when for your organization or project.

Please leave comments over there. Thanks!

Two notes:

  1. If you would like to write an article for, I’m the technical editor. Send me your article and we can go from there.
  2. If you would like more common-sense approaches to agile, sign up for the Influential Agile Leader. We’re leading it in San Francisco and London this year. Early bird pricing ends soon.
Categories: Blogs

What good are story points and velocity in Scrum?

Scrum Breakfast - Mon, 02/16/2015 - 12:10
We use velocity as a measure of how many story points to take into the next sprint. When you take in enough stories, and story points, so that you reach your average velocity, then, you can end the sprint planning meeting.Although this is a common approach, it is exactly how you should not use story points in Scrum. It leads to over-commitment and spillover (started, but unfinished work) at the end of the sprint. Both of these are bad for performance. How should you use story points in planning? How do you create the Forecast? And what do you do if the team runs out of work?

The first thing to remember is that Development Team is self-organizing. They have exclusive jurisdiction over how much work they take on. The Product Owner has final say over the ordering of items in the backlog, but nobody tells the the Development Team how much work to take on! Not the Product Owner, not the ScrumMaster, and certainly not the math!

As a Product Owner, I would use story points to help set medium and long-term expectations on what is really achievable. Wish and probable reality need to be more or less in sync with each other. If the disparity is too big, it's the Product Owner's job to fix the problem, and she has lots of options: less scope, simpler acceptance criteria, more time, more people, pivot, persevere, or even abandon.

As a ScrumMaster, I would use velocity to identify a number of dysfunctions. A wavy burndown chart is often a symptom of stories that are too big, excessive spillover, or poorly understood acceptance criteria (to name the most likely causes). A flattening burn-down chart is often a sign of technical debt. An accelerating burn-down chart may be sign of management pressure to perform (story point inflation). A lack of a burn-down or velocity chart may be a sign of flying blind!

As a member of the Development Team, I would use the estimate in story points to help decide whether stories are ready to take into the sprint. An individual story should represent on average 10% or less of the team's capacity.
How to create the Sprint ForecastHow much work should the team take on in a sprint? As Scrum Master, I would ask the team, can you do the first story? Can you do the first and the second? Can you do first, the second and the third? Keep asking until the team hesitates. As soon as they hesitate, stop. That is the forecast.

Why should you stop at this point? Taking on more stories will add congestion and slow down the team. Think of the highway at rush hour. Do more cars on the road mean the traffic moves faster? Would be nice.

Why do you even make a forecast? Some projects say, let's just get into a state of flow, and pull work as we are ready to take it. This can work too, but my own experience with that approach has been mixed. It is very easy to lose focus on getting things done and lose the ability to predict what can be done over a longer period of time. So I believe Sprint Forecasts are useful because they help us inspect-and-adapt enroute to our longer term goal.

What about "yesterday's weather"? Can we use the results of the last sprint to reality check the forecast for this sprint? Sure! If your team promised 100 but only delivered 70 or less, this is a sign that they should not commit to more than 70, and quite probably less. I call this "throttling", and it is one of my 12 Tips for Product Owners who want better performance from their Scrum Teams. But yesterday's weather is not a target, it's a sanity check. If it becomes your target, it may be holding you down.
What if the team runs out of work?On the one hand, this is easy. If the team runs out of work, they can just ask the Product Owner for more. A working agreement can streamline this process, for example, Team, if you run out of work, you can:

  • Take the top item from the product backlog.
  • Contact me (the Product Owner) if you get down to just one ready item in the backlog
  • Implement your top priority improvement to our code ("refactoring")

Implementing improvements from the last retrospective is usually a particularly good idea, unless you are very close to a release. There are investments in productivity that will often pay huge dividends, surprisingly quickly!

Categories: Blogs

Want best impact? Change yourself!

Manage Well - Tathagat Varma - Mon, 02/16/2015 - 12:02
A lot of us want to create an impact, especially the ones that comes in B-I-G font size. Change the world. Stop global warming. Establish world peace. Find cancer cure. Stop wars. Leave a legacy that lasts forever. We want to conquer the world with our ideas, our creation, our accomplishments.
Categories: Blogs

Kanban Thinking Workshop in London

AvailAgility - Karl Scotland - Mon, 02/16/2015 - 11:00


I have another public Kanban Thinking workshop coming up in London (March 5-6), in collaboration with Agil8, and to fill the last few places, I can offer a discount! Book now, using the code KS25 to get 25% off the standard price, and get 2 days of fun, discover how to design a kanban system by populating a kanban canvas, and learn how to make system interventions which have a positive impact.

To wet your appetite, here’s a couple of photos from a recent workshop. (Click for larger versions).


Categories: Blogs

Early Bird Ends Soon for Influential Agile Leader

Johanna Rothman - Sun, 02/15/2015 - 22:46

If you are a leader for your agile efforts in your organization, you need to consider participating in The Influential Agile Leader. If you are working on how to transition to agile, how to talk about agile, how to help your peers, managers, or teams, you want to participate.

Gil Broza and I designed it to be experiential and interactive. We’re leading the workshop in San Francisco, Mar 31-Apr 1. We’ll be in London April 14-15.

The early bird pricing ends Feb 20.

People who participate see great results, especially when they bring peers/managers from their organization. Sign up now.

Categories: Blogs

Python/scikit-learn: Calculating TF/IDF on How I met your mother transcripts

Mark Needham - Sun, 02/15/2015 - 17:56

Over the past few weeks I’ve been playing around with various NLP techniques to find interesting insights into How I met your mother from its transcripts and one technique that kept coming up is TF/IDF.

The Wikipedia definition reads like this:

tf–idf, short for term frequency–inverse document frequency, is a numerical statistic that is intended to reflect how important a word is to a document in a collection or corpus.

It is often used as a weighting factor in information retrieval and text mining.

The tf-idf value increases proportionally to the number of times a word appears in the document, but is offset by the frequency of the word in the corpus, which helps to adjust for the fact that some words appear more frequently in general.

I wanted to generate a TF/IDF representation of phrases used in the hope that it would reveal some common themes used in the show.

Python’s scikit-learn library gives you two ways to generate the TF/IDF representation:

  1. Generate a matrix of token/phrase counts from a collection of text documents using CountVectorizer and feed it to TfidfTransformer to generate the TF/IDF representation.
  2. Feed the collection of text documents directly to TfidfVectorizer and go straight to the TF/IDF representation skipping the middle man.

I started out using the first approach and hadn’t quite got it working when I realised there was a much easier way!

I have a collection of sentences in a CSV file so the first step is to convert those into a list of documents:

from collections import defaultdict
import csv
episodes = defaultdict(list)
with open("data/import/sentences.csv", "r") as sentences_file:
    reader = csv.reader(sentences_file, delimiter=',')
    for row in reader:
for episode_id, text in episodes.iteritems():
    episodes[episode_id] = "".join(text)
corpus = []
for id, episode in sorted(episodes.iteritems(), key=lambda t: int(t[0])):

corpus contains 208 entries (1 per episode), each of which is a string containing the transcript of that episode. Next it’s time to train our TF/IDF model which is only a few lines of code:

from sklearn.feature_extraction.text import TfidfVectorizer
tf = TfidfVectorizer(analyzer='word', ngram_range=(1,3), min_df = 0, stop_words = 'english')

The most interesting parameter here is ngram_range – we’re telling it to generate 2 and 3 word phrases along with the single words from the corpus.

e.g. if we had the sentence “Python is cool” we’d end up with 6 phrases – ‘Python’, ‘is’, ‘cool’, ‘Python is’, ‘Python is cool’ and ‘is cool’.

Let’s execute the model against our corpus:

tfidf_matrix =  tf.fit_transform(corpus)
>>> len(feature_names)
>>> feature_names[50:70]
[u'00 does sound', u'00 don', u'00 don buy', u'00 dressed', u'00 dressed blond', u'00 drunkenly', u'00 drunkenly slurred', u'00 fair', u'00 fair tonight', u'00 fall', u'00 fall foliage', u'00 far', u'00 far impossible', u'00 fart', u'00 fart sure', u'00 friends', u'00 friends singing', u'00 getting', u'00 getting guys', u'00 god']

So we’re got nearly 500,000 phrases and if we look at tfidf_matrix we’d expect it to be a 208 x 498254 matrix – one row per episode, one column per phrase:

>>> tfidf_matrix
<208x498254 sparse matrix of type '<type 'numpy.float64'>'
	with 740396 stored elements in Compressed Sparse Row format>

This is what we’ve got although under the covers it’s using a sparse representation to save space. Let’s convert the matrix to dense format to explore further and find out why:

dense = tfidf_matrix.todense()
>>> len(dense[0].tolist()[0])

What I’ve printed out here is the size of one row of the matrix which contains the TF/IDF score for every phrase in our corpus for the 1st episode of How I met your mother. A lot of those phrases won’t have happened in the 1st episode so let’s filter those out:

episode = dense[0].tolist()[0]
phrase_scores = [pair for pair in zip(range(0, len(episode)), episode) if pair[1] > 0]
>>> len(phrase_scores)

There are just under 5000 phrases used in this episode, roughly 1% of the phrases in the whole corpus.
The sparse matrix makes a bit more sense – if scipy used a dense matrix representation there’d be 493,000 entries with no score which becomes more significant as the number of documents increases.

Next we’ll sort the phrases by score in descending order to find the most interesting phrases for the first episode of How I met your mother:

>>> sorted(phrase_scores, key=lambda t: t[1] * -1)[:5]
[(419207, 0.2625177493269755), (312591, 0.19571419072701732), (267538, 0.15551468983363487), (490429, 0.15227880637176266), (356632, 0.1304175242341549)]

The first value in each tuple is the phrase’s position in our initial vector and also corresponds to the phrase’s position in feature_names which allows us to map the scores back to phrases. Let’s look up a couple of phrases:

>>> feature_names[419207]
>>> feature_names[312591]
>>> feature_names[356632]

Let’s automate that lookup:

sorted_phrase_scores = sorted(phrase_scores, key=lambda t: t[1] * -1)
for phrase, score in [(feature_names[word_id], score) for (word_id, score) in sorted_phrase_scores][:20]:
   print('{0: <20} {1}'.format(phrase, score))
ted                  0.262517749327
olives               0.195714190727
marshall             0.155514689834
yasmine              0.152278806372
robin                0.130417524234
barney               0.124411751867
lily                 0.122924977859
signal               0.103793246466
goanna               0.0981379875009
scene                0.0953423604123
cut                  0.0917336653574
narrator             0.0864622981985
flashback            0.078295921554
flashback date       0.0702825260177
ranjit               0.0693927691559
flashback date robin 0.0585687716814
ted yasmine          0.0585687716814
carl                 0.0582101172888
eye patch            0.0543650529797
lebanese             0.0543650529797

We see all the main characters names which aren’t that interested – perhaps they should be part of the stop list – but ‘olives’ which is where the olive theory is first mentioned. I thought olives came up more often but a quick search for the term suggests it isn’t mentioned again until Episode 9 in Season 9:

$ grep -rni --color "olives" data/import/sentences.csv | cut -d, -f 2,3,4 | sort | uniq -c
  16 1,1,1
   3 193,9,9

‘yasmine’ is also an interesting phrase in this episode but she’s never mentioned again:

$ grep -h -rni --color "yasmine" data/import/sentences.csv
49:48,1,1,1,"Barney: (Taps a woman names Yasmine) Hi, have you met Ted? (Leaves and watches from a distance)."
50:49,1,1,1,"Ted: (To Yasmine) Hi, I'm Ted."
51:50,1,1,1,Yasmine: Yasmine.
53:52,1,1,1,"Yasmine: Thanks, It's Lebanese."
65:64,1,1,1,"[Cut to the bar, Ted is chatting with Yasmine]"
67:66,1,1,1,Yasmine: So do you think you'll ever get married?
68:67,1,1,1,"Ted: Well maybe eventually. Some fall day. Possibly in Central Park. Simple ceremony, we'll write our own vows. But--eh--no DJ, people will dance. I'm not going to worry about it! Damn it, why did Marshall have to get engaged? (Yasmine laughs) Yeah, nothing hotter than a guy planning out his own imaginary wedding, huh?"
69:68,1,1,1,"Yasmine: Actually, I think it's cute."
79:78,1,1,1,"Lily: You are unbelievable, Marshall. No-(Scene splits in half and shows both Lily and Marshall on top arguing and Ted and Yasmine on the bottom mingling)"
82:81,1,1,1,Ted: (To Yasmine) you wanna go out sometime?
85:84,1,1,1,[Cut to Scene with Ted and Yasmine at bar]
86:85,1,1,1,Yasmine: I'm sorry; Carl's my boyfriend (points to bartender)

It would be interesting to filter out the phrases which don’t occur in any other episode and see what insights we get from doing that. For now though we’ll extract phrases for all episodes and write to CSV so we can explore more easily:

with open("data/import/tfidf_scikit.csv", "w") as file:
    writer = csv.writer(file, delimiter=",")
    writer.writerow(["EpisodeId", "Phrase", "Score"])
    doc_id = 0
    for doc in tfidf_matrix.todense():
        print "Document %d" %(doc_id)
        word_id = 0
        for score in doc.tolist()[0]:
            if score > 0:
                word = feature_names[word_id]
                writer.writerow([doc_id+1, word.encode("utf-8"), score])
            word_id +=1
        doc_id +=1

And finally a quick look at the contents of the CSV:

$ tail -n 10 data/import/tfidf_scikit.csv
208,york apparently laughs,0.012174304095213192
208,york aren,0.012174304095213192
208,york aren supposed,0.012174304095213192
208,young ladies,0.012174304095213192
208,young ladies need,0.012174304095213192
208,young man,0.008437685963000223
208,young man game,0.012174304095213192
208,young stupid,0.011506395106658192
208,young stupid sighs,0.012174304095213192
Categories: Blogs

Diamond Kata - Some Thoughts on Tests as Documentation

Mistaeks I Hav Made - Nat Pryce - Sun, 02/15/2015 - 14:13
Comparing example-based tests and property-based tests for the Diamond Kata, I’m struck by how well property-based tests reduce duplication of test code. For example, in the solutions by Sandro Mancuso and George Dinwiddie, not only do multiple tests exercise the same property with different examples but the tests duplicate assertions. Property-based tests avoid the former by defining generators of input data, but I’m not sure why the latter occurs. Perhaps Seb’s “test recycling” approach would avoid this kind of duplication. But compared to example based tests, property based tests do not work so well as as an explanatory overview. Examples convey an overall impression of what the functionality is, but are are not good at describing precise details. When reading example-based tests, you have to infer the properties of the code from multiple examples and informal text in identifiers and comments. The property-based tests I wrote for the Diamond Kata specify precise properties of the diamond function, but nowhere is there a test that describes that the function draws a diamond! There’s a place for both examples and properties. It’s not an either/or decision. However, explanatory examples used for documentation need not be test inputs. If we’re generating inputs for property tests and generating documentation for our software, we can combine the two, and insert generated inputs and calculated ouputs into generated documentation.
Categories: Blogs

Sincere Seekers in Search of True Love

Portia Tung - Selfish Programming - Sat, 02/14/2015 - 22:43


Years ago, I made a wish. A wish that one day, I’d be brave enough and mad enough to take part in the movement that is taking the world by storm, or should I say love? I’m, of course, referring to the Free Hugs Campaign started by one man in an attempt to reconnect with humanity.

I first came across “free hugging” during a visit to Helsinki back in December 2008. It was a bitterly cold winter, the kind that made you worry about losing a toe or two if you spent too long stomping the white pavement on your own.

I was wandering around the city after a jam-packed day of Agile training and who did I find beaming with warm smiles and arms wide open towards me but two young women at the train station?

Incredibly, these two young women were offering free hugs. To anyone and everyone.

A Wish Come True

After 6 long years, this random wish of mine finally came true. On Sunday, 18 January 2015, to my great fear and delight, I was offered the chance to give free hugs to the people frequenting Pimlico (home of Tate Britain) on a chilly winter afternoon.

And in spite of of the butterflies in my tummy screaming “No!!! Don’t do it!!!”, I knew my time had come. To connect with the rest of humanity like I’ve never dared to but have always longed to do.

Together with a bunch of well-wishing strangers in search of inner peace, I stomped the pavement and offered free hugs to anyone and everyone.

Between us, we hugged over 80 people in under an hour and didn’t get arrested.

For me, the most remarkable takeaway from that experience is that I learned more about what it means to be human in those 60 minutes than I have in my lifetime so far.

I learned that strangers can be kind and generous. That most of us want nothing more than to connect with one another. That we’re all in search of true love and when we find it, what better way to celebrate it than with a hug?

Happy Valentine’s Day!

Categories: Blogs

The Great Love Quotes Collection Revamped

J.D. Meier's Blog - Sat, 02/14/2015 - 21:30

A while back I put together a comprehensive collection of love quotes.   It’s a combination of the wisdom of the ages + modern sages.   In the spirit of Valentine’s Day, I gave it a good revamp.  Here it is:

The Great Love Quotes Collection

It's a serious collection of love quotes and includes lessons from the likes of Lucille Ball, Shakespeare, Socrates, and even The Princess Bride.

How I Organized the Categories for Love Quotes

I organized the quotes into a set of buckets:
Broken Hearts and Loss
Falling in Love
Fear and Love
Fun and Love
Love and Life
Significance and Meaning
The Power of Love
True Love

I think there’s a little something for everyone among the various buckets.   If you walk away with three new quotes that make you feel a little lighter, put a little skip in your step, or help you see love in a new light, then mission accomplished.

Think of Love as Warmth and Connection

If you think of love like warmth and connection, you can create more micro-moments of love in your life.

This might not seem like a big deal, but if you knew all the benefits for your heart, brain, bodily processes, and even your life span, you might think twice.

You might be surprised by how much your career can be limited if you don’t balance connection with conviction.  It’s not uncommon to hear a lot of turning points in the careers of developers, program managers, IT leaders, and business leaders that changed their game, when they changed their heart.

In fact, on one of the teams I was on, the original mantra was “business before technology”, but people in the halls started to say, “people before business, business before technology” to remind people of what makes business go round.

When people treat each other better, work and life get better.

Love Quotes Help with Insights and Actions

Here are a few of my favorite love quotes from the collection …

“Love is like heaven, but it can hurt like hell.” – Unknown

“Love is not a feeling, it’s an ability.” — Dan in Real Life

“There is a place you can touch a woman that will drive her crazy. Her heart.” — Milk Money

“Hearts will be practical only when they are made unbreakable.”  – The Wizard of Oz

“Things are beautiful if you love them.” – Jean Anouilh

“Life is messy. Love is messier.” – Catch and Release

“To the world you may be just one person, but to one person you may be the world.” – Unknown

For many more quotes, explore The Great Love Quotes Collection.

You Might Also Like

Happiness Quotes Revamped

My Story of Personal Transformation

The Great Leadership Quotes Collection Revamped

The Great Personal Development Quotes Collection Revamped

The Great Productivity Quotes Collection

Categories: Blogs

Changing Behavior by Asking the Right Questions

George Dinwiddie’s blog - Sat, 02/14/2015 - 03:16

My article, Agile Adoption: Changing Behavior by Asking the Right Questions, has been published over on (free registration required). It talks about when managers want change, but don’t want to squeeze the Agile out by force.

Categories: Blogs

Neo4j: Building a topic graph with Prismatic Interest Graph API

Mark Needham - Sat, 02/14/2015 - 01:38

Over the last few weeks I’ve been using various NLP libraries to derive topics for my corpus of How I met your mother episodes without success and was therefore enthused to see the release of Prismatic’s Interest Graph API

The Interest Graph API exposes a web service to which you feed a block of text and get back a set of topics and associated score.

It has been trained over the last few years with millions of articles that people share on their social media accounts and in my experience using Prismatic the topics have been very useful for finding new material to read.

The first step is to head to and get an API key which will be emailed to you.

Having done that we’re ready to make some calls to the API and get back some topics.

I’m going to use Python to call the API and I’ve found the requests library the easiest library to use for this type of work. Our call to the API looks like this:

import requests
payload = { 'title': "insert title of article here",
            'body': "insert body of text here"),
            'api-token': "insert token sent by email here"}
r ="", data=payload)

One thing to keep in mind is that the API is rate limited to 20 requests a second so we need to restrict our requests or we’re going to receive error response codes. Luckily I came across an excellent blog post showing how to write a decorator around a function and only allow it to execute at a certain frequency.

To rate limit our calls to the Interest Graph we need to pull the above code into a function and annotate it appropriately:

import time
def RateLimited(maxPerSecond):
    minInterval = 1.0 / float(maxPerSecond)
    def decorate(func):
        lastTimeCalled = [0.0]
        def rateLimitedFunction(*args,**kargs):
            elapsed = time.clock() - lastTimeCalled[0]
            leftToWait = minInterval - elapsed
            if leftToWait>0:
            ret = func(*args,**kargs)
            lastTimeCalled[0] = time.clock()
            return ret
        return rateLimitedFunction
    return decorate
def topics(title, body):
    payload = { 'title': title,
                'body': body,
                'api-token': "insert token sent by email here"}
    r ="", data=payload)
    return r

The text I want to classify is stored in a CSV file – one sentence per line. Here’s a sample:

$ head -n 10 data/import/sentences.csv
2,1,1,1,Scene One
3,1,1,1,[Title: The Year 2030]
4,1,1,1,"Narrator: Kids, I'm going to tell you an incredible story. The story of how I met your mother"
5,1,1,1,Son: Are we being punished for something?
6,1,1,1,Narrator: No
7,1,1,1,"Daughter: Yeah, is this going to take a while?"
8,1,1,1,"Narrator: Yes. (Kids are annoyed) Twenty-five years ago, before I was dad, I had this whole other life."
9,1,1,1,"(Music Plays, Title ""How I Met Your Mother"" appears)"

We’ll also need to refer to another CSV file to get the title of each episode since it isn’t being stored with the sentence:

$ head -n 10 data/import/episodes_full.csv
1,1,/wiki/Pilot,1,"September 19, 2005",1127084400,Pilot,Pamela Fryman,10.94,"Carter Bays,Craig Thomas",68
2,2,/wiki/Purple_Giraffe,1,"September 26, 2005",1127689200,Purple Giraffe,Pamela Fryman,10.40,"Carter Bays,Craig Thomas",63
3,3,/wiki/Sweet_Taste_of_Liberty,1,"October 3, 2005",1128294000,Sweet Taste of Liberty,Pamela Fryman,10.44,"Phil Lord,Chris Miller",67
4,4,/wiki/Return_of_the_Shirt,1,"October 10, 2005",1128898800,Return of the Shirt,Pamela Fryman,9.84,Kourtney Kang,59
5,5,/wiki/Okay_Awesome,1,"October 17, 2005",1129503600,Okay Awesome,Pamela Fryman,10.14,Chris Harris,53
6,6,/wiki/Slutty_Pumpkin,1,"October 24, 2005",1130108400,Slutty Pumpkin,Pamela Fryman,10.89,Brenda Hsueh,62
7,7,/wiki/Matchmaker,1,"November 7, 2005",1131321600,Matchmaker,Pamela Fryman,10.55,"Sam Johnson,Chris Marcil",57
8,8,/wiki/The_Duel,1,"November 14, 2005",1131926400,The Duel,Pamela Fryman,10.35,Gloria Calderon Kellett,46
9,9,/wiki/Belly_Full_of_Turkey,1,"November 21, 2005",1132531200,Belly Full of Turkey,Pamela Fryman,10.29,"Phil Lord,Chris Miller",60

Now we need to get our episode titles and transcripts ready to pass to the topics function. Since we’ve only got ~ 200 episodes we can create a dictionary to store that data:

episodes = {}
with open("data/import/episodes_full.csv", "r") as episodesfile:
    episodes_reader = csv.reader(episodesfile, delimiter=",")
    for episode in episodes_reader:
        episodes[int(episode[0])] = {"title": episode[6], "sentences" : [] }
with open("data/import/sentences.csv", "r") as sentencesfile:
     sentences_reader = csv.reader(sentencesfile, delimiter=",")
     for sentence in sentences_reader:
>>> episodes[1]["title"]
>>> episodes[1]["sentences"][:5]
['Pilot', 'Scene One', '[Title: The Year 2030]', "Narrator: Kids, I'm going to tell you an incredible story. The story of how I met your mother", 'Son: Are we being punished for something?']

Now we’re going to loop through each of the episodes, call topics and write the result into a CSV file so we can load it into Neo4j afterwards to explore the data:

import json
with open("data/import/topics.csv", "w") as topicsfile:
    topics_writer = csv.writer(topicsfile, delimiter=",")
    topics_writer.writerow(["EpisodeId", "TopicId", "Topic", "Score"])
    for episode_id, episode in episodes.iteritems():
        tmp = topics(episode["title"], "".join(episode["sentences"]).json()
        print episode_id, tmp
        for topic in tmp['topics']:
            topics_writer.writerow([episode_id, topic["id"], topic["topic"], topic["score"]])

It takes about 10 minutes to run and this is a sample of the output:

$ head -n 10 data/import/topics.csv
1,1163,Dating and Courtship,0.5487490108554022

We’ll use Neo4j’s LOAD CSV command to load the data in:

// make sure the topics exist
LOAD CSV WITH HEADERS FROM "file:///Users/markneedham/projects/neo4j-himym/data/import/topics.csv" AS row
MERGE (topic:Topic {id: TOINT(row.TopicId)})
ON CREATE SET topic.value = row.Topic
// make sure the topics exist
LOAD CSV WITH HEADERS FROM "file:///Users/markneedham/projects/neo4j-himym/data/import/topics.csv" AS row
MERGE (topic:Topic {id: TOINT(row.TopicId)})
ON CREATE SET topic.value = row.Topic
// now link the episodes and topics
LOAD CSV WITH HEADERS FROM "file:///Users/markneedham/projects/neo4j-himym/data/import/topics.csv" AS row
MATCH (topic:Topic {id: TOINT(row.TopicId)})
MATCH (episode:Episode {id: TOINT(row.EpisodeId)})
MERGE (episode)-[:TOPIC {score: TOFLOAT(row.Score)}]->(topic)

We’ll assume that the episodes and seasons are already loaded – the commands to load those in are on github.

We can now write some queries against our topic graph. We’ll start simple – show me the topics for an episode:

MATCH (episode:Episode {id: 1})-[r:TOPIC]->(topic)
RETURN topic, r


Let’s say we liked the ‘Puns’ aspect of the Pilot episode and want to find out which other episodes had puns. The following query would let us find those:

MATCH (episode:Episode {id: 1})-[r:TOPIC]->(topic {value: "Puns"})<-[:TOPIC]-(other)
RETURN episode, topic, other

Graph  1

Or maybe we want to find the episode which has the most topics in common:

MATCH (episode:Episode {id: 1})-[:TOPIC]->(topic),
RETURN otherEpisode.title as episode, COUNT(r) AS topicsInCommon
ORDER BY topicsInCommon DESC
==> +------------------------------------------------+
==> | episode                       | topicsInCommon |
==> +------------------------------------------------+
==> | "Purple Giraffe"              | 6              |
==> | "Ten Sessions"                | 5              |
==> | "Farhampton"                  | 4              |
==> | "The Three Days Rule"         | 4              |
==> | "How I Met Everyone Else"     | 4              |
==> | "The Time Travelers"          | 4              |
==> | "Mary the Paralegal"          | 4              |
==> | "Lobster Crawl"               | 4              |
==> | "The Magician's Code, Part 2" | 4              |
==> | "Slutty Pumpkin"              | 4              |
==> +------------------------------------------------+
==> 10 rows

We could then tweak that query to get the names of those topics:

MATCH (episode:Episode {id: 1})-[:TOPIC]->(topic),
RETURN otherEpisode.title as episode, season.number AS season, COUNT(r) AS topicsInCommon, COLLECT(topic.value)
ORDER BY topicsInCommon DESC
==> +-----------------------------------------------------------------------------------------------------------------------------------+
==> | episode                   | season | topicsInCommon | COLLECT(topic.value)                                                        |
==> +-----------------------------------------------------------------------------------------------------------------------------------+
==> | "Purple Giraffe"          | "1"    | 6              | ["Humour","Fiction","Kissing","Dating and Courtship","Flirting","Laughing"] |
==> | "Ten Sessions"            | "3"    | 5              | ["Humour","Puns","Dating and Courtship","Flirting","Laughing"]              |
==> | "How I Met Everyone Else" | "3"    | 4              | ["Humour","Fiction","Dating and Courtship","Laughing"]                      |
==> | "Farhampton"              | "8"    | 4              | ["Humour","Fiction","Kissing","Dating and Courtship"]                       |
==> | "Bedtime Stories"         | "9"    | 4              | ["Humour","Puns","Dating and Courtship","Laughing"]                         |
==> | "Definitions"             | "5"    | 4              | ["Kissing","Dating and Courtship","Flirting","Laughing"]                    |
==> | "Lobster Crawl"           | "8"    | 4              | ["Humour","Dating and Courtship","Flirting","Laughing"]                     |
==> | "Little Boys"             | "3"    | 4              | ["Humour","Puns","Dating and Courtship","Laughing"]                         |
==> | "Wait for It"             | "3"    | 4              | ["Fiction","Puns","Flirting","Laughing"]                                    |
==> | "Mary the Paralegal"      | "1"    | 4              | ["Humour","Dating and Courtship","Flirting","Laughing"]                     |
==> +-----------------------------------------------------------------------------------------------------------------------------------+

Overall 168 (out of 208) of the other episodes have a topic in common with the first episode so perhaps just having a topic in common isn’t the best indication of similarity.

An interesting next step would be to calculate cosine or jaccard similarity between the episodes and store that value in the graph for querying later on.

I’ve also calculated the most common bigrams across all the transcripts so it would be interesting to see if there are any interesting insights at the intersection of episodes, topics and phrases.

Categories: Blogs

Cross-Platform AutoMapper (again)

Jimmy Bogard - Fri, 02/13/2015 - 17:04

Building cross-platform support for AutoMapper has taken some…interesting twists and turns. First, I supported AutoMapper in Silverlight 3.0 five (!) years ago. I did this with compiler directives.

Next, I got tired of compiler directives, tired of Silverlight, and went back to only supporting .NET 4.

Then in AutoMapper 3.0, I supported multiple platforms via portable class libraries. When that first came out, I started get reports of exceptions that I didn’t think should ever show up, but there was a problem. MSBuild doesn’t want to copy referenced assemblies that aren’t actually being used, so I’d get issues where you’d reference platform-specific assemblies in a “Core” library, but your “UI” project that referenced “Core” didn’t pull in the platform-specific assembly.

So began a journey to force the platform-specific assembly to get copied over, no matter what. But even that was an issue – I went through several different iterations of this before it finally, reliably worked.

Unless you’re on Xamarin, which doesn’t support using this method (of PowerShell scripts) to run install scripts on Mac.

Then I had a GitHub issue from Microsoft folks asking for CoreCLR support. And with vNext projects, the project itself describes the platforms to support, including all files in the directory. Meaning I wouldn’t be picking and choosing which files should be in the assembly or not. So, we’re back to square one.

A new path

With CoreCLR and the vNext project style that is folder-based rather than scattershot, pick-and-choose file based, I could only get CoreCLR support working by using conditional compiler directives. This was already in AutoMapper a few places, but mainly in files between the platform specific assemblies. I’ve always had to do a little bit of this:


Not absolutely horrible, but now with CoreCLR, I need to do this everywhere. To keep my sanity, I needed to include every file in every project. Ideally, I could just have the one portable library, but that won’t work until CoreCLR is fully released. With CoreCLR, I wanted to just have one single project that built multiple platforms. vNext class libraries can do this out-of-the-box:


However, I couldn’t move all platforms/frameworks since they’re not all supported in vNext class projects (yet). I still had to have individual projects.

Back when I supported Silverlight 3 for the first time, I abandoned support because it was a huge pain managing multiple projects and identical files. With vNext project files, which just includes all files in a folder without doing any explicit adding, I could have a great experience. I needed that with my other projects. The final project structure looked like this:


In the root PCL project, I’ll do all of the work. Refactoring, coding, anything. All of the platform-specific projects will just include all the source files to compile. To get them to do this, however, meant I needed to modify the project files to include files via wildcard:


My projects automatically include *all* files within folders (I needed to explicitly specify individual folders for whatever reason). With this configuration, my projects now include all files automatically:


I just have to be very careful that when I’m adding files, I only do this in the core PCL project, where files ARE added explicitly. There seems to be strange behavior that if I added a file manually to a project with wildcard includes, all of the files will be explicitly added. Not what I’d like.

Ultimately, this greatly simplified the deployment story as well. Each dependency only includes the one, single assembly:


At the end of the day, this deployment strategy is best for the users. I don’t have to worry about platform-specific extension libraries, GitHub issues about builds breaking in certain environments or the application crashing in cloud platforms.

If I had to do it over again, I’m not unhappy with the middle step I took of platform-specific extension assemblies. It forced me to modularize whereas with pure compiler directives I could have accepted a spaghetti mess of code.

Eventually, I’d like to collapse all into one project, but until it’s supported, this seems to work for everyone involved.

Post Footer automatically generated by Add Post Footer Plugin for wordpress.

Categories: Blogs

CSR, Knowledge, and a Student Conference

Dear Junior
At Omegapoint we have for a long time struggled with the question: How can we contribute towards society? Or, put otherwise, how can we pay back to the society to which we owe so much? To use the business lingo phrase: How do we exercise our Corporate Social Responsibility (CSR)?
There is the obvious non-answer: We already contribute to society though our business. We pay taxes without fuzzing or hiding. What we do for our customers must be good in a broader sense, otherwise they would not pay for it. So, obviously we contribute.
But our moral standard is somewhat higher than that kind of waterline mark.
Of course there is always the easy way out: Donate money to some beneficiary organisation. And of course we have done so. But it does not feel completely satisfactory. Anyone can donate spare money: should we not pay those money as salaries and let our employees donate at their own discretion? What could we as a company contribute with?
Also we have done pro bono work: helping NGOs with their web-sites and support systems. Also we have given pro bono aid and coaching to local startups. Of course this is better, but we were still not completely satisfied. Lots of other companies could do this, it did not feel unique to us.
Slowly has the realisation dawned on us. We define ourselves as a knowledge company with an attitude of openness. The epitome of this is our semi-annual two-days competence conference, affectionately named OPKoKo where we run presentations, workshops, and discussion - all powered by stunning colleagues and some occasional guest. This is the soul of Omegapoint, this is what we should share.
Sharing with whom became pretty obvious. At Omegapoint almost everyone has studied at a technical university, apart from some remarkable autodidacts. In Sweden university education has no tuition fee at any of the universities - it is funded through taxes instead. This means that even the best universities have a steady flow of talented young people from all corners society. Omegapoint as a company is build on this foundation. We should pay back to Swedish Academia, and its education of students in particular.
Once question phrased, answer was simple. This spring we will run our first student conference: Studentkonferens 2015 by Omegapoint Academy.
We have put together a one-day conference with three tracks: programming, security, and testing. We will bring our experts from the field to spend a day presenting, discussing and socialising with the students. 
We have tried to pick subjects that are as relevant to students as possible - not far-away is about "strategic architectural enterprise business process modelling", but rather about stuff they can relate to and hopefully use immediately. We ended up choosing "Craftsmanship" as the theme.
Now, "Craftsmanship" is a term that has been thrown around our community left and right, and I am definitely sceptical to some uses of the word. Still, I think the original message of craftsmanship holds: love the craft, pay attention to what matters, be pragmatic.
Yesterday we announced the conference to the public during the Datatjej2015 conference run by female-CS-student association Datatjej. This is not a coincidence. 
At Omegapoint we always hire on merit. The idea to select from non-relevant attributes as gender or ethnicity is completely foreign to us. However, we also do know that women have had (and still have) a harder time in our industry than men. Thus, we want to encourage initiatives that encourage young women to make a move in our industry. Hence, it makes sense to draw some attention to Datatjej2015 by selecting that event for announcing our own initiative.
Now, this is going to be fun. Also, I have been given the honour to present the opening keynote, presenting my view of Craftsmanship in software development. I am really looking forward to all of it.
PS To pay credit where credit is due, I really must praise my colleagues Lotta Hammarström and Daniel Deogun. They where both part of the original idea, and have really worked hard to it come alive.

Categories: Blogs

The Fun Zone - The Sweet Spot Of Resilient Learning

Agile Thinks and Things - Oana Juncu - Fri, 02/13/2015 - 12:31
LEGO SERIOUS PLAY TrainingWe all learn all the time, and we enjoy  it . Learning is exciting !  ... when we succeed .  Remember when you first learned to ride a bike ? Wan't it a little bit like a victory whet you first succeeded to make the bike move without someone discreetly holding your ? The big secret of learning with joy , though, is that we at, we don't play mental games like conceptualisation or arguing . Scientific research have shown that we exclusively learn from acting , never from listed concepts . To remember what we learned , the key sweet secret is to have fun during the learning process. Here is what we can do about it .
Let's  Go And Play !
We love challenges. Or don't we ? If you think a little bit , there are challenges that we enjoy and others that we fear and want to avoid . What is the difference ? You might say, ones are not important , others are crucial and may change your life.  Think of a challenge from the second category that you faced and had a good outcome ? How did you feel after ?
Now think of other "unimportant" challenges that you successfully solved? How did you feel after ?
What was the difference of emotions ? Relieved vs excited ? Powerful vs joyful ? Proud vs ... proud ?
At the end there is little ( no ) difference between the  emotional experience of overcoming a challenge of stake and  "superficial" little challenge.
If you were asked to name a type of "low stake challenge" experience, you probably name "playing games".
"Low stake" set-up is an interesting environment for learning, because we are in a safe zone "by design".  As we remember the rewarding feeling of solving a challenge ,  the best way to create resilient learning is to put ourselves in a "low stake" environment  to acquire desired skills. This environment is will be the "learning playground" . We are ready to play games to learn.

Good Games For Fun Learning
ATD2014 - Playing for the Next Demo Episode of
Our Killing AppDeciding to go play games is not  - at least not always - enough !  We enjoy playing a game because it is a good game. We learn effectively form a game when it is a goo ( learning ) game .
What is a good game ? Here is what games theory says about  good games :

  • Have  a goal ( an outcome to reach)
  • Have clear rules ( if not players won't know how to reach the outcome ) 
  • Give a sense of control of the game to the participants
  • Give a sense of progress in the game .

The Fun Zone Of  Learning
Comfort Zone For Learning
From the LSP Certification Training The sweet spot of  fun learning is the right balance between the level of difficulty of the challenge  in respect with the skills you have. Did you enjoy a game that was way too complex for you from the beginning ? Would you play a video game at level 7 when you never played it ?
But what about starting at level 1 and playing only level 1 over time ? Do you have the same "fun level" over time ? When we enhance our skills , we expect higher challenges. Because challenges that   are just over our skill level are the one that give us the best quality entertainment.
We have a sense of progress. We learned . We played together. And playing IS collaboration.

The graph of the "Learning Fun Zone" was presented by the Lego Serious Play Trainers during my Facilitation Certification Training. Credits to Jean Semo and Marie-Christine Dupont for it. 

Linked Materials :

JanetMcGonigal : Reality is Broken 
Categories: Blogs

Can You Mandate Your Agile Transformation?

Leading Agile - Mike Cottmeyer - Fri, 02/13/2015 - 09:00

Well… it depends.

If you view agile as a system of beliefs, or a way of looking at the world, or as a culture your company is expected to adopt…I’d suggest that it’s impossible to mandate an agile transformation. There is no way to force people to believe in something they don’t believe in or to feel something they don’t feel.

If you view agile as a set of practices, or as a way of performing your day-to-day activities, or as a set of ceremonies and artifacts and roles that people are required to perform… I’d suggest that, while probably not impossible to mandate, at best you’ll get malicious compliance if you try.

If you view agile as a system of delivery predicated upon the notion of small cross-functional teams, and you mandate those teams have everything necessary to deliver a working, tested increment of the product… and you mandate the organization gives those teams extreme clarity around what you are asking them to build… and you mandate those teams deliver an increment of the product for inspection every couple of weeks, just so we can make sure they are on the right track, give them feedback, and validate they are making measurable progress against our business goals…

I’d suggest that it’s irresponsible NOT to mandate your agile transformation.

Once you mandate the right kind of agile transformation, now we can explore the wide palette of tools and techniques and practices that make that kind of system work, and we can invite the team to choose the tools and techniques and practices that work best for them in their particular context.

Once you mandate the right kind of agile transformation, and the team has everything they need to be successful, autonomy to make local decisions, and the safety to decide how to do the work and how much work can be done, you can then invite them to change their mind about what they believe.

Mandating an agile transformation and inviting people to participate are not mutually exclusive. We just have to be clear on what’s negotiable and what isn’t.

The post Can You Mandate Your Agile Transformation? appeared first on LeadingAgile.

Categories: Blogs

An open letter about unit tests

George Dinwiddie’s blog - Fri, 02/13/2015 - 04:33

An open letter to a programmer who thinks that code coverage by integration tests eliminates the need for unit tests.

Why do we want tests? Typically, we want functional or acceptance test to demonstrate that the system, as a whole, functions the way we (or “the business”) wants it to function. We want integration tests to demonstrate that the subsystems talk to each other properly, as often errors creep in with differing expectations at these major system boundaries. And we want unit or micro tests to demonstrate that the code works the way the programmer thinks it should work. This is the business view of tests.

From a personal point of view, as a programmer I want tests to make my life easier as a programmer. Tests, particularly unit tests, give me quick feedback when my changes to the code have violated previous expectations of the code. They let me know when I’ve accomplished the functionality that lead me to make changes. They let me reorganize the code without worrying about making a mistake, because the tests will immediately alert me if I do. They let me move quickly, because the tests can analyze the holes in my design much more quickly than I can. And they save me from lots of embarrassment that comes from delivering buggy code.

You might think that we only need one level of testing. As long as a line of code is covered by a test, why cover it again? In a perfect world, shouldn’t this be sufficient?

In a perfect world, we wouldn’t need to write tests at all. We’re not in a perfect world, and neither are our tests. Our functional tests can look at the whole system, but has a hard time exercising the edge conditions deep in the code. The fact that a line of code has been exercised by a test does not tell us that the line works properly under various conditions. Our unit tests can easily check the boundaries of a small piece of code, but can’t assure us that the pieces are connected together properly. We need multiple levels of testing to get the assurance we need.

In addition, tests have other attributes. Unit tests run much faster (much less than a second) than integration or functional tests. This lets us run them much more frequently (several times a minute) and quickly get feedback on the effect of our latest code change. They can also diagnose the errors much more precisely than larger scale tests. They can codify our assumptions about the way a piece of code works, so if someone else makes a change that breaks those assumptions it becomes immediately obvious. Done well, unit tests give us confidence to move fast and stay out of the debugger.

If you’re not getting these benefits from unit testing, call me and I’ll work with you on them. Life is too short to struggle with unit tests and not get the value they can offer.

Categories: Blogs