Lets Not Run Towards the Creepy Line Before We Can Walk

Just Give Me a Signthe-man-with-two-brains

There is a very funny scene in an old Steve Martin movie The Man with Two Brains where our hero, Dr. Michael Hfuhruhurr, looks at a painting of his beloved, dead wife, Rebecca, and asks if she is happy with his feelings for the new love in his life, Dolores. An ethereal voice whispers noooooo, the painting begins to spin round, candlesticks burst into flames and an ungodly wind blows through the room. All the time the voice grows louder and then, when it all stops suddenly, a dishevelled Dr Hfuhruhurr says Just give me a sign … any sign … I’ll be looking out for it.

Predictably

I saw a recent presentation from, err let’s say one of the big three global technology corporations, which reminded me of this scene. The presenter talked us through a scenario where a young man is presented an offer to use some loyalty points. In the world of #bigdata, the presenter, went on, we will know that he has been dating for a little over eighteen months so the offer will be personalised towards romantic destinations.

Using recent purchase history, presumably an engagement ring, the intent of the trip is determined and the couples experience is further customised.  A taxi, rather than hire car, to their intimate dinner for two and at a restaurant that provides just the right setting for them rather than say, a young family or a solo business traveller. No piece of data or algorithm is left unturned to create the perfect weekend for our fictitious couple.

I am not sure I am ready to be second guessed about major life decisions by businesses that have yet to work out that googlemail.com and gmail.com is the same email suffix

Creepy Line

I know, the presenter conceded at one point, some of you might be concerned that we are crossing ‘the creepy line’ here. And the room relaxed a little and listened intently to a world where algorithms applied to increasingly personal data ensured that each need was carefully met before the couple even realised for themselves that they needed it.

The story concluded with a marriage proposal and an assertion that all of this is possible in a world of big data, machine learning and predictive analytics.

Second Guessing

Now, I welcome a world where cars are recalled before we experience a breakdown, where risk is assessed and mitigated and where fraudulent use of my credit card is spotted before any real damage is done to my own or my providers finances. All good.

However, when it comes predicting what we will want next, to second guessing us, I am not so sure. And here’s why.

based on their  personalised  marketing,this is what I think they actually know about me. They know my email address and that I buy mens clothes. And err, that’s it.

Personalise This

I am a man , no longer in the blush of youth. In spite of that, I maintain a distant interest in fashion. I care, actually very much, about the clothes I wear even if those around me might be surprised by that. I apply two universal rules. I don’t ever want to buy clothes that my Dad would wear and, even more importantly, that my Son would wear. I tend towards blues but don’t want it to be the only colour in my wardrobe and whilst the trend is towards slim fit trousers (pants for my US pals) I have to check the fit carefully because I have (let’s saystrong) calves. I very rarely wear knits, like jolly (but certainly not ‘humorous’) socks, prefer smart over casual, never go double-breasted and I almost always wear a collar particularly for dinner. As I say, older dude. I shop on-line, actually pretty frequently but I have to know the store well before I do because I have little time for the rigmarole that comes with returning parcels using  a service that seems largely geared to those that live in 1975.

And Here is What You Really Know

I buy from what I believe to be pretty innovative retailers but, based on their personalised marketing, this is what I think they actually know about me. They know my email address and that I buy mens clothes.

They don’t even have a firm grasp on the email thing to be candid. I am not sure I am ready to be second guessed about major life decisions by businesses that have yet to work out that googlemail.com and gmail.com is the same email suffix and that offering me an item at a discounted prices makes any sense if they don’t have it left in my size.

Listening

I occasionally send an email or respond to a survey to grumble about an unhelpful web site or to call out a particularly helpful assistant that didn’t assume I was looking for everything in beige. However, according to Gartner, whilst 95% of businesses collect feedback only around 35% use the insight that they have collected. A tiny 10% improve their business with this information and only half of that small group circle back to the customer and tell them that they did so.

Our customers  talk to each other on social platforms and they comment on their experience of our businesses in their own voice. Many also talk to us as directly. They leave reviews and some, though the rates are decreasing, respond to our surveys. More would answer our questions too, if they were asked thoughtfully and if we demonstrated that we did something with their feedback from time to time.

Surveillance and Second Guessing is Creepy. Listening is Human

Privacy is an important issue, I don’t mean to diminish it, but let’s forget where the creepy line is for now. Instead, why don’t we try something fundamentally human as businesses. Why don’t we try listening.

So my message to the CMO of those very large and resourceful businesses that currently have my tentative loyalty. Rather than looking for patterns, trends and algorithmic intent in data you have no place being you can read this blog, my tweets, my comments or even ask me what I like or what (to quote Spencer Trilby played by Charlton Heston in True Lies) blows my skirt up. And on that subject, I am not going to wade through 50 questions designed for your departmental silos. Let me tell you in my own words and in my own time.

Or you can continue to wait for a sign and risk me moving on to someone that listened.

Advertisements

Big Data Indescribably Large

Lost for words

I have been the first to be overly critical of those that define big data solely by size and (absence of) structure. That being said, it is inescapable that data volumes have reached an inflection point. In an article for the Wall Street Journal, Andrew McAfee makes a pretty startling observation. Data has gone from being measured in terabytes to petabytes and exabytes. He explains that in 2012 Cisco announced that its equipment was recording a zettabyte of data. Not startling so far and, in any case, outside of the circle of data geeks, few will have heard of a zettabyte.  The more jarring fact is that the next metric for measuring data is the final one. After the zettabyte is a yottabyte (10 to the power of 24  as you asked ) and then that’s it. We have literally run out of words to describe how big, big data is.

Big v Different

Commentators such as Jeff Jonas and Kenneth Culkier make the point that big is not just big. Big can be different.  David Weinberger, one of the authors of the CluetrainManifesto, makes a similar point in his book Too Big to Know. He proposes that knowledge has been shaped, perhaps even limited by its medium. Only the most important, meticulously researched facts were  committed to paper until the invention of the printing press. Even then, the printed medium carried figurative and literal weight.

In describing Big Data in Decision Sourcing, we  contrast transactional data with ambient data. Transactional data was limited by traditional data processing originally in the form of the punch card and more latterly the relational database. Ambient data, however, exists all around us. It’s size meant that it went unobserved or at least uncaptured. This is what has changed. Affordable and available technology means that signals generated through the internet of things and human social interaction can be captured in digital form providing new (and different) sources of insight. The relational database limited us to recording invoice lines and account details whilst new forms of data management allow us to capture every human gesture, comment and click. Meanwhile  the machines are logging everything they do.

Whats’s next?

Metric prefixes were last updated in 1991 at the 19th General Conference on Weights and Measures and beyond yotta, we got nothin’. Big Data means disruptive, transformational change in a way that we don’t completely understand today. In fact we don’t even have a name for what comes next. Yet.

Big Data Week: New and Different

Big Data Week

I was able to take part, albeit fleetingly, this week in Big Data Week a series of events run in 25 cities across the globe.  There were a series of evening meetups hackathons and panels throughout the week in London with the key event running out of Imperial College on Thursday. Edd Dumbill O’Reilly Strata chaired and speakers included the excellent Kenneth Cukier (co-author of Big Data) and Nick Halstead (founder UK start-up Datasift)

First off my congratulations go to Stuart Townsend and Andrew Gregson who  pulled together a program that was excellent,  free of hype and as grounded as the title ‘Putting Data to Work’ suggests. Superb.

Because of pressing commitments with the day job, I dipped in and out of what I thought would be the ‘best’ bits and was mostly right with only one session that really missed the mark for me.

Takeaways

My takeaways;

– Big Data projects are still largely about click stream, native internet businesses and one off mash-up projects (more often public and ngo than corporates)

– Big Data has its origins in the web (thank you Google) and this is where most of the corporate activity remains. ‘mainstream’ (whatever that means in the networked age) corporate use is a way off

– We are still largely defining Big Data in technical terms (a good 40% of the group in one session described themselves as ‘technical’ when polled

– It is still very early days. Innovation, interest and investment is still high and growing

New and Different Data

The highlight for me was Cukier who has come closest to providing a satisfactory definition of Big Data for me. As you may appreciate from previous posts, technical definitions based on volume are, strictly speaking, spot on but leave me a little cold. I attempt to get closer to something more vital (shameless plug alert) in Decision Sourcing (Roberts and Pakkiri, Gower 2013) by describing it as ambient. By describing it as such I am asserting that it is data that has always existed around us (temperature readings, product mentions, consumer comments, buyer behaviour) but it has only recently been captured and made available to us as data.

Let me explain. If I abandon my basket in Sainsbury because I couldn’t find the one thing I came in for and it can be seen by the store manager when the close but not understood. Not so for Amazon. The thermostat in my home is the very definition of ‘there or thereabouts’ but a Nest captures, stores and learns from accurate readings. When I share how great brunch is at Balthazar around the office, someone may make a mental note. When I do it on Facebook, it’s a piece of data to go with the other 1.5 billion that day.

The point is not that big data is big, though it is. It is that it wasn’t available to us before either because it wasn’t being captured (social mentions) or it’s volume and variety (web clicks, smart meter readings) made it impossible to store and analyse and therefore understand.

As Cukier put’s it. Sometimes more is not just more. Sometimes it is so much more that it is different. Big Data Week 2013 seemed to be a great success to me. I look forward to New and Different Data Week 2014.

Facebook Graph Search: The Power of the Nodes and Edges

According to Mashable, Facebook Graph Search could be it’s greatest innovation. I tend to agree. FB have eight years of Big Data (including almost over two billion new Likes each day) to help us identify products, services and brands that we might need through the experiences of those in our social network.

Actually, a graph consists of only two things; Nodes (people) and edges (their relationships) Analysis of these though can reveal much. The simplest is a measurement of neighbours, the number of edges and their direction. A node with a large number of inward edges (or indegree) can be thought of as popular. One with a large number of outward edges gregarious. If it were possible, Lady Gaga could make a whole boutique full of dresses out of her indegree. Simple analysis of these elements are behind ‘People You May Know’ features in LinkedIn, Chatter, Connections, Jive and Facebook to mention just a few.

New Nodes

Of course, the FB Graph includes other types of nodes (businesses, brands, products) and many other types of edge including the ubiquitous Like. FB also have demographics and psychographics because we surrender more information about ourselves to FB than we would feel comfortable doing in any other survey online or offline. We’re all concerned about privacy but generally end up somewhere around ‘what are you gonna do?’.

These simple elements add up to something very powerful. It’s possible not just to find French restaurants in Frimley but those that are preferred by frequent travellers to the Côte. Not just DIY stores nearby but those popular with power tool enthusiasts. Robert Putnam could have found countless examples for his book on the decline of social capital Bowling Alone. And it is just the beginning. Let’s not forget that those edges include ‘listened’, ‘read’, ‘watched’,’hiked’ and ‘cooked’ to name but a few of the verbs now residing in your facebook apps list and your personal social graph.

Big Data Breakthrough

This is a Big Data breakthrough for Facebook and puts some distance between them and their competitor, Google.  I am not sure that plus’ing is enough of an ‘edge’ at this stage. And for those that can’t see that FB and Google are competitors then remember that there is no revenue in Search. No one actually pays Google to organise the worlds information. Nor do we part with our cash for maintaing personal networks on Facebook. There is, however, a group of people willing to pay for connecting people to products they might enjoy. Advertisers. In other words there is revenue in creating new edges between nodes.  That’s the power of the graph.

Gamification and Gamified Business Intelligence

Gamified

I have become somewhat preoccupied with gamification of late. After the usual reading and research concluded with some structured study with the Wharton School through the excellent Coursera program, it became apparent that it was less of a diversion than I first thought. Indeed, there is considerable overlap between the aims of gamification and the aims of Business Intelligence.

To understand why, let’s start with the  definition of gamification from Professor Kevin Webach, the course lecturer and also the author of ‘For the Win‘ which is;

“The Use of game elements and game design techniques in non-game contexts”.

It’s an excellent, insightful and crisp definition. However it really only explains the ‘what’ but no the ‘why’. For this, I would refer you to Brian Blau and Brian Burke of Gartner who extend the definition as;

“The use of game mechanics to drive engagement in non-game business scenarios and to change behaviors in a target audience to achieve business outcomes”

Level 1

Both definitions are about using game elements in a non-game context but  Webach is being more inclusive whilst Gartner very specific. For Gartner is’s about business whilst Wharton include  external gamification and gamification for behavioural and social change. The former is gamification as a marketing device such as Foursquare. The latter is a rich and interesting area that would include Runkeeper and Zamzee encouraging us to be become a little fittter and OPower which, by comparing our energy usage to our peer group, helps us be more aware of our consumption.

The third Wharton category, internal gamification has the greatest overlap with Analytics, Business Intelligence and Performance Management. A definition of which can be derived from some minor modifications to the Gartner definition of gamification;

“The use of  analytics, business planning and key performance indicators to drive engagement  and to change behaviors in a target audience to achieve business outcomes”

Analytic applications are systems, sets of mechanics, to align, engage and improve the performance of the business. They, like a gamified system, are an abstraction. They are a derivation of business activities not the activities themselves. The numbers, charts and indicators become a new reality distinct from the business activity from which they are derived. They are, in a sense, gamified systems but with only a small subset of the rich set of (game) mechanics that might be made available. In fact I have argued for some time that this subset of mechanics is as woefully inadequate as the user experience/user interace design effort in most corporate analytic applications. We still think that a dashboard is a pretty cool interface.

Achievements

Business intelligence can, more often than it should, be driven by whatever data is available. Equally common is to deliver a system that is a marginal improvement in the information system it replaced but in a new tool or technology. The design will pay scant regard to how the information will really be used and are open to being ignored or even ‘gamed’. Measure a sales team on orders and there may be an increase in cancelled orders. Measure baggage handlers on the time it takes the first bag to arrive on a carousel and the second and subsequent bags might wait for the first bag on the next flight.

Internal gamification is designed around a deep understanding of the players (staff, workforce) and their motivations. It draws inspiration from an extensive palette of behavioural (game) mechanics.

Level Up

Business Intelligence then, could reasonably be defined as an early attempt to gamify the workplace. Sophisticated BI  intended to engage the workforce and align organisational behaviours through carefully designed elements of which analytics and key performance indicators were just a small subset, would be a game that many businesses would find worth playing.

Just Stop with the ‘Big Data is Just’

OK, I get it. You’re sceptical. You’ve seen stuff come and you’ve seen it go. To you big data is just BI, just data, just analytics for the hip kids, just a distraction or just hype and fad.

Except it isn’t. Big data is only ‘just’ analytics in the same way that cloud is ‘just’ asp or bureau. That is to say it isn’t at all.

It ‘aint hadoop either

Others define it in terms of the technology. I get this too. New tech is making it all possible and existing databases have been a barrier. New approaches like hadoop were borrowed from those that pioneered extracting value from enormous volumes of data. To the traditional data vendors, a terabyte was a big deal. They failed to notice that this was becoming standard in a home pc and that insurgent innovators were capturing, processing and mining mountains of data. They  didn’t keep up, so others had their lunch money and now they are playing catch-up.

But it would be wrong to define big data in terms of the innovation that allows it to happen. A little like defining fine dining as an activity conducted with knives, forks and a high quality napkin. It would be the most common mistake of the Big Data muggle.

The end of transaction oriented business

So if it’s not just ‘just’ and it’s not the technology … what is it?

It’s nothing less than a profound change in our approach to data. Historically, businesses managed themselves as a series of transactions. Occasional snapshots if you will. Only the essential financial and operational interactions between them and their customers. A quotation, an order, a despatch note and most importantly an invoice. Early on-line commerce  began to change this. Every gesture a customer made on their shopping journey could be captured. An abandoned basket in a supermarket tells the store manager nothing. Online, the same shopping cart could tell us that the delivery times are too long, the accessories were out of stock or that the secure shopping statement was in the wrong place. For the first time, so much data was being generated that ‘traditional’ analytics started to creak and groan and most of this type of analysis took place outside of corporate BI. It was ‘special’ click stream, needed specialised tools and the BI specialist and vendor shook their heads at it’s lack of structure. Where were the columns, rows and indexes.

This was just the beginning. Social platforms don’t just allow the analysis of shopping behaviours but all behaviours. If a customer comments, complains, compliments or converses in general about you or your brand, it is possible to know. It’s no longer heresay or anecdote, it’s available from the blogsphere or the Twitter firehose. It’s data.

Another beginning

Actually, that was just the beginning of the beginning. New classes of devices that can generate more data than the most active surfer or shopper are boosting the on-line population. Forget smart meters and the internet fridge, at least for now. Think more about ultra-low cost devices that remind you to water the yukka, feed the guppy or take your medication. If you forget any of these, particularly the medication, they will probably tell others too. Connected asthma inhalers can provide insight into air quality and cars that connect with your insurers who adjust your insurance premiums because your acceleration and braking patterns suggest that you are driving like you are on a track day rather than on the hanger lane gyratory. Oh and my new pebble watch (when it arrives) will add to the billions of facts, snippets and streams being added to that one big database in the sky. The cloud.

Ambient Data and why Big Data is Big

Big data represents a profound change. In our book Decision Sourcing, Gower, 2013, we refer to it as ‘ambient’ rather than big data. Ambient because we have always been surrounded by our thoughts, gestures, actions and conversations but they have never been data before. They were lost (as Rutger said) ‘like tears in rain’.

Today, we are approaching an an age where it is possible and practical to know everything that there is to know. Everything that is (to use an arcane legal expression) ‘uttered and muttered’. That’s what makes it big. Really big. Teradata think a Tera is big but it’s just a walk to the shops compared to Big Data.

Oh no  it is not ‘just’ anything. It is the beginning of the most significant shift in our industry since it began. The complexities are many, the data as varied as it is voluminous but the prize is knowledge and insight much of it predictive. Indeed, everything we have done to this point has been in preparation for the age of Big Data.

If Big Data is just anything right now … it’s just the beginning.

Information Curation 1 dot 2

The big, fat and very cool Kabocha
At the end of a jetty on a beach near Benesse House, the big, fat and very cool Kabocha

Dot to Dot: In the previous Post

In part one we examined how the curatorial process is one that is relevant to the way in which businesses make informed decisions. We examined how Frances Morris, curator of the Kusama exhibition at the Tate Modern in 2012, dealt with abundance the most pressing issue for those of us dealing with exponentially increasing data volumes today. We also saw that curation has parallels with analysis. One that starts with very few assumptions, perhaps an inkling that there is a story to tell, but then becomes more focused as evidence is sifted, examined and understood.

In this, second part, we look at filtering, relevance and how the curatorial process helps us understand which comes first … data or information.

Relevance not Completeness

As I listened Morris at the Tate, it was clear that the story she wanted to tell was as much a product of the things she left out as it was the things she included. Morris described  how she visited a site on the Japanese island, Naoshima, to see an example of Kusama’s famous pumpkins. Perched at the of end of a pier, jutting into the Inland Sea, she decided that to take it out of context would be to lose something of the truth. This lead to, perhaps, her most controversial decision amongst Kusama’s many fans, to not include one of Kusama’s recurring themes in the summer exhibition. The pumpkins, similarly to the most frequently used data, were popular. They were well known and well understood. However, they didn’t bring anything new. At the end of the pier, they were relevant and contextual. In an exhibition intended to deliver insight into ‘Kusama’s era’s’, the key points at which the artist had reinvented herself they added nothing new.

Story First

One of the most telling characteristic of Morris’s curatorial process was that the story she wanted to tell was not limited by the art. Kusama was a leader in the 60’s New York avante garde movement. She was outlandish and outspoken, sometimes shocking. Not all of this is obvious from her art but it was an important thread in Morris’s story. To remedy this she chose to exhibit documents and papers that gave Kusama a voice. Clippings, letters and personal artefacts enriched the story. The result was a much more complete picture of an artist who’s influence on culture and society had as much to do with her activism, performance art and outrageous ‘happenings’ as her art.

Sometimes, as analysts, we limit our story by what is in the database or data warehouse. Smart decisions should be informed but that doesn’t mean to the exclusion of other forms of knowledge. That which is anecdotal and tacit alongside the ‘facts’ might provide a more complete and accurate picture. Information exists outside of columns and rows.

Joining the Dots

Does the curatorial process deliver insight? Does it ultimately leave it’s visitors with the “facts” insofar as we can as they relate to life and art. The test would be Kusama’s reaction to Morris’s exhibition when she visited for a private viewing before it was opened to the public. It seems the answer is an overwhelming yes. At one point, as Morris walked Kusama around the exhibition, she wept. The collection which spanned nine decades of an extraordinary life had struck a deep and personal chord. This visceral reaction was an acknowledgement that it was an essential truth from perhaps the only one who knew, in this case, what the truth really was.

Knowledge does not leap off a computer screen or printed page any more than the life of an artist leaps off a gallery wall. It is a synthesis of data and information. To deliver a report, chart or scorecard is not to deliver knowledge. The job is only part done. The information needs to be socialised, discussed, debated and supplemented with what we know of our customers and products.  Neither is the process just ‘analysis’. It is one of selecting that which is relevant, excluding that which is not and enriching with the experiences and opinions of those in the business who’s expertise is not captured in rows and columns. In a world where we are overwhelmed with information, knowledge and understanding requires curation.

The nine decades of Yayoi Kusama at the Tate. 

Frances Morris discusses and explores Yayoi Kusama’s life and work. Taking the audience through her curatorial processes, Morris will map out the exhibition from its origins to completion. The curator will also reflect on her personal journey with Kusama, having had the opportunity to work closely with her over the last three years.