Lets Not Run Towards the Creepy Line Before We Can Walk

Just Give Me a Signthe-man-with-two-brains

There is a very funny scene in an old Steve Martin movie The Man with Two Brains where our hero, Dr. Michael Hfuhruhurr, looks at a painting of his beloved, dead wife, Rebecca, and asks if she is happy with his feelings for the new love in his life, Dolores. An ethereal voice whispers noooooo, the painting begins to spin round, candlesticks burst into flames and an ungodly wind blows through the room. All the time the voice grows louder and then, when it all stops suddenly, a dishevelled Dr Hfuhruhurr says Just give me a sign … any sign … I’ll be looking out for it.

Predictably

I saw a recent presentation from, err let’s say one of the big three global technology corporations, which reminded me of this scene. The presenter talked us through a scenario where a young man is presented an offer to use some loyalty points. In the world of #bigdata, the presenter, went on, we will know that he has been dating for a little over eighteen months so the offer will be personalised towards romantic destinations.

Using recent purchase history, presumably an engagement ring, the intent of the trip is determined and the couples experience is further customised.  A taxi, rather than hire car, to their intimate dinner for two and at a restaurant that provides just the right setting for them rather than say, a young family or a solo business traveller. No piece of data or algorithm is left unturned to create the perfect weekend for our fictitious couple.

I am not sure I am ready to be second guessed about major life decisions by businesses that have yet to work out that googlemail.com and gmail.com is the same email suffix

Creepy Line

I know, the presenter conceded at one point, some of you might be concerned that we are crossing ‘the creepy line’ here. And the room relaxed a little and listened intently to a world where algorithms applied to increasingly personal data ensured that each need was carefully met before the couple even realised for themselves that they needed it.

The story concluded with a marriage proposal and an assertion that all of this is possible in a world of big data, machine learning and predictive analytics.

Second Guessing

Now, I welcome a world where cars are recalled before we experience a breakdown, where risk is assessed and mitigated and where fraudulent use of my credit card is spotted before any real damage is done to my own or my providers finances. All good.

However, when it comes predicting what we will want next, to second guessing us, I am not so sure. And here’s why.

based on their  personalised  marketing,this is what I think they actually know about me. They know my email address and that I buy mens clothes. And err, that’s it.

Personalise This

I am a man , no longer in the blush of youth. In spite of that, I maintain a distant interest in fashion. I care, actually very much, about the clothes I wear even if those around me might be surprised by that. I apply two universal rules. I don’t ever want to buy clothes that my Dad would wear and, even more importantly, that my Son would wear. I tend towards blues but don’t want it to be the only colour in my wardrobe and whilst the trend is towards slim fit trousers (pants for my US pals) I have to check the fit carefully because I have (let’s saystrong) calves. I very rarely wear knits, like jolly (but certainly not ‘humorous’) socks, prefer smart over casual, never go double-breasted and I almost always wear a collar particularly for dinner. As I say, older dude. I shop on-line, actually pretty frequently but I have to know the store well before I do because I have little time for the rigmarole that comes with returning parcels using  a service that seems largely geared to those that live in 1975.

And Here is What You Really Know

I buy from what I believe to be pretty innovative retailers but, based on their personalised marketing, this is what I think they actually know about me. They know my email address and that I buy mens clothes.

They don’t even have a firm grasp on the email thing to be candid. I am not sure I am ready to be second guessed about major life decisions by businesses that have yet to work out that googlemail.com and gmail.com is the same email suffix and that offering me an item at a discounted prices makes any sense if they don’t have it left in my size.

Listening

I occasionally send an email or respond to a survey to grumble about an unhelpful web site or to call out a particularly helpful assistant that didn’t assume I was looking for everything in beige. However, according to Gartner, whilst 95% of businesses collect feedback only around 35% use the insight that they have collected. A tiny 10% improve their business with this information and only half of that small group circle back to the customer and tell them that they did so.

Our customers  talk to each other on social platforms and they comment on their experience of our businesses in their own voice. Many also talk to us as directly. They leave reviews and some, though the rates are decreasing, respond to our surveys. More would answer our questions too, if they were asked thoughtfully and if we demonstrated that we did something with their feedback from time to time.

Surveillance and Second Guessing is Creepy. Listening is Human

Privacy is an important issue, I don’t mean to diminish it, but let’s forget where the creepy line is for now. Instead, why don’t we try something fundamentally human as businesses. Why don’t we try listening.

So my message to the CMO of those very large and resourceful businesses that currently have my tentative loyalty. Rather than looking for patterns, trends and algorithmic intent in data you have no place being you can read this blog, my tweets, my comments or even ask me what I like or what (to quote Spencer Trilby played by Charlton Heston in True Lies) blows my skirt up. And on that subject, I am not going to wade through 50 questions designed for your departmental silos. Let me tell you in my own words and in my own time.

Or you can continue to wait for a sign and risk me moving on to someone that listened.

Advertisements

Big Data Indescribably Large

Lost for words

I have been the first to be overly critical of those that define big data solely by size and (absence of) structure. That being said, it is inescapable that data volumes have reached an inflection point. In an article for the Wall Street Journal, Andrew McAfee makes a pretty startling observation. Data has gone from being measured in terabytes to petabytes and exabytes. He explains that in 2012 Cisco announced that its equipment was recording a zettabyte of data. Not startling so far and, in any case, outside of the circle of data geeks, few will have heard of a zettabyte.  The more jarring fact is that the next metric for measuring data is the final one. After the zettabyte is a yottabyte (10 to the power of 24  as you asked ) and then that’s it. We have literally run out of words to describe how big, big data is.

Big v Different

Commentators such as Jeff Jonas and Kenneth Culkier make the point that big is not just big. Big can be different.  David Weinberger, one of the authors of the CluetrainManifesto, makes a similar point in his book Too Big to Know. He proposes that knowledge has been shaped, perhaps even limited by its medium. Only the most important, meticulously researched facts were  committed to paper until the invention of the printing press. Even then, the printed medium carried figurative and literal weight.

In describing Big Data in Decision Sourcing, we  contrast transactional data with ambient data. Transactional data was limited by traditional data processing originally in the form of the punch card and more latterly the relational database. Ambient data, however, exists all around us. It’s size meant that it went unobserved or at least uncaptured. This is what has changed. Affordable and available technology means that signals generated through the internet of things and human social interaction can be captured in digital form providing new (and different) sources of insight. The relational database limited us to recording invoice lines and account details whilst new forms of data management allow us to capture every human gesture, comment and click. Meanwhile  the machines are logging everything they do.

Whats’s next?

Metric prefixes were last updated in 1991 at the 19th General Conference on Weights and Measures and beyond yotta, we got nothin’. Big Data means disruptive, transformational change in a way that we don’t completely understand today. In fact we don’t even have a name for what comes next. Yet.

Big Data Week: New and Different

Big Data Week

I was able to take part, albeit fleetingly, this week in Big Data Week a series of events run in 25 cities across the globe.  There were a series of evening meetups hackathons and panels throughout the week in London with the key event running out of Imperial College on Thursday. Edd Dumbill O’Reilly Strata chaired and speakers included the excellent Kenneth Cukier (co-author of Big Data) and Nick Halstead (founder UK start-up Datasift)

First off my congratulations go to Stuart Townsend and Andrew Gregson who  pulled together a program that was excellent,  free of hype and as grounded as the title ‘Putting Data to Work’ suggests. Superb.

Because of pressing commitments with the day job, I dipped in and out of what I thought would be the ‘best’ bits and was mostly right with only one session that really missed the mark for me.

Takeaways

My takeaways;

– Big Data projects are still largely about click stream, native internet businesses and one off mash-up projects (more often public and ngo than corporates)

– Big Data has its origins in the web (thank you Google) and this is where most of the corporate activity remains. ‘mainstream’ (whatever that means in the networked age) corporate use is a way off

– We are still largely defining Big Data in technical terms (a good 40% of the group in one session described themselves as ‘technical’ when polled

– It is still very early days. Innovation, interest and investment is still high and growing

New and Different Data

The highlight for me was Cukier who has come closest to providing a satisfactory definition of Big Data for me. As you may appreciate from previous posts, technical definitions based on volume are, strictly speaking, spot on but leave me a little cold. I attempt to get closer to something more vital (shameless plug alert) in Decision Sourcing (Roberts and Pakkiri, Gower 2013) by describing it as ambient. By describing it as such I am asserting that it is data that has always existed around us (temperature readings, product mentions, consumer comments, buyer behaviour) but it has only recently been captured and made available to us as data.

Let me explain. If I abandon my basket in Sainsbury because I couldn’t find the one thing I came in for and it can be seen by the store manager when the close but not understood. Not so for Amazon. The thermostat in my home is the very definition of ‘there or thereabouts’ but a Nest captures, stores and learns from accurate readings. When I share how great brunch is at Balthazar around the office, someone may make a mental note. When I do it on Facebook, it’s a piece of data to go with the other 1.5 billion that day.

The point is not that big data is big, though it is. It is that it wasn’t available to us before either because it wasn’t being captured (social mentions) or it’s volume and variety (web clicks, smart meter readings) made it impossible to store and analyse and therefore understand.

As Cukier put’s it. Sometimes more is not just more. Sometimes it is so much more that it is different. Big Data Week 2013 seemed to be a great success to me. I look forward to New and Different Data Week 2014.

Big Data: Let’s agree, no more V’s

3Vs

I don’t quite know when it happened but we have recently added another V to the three existing characteristics of big data. Perhaps more. Gartner analyst Doug Laney gave us the first batch. High volume, real-time, rapid change velocity and unstructured variety. This certainly set big data apart and at least partially explained why the old tech combination of columns, rows and sql were no longer big, strong or fast enough to deal with it. We needed hadoop, columnar, nosql, massively parallel and other innovations to deal with a full three V’s.

4Vs

More recently veracity has qualified for this somewhat exclusive club. Dealing with notions of sentiment, mentions and sociographics from tweets, facebook status updates and youtube comments is an imprecise practice very unlike traditional data processing where all transactions balance and net out to zero. According to IBM, one in four business leaders do not trust the data that they make decisions on and this new world is unlikely to make them feel any less queazy.

More V’s

A quick search will find other candidate V’s including visualisation. Indeed, one source suggests we are up to six V’s but it is time to stop counting. Whilst classifying and characterising big data in this way is understandable it is not completely helpful. In fact, according to Wherescape CEO, Michael Whitehead it perpetuates the stereotype of navel gazing IT types. This ever increasing collection of V’s is not strictly true either. Some Big Data is not high volume, some not real time and some might even have a little structure.

It kind of misses the point as well as it misses out another twenty five letters of the alphabet. Big data is certainly sourced from different places – from web sites, social platforms, machines on the internet of things. It also certainly plentiful and strange. However, defining it in terms of where it has come from or how it is processed is a technicality. It would offer far more insight to discuss it in terms of how it can be used in retail, insurance and telecommunications.

One V

Indeed, like many others, I can only really get behind one v. V for value. Like all data the test is what you do with it once you have it. If the answer is identify fraud, adjust an insurance premium in real-time, predict climate change patterns or alert a physician that a therapy regime is dangerously out of step then we can see something of value. If the answer is nothing then all that hadoop’ing, nosql’ing, massively parallel’ing and v counting is for idle curiosity.

See also Big Data – Why the 3V’s Don’t Make Sense, What is Big Data? and the Top 5 Myths About Big Data.

Facebook Graph Search: The Power of the Nodes and Edges

According to Mashable, Facebook Graph Search could be it’s greatest innovation. I tend to agree. FB have eight years of Big Data (including almost over two billion new Likes each day) to help us identify products, services and brands that we might need through the experiences of those in our social network.

Actually, a graph consists of only two things; Nodes (people) and edges (their relationships) Analysis of these though can reveal much. The simplest is a measurement of neighbours, the number of edges and their direction. A node with a large number of inward edges (or indegree) can be thought of as popular. One with a large number of outward edges gregarious. If it were possible, Lady Gaga could make a whole boutique full of dresses out of her indegree. Simple analysis of these elements are behind ‘People You May Know’ features in LinkedIn, Chatter, Connections, Jive and Facebook to mention just a few.

New Nodes

Of course, the FB Graph includes other types of nodes (businesses, brands, products) and many other types of edge including the ubiquitous Like. FB also have demographics and psychographics because we surrender more information about ourselves to FB than we would feel comfortable doing in any other survey online or offline. We’re all concerned about privacy but generally end up somewhere around ‘what are you gonna do?’.

These simple elements add up to something very powerful. It’s possible not just to find French restaurants in Frimley but those that are preferred by frequent travellers to the Côte. Not just DIY stores nearby but those popular with power tool enthusiasts. Robert Putnam could have found countless examples for his book on the decline of social capital Bowling Alone. And it is just the beginning. Let’s not forget that those edges include ‘listened’, ‘read’, ‘watched’,’hiked’ and ‘cooked’ to name but a few of the verbs now residing in your facebook apps list and your personal social graph.

Big Data Breakthrough

This is a Big Data breakthrough for Facebook and puts some distance between them and their competitor, Google.  I am not sure that plus’ing is enough of an ‘edge’ at this stage. And for those that can’t see that FB and Google are competitors then remember that there is no revenue in Search. No one actually pays Google to organise the worlds information. Nor do we part with our cash for maintaing personal networks on Facebook. There is, however, a group of people willing to pay for connecting people to products they might enjoy. Advertisers. In other words there is revenue in creating new edges between nodes.  That’s the power of the graph.

What Has CRM Ever Done for Us?

Actually the Romans come out rather well when Reg asks the questions of a bunch of masked activists in Matthias’s house in ‘The Life of Brian’. The aqueduct was just the beginning. Would CRM fare so well in a contemporary and probably unfunny update of the classic scene?

What has it done for us? Don’t misunderstand me. I use salesforce, my chosen flavour of CRM, every day. I wouldn’t be without it. Everything I do is captured in those seemingly simple customer, contact and opportunity tabs. However, what has it ever done for me … as a customer?

I have just finished Doc Searl’s latest book, the Intention Economy. It is a jarring book which turns CRM on it’s head, instead describing a world where software helps customers manage their suppliers rather than the other way round. It manages to be visionary by illustrating with situations which are utterly everyday. As customers, like frogs on slow-boil, we have come to accept the unacceptable. We tolerate what should be intolerable.

For example, Doc makes the point that when he travels by air (not unlike me) he has no special dietary requirements, places few demands on cabin crew, is likely to offer up his seat to accommodate a family or couple travelling together and is willing to pay (a little rather than take out a mortgage) extra to reduce the stress of travelling because the novelty has long since worn off. What his frequent flyer programme knows about him (and mine about me) is the total miles we have travelled and our address. Hmmm.

Yesterday,  I received a ‘personal’ note from a high street chain that I used to visit often but haven’t been able to recently.  Let’s say it’s a shop for the body. I shop here because I admired their deeply principled founder and her stand on ethical, environmental and social issues. I also like smelling like a satsuma. Mostly though, I shop there because there is convenient outlet on Waterloo concourse my gateway into and out of London. Rather, there was an outlet. It closed down during the station refurbishments and has yet to reappear. The CRM system that delivered the ‘personal’ note to me notes that they hadn’t seen me in a long time and offered me a generous discount to return. So far, so good. However, the featured products were wild rose hand cream, lip butter and a free makeover. I am a modern man and I freely admit that I prefer the smell of citrus fruit to masculine musk but it didn’t seem like a particularly compelling selection even for me.

And, this is a business I respect. At least their CRM had  spotted that it had been an unusually long time between the last transaction.

Another on-line retailer that I have been ‘loyal’ to for years has been through a recent CRM upgrade. I now only receive the section of their clothing catalogue for men. They finally understand my gender and no longer assume that my wife and I automatically like the same brand because we pick out curtains together. They worked out that I am a male and that I have different shopping habits to my wife. Big whoop.

This is the reality of CRM and Big Data today. Companies at the top of their game, with the most sophisticated CRM have worked out households, genders and not much more. And B2B is generally not even close. Many direct mail (interruptions) that I receive in my office inbox don’t even get my name correct and few, if any, are relevant to my job title or role.

It is true that sophisticated relevance marketing exists. These are the types of systems that can tell when you have started and finished the Atkins diet but they require a level of exclusivity associated with a church service and a gold band rather than the somewhat lighter associations most of us have with our grocers, coffee shops or satsuma scented shower gel supplier.

The Romans did actually give us irrigation, underfloor heating and straight roads but what has CRM ever done for us? We need more than a wallet full of loyalty cards, an iphone full of apps, licensing terms that we accept without reading and  discount vouchers with a redemption date just expired at the time we want to use them. It has a long way to go before it makes good use of all of that data, all those cookies and screens of social analytics. Mostly CRM needs to respect that unless it is going to make good and positive use of all of that data, that customers might tire of waiting, take it all back and start building VRM. The clock is ticking.

Just Stop with the ‘Big Data is Just’

OK, I get it. You’re sceptical. You’ve seen stuff come and you’ve seen it go. To you big data is just BI, just data, just analytics for the hip kids, just a distraction or just hype and fad.

Except it isn’t. Big data is only ‘just’ analytics in the same way that cloud is ‘just’ asp or bureau. That is to say it isn’t at all.

It ‘aint hadoop either

Others define it in terms of the technology. I get this too. New tech is making it all possible and existing databases have been a barrier. New approaches like hadoop were borrowed from those that pioneered extracting value from enormous volumes of data. To the traditional data vendors, a terabyte was a big deal. They failed to notice that this was becoming standard in a home pc and that insurgent innovators were capturing, processing and mining mountains of data. They  didn’t keep up, so others had their lunch money and now they are playing catch-up.

But it would be wrong to define big data in terms of the innovation that allows it to happen. A little like defining fine dining as an activity conducted with knives, forks and a high quality napkin. It would be the most common mistake of the Big Data muggle.

The end of transaction oriented business

So if it’s not just ‘just’ and it’s not the technology … what is it?

It’s nothing less than a profound change in our approach to data. Historically, businesses managed themselves as a series of transactions. Occasional snapshots if you will. Only the essential financial and operational interactions between them and their customers. A quotation, an order, a despatch note and most importantly an invoice. Early on-line commerce  began to change this. Every gesture a customer made on their shopping journey could be captured. An abandoned basket in a supermarket tells the store manager nothing. Online, the same shopping cart could tell us that the delivery times are too long, the accessories were out of stock or that the secure shopping statement was in the wrong place. For the first time, so much data was being generated that ‘traditional’ analytics started to creak and groan and most of this type of analysis took place outside of corporate BI. It was ‘special’ click stream, needed specialised tools and the BI specialist and vendor shook their heads at it’s lack of structure. Where were the columns, rows and indexes.

This was just the beginning. Social platforms don’t just allow the analysis of shopping behaviours but all behaviours. If a customer comments, complains, compliments or converses in general about you or your brand, it is possible to know. It’s no longer heresay or anecdote, it’s available from the blogsphere or the Twitter firehose. It’s data.

Another beginning

Actually, that was just the beginning of the beginning. New classes of devices that can generate more data than the most active surfer or shopper are boosting the on-line population. Forget smart meters and the internet fridge, at least for now. Think more about ultra-low cost devices that remind you to water the yukka, feed the guppy or take your medication. If you forget any of these, particularly the medication, they will probably tell others too. Connected asthma inhalers can provide insight into air quality and cars that connect with your insurers who adjust your insurance premiums because your acceleration and braking patterns suggest that you are driving like you are on a track day rather than on the hanger lane gyratory. Oh and my new pebble watch (when it arrives) will add to the billions of facts, snippets and streams being added to that one big database in the sky. The cloud.

Ambient Data and why Big Data is Big

Big data represents a profound change. In our book Decision Sourcing, Gower, 2013, we refer to it as ‘ambient’ rather than big data. Ambient because we have always been surrounded by our thoughts, gestures, actions and conversations but they have never been data before. They were lost (as Rutger said) ‘like tears in rain’.

Today, we are approaching an an age where it is possible and practical to know everything that there is to know. Everything that is (to use an arcane legal expression) ‘uttered and muttered’. That’s what makes it big. Really big. Teradata think a Tera is big but it’s just a walk to the shops compared to Big Data.

Oh no  it is not ‘just’ anything. It is the beginning of the most significant shift in our industry since it began. The complexities are many, the data as varied as it is voluminous but the prize is knowledge and insight much of it predictive. Indeed, everything we have done to this point has been in preparation for the age of Big Data.

If Big Data is just anything right now … it’s just the beginning.