Sunday, October 31, 2010

Milkshakes

Picture: Michael Perini
Recently, the rate of me coming across the milkshake story has increased dramatically.  The story to which I refer is Clayton Christensen's research involving the purchasers of milkshakes at an unnamed fast food chain.  The story resonated with me when I first heard it a few years ago, and I've often casually thought about how to apply it beyond milkshakes.

For those who haven't heard of the research, the best way to learn about it is from Christensen himself in this 7-minute interview: http://www.youtube.com/watch?v=H3fGwsrXuZw.  The gist is that a particular fast food chain wanted to know how to increase milkshake sales.  They had done traditional market segmentation and focus group studies to no avail, so they wanted to try something different.  A researcher spent one day observing a restaurant and noticed that about 40% of milkshake sales occurred during the morning commute hours, so the next day he asked these morning purchasers some questions to understand why they came in to buy milkshakes.  The most popular response was not that they wanted a sweet dessert or that they wanted a breakfast replacement, it's that they all had long, boring commutes and that they needed something to stave off the inevitable mid-morning hunger at work.  In Christensen's words, these were parts of the "job" for which the purchasers were "hiring" the milkshake.  A banana didn't perform the job well because it was eaten too fast to reduce boredom; a donut didn't perform the job well because it left the eater hungry by 10am.  The best employee for the job turned out to be a milkshake since it took 20 minutes to consume (reducing boredom) and left the consumer full until lunchtime.  Thus a successful innovation in the milkshake space would be increased viscosity, further reducing boredom by causing the milkshake to last longer.  Improving flavor would have no impact on sales because the customer wasn't hiring for taste; the customer would not be willing to pay for such an unnecessary feature.

I see two excellent benefits to Christensen's jobs-based approach to innovation:
  1. It provides focus for product improvements (or for developing new products) on a small set of important features, increasing both the accuracy and speed of innovation.
  2. It aids in identification of competitors.
The second benefit is a bit less intuitive than the first, but it can be just as beneficial as the first in some contexts.  Returning to milkshakes, what would be a competitor for Wendy's milkshake?  One might be tempted to say "a McDonald's milkshake" or "a Burger King milkshake."  While that may be true if you consider only the milkshake market, it is not at all true if you consider what the customer is actually paying for.  The customer is paying for a job that involves reducing boredom on a morning commute and staving off mid-morning hunger.  Thinking in that context, it is clear that many products are competitors for Wendy's milkshakes, including bagels, bananas, donuts, granola bars, and egg sandwiches, to name a few.  If a lot customers are hiring bananas for this job, you must consider what features of a banana are attractive to the customer and use those to better position your milkshake.  For example, a customer can buy bananas in convenient bunches, so she only has to make one trip per week to hire them; perhaps a pre-packaged milkshake sold as a package of 5 would be successful (indeed, note the prevalence of meal shakes).

Thinking about everyday products in the context of jobs being performed casts them in new lights and makes it easier to think of replacements.  Plus, it's fun to think through why I hire one coffee mug over another when all I want is a cup of tea.

Thursday, October 14, 2010

Universal Translator

In the Star Trek universe, a Universal Translator is a device that completely obliterates language barriers by instantaneously translating all speech to the user's native language.  In essence, it makes it so that everybody appears to be speaking one language.  In the Star Trek: Enterprise series, which takes place in the early years of humans' exploration of inhabited space, the translator is an imperfect device that merely produces a text translation of speech and often mistranslates, especially in the case of unfamiliar languages. The choice of the writers of Star Trek: Enterprise to portray the ancestor of the universal translator as a speech-to-text device is logical given that languages can be written down and text is easier for computers to work with than audio, right?

Earlier today I came across a TechCrunch article that got me thinking about speech-to-speech translation in the modern day.  One commenter mentioned an Android app that does speech-to-speech translation, so I took his tip and tried out Talk To Me Cloud.  The app's algorithm essentially includes three steps: 1) Listen to the Language A speech and translate it to Language A text; 2) Send the Language A text to a server in the cloud to translate it to Language B text; 3) Translate the Language B text to Language B speech. The interface took a minute to figure out, but true to its intention, it listened to my elementary Spanish and said it back to me in English.  Awesome!  But not quite a Universal Translator, or really all that useful.  Yet.

The app is plagued by several drawbacks, some within the developer's control and some without:

  1. The visual interface is clunky.  It looks like the developer took the technology behind the app, translated its steps into UI components, and just tossed those onto the screen in linear fashion.  "Let's see... The user has to select the language that needs to be translated, so let's put a drop down box in there.  The user has to see the text of what was said so they know if the speech-to-text worked right, so let's toss a big text box in there.  The user has to tell the app when to start listening, so let's put a microphone button in there, but let's make it REALLY BIG so it's easy to find.  Etc."  The net effect of this is the app feels like a techie tool, not something that opens doors to otherwise impossible conversations.  Make it fun, developer!  I can use this app to ask someone to point me to the nearest restroom in Italian!
  2. The hardware doesn't match up.  The app relies on the phone's mic for the speakerphone to hear the speech it should translate.  The problem is, the speakerphones on current smartphones (at least those that I've dealt with) are pretty poor about picking up meaningful audio outside of a few feet away, and that only in a quiet setting.  Now imagine using this app for one of its intended purposes: Suppose you are traveling in Germany and you are trying to describe your hotel (you can't recall the name) to a taxi driver.  You have this great app that will let you talk to this driver, so you bring out your phone and, naturally, set it between you two.  The app happily translates about 1 in 3 words, plus whatever is on the radio and the road noise outside.  The only solution available to you is to bring the phone close you, speak, quickly hand it to the driver who speaks and quickly hands it back to you, and so on.  That's a bit of an awkward interaction.
  3. You have to tell the app which language is being translated and which language to which it should be translated.  That's fine if only you are speaking, but in most cases you're having a conversation in which more than one person speaks, and more than one language needs to be translated.  Luckily, the developer put in a "Swap" button that allows you to swap the input and output languages.  But consider the taxi driver interaction:  Now, not only do you have to pass the phone back and forth, but each time you do, you have to press the Swap button.  And since you have to keep swapping languages, you have to break up the audio recordings, and thus each swap you make also requires a press of the microphone button.  The awkwardness of the conversation just went up a few points.
  4. Speech-to-text technology isn't very accurate.  As you can see from my screenshot, the STT translator got a word wrong ("Tus" became "Chris").  If the first step in the translation process fails, what hope is there that anything intelligible will come out the other end?  There is a reason STT technology hasn't been enormously successful commercially: It's a really tough problem to solve.
  5. Text does not include emphasis or accents.  The usage of text as an intermediary makes sense given current technology (text-to-text translation is getting pretty good), but does away with the benefits of verbal communication, namely emphasis and accent.  The former can alter the meaning of the speech entirely, and the latter can provide insight into the speaker's origins as well as (more practically) change the meaning of the speech due to differences in dialects.  For sure, creating speech recognition technology that could recognize dialects would be very difficult, but it is simply an impossible task for text.
#1 and #2 will likely be addressed within a few years as better phones and better UI interactions are developed, but #3, #4, and #5 are interesting to think about.  I believe that there is a simple solution to #3: Rather than the user telling the app which language to use on each round of speech, the app ought to be able to figure out the language itself after one round.  Put another way, the app ought to be able to figure out which person is speaking after hearing each one once.  Assuming that speakers do not change languages, the identity of the speaker can tell the app which language to translate.  One way to accomplish this would be to use the assumption that the speakers in most conversations have differing pitches of voice.  Another way would be to use the position of the speakers relative to the phone (though this would require multiple microphones on the phone, of course.  Many phones indeed have multiple mics, such as my Droid X).

#5 is a basic problem with text, so some other medium must be used to address it.  Video, scent, and touch would be pretty interesting, but audio seems to be the most practical.  In this case, audio also brings the benefit of reducing the number of translations from three (speech-text-text-speech) to one (speech-speech).  Assuming that any translation is imperfect, the multiplier effect would say that one translation has a much greater potential for high quality than three translations.  Further, direct speech-speech translation would bring the benefits of emphasis and accents, which are vital to verbal communication.  If we truly want a Universal Translator, it needs to be a direct speech-speech translation device.  That would take care of #4 as well.

Finally, it seems to me that speech-text, text-text, and speech-speech are all different problems.  They may have some underlying similarities such as matching algorithms, but progress in the text-text space may be unhelpful to the progress of the speech-speech space.  Thus I feel that the writers of Star Trek: Enterprise were misguided: The speech-text translation device is not a stepping-stone to the Universal Translator.  If computer scientists and developers focus their energies on direct speech-speech translation, we just might have the UT before Star Trek's predicted late-22nd-century arrival.

Thursday, October 7, 2010

Potential Disruption Chain

While at Napkin Club this evening I fell into a reverie for several minutes.  The trigger?  I saw a keychain with a Kroger loyalty card.

I got to thinking about loyalty cards, specifically about the future of them.  They are predicated on a particular set of technology: that of barcodes, barcode scanners, and computerized registers.  But what is happening as more and more information is moving to mobile devices?  Shoppers are beginning to desire keeping loyalty card information on their devices rather than accumulating piles of physical cards.  The problem with that is current scanner technology does not work with barcodes displayed on mobile devices. (Just try it: If you have a smartphone, get a keychain app for your phone and load some of your loyalty cards into it.  Now go to your favorite store, load up the loyalty card on the app, and attempt to swipe your phone across the scanner.  I'll bet you and/or the annoyed sales clerk give up after 30 seconds.)  The solution to this problem obviously involves some other protocol through which the mobile device and the scanner can communicate, such as Bluetooth.  Or does it?

In fact, the solution may have nothing to do with mobile devices or scanners at all.  It all goes back to the job for which the store is hiring that is currently held by the loyalty card.  What is the store trying to do with the loyalty card?  One could argue that the primary purpose is to collect customer purchase data.  That is likely true, but there must be a reason that the store wants to collect such data.  I would posit that the store wants purchase data so that it can find correlations in purchasing behavior and link it with demographic information.  Going further, the goal of such correlations is to help the store optimize which products it offers, where it places them, and the like.  Further still, the primary purpose of such optimizations is to increase the store's sales.  That increase most likely serves to increase profits, which in turn serves the store's goal of maximizing shareholder value.

By recursively considering the real purposes behind each activity or goal, I created a chain of continually expanding goals, all the way to the store's ultimate goal of maximizing shareholder value (one could go even beyond that, but I think that's far enough for the purposes of discussion).  Here's the big idea:  Every one of the links in the chain could potentially be substituted.  The loyalty card could be substituted by a smartphone to fulfill the goal of providing trackable purchase data, the product placement optimizations could be substituted by providing higher margin products to fulfill the goal of increasing sales, and increasing sales could be substituted by reducing costs to fulfill the goal of increasing profits.  In other words, every one of the links in the chain is in a potential area for disruption.

Building such a potential disruption chain could be a useful tool for drawing out ideas for innovations.  It is easy to think of innovations in the bottom link (Let's just use this newer, better technology X!), but as you move up the chain, innovations have much greater potentials for impact.  Just imagine the store coming up with a new method for obtaining correlated purchase and demographic data, such as simply buying the data from a specialized firm.  The specialized firm may offer much higher-quality analyses, and the store would be able to realize savings in the printing of loyalty cards, scanner equipment, data warehousing, and the performance of its own analyses.  Breaking out the purpose of a particular activity into a potential disruption chain helps such ideas come forth.

A Potential Disruption Chain

Friday, October 1, 2010

Managing an Innovation Portfolio

This week, my Marketing class discussed innovation strategy, driven particularly by a George S. Day's article, "Is It Real?  Can We Win?  Is It Worth Doing?", printed in the Harvard Business Review in 2007.  Two key points popped out at me from this article:

  1. Just like in the world of investments, your innovation portfolio should be diversified in terms of risk.
  2. To narrow down the candidates for innovations to pursue, have a team ask the three essential questions that make up the name of the article.
I found the first point particularly interesting.  Day suggested creating a scatter plot of projects, with the x-axis being newness of the intended market (to he firm) and the y-axis being the newness of the technology (to the firm).  Intuitively, he's suggesting that risk increases as the innovation is either more unfamiliar in terms of technology or more unfamiliar in terms of intended market.  The big idea with this is that a firm ought to be able to draw a trendline roughly equivalent to y=x, where the points are evenly distributed between high risk and low risk.   In other words, the ideal strategy includes incremental innovations, big innovations in terms of new markets, big innovations in terms of new technologies, and big innovations in both new markets and new technologies.

As Day points out, most firms focus on the lower-left (incremental innovations).  This makes sense in a risk-averse environment, where a misstep could damage a company's credibility.  But what about an environment where missteps are expected or easily swept under the rug?  I would argue that tech fits that category.  Just take a look at recently failed high-risk projects, such as Google Wave (New technology? Yes.  New market for Google?  It was a project collaboration service - certainly a new space for Google).  Did Wave damage Google's credibility?  Not at all.  Could it have been huge?  Definitely.  In fact, tech demands high-risk projects, as evidenced by the common worry among tech executives that their business could be replaced by "two guys in a garage."  Those two guys in a garage are certainly pushing the envelope of new technologies and new markets.  In order for an incumbent to compete, it needs to beat the two guys to those innovations.

Incumbent tech firms certainly need some incremental innovations to continue steady growth with existing products, but complete risk aversion will lead to steady decline.  Technology moves fast; patents are easily worked around; trade secrets are quickly leaked.  The only sure growth strategy is a healthy mix of risk.