In my last blog, Why BI and Digital Analytics Went Their Separate Ways, I argued that although BI and Digital Analytics have gone their separate ways, this separation has made it increasingly difficult for marketers to achieve their business goals.  Specifically, it has been challenging for marketers to create a robust, actionable customer view and to drive communications to customers in a timely way that maximizes conversion.
Four industry groups are racing to repair the fracture – each from a different point of view.

  1. App Developers control the end to end environment from data collection to analysis through activation.  By standardizing data collection, these vendors can create scalable environments to integrate the marketing analytics stack in a very low cost way at the price of limiting flexibility.  In the case of eCommerce, Jirafe, Kiss Metrics and eCommera would all fall into this category.
  2. Data Platform Providers also control the end to end environment, but require more custom development for any individual enterprise solution.  For example, AgilOne and Boxever aim to integrate the marketing analytics stack from tag to action.  A fully implemented solution can, however, take some time to set up and deploy into production.
  3. AdTech providers have developed hybrid business intelligence, digital analytics warehouses to help them integrate the stack themselves.  These solutions will ultimately marginalize point solutions for enterprise class applications and may eventually scale down into the mid-size company market.  MediaMath, Turn and Marin Software have all developed offerings in this area.
  4. Custom built integrations based on commercially available tools.  In this case, the end-users themselves will deploy a tag management solution to capture data directly, a tool to manage data in a noSQL environment to map and aggregate the data and a BI solution to analyze the data in combination.
    • The BI platform vendors have begun to facilitate these integrations.  The developer tools (e.g., Jaspersoft and Pentaho) are further down the DA path, but have less momentum while commercial tools like Birst are beginning to address DA as the footprint for commercial demand grows.
    • Vendors in the middle of the stack like Snowplow and Treasure Data have offerings that facilitate the data collection and noSQL pieces.
    • Finally, the tag management vendors have begun to offer data capture and activation services like Tealium’s Event Store and AudienceStream products.

Depending on the company, one or more of these categories of solutions could be a good fit.

Repairing the Fracture

So, how does one evaluate the solutions on offer?  As always it depends on your needs, but in general, an integrated stack will require seven important elements.  Leaving aside the companies that want to integrate and control the entire stack, new technologies can facilitate all seven.

  1. Standardized data capture.  Capturing weblog data in an integrated way so companies no longer need to round trip data to Adtech vendors before it can enter a Digital Analytics/BI system can eliminate latency and data reconciliation issues.  Tealium’s Event Store, for example, stores all of a site’s event data and all the information sent to various AdTech vendors.  Using integrated sources of these types reduce the latency, reconciliation and redundancy associated with round-tripping data.
  2. Streaming data transfer.  AWS Kinesis, which was released in November of 2013, can collect data from large numbers of different sources into one location where the data can be filtered, grouped, aggregated, and manipulated as it is transferred from the source to either to S3 or directly into Redshift.  (See our blog on Amazon Kinesis to find out more.)
  3. Data mapping.  For weblog data to be of use in the SQL world, key value pairs need to be mapped to database columns.  In AWS, EMR can resolve these issues via jobs that parse the query strings and then map them into column based outputs.  In addition, Redshift now offers new functionality that supports column mapping for JSON files.
  4. Aggregation.  Once the columns are mapped, event data needs to be aggregated into units that help marketers make better decisions.  On the customer side, this may mean aggregating path data for customer website visits into a visit intent segmentation.  (E.g. people who browse in the career section of the website and click on specific job titles and then later investigate various product pages and company background pages are not likely to purchase products.  The intent of their visit is more likely company research as a part of a job search.  )  Traditional SQL databases handle this type of aggregation relatively poorly, but EMR represents a great way to do this type of ETL.
  5. Master Data Management.  With all the various ID’s created by the AdTech vendors, the legacy production systems inside the company and increasingly social data sources matching becomes an increasingly important issue in this environment.
  6. Warehousing and BI.  Redshift, AWS’ multi-node column database changes the game for integrating traditional structure data and weblog data.  At $1000 a terabyte per year and an ability to manage up to 6-7 petabytes, it eliminates many of the challenges that organizations have had using traditional SQL databases to house these types of data.  Multiple BI tools including Jaspersoft and Birst work with Redshift.
  7. Real time activation.  After the plumbing has been set, marketers need to react fast to capture full customer potential value.  This means that a middleware decision engine will need to sit in between the AdTech vendors and the customer’s data.

Look for big changes in the way that companies analyze their web and customer data in the years to come…  The building blocks are emerging; integrated solutions soon will bridge the chasm between the BI and digital analytics worlds.