It was interesting to hear about the big data alliance announced in mid-February, aimed at accelerating the adoption of solutions based on Hadoop – the best known but still relatively hairy open-source method for distributing, managing and processing very large and often disparate amounts of data. The initial group of companies that signed up includes IBM, GE, Pivotal, Verizon, and Hortonworks. This type of multi-vendor initiative, often undertaken in after a new technology has experienced a few years of more hype than actual adoption, is a logical step aimed at encouraging standardization of code, common development platforms to stimulate application development, and certification among vendors and resellers, all targeted at accelerating the formation of a sustainable new high-growth category. It was interesting to note that Cloudera (among others) has elected not to join the alliance – at least not yet. Back in February, Mike Olson, co-founder and chief strategy officer of Cloudera, was quoted as saying that his company believes that this new initiative is redundant. His reasoning was that the Apache Software Foundation, the open source group that has supported Hadoop from the beginning of its launch in the marketplace, is the right forum for ensuring industry-wide standards for Hadoop. Unfortunately, open source forums or collaborations are not always reliable as fosterers of faster adoption because when they are unfocused and/or un-led, though the Openstack foundation made reasonable progress after its founding by Rackspace. In the end, pragmatist customers, always the largest bloc of target customers, will undoubtedly prefer to opt for what they see as a default choice. So, hopefully one or other initiative will win out over the other.
Cloudera’s view gives voice to the foreseeable tug of war between open source advocates who make their money from servicing and supporting the technology, and proprietary vendors who make most of their money from IP license renewals. The concern among open-source advocates is always that the IBMs, Verizons, GEs, and VMwares of the world will attempt to co-opt Hadoop, as they have in the past with Unix and, to a much lesser degree, Linux. However, these days most vendors acknowledge that the days of purely proprietary technologies winning the adoption and upgrade wars are numbered. After the successful adoption of Unix in the 90s, followed by that of Linux in the 2000s, enterprise customers have shown that they will adopt service-led solutions that are based on these non-proprietary free technologies. One of the major liberating aspects of the cloud era of enterprise computing that began around 2008 or so has been to make IT organizations increasingly averse to vendor lock-in because, for example, they now have alternatives through Openstack and the Open Compute Project as legitimate alternatives. These days customers want to avoid the future threat of having to pony up millions of dollars for proprietary technologies that are embedded in their data centers but add less and less value in comparison with newer, more open technologies. Thus major players in on-premise computing such as IBM, CA, and BMC in mainframe software, Oracle and IBM in Database Management Systems, SAP and Oracle in ERP, and VMware in virtualization, are all increasingly vulnerable to the risk of customers opting out of maintenance contract renewals. It doesn’t help them now that, too often, the cost of the renewals has been extortionate, making customers feel more like hostages than clients.
This new big data alliance signals a couple of things:
- A number of leading traditional and younger cloud vendors are anxious about the slower than hoped-for adoption of Hadoop by larger corporate and government organizations due in part to its inherent complexity and to the complexity of figuring out which use cases it is most suited for.
- They realize that both incumbents and insurgents need to pool the know-how and technology that they each bring to the table in order to formulate credible, repeatable solutions to the business problems that pragmatist customers are anxious to solve.
Assuming that the alliance delivers much more than just a press release, the mere fact that these disparate product and service providers are pooling their efforts may well push big data across the chasm into the enterprise. It is always important to remember that pragmatist customers, more than visionaries or technology enthusiasts, are intensely self-referencing, preferring to make decisions as a herd. Furthermore, this is the set of buyers that eventually determines whether or not a category gains sustained market adoption within a relatively short time such as 2-3 years, or whether it takes another five to ten years (think sales force automation or data warehousing in the 80s and 90s) or – worse – if it never really takes hold (examples abound, including artificial intelligence or expert systems twenty or thirty years ago).