In a technical preview, Hortonworks has unveiled version 2.0 of its DataFlow product. Hortonworks DataFlow (HDF), a piece of middleware that manages incoming data streams, is targeted at emerging Internet of Things (IoT) use cases. HDF 2.0 adds support for many – but not yet all – products falling under the umbrella of the Hortonworks Data Platform (HDP) Hadoop mother ship. HDF 2.0 is still very much an embryonic product; however, the new release begins the process of getting formally governed.
Making dataflow management more manageable
With its initial release a year ago, HDF took what had been a single-product company in a new direction by providing middleware that manages the flow of incoming data streams. HDF, based on the Apache NiFi project, is not necessarily tied to the Hadoop platform because it could be used for managing the feeding of data streamed to any data target.
The trigger for introducing HDF was the goal of entering the burgeoning market of IoT applications. Also, it was a means for Hortonworks to spread its bets with product diversification. In its early stages, uptake of HDF was modest in gross numbers. However, roughly 30% of those trialing HDF are net new customers of Hortonworks, suggesting that there might be a standalone market that doesn't require the full HDP offering. This could be significant for Hortonworks, given that a prime culprit for its disappointing Q2 numbers (where sales – and expenses – rose 45–50% year on year) has been the long sales cycles associated with the costly land-and-expand strategy that all Hadoop providers must follow. It would be useful for Hortonworks to have a product that could sell more quickly.
HDF 2.0 adds the first hooks to the management components of the HDP, including for Ambari (a management console) and Ranger (data security). It also provides a front end similar to Google Docs so that multiple users can work on configuring how HDF manages dataflow. On the company's roadmap is support for Atlas, which provides a means for tagging data (or data sources) that can provide the basis for governance. Also on the roadmap is a new open source subproject, MiNiFi, which could miniaturize NiFi dataflow capabilities for embedding on edge devices – a means for pushing aggregation and flow management of streaming data down to the edge of the network, which will be critical for IoT networks to scale. Clearly, HDF is a product that is still very much a work in progress – therefore it is still early for production applications. However, the tie into the management umbrella that Hortonworks already offers for its data platform means that the coast is becoming clear for piloting tangible IoT production use cases.
"Hortonworks gets ready for Hadoop adolescence," IT0014-003139 (July 2016)
"Hortonworks diversifies its platform and delivery strategy," IT0014-003107 (March 2016)
Tony Baer, Principal Analyst, Information Management