skip to main content
Close Icon

In order to deliver a personalized, responsive service and to improve the site, we remember and store information about how you use it. This is done using simple text files called cookies which sit on your computer. By continuing to use this site and access its features, you are consenting to our use of cookies. To find out more about the way Informa uses cookies please go to our Cookie Policy page.

Global Search Configuration

Ovum view

Summary

Data lakes are not typically the first use cases for big data implementations. Instead, they typically represent an advanced stage of evolution, once successful use cases materialize across multiple-line organizations and trigger the search for a strategy to manage and govern big data as an enterprise resource. But figuring out where to start with governance can be tough. Zaloni, which provides a solution for helping organizations manage the lifecycle of data, is packaging up a ready-made starter solution for the Hortonworks Data Platform (HDP) that can help enterprises take that first step.

Getting data lake governance off the ground

At the crux of the matter, data lake governance is about maintaining effective control and visibility over data. Data lake governance borrows from the experience of governing enterprise data warehouses – but the emphasis is on adaptation, because the nature of the data and how it is utilized differs sharply in a data lake. Ultimately, data lake governance encompasses:

  • Managing data inventory, which includes two centers of activity. These are data curation, which is performed by line-of-business users on a self-service basis to prepare their own sets of data; and physical inventory, where IT is ultimately accountable for documenting what data resides in the data lake and that it is properly secured.

  • Keeping data secured and accessed controlled.

  • Optimizing the cost and managing integration of the data lake with external data platforms and sources within the enterprise.

Zaloni has developed solutions that target curation, physical inventory, and security aspects of data lake governance. Its original solution, Bedrock, performs a lifecycle management function that manages data from ingest to discovery, preparation, cataloging, and securing (through integrating with access control solutions and managing data protection). They also offer a self-service, business-user-oriented tool, Mica, which provides the self-service front end for data preparation and curation.

Its new offering, "Data Lake in a Box," is actually a pre-configured implementation of Zaloni Bedrock and Mica on HDP. It is the result of harvesting best practices from Bedrock and Mica engagements with a number of early Zaloni customers. Among the elements in the "Box" is preconfiguring the data ingestion component with connectors to popular data sources, the configuration of a metadata exchange framework. While no two data lakes are the same, getting a jumpstart with a prepackaged template provides a useful tool that can help enterprises get off square one.

Appendix

Further reading

Developing a Strategy for Data Lake Governance, IT0014-003113 (May 2016)

"On the Radar: Zaloni develops tooling for managing the data lake," IT0014-003090 (December 2015)

Author

Tony Baer, Principal Analyst, Information Management

tony.baer@ovum.com

Recommended Articles

;

Have any questions? Speak to a Specialist

Europe, Middle East & Africa team - +44 (0) 207 017 7700


Asia-Pacific team - +61 (0)3 960 16700

US team - +1 646 957 8878

+44 (0) 207 551 9047 - Operational from 09.00 - 17.00 UK time

You can also contact your named/allocated Client Services Executive using their direct dial.
PR enquiries - Call us at +44 7770704398 or email us at pr@ovum.com

Contact marketing - marketingdepartment@ovum.com

Already an Ovum client? Login to the Knowledge Center now