skip to main content
Close Icon We use cookies to improve your website experience.  To learn about our use of cookies and how you can manage your cookie settings, please see our Cookie Policy.  By continuing to use the website, you consent to our use of cookies.
Global Search Configuration


In the race to extract value from big data, bottlenecks within the data-handling process are still commonplace. One of these primary points of constraint is in data preparation, where data must be standardized, cleansed, formatted, and readied for analytics engines.


  •  The self-serve model in analytics and data preparation is growing quickly, but most data prep today is still bottlenecked with specialists, such as data scientists.
  •  Data preparation, ideally, should begin at the point of data creation rather than later downstream during data handling.
  • With the increased prevalence of the unified, managed “data lake” model, data prep is poised to increasingly become a feature rather than just a tool.

Features and Benefits

  • Assesses the current usability of tools in the data preparation market, and helps identify possible bottlenecks within the enterprise data prep workflow.
  • Evaluates the far-reaching benefits of expanding data preparation capabilities to a wider set of enterprise end users.
  • Analyzes the role of data prep in the scope of holistic enterprise information management strategy and architecture.
  • Identifies possible strategies for spreading the user base of data prep tools and expanding the culture of data literacy among nontechnical users.

Key questions answered

  • What is the current user base of enterprise data prep tools, and how do they fit into information management strategy?
  • How would the enterprise benefit from moving data prep abilities "upstream," closer to the source of data creation?
  • How would more user-friendly data prep tools help positively impact productivity and the culture of data leverage?

Table of contents


  • Catalyst
  • Ovum view
  • Key messages


  • Recommendations for enterprises
  • Recommendations for vendors

The self-serve model is growing, but most data prep today is still bottlenecked with specialists

  • Many data professional spend disproportionate amounts of time managing data rather than mining it
  • In an unstructured data world, the prevailing methodology is still semi-structured

Data prep must begin at the creation of data, rather than downstream

  • Data preparation today is an IT bottleneck, not an end-to-end process
  • Moving data prep closer to the original source of data creation lessens the burden on all
  • Democratized data quality and prep could distribute effort for better analytics results

Data prep is beginning to become a feature rather than a tool

  • The trend toward pooling content in the "data lake" is blurring data preparation and data management
  • Increased prevalence of self-serve data prep is conditioning business users to be better data stewards
  • As technology matures, specific use-case tools often become embedded within other platforms


  • Methodology
  • Further reading
  • Author

Recommended Articles


Have any questions? Speak to a Specialist

Europe, Middle East & Africa team: +44 7771 980316

Asia-Pacific team: +61 (0)3 960 16700

US team: +1 212-652-5335

Email us at

You can also contact your named/allocated Client Services Executive using their direct dial.
PR enquiries - Email us at

Contact marketing -

Already an Ovum client? Login to the Knowledge Center now