The market for self-service data prep capabilities and functionality is becoming increasingly “platformized.” Standalone data prep tools are gradually adding more data management capabilities, and data management and BI/analytics vendors are adding integrated data prep capabilities to their existing environments. SAS, one of the newest entrants to this market, in December 2017 debuted SAS Data Preparation, which runs on the SAS Viya cloud-enabled, in-memory analytics engine. The decision to introduce self-service data prep is a logical addition to the SAS Viya self-service platform.
Straightforward data prep running on the SAS Viya engine
Previously, those that wanted to do data prep for data in the SAS environment were limited to using SAS Data Loader for Hadoop, which provided self-service data prep on Hadoop/Spark, more conventional data quality tools from SAS or third parties, or handling the process manually. The common thread with these approaches, apart from SAS Data Loader, was that they frequently required IT. This is counter to SAS’s goals for SAS Viya, which is built around self-service. To recap, SAS Viya is its emerging, cloud-friendly, in-memory self-service analytics platform that refactors capabilities from across SAS’s analytics and data management portfolio. It converges pieces that were formerly available through separate tools. It breaks down the wall between SAS and non-SAS languages, capable of running analytics developed on Jupyter notebooks in languages such as Python or R, and runs them natively in SAS.
The unveiling of SAS Data Preparation joins other tools, such as SAS Visual Investigator, that have been released on Viya. With the introduction of self-service data prep capabilities, business users can now take data prep into their own hands without leaving the SAS ecosystem, reducing the burden on IT while maintaining the governance controls and processing capabilities that the SAS platform offers for data.
SAS Data Preparation provides a toolbox of capabilities that will meet nearly all the use cases for basic data prep needs. Offered via a visual, web-based application, SAS Data Preparation is designed first and foremost for business analyst users, and data is loaded and processed in-memory as part of the analytics pipeline. Users can append, join, filter, transpose, change case, convert and rename columns, remove and split columns, trim whitespace, and perform calculated column creation, all without leaving the SAS environment. Visual data lineage exploration, useful for understanding where and how data exists within the SAS ecosystem, is also provided. Collaborative project management and data job monitoring and scheduling round out the capabilities that are geared toward business users. SAS Data Quality is also included in the SAS Data Preparation offering, and is integrated into the interactive web UI, allowing for parsing, standardization, and matching. For programmers and technical users, SAS Data Quality provides APIs for data quality functions and data profiling.
However, the SAS journey to provide self-service data prep has just begun, and some more advanced functionality (particularly guided features powered by machine learning) are still to be developed. Features such as predictive transformations are currently not available, and instead the product focuses on the core functionality that is most familiar to business analyst users. Native data cataloging capabilities have also yet to be added. While machine learning-powered features and data catalog capabilities are on the roadmap for 2018, the company has a game of catch-up to play, particularly when compared to some of the vendors in the space that were traditionally focused on standalone, best-of-breed data prep capabilities. Machine learning will be critical to expanding the user base of SAS’s data prep functionality beyond business analysts and technical users. For now, the strength of the SAS Data Preparation offering is its ability to recruit business users into the data prep process without leaving the SAS ecosystem, a useful quality for any organization looking to make SAS its one-stop-shop for analytics and data processing.
"SAS Viya emerges as the bedrock of SAS's future direction," IT0004-000458 (June 2017)
“SAS's investments position it to be an analytics market leader for the years to come,” IT0014-003291 (June 2017)
Beyond Self-Serve: Expanding the End-User Audience of Data Prep, IT0014-003213 (January 2017)
Paige Bartley, Senior Analyst, Data and Enterprise Intelligence