MAINFRAME > Business Strategy > BI and Analytics

IBM Brings Apache Spark’s Analytics Power to Linux and z Systems Servers

Apache Spark

Analytics is increasingly an integral part of day-to-day operations at today’s leading businesses, as evidenced by the 2015 Gartner Executive Summary of the CIO Agenda ( Eighty percent of CIOs surveyed stated that transition from backward-looking, passive analysis must shift to forward-looking predictive analytics and active experimentation. Transformation is also occurring through huge growth in mobile and digital channels. The same survey shows that whereas separating analytics and operations had been acceptable in the past, 62 percent of CIOs now feel the need to shift to a more embedded analytics environment.

Previously acceptable response times and delays for analytic insight are no longer viable, with a stronger push toward real-time and in-transaction analytics. In addition, data science skills are increasingly in demand. One McKinsey Global Institute report ( states that by 2018, the U.S. could face a shortage of as much as 190,000 people with data science skills. As a result, enterprise organizations are attempting to leverage analytics in new ways and transition existing analytic capabilities to respond with more flexibility while making the most efficient use of highly valuable data science skills.

IBM plans to put more than 3,500 IBM developers and researchers to work on worldwide Spark-related projects

Though the demand for more agile analytics across the enterprise is increasing, many of today’s solutions are aligned to specific platforms, tied to inflexible programming models, require vast data movements into data lakes that quickly become data swamps, and result in pockets of analytics and insight that require ongoing manual intervention to integrate into coherent analytics solutions.

With all of these impending forces converging, organizations are well poised for a change. The recent growth and adoption of Apache Spark as an analytics framework and platform is timely and helps meet these challenging demands.

What is Apache Spark?

Spark is an open source, in-memory analytics computing framework offered by the Apache Foundation.

Spark offers a unified programming environment and is extremely lightweight. Most importantly it’s function-rich, meaning it provides libraries for commonly used analytic methodologies for data access, manipulation and application of various algorithms. Spark offers language diversity in its support for Java*, Python, Scala and, most recently, R.

From an operations point of view, Spark can run in stand-alone mode or clustered environments, and it’s not reliant on a specific file system or platform set of technologies, but it can be adapted to many configurations.

In June, IBM announced a major commitment to Spark, including plans to put more than 3,500 IBM developers and researcher to work on Spark-related projects worldwide, contribution of the IBM SystemML machine learning technology to the Spark open-source community and its intent to offer Spark as a service on IBM Bluemix*.

Mythili Venkatakrishnan is an IBM Senior Technical Staff Member and is the z Systems Architecture and Technology Lead for analytics.

comments powered by Disqus



2019 Solutions Edition

A Comprehensive Online Buyer's Guide to Solutions, Services and Education.


AI-Driven Technological Progress

All Together Now

A Centralized Business Analytics Environment Delivers Greater Value

IBM Systems Magazine Subscribe Box Read Now Link Subscribe Now Link iPad App Google Play Store
Mainframe News Sign Up Today! Past News Letters