Showing posts with label apachespark. Show all posts
Showing posts with label apachespark. Show all posts

Monday, 12 September 2016

Calling HANA Views from Apache Spark

Open Source Apache Spark  is fast becoming the de facto standard for Big Data processing and analytics. It’s an ‘in-memory’ data processing engine, utilising the distributed computing power of 10’s or even 1000’s of logical linked host machines (cluster). It’s able to crunch through vast quantities of both structured and unstructured data. You can easily scale out your cluster as your data appetite grows.

In addition to this it can also be used as a data federation layer spanning both traditional databases as well as other popular big data platforms, such as Hadoop HDFS, Hadoop Hbase, Cassandra, Amazon Redshift and S3, to name a few.