3 docs tagged with "spark"

Alluxio: Data Orchestration for Analytics and AI

Alluxio is a data orchestration technology for analytics and machine learning in the cloud. It bridges the gap between data-driven applications and storage systems, bringing data closer to compute for faster processing while providing a unified namespace for data access across different storage systems.

Apache Livy: REST API for Apache Spark

Apache Livy is a service that enables easy interaction with a Spark cluster over a REST interface. It enables easy submission of Spark jobs or snippets of Spark code, synchronous or asynchronous result retrieval, as well as Spark Context management, all via a simple REST interface or an RPC client library.

Apache Spark: Unified Analytics Engine

Apache Spark is a unified analytics engine for large-scale data processing with built-in modules for streaming, SQL, machine learning, and graph processing. It provides high-level APIs in Java, Scala, Python, and R, and an optimized engine that supports general execution graphs.