BlazeBlaze
Introduction
  • Getting Started
  • Configuration
  • Benchmarks
  • v5.0.0
  • v4.0.1
  • v4.0.0
  • All Archived Releases
Blogs
GitHub
Introduction
  • Getting Started
  • Configuration
  • Benchmarks
  • v5.0.0
  • v4.0.1
  • v4.0.0
  • All Archived Releases
Blogs
GitHub

Blaze

Welcome to your VuePress site

Run Spark SQL/DataFrame Faster
Blaze is an accelerator for Apache Spark which leverages native vectorized execution to accelerate SQL/DataFrame queries.
StarFork
Get Started
Key Features
Performance Icon Performance
  • Supports most native operators/expressions and fine-grained failback.
  • Powered by Rust, 2x faster on TPC-DS benchmark.
  • Performs significantly better in production environments.
Production ready Icon Production ready
  • Verified on production environments with exabytes of data.
  • Supports complex production scenarios like JSON parsing, UDF/UDTF, etc.
  • Resolved various stability and data consistency issues.
Easy to Use Icon Easy to Use
  • Simple to build and install to Spark.
  • Easy to configuration.
  • Full-featured execution metrics.
Compatibility Icon Compatibility
  • Adapted to Spark mainline versions.
  • Supports different storage systems like HDFS, S3, etc.
Ecology Icon Ecosystem
  • Supports data lake system like Hudi, Paimon.
  • Supports Remote Shuffle Service like Apache Celeborn.
Development Icon Community
  • Some cooperators have applied Blaze on production.
  • More are researching and evaluating Blaze.

Benchmarks
Blaze has passed all TPC-DS/TPC-H benchmark cases. Comparing to Spark-3.5, Blaze is running ~2x faster and save ~50% cluster resources. See Benchmark Details.

Cooperators
Blaze currently has some users and contributors. You are invited to join the list by emailing blaze@kwai.com.
MIT License | Copyright © 2022- the Blaze community