BlazeBlaze
Introduction
  • Getting Started
  • Configuration
  • Benchmarks
  • v5.0.0
  • v4.0.1
  • v4.0.0
  • All Archived Releases
Blogs
GitHub
Introduction
  • Getting Started
  • Configuration
  • Benchmarks
  • v5.0.0
  • v4.0.1
  • v4.0.0
  • All Archived Releases
Blogs
GitHub
  • Introduction
  • Documents

    • Getting-Started
    • Configurations
    • Benchmarks
  • Archives

    • v5.0.0
    • v4.0.1
    • v4.0.0
    • All Archived Releases
  • Blogs

Configurations for Blaze

Blaze Runtime Parameter

ParametersDefaultNote
spark.blaze.enabletrueEnable/disable blaze engine.
spark.blaze.batchSize10000Suggested batch size for arrow batches.
spark.blaze.memoryFraction0.6Suggested fraction of off-heap memory used in native execution. Actual off-heap memory usage is expected to be spark.executor.memoryOverhead * fraction.
spark.blaze.tokio.num.worker.threads1Number of worker threads used in tokio runtime, 0 to use default available parallelism value. For CPUs those support hyperthreading, it is recommended to set this value to the number of available physical cores.
spark.blaze.enableInputBatchStatisticstrueEnable extra metrics of input batch statistics.
spark.blaze.partialAggSkipping.enabletrueEnable partial aggregate skipping. (see https://github.com/blaze-init/blaze/issues/327)
spark.blaze.partialAggSkipping.ratio0.8Partial aggregate skipping ratio.
spark.blaze.partialAggSkipping.minRows20000Minimum number of rows to trigger partial aggregate skipping.
spark.blaze.parquet.enable.pageFilteringfalseParquet enable page filtering.
spark.blaze.parquet.enable.bloomFilterfalseParquet enable bloom filter.
spark.blaze.forceShuffledHashJoinfalseReplace all sort-merge join to shuffled-hash join, only used for special benchmarking.

Native Operators Switch

ParametersDefault
spark.blaze.enable.scantrue
spark.blaze.enable.projecttrue
spark.blaze.enable.filtertrue
spark.blaze.enable.sorttrue
spark.blaze.enable.uniontrue
spark.blaze.enable.smjtrue
spark.blaze.enable.shjtrue
spark.blaze.enable.bhjtrue
spark.blaze.enable.bnljtrue
spark.blaze.enable.local.limittrue
spark.blaze.enable.global.limittrue
spark.blaze.enable.take.ordered.and.projecttrue
spark.blaze.enable.aggrtrue
spark.blaze.enable.expandtrue
spark.blaze.enable.windowtrue
spark.blaze.enable.generatetrue
spark.blaze.enable.local.table.scantrue
spark.blaze.enable.data.writingfalse

Expression/UDF switch

ParametersDefaultNote
spark.blaze.enable.caseconvert.functionstrueEnable converting upper/lower functions to native, special cases may provide different outputs from spark due to different unicode versions.
spark.blaze.udf.brickhouse.enabledtrueEnable some native-implemented brickhouse UDFs.
spark.blaze.udf.UDFJson.enabledtrueEnable native implemented get_json_object/json_tuple. May introduce inconsistency in special case (especially with illegal json inputs).
Prev
Getting-Started
Next
Benchmarks