spark.sql.cache.serializer
The name of a class that implements org.apache.spark.sql.columnar.CachedBatchSerializer. It will be used to translate SQL data into a format that can more efficiently be cached. The underlying API is subject to change so use with caution. Multiple classes cannot be specified. The class must have a no-arg constructor.
spark.sql.catalog.spark_catalog.defaultDatabase
The default database for session catalog.
spark.sql.event.truncate.length
Threshold of SQL length beyond which it will be truncated before adding to event. Defaults to no truncation. If set to 0, callsite will be logged instead.
spark.sql.extensions
A comma-separated list of classes that implement Function1[SparkSessionExtensions, Unit] used to configure Spark Session extensions. The classes must have a no-args constructor. If multiple extensions are specified, they are applied in the specified order. For the case of rules and planner strategies, they are applied in the specified order. For the case of parsers, the last parser is used and each parser can delegate to its predecessor. For the case of function name conflicts, the last registered function name is used.
spark.sql.hive.metastore.barrierPrefixes
A comma separated list of class prefixes that should explicitly be reloaded for each version of Hive that Spark SQL is communicating with. For example, Hive UDFs that are declared in a prefix that typically would be shared (i.e. org.apache.spark.*
).
spark.sql.hive.metastore.jars
Location of the jars that should be used to instantiate the HiveMetastoreClient.
This property can be one of four options:
1. "builtin"
Use Hive 2.3.9, which is bundled with the Spark assembly when
-Phive
is enabled. When this option is chosen,
spark.sql.hive.metastore.version
must be either
2.3.9
or not defined.
2. "maven"
Use Hive jars of specified version downloaded from Maven repositories.
3. "path"
Use Hive jars configured by spark.sql.hive.metastore.jars.path
in comma separated format. Support both local or remote paths.The provided jars
should be the same version as spark.sql.hive.metastore.version
.
4. A classpath in the standard format for both Hive and Hadoop. The provided jars
should be the same version as spark.sql.hive.metastore.version
.
spark.sql.hive.metastore.jars.path
Comma-separated paths of the jars that used to instantiate the HiveMetastoreClient.
This configuration is useful only when spark.sql.hive.metastore.jars
is set as path
.
The paths can be any of the following format:
1. file://path/to/jar/foo.jar
2. hdfs://nameservice/path/to/jar/foo.jar
3. /path/to/jar/ (path without URI scheme follow conf fs.defaultFS
's URI schema)
4. [http/https/ftp]://path/to/jar/foo.jar
Note that 1, 2, and 3 support wildcard. For example:
1. file://path/to/jar/,file://path2/to/jar//.jar
2. hdfs://nameservice/path/to/jar/,hdfs://nameservice2/path/to/jar//.jar
spark.sql.hive.metastore.sharedPrefixes
A comma separated list of class prefixes that should be loaded using the classloader that is shared between Spark SQL and a specific version of Hive. An example of classes that should be shared is JDBC drivers that are needed to talk to the metastore. Other classes that need to be shared are those that interact with classes that are already shared. For example, custom appenders that are used by log4j.
spark.sql.hive.metastore.version
Version of the Hive metastore. Available options are 0.12.0
through 2.3.9
and 3.0.0
through 3.1.3
.
spark.sql.hive.thriftServer.singleSession
When set to true, Hive Thrift server is running in a single session mode. All the JDBC/ODBC connections share the temporary views, function registries, SQL configuration and the current database.
spark.sql.hive.version
The compiled, a.k.a, builtin Hive version of the Spark distribution bundled with. Note that, this a read-only conf and only used to report the built-in hive version. If you want a different metastore client for Spark to call, please refer to spark.sql.hive.metastore.version.
spark.sql.metadataCacheTTLSeconds
Time-to-live (TTL) value for the metadata caches: partition file metadata cache and session catalog cache. This configuration only has an effect when this value having a positive value (> 0). It also requires setting 'spark.sql.catalogImplementation' to hive
, setting 'spark.sql.hive.filesourcePartitionFileCacheSize' > 0 and setting 'spark.sql.hive.manageFilesourcePartitions' to true
to be applied to the partition file metadata cache.
spark.sql.queryExecutionListeners
List of class names implementing QueryExecutionListener that will be automatically added to newly created sessions. The classes should have either a no-arg constructor, or a constructor that expects a SparkConf argument.
spark.sql.sources.disabledJdbcConnProviderList
Configures a list of JDBC connection providers, which are disabled. The list contains the name of the JDBC connection providers separated by comma.
spark.sql.streaming.streamingQueryListeners
List of class names implementing StreamingQueryListener that will be automatically added to newly created sessions. The classes should have either a no-arg constructor, or a constructor that expects a SparkConf argument.
spark.sql.streaming.ui.enabled
Whether to run the Structured Streaming Web UI for the Spark application when the Spark Web UI is enabled.
spark.sql.streaming.ui.retainedProgressUpdates
The number of progress updates to retain for a streaming query for Structured Streaming UI.
spark.sql.streaming.ui.retainedQueries
The number of inactive queries to retain for Structured Streaming UI.
spark.sql.ui.retainedExecutions
Number of executions to retain in the Spark UI.
spark.sql.warehouse.dir
$PWD/spark-warehouse
)The default location for managed databases and tables.