pyspark.sql.DataFrameWriter.mode¶
-
DataFrameWriter.
mode
(saveMode: Optional[str]) → pyspark.sql.readwriter.DataFrameWriter[source]¶ Specifies the behavior when data or table already exists.
Options include:
append: Append contents of this
DataFrame
to existing data.overwrite: Overwrite existing data.
error or errorifexists: Throw an exception if data already exists.
ignore: Silently ignore this operation if data already exists.
New in version 1.4.0.
Changed in version 3.4.0: Supports Spark Connect.
Examples
Raise an error when writing to an existing path.
>>> import tempfile >>> with tempfile.TemporaryDirectory() as d: ... spark.createDataFrame( ... [{"age": 80, "name": "Xinrong Meng"}] ... ).write.mode("error").format("parquet").save(d) Traceback (most recent call last): ... ...AnalysisException: ...
Write a Parquet file back with various options, and read it back.
>>> with tempfile.TemporaryDirectory() as d: ... # Overwrite the path with a new Parquet file ... spark.createDataFrame( ... [{"age": 100, "name": "Hyukjin Kwon"}] ... ).write.mode("overwrite").format("parquet").save(d) ... ... # Append another DataFrame into the Parquet file ... spark.createDataFrame( ... [{"age": 120, "name": "Takuya Ueshin"}] ... ).write.mode("append").format("parquet").save(d) ... ... # Append another DataFrame into the Parquet file ... spark.createDataFrame( ... [{"age": 140, "name": "Haejoon Lee"}] ... ).write.mode("ignore").format("parquet").save(d) ... ... # Read the Parquet file as a DataFrame. ... spark.read.parquet(d).show() +---+-------------+ |age| name| +---+-------------+ |120|Takuya Ueshin| |100| Hyukjin Kwon| +---+-------------+