pyspark.sql.functions.xpath¶
-
pyspark.sql.functions.
xpath
(xml: ColumnOrName, path: ColumnOrName) → pyspark.sql.column.Column[source]¶ Returns a string array of values within the nodes of xml that match the XPath expression.
New in version 3.5.0.
Examples
>>> df = spark.createDataFrame( ... [('<a><b>b1</b><b>b2</b><b>b3</b><c>c1</c><c>c2</c></a>',)], ['x']) >>> df.select(xpath(df.x, lit('a/b/text()')).alias('r')).collect() [Row(r=['b1', 'b2', 'b3'])]