{"config":{"lang":["en"],"prebuild_index":false,"separator":"[\\s\\-]+"},"docs":[{"location":"","text":"! ! expr - Logical not. % expr1 % expr2 - Returns the remainder after expr1 / expr2 . Examples: > SELECT 2 % 1.8; 0.2 > SELECT MOD(2, 1.8); 0.2 & expr1 & expr2 - Returns the result of bitwise AND of expr1 and expr2 . Examples: > SELECT 3 & 5; 1 * expr1 * expr2 - Returns expr1 * expr2 . Examples: > SELECT 2 * 3; 6 + expr1 + expr2 - Returns expr1 + expr2 . Examples: > SELECT 1 + 2; 3 - expr1 - expr2 - Returns expr1 - expr2 . Examples: > SELECT 2 - 1; 1 / expr1 / expr2 - Returns expr1 / expr2 . It always performs floating point division. Examples: > SELECT 3 / 2; 1.5 > SELECT 2L / 2L; 1.0 < expr1 < expr2 - Returns true if expr1 is less than expr2 . Arguments: expr1, expr2 - the two expressions must be same type or can be casted to a common type, and must be a type that can be ordered. For example, map type is not orderable, so it is not supported. For complex types such array/struct, the data types of fields must be orderable. Examples: > SELECT 1 < 2; true > SELECT 1.1 < '1'; false > SELECT to_date('2009-07-30 04:17:52') < to_date('2009-07-30 04:17:52'); false > SELECT to_date('2009-07-30 04:17:52') < to_date('2009-08-01 04:17:52'); true > SELECT 1 < NULL; NULL <= expr1 <= expr2 - Returns true if expr1 is less than or equal to expr2 . Arguments: expr1, expr2 - the two expressions must be same type or can be casted to a common type, and must be a type that can be ordered. For example, map type is not orderable, so it is not supported. For complex types such array/struct, the data types of fields must be orderable. Examples: > SELECT 2 <= 2; true > SELECT 1.0 <= '1'; true > SELECT to_date('2009-07-30 04:17:52') <= to_date('2009-07-30 04:17:52'); true > SELECT to_date('2009-07-30 04:17:52') <= to_date('2009-08-01 04:17:52'); true > SELECT 1 <= NULL; NULL <=> expr1 <=> expr2 - Returns same result as the EQUAL(=) operator for non-null operands, but returns true if both are null, false if one of the them is null. Arguments: expr1, expr2 - the two expressions must be same type or can be casted to a common type, and must be a type that can be used in equality comparison. Map type is not supported. For complex types such array/struct, the data types of fields must be orderable. Examples: > SELECT 2 <=> 2; true > SELECT 1 <=> '1'; true > SELECT true <=> NULL; false > SELECT NULL <=> NULL; true = expr1 = expr2 - Returns true if expr1 equals expr2 , or false otherwise. Arguments: expr1, expr2 - the two expressions must be same type or can be casted to a common type, and must be a type that can be used in equality comparison. Map type is not supported. For complex types such array/struct, the data types of fields must be orderable. Examples: > SELECT 2 = 2; true > SELECT 1 = '1'; true > SELECT true = NULL; NULL > SELECT NULL = NULL; NULL == expr1 == expr2 - Returns true if expr1 equals expr2 , or false otherwise. Arguments: expr1, expr2 - the two expressions must be same type or can be casted to a common type, and must be a type that can be used in equality comparison. Map type is not supported. For complex types such array/struct, the data types of fields must be orderable. Examples: > SELECT 2 == 2; true > SELECT 1 == '1'; true > SELECT true == NULL; NULL > SELECT NULL == NULL; NULL > expr1 > expr2 - Returns true if expr1 is greater than expr2 . Arguments: expr1, expr2 - the two expressions must be same type or can be casted to a common type, and must be a type that can be ordered. For example, map type is not orderable, so it is not supported. For complex types such array/struct, the data types of fields must be orderable. Examples: > SELECT 2 > 1; true > SELECT 2 > '1.1'; true > SELECT to_date('2009-07-30 04:17:52') > to_date('2009-07-30 04:17:52'); false > SELECT to_date('2009-07-30 04:17:52') > to_date('2009-08-01 04:17:52'); false > SELECT 1 > NULL; NULL >= expr1 >= expr2 - Returns true if expr1 is greater than or equal to expr2 . Arguments: expr1, expr2 - the two expressions must be same type or can be casted to a common type, and must be a type that can be ordered. For example, map type is not orderable, so it is not supported. For complex types such array/struct, the data types of fields must be orderable. Examples: > SELECT 2 >= 1; true > SELECT 2.0 >= '2.1'; false > SELECT to_date('2009-07-30 04:17:52') >= to_date('2009-07-30 04:17:52'); true > SELECT to_date('2009-07-30 04:17:52') >= to_date('2009-08-01 04:17:52'); false > SELECT 1 >= NULL; NULL ^ expr1 ^ expr2 - Returns the result of bitwise exclusive OR of expr1 and expr2 . Examples: > SELECT 3 ^ 5; 2 abs abs(expr) - Returns the absolute value of the numeric value. Examples: > SELECT abs(-1); 1 acos acos(expr) - Returns the inverse cosine (a.k.a. arc cosine) of expr , as if computed by java.lang.Math.acos . Examples: > SELECT acos(1); 0.0 > SELECT acos(2); NaN add_months add_months(start_date, num_months) - Returns the date that is num_months after start_date . Examples: > SELECT add_months('2016-08-31', 1); 2016-09-30 Since: 1.5.0 aggregate aggregate(expr, start, merge, finish) - Applies a binary operator to an initial state and all elements in the array, and reduces this to a single state. The final state is converted into the final result by applying a finish function. Examples: > SELECT aggregate(array(1, 2, 3), 0, (acc, x) -> acc + x); 6 > SELECT aggregate(array(1, 2, 3), 0, (acc, x) -> acc + x, acc -> acc * 10); 60 Since: 2.4.0 and expr1 and expr2 - Logical AND. approx_count_distinct approx_count_distinct(expr[, relativeSD]) - Returns the estimated cardinality by HyperLogLog++. relativeSD defines the maximum estimation error allowed. approx_percentile approx_percentile(col, percentage [, accuracy]) - Returns the approximate percentile value of numeric column col at the given percentage. The value of percentage must be between 0.0 and 1.0. The accuracy parameter (default: 10000) is a positive numeric literal which controls approximation accuracy at the cost of memory. Higher value of accuracy yields better accuracy, 1.0/accuracy is the relative error of the approximation. When percentage is an array, each value of the percentage array must be between 0.0 and 1.0. In this case, returns the approximate percentile array of column col at the given percentage array. Examples: > SELECT approx_percentile(10.0, array(0.5, 0.4, 0.1), 100); [10.0,10.0,10.0] > SELECT approx_percentile(10.0, 0.5, 100); 10.0 array array(expr, ...) - Returns an array with the given elements. Examples: > SELECT array(1, 2, 3); [1,2,3] array_contains array_contains(array, value) - Returns true if the array contains the value. Examples: > SELECT array_contains(array(1, 2, 3), 2); true array_distinct array_distinct(array) - Removes duplicate values from the array. Examples: > SELECT array_distinct(array(1, 2, 3, null, 3)); [1,2,3,null] Since: 2.4.0 array_except array_except(array1, array2) - Returns an array of the elements in array1 but not in array2, without duplicates. Examples: > SELECT array_except(array(1, 2, 3), array(1, 3, 5)); [2] Since: 2.4.0 array_intersect array_intersect(array1, array2) - Returns an array of the elements in the intersection of array1 and array2, without duplicates. Examples: > SELECT array_intersect(array(1, 2, 3), array(1, 3, 5)); [1,3] Since: 2.4.0 array_join array_join(array, delimiter[, nullReplacement]) - Concatenates the elements of the given array using the delimiter and an optional string to replace nulls. If no value is set for nullReplacement, any null value is filtered. Examples: > SELECT array_join(array('hello', 'world'), ' '); hello world > SELECT array_join(array('hello', null ,'world'), ' '); hello world > SELECT array_join(array('hello', null ,'world'), ' ', ','); hello , world Since: 2.4.0 array_max array_max(array) - Returns the maximum value in the array. NULL elements are skipped. Examples: > SELECT array_max(array(1, 20, null, 3)); 20 Since: 2.4.0 array_min array_min(array) - Returns the minimum value in the array. NULL elements are skipped. Examples: > SELECT array_min(array(1, 20, null, 3)); 1 Since: 2.4.0 array_position array_position(array, element) - Returns the (1-based) index of the first element of the array as long. Examples: > SELECT array_position(array(3, 2, 1), 1); 3 Since: 2.4.0 array_remove array_remove(array, element) - Remove all elements that equal to element from array. Examples: > SELECT array_remove(array(1, 2, 3, null, 3), 3); [1,2,null] Since: 2.4.0 array_repeat array_repeat(element, count) - Returns the array containing element count times. Examples: > SELECT array_repeat('123', 2); [\"123\",\"123\"] Since: 2.4.0 array_sort array_sort(array) - Sorts the input array in ascending order. The elements of the input array must be orderable. Null elements will be placed at the end of the returned array. Examples: > SELECT array_sort(array('b', 'd', null, 'c', 'a')); [\"a\",\"b\",\"c\",\"d\",null] Since: 2.4.0 array_union array_union(array1, array2) - Returns an array of the elements in the union of array1 and array2, without duplicates. Examples: > SELECT array_union(array(1, 2, 3), array(1, 3, 5)); [1,2,3,5] Since: 2.4.0 arrays_overlap arrays_overlap(a1, a2) - Returns true if a1 contains at least a non-null element present also in a2. If the arrays have no common element and they are both non-empty and either of them contains a null element null is returned, false otherwise. Examples: > SELECT arrays_overlap(array(1, 2, 3), array(3, 4, 5)); true Since: 2.4.0 arrays_zip arrays_zip(a1, a2, ...) - Returns a merged array of structs in which the N-th struct contains all N-th values of input arrays. Examples: > SELECT arrays_zip(array(1, 2, 3), array(2, 3, 4)); [{\"0\":1,\"1\":2},{\"0\":2,\"1\":3},{\"0\":3,\"1\":4}] > SELECT arrays_zip(array(1, 2), array(2, 3), array(3, 4)); [{\"0\":1,\"1\":2,\"2\":3},{\"0\":2,\"1\":3,\"2\":4}] Since: 2.4.0 ascii ascii(str) - Returns the numeric value of the first character of str . Examples: > SELECT ascii('222'); 50 > SELECT ascii(2); 50 asin asin(expr) - Returns the inverse sine (a.k.a. arc sine) the arc sin of expr , as if computed by java.lang.Math.asin . Examples: > SELECT asin(0); 0.0 > SELECT asin(2); NaN assert_true assert_true(expr) - Throws an exception if expr is not true. Examples: > SELECT assert_true(0 < 1); NULL atan atan(expr) - Returns the inverse tangent (a.k.a. arc tangent) of expr , as if computed by java.lang.Math.atan Examples: > SELECT atan(0); 0.0 atan2 atan2(exprY, exprX) - Returns the angle in radians between the positive x-axis of a plane and the point given by the coordinates ( exprX , exprY ), as if computed by java.lang.Math.atan2 . Arguments: exprY - coordinate on y-axis exprX - coordinate on x-axis Examples: > SELECT atan2(0, 0); 0.0 avg avg(expr) - Returns the mean calculated from values of a group. base64 base64(bin) - Converts the argument from a binary bin to a base 64 string. Examples: > SELECT base64('Spark SQL'); U3BhcmsgU1FM bigint bigint(expr) - Casts the value expr to the target data type bigint . bin bin(expr) - Returns the string representation of the long value expr represented in binary. Examples: > SELECT bin(13); 1101 > SELECT bin(-13); 1111111111111111111111111111111111111111111111111111111111110011 > SELECT bin(13.3); 1101 binary binary(expr) - Casts the value expr to the target data type binary . bit_length bit_length(expr) - Returns the bit length of string data or number of bits of binary data. Examples: > SELECT bit_length('Spark SQL'); 72 boolean boolean(expr) - Casts the value expr to the target data type boolean . bround bround(expr, d) - Returns expr rounded to d decimal places using HALF_EVEN rounding mode. Examples: > SELECT bround(2.5, 0); 2.0 cardinality cardinality(expr) - Returns the size of an array or a map. The function returns -1 if its input is null and spark.sql.legacy.sizeOfNull is set to true. If spark.sql.legacy.sizeOfNull is set to false, the function returns null for null input. By default, the spark.sql.legacy.sizeOfNull parameter is set to true. Examples: > SELECT cardinality(array('b', 'd', 'c', 'a')); 4 > SELECT cardinality(map('a', 1, 'b', 2)); 2 > SELECT cardinality(NULL); -1 cast cast(expr AS type) - Casts the value expr to the target data type type . Examples: > SELECT cast('10' as int); 10 cbrt cbrt(expr) - Returns the cube root of expr . Examples: > SELECT cbrt(27.0); 3.0 ceil ceil(expr) - Returns the smallest integer not smaller than expr . Examples: > SELECT ceil(-0.1); 0 > SELECT ceil(5); 5 ceiling ceiling(expr) - Returns the smallest integer not smaller than expr . Examples: > SELECT ceiling(-0.1); 0 > SELECT ceiling(5); 5 char char(expr) - Returns the ASCII character having the binary equivalent to expr . If n is larger than 256 the result is equivalent to chr(n % 256) Examples: > SELECT char(65); A char_length char_length(expr) - Returns the character length of string data or number of bytes of binary data. The length of string data includes the trailing spaces. The length of binary data includes binary zeros. Examples: > SELECT char_length('Spark SQL '); 10 > SELECT CHAR_LENGTH('Spark SQL '); 10 > SELECT CHARACTER_LENGTH('Spark SQL '); 10 character_length character_length(expr) - Returns the character length of string data or number of bytes of binary data. The length of string data includes the trailing spaces. The length of binary data includes binary zeros. Examples: > SELECT character_length('Spark SQL '); 10 > SELECT CHAR_LENGTH('Spark SQL '); 10 > SELECT CHARACTER_LENGTH('Spark SQL '); 10 chr chr(expr) - Returns the ASCII character having the binary equivalent to expr . If n is larger than 256 the result is equivalent to chr(n % 256) Examples: > SELECT chr(65); A coalesce coalesce(expr1, expr2, ...) - Returns the first non-null argument if exists. Otherwise, null. Examples: > SELECT coalesce(NULL, 1, NULL); 1 collect_list collect_list(expr) - Collects and returns a list of non-unique elements. collect_set collect_set(expr) - Collects and returns a set of unique elements. concat concat(col1, col2, ..., colN) - Returns the concatenation of col1, col2, ..., colN. Examples: > SELECT concat('Spark', 'SQL'); SparkSQL > SELECT concat(array(1, 2, 3), array(4, 5), array(6)); [1,2,3,4,5,6] Note: at logic for arrays is available since 2.4.0. concat_ws concat_ws(sep, [str | array(str)]+) - Returns the concatenation of the strings separated by sep . Examples: > SELECT concat_ws(' ', 'Spark', 'SQL'); Spark SQL conv conv(num, from_base, to_base) - Convert num from from_base to to_base . Examples: > SELECT conv('100', 2, 10); 4 > SELECT conv(-10, 16, -10); -16 corr corr(expr1, expr2) - Returns Pearson coefficient of correlation between a set of number pairs. cos cos(expr) - Returns the cosine of expr , as if computed by java.lang.Math.cos . Arguments: expr - angle in radians Examples: > SELECT cos(0); 1.0 cosh cosh(expr) - Returns the hyperbolic cosine of expr , as if computed by java.lang.Math.cosh . Arguments: expr - hyperbolic angle Examples: > SELECT cosh(0); 1.0 cot cot(expr) - Returns the cotangent of expr , as if computed by 1/java.lang.Math.cot . Arguments: expr - angle in radians Examples: > SELECT cot(1); 0.6420926159343306 count count(*) - Returns the total number of retrieved rows, including rows containing null. count(expr[, expr...]) - Returns the number of rows for which the supplied expression(s) are all non-null. count(DISTINCT expr[, expr...]) - Returns the number of rows for which the supplied expression(s) are unique and non-null. count_min_sketch count_min_sketch(col, eps, confidence, seed) - Returns a count-min sketch of a column with the given esp, confidence and seed. The result is an array of bytes, which can be deserialized to a CountMinSketch before usage. Count-min sketch is a probabilistic data structure used for cardinality estimation using sub-linear space. covar_pop covar_pop(expr1, expr2) - Returns the population covariance of a set of number pairs. covar_samp covar_samp(expr1, expr2) - Returns the sample covariance of a set of number pairs. crc32 crc32(expr) - Returns a cyclic redundancy check value of the expr as a bigint. Examples: > SELECT crc32('Spark'); 1557323817 cube cume_dist cume_dist() - Computes the position of a value relative to all values in the partition. current_database current_database() - Returns the current database. Examples: > SELECT current_database(); default current_date current_date() - Returns the current date at the start of query evaluation. Since: 1.5.0 current_timestamp current_timestamp() - Returns the current timestamp at the start of query evaluation. Since: 1.5.0 date date(expr) - Casts the value expr to the target data type date . date_add date_add(start_date, num_days) - Returns the date that is num_days after start_date . Examples: > SELECT date_add('2016-07-30', 1); 2016-07-31 Since: 1.5.0 date_format date_format(timestamp, fmt) - Converts timestamp to a value of string in the format specified by the date format fmt . Examples: > SELECT date_format('2016-04-08', 'y'); 2016 Since: 1.5.0 date_sub date_sub(start_date, num_days) - Returns the date that is num_days before start_date . Examples: > SELECT date_sub('2016-07-30', 1); 2016-07-29 Since: 1.5.0 date_trunc date_trunc(fmt, ts) - Returns timestamp ts truncated to the unit specified by the format model fmt . fmt should be one of [\"YEAR\", \"YYYY\", \"YY\", \"MON\", \"MONTH\", \"MM\", \"DAY\", \"DD\", \"HOUR\", \"MINUTE\", \"SECOND\", \"WEEK\", \"QUARTER\"] Examples: > SELECT date_trunc('YEAR', '2015-03-05T09:32:05.359'); 2015-01-01 00:00:00 > SELECT date_trunc('MM', '2015-03-05T09:32:05.359'); 2015-03-01 00:00:00 > SELECT date_trunc('DD', '2015-03-05T09:32:05.359'); 2015-03-05 00:00:00 > SELECT date_trunc('HOUR', '2015-03-05T09:32:05.359'); 2015-03-05 09:00:00 Since: 2.3.0 datediff datediff(endDate, startDate) - Returns the number of days from startDate to endDate . Examples: > SELECT datediff('2009-07-31', '2009-07-30'); 1 > SELECT datediff('2009-07-30', '2009-07-31'); -1 Since: 1.5.0 day day(date) - Returns the day of month of the date/timestamp. Examples: > SELECT day('2009-07-30'); 30 Since: 1.5.0 dayofmonth dayofmonth(date) - Returns the day of month of the date/timestamp. Examples: > SELECT dayofmonth('2009-07-30'); 30 Since: 1.5.0 dayofweek dayofweek(date) - Returns the day of the week for date/timestamp (1 = Sunday, 2 = Monday, ..., 7 = Saturday). Examples: > SELECT dayofweek('2009-07-30'); 5 Since: 2.3.0 dayofyear dayofyear(date) - Returns the day of year of the date/timestamp. Examples: > SELECT dayofyear('2016-04-09'); 100 Since: 1.5.0 decimal decimal(expr) - Casts the value expr to the target data type decimal . decode decode(bin, charset) - Decodes the first argument using the second argument character set. Examples: > SELECT decode(encode('abc', 'utf-8'), 'utf-8'); abc degrees degrees(expr) - Converts radians to degrees. Arguments: expr - angle in radians Examples: > SELECT degrees(3.141592653589793); 180.0 dense_rank dense_rank() - Computes the rank of a value in a group of values. The result is one plus the previously assigned rank value. Unlike the function rank, dense_rank will not produce gaps in the ranking sequence. double double(expr) - Casts the value expr to the target data type double . e e() - Returns Euler's number, e. Examples: > SELECT e(); 2.718281828459045 element_at element_at(array, index) - Returns element of array at given (1-based) index. If index < 0, accesses elements from the last to the first. Returns NULL if the index exceeds the length of the array. element_at(map, key) - Returns value for given key, or NULL if the key is not contained in the map Examples: > SELECT element_at(array(1, 2, 3), 2); 2 > SELECT element_at(map(1, 'a', 2, 'b'), 2); b Since: 2.4.0 elt elt(n, input1, input2, ...) - Returns the n -th input, e.g., returns input2 when n is 2. Examples: > SELECT elt(1, 'scala', 'java'); scala encode encode(str, charset) - Encodes the first argument using the second argument character set. Examples: > SELECT encode('abc', 'utf-8'); abc exists exists(expr, pred) - Tests whether a predicate holds for one or more elements in the array. Examples: > SELECT exists(array(1, 2, 3), x -> x % 2 == 0); true Since: 2.4.0 exp exp(expr) - Returns e to the power of expr . Examples: > SELECT exp(0); 1.0 explode explode(expr) - Separates the elements of array expr into multiple rows, or the elements of map expr into multiple rows and columns. Examples: > SELECT explode(array(10, 20)); 10 20 explode_outer explode_outer(expr) - Separates the elements of array expr into multiple rows, or the elements of map expr into multiple rows and columns. Examples: > SELECT explode_outer(array(10, 20)); 10 20 expm1 expm1(expr) - Returns exp( expr ) - 1. Examples: > SELECT expm1(0); 0.0 factorial factorial(expr) - Returns the factorial of expr . expr is [0..20]. Otherwise, null. Examples: > SELECT factorial(5); 120 filter filter(expr, func) - Filters the input array using the given predicate. Examples: > SELECT filter(array(1, 2, 3), x -> x % 2 == 1); [1,3] Since: 2.4.0 find_in_set find_in_set(str, str_array) - Returns the index (1-based) of the given string ( str ) in the comma-delimited list ( str_array ). Returns 0, if the string was not found or if the given string ( str ) contains a comma. Examples: > SELECT find_in_set('ab','abc,b,ab,c,def'); 3 first first(expr[, isIgnoreNull]) - Returns the first value of expr for a group of rows. If isIgnoreNull is true, returns only non-null values. first_value first_value(expr[, isIgnoreNull]) - Returns the first value of expr for a group of rows. If isIgnoreNull is true, returns only non-null values. flatten flatten(arrayOfArrays) - Transforms an array of arrays into a single array. Examples: > SELECT flatten(array(array(1, 2), array(3, 4))); [1,2,3,4] Since: 2.4.0 float float(expr) - Casts the value expr to the target data type float . floor floor(expr) - Returns the largest integer not greater than expr . Examples: > SELECT floor(-0.1); -1 > SELECT floor(5); 5 format_number format_number(expr1, expr2) - Formats the number expr1 like '#,###,###.##', rounded to expr2 decimal places. If expr2 is 0, the result has no decimal point or fractional part. expr2 also accept a user specified format. This is supposed to function like MySQL's FORMAT. Examples: > SELECT format_number(12332.123456, 4); 12,332.1235 > SELECT format_number(12332.123456, '##################.###'); 12332.123 format_string format_string(strfmt, obj, ...) - Returns a formatted string from printf-style format strings. Examples: > SELECT format_string(\"Hello World %d %s\", 100, \"days\"); Hello World 100 days from_json from_json(jsonStr, schema[, options]) - Returns a struct value with the given jsonStr and schema . Examples: > SELECT from_json('{\"a\":1, \"b\":0.8}', 'a INT, b DOUBLE'); {\"a\":1, \"b\":0.8} > SELECT from_json('{\"time\":\"26/08/2015\"}', 'time Timestamp', map('timestampFormat', 'dd/MM/yyyy')); {\"time\":\"2015-08-26 00:00:00.0\"} Since: 2.2.0 from_unixtime from_unixtime(unix_time, format) - Returns unix_time in the specified format . Examples: > SELECT from_unixtime(0, 'yyyy-MM-dd HH:mm:ss'); 1970-01-01 00:00:00 Since: 1.5.0 from_utc_timestamp from_utc_timestamp(timestamp, timezone) - Given a timestamp like '2017-07-14 02:40:00.0', interprets it as a time in UTC, and renders that time as a timestamp in the given time zone. For example, 'GMT+1' would yield '2017-07-14 03:40:00.0'. Examples: > SELECT from_utc_timestamp('2016-08-31', 'Asia/Seoul'); 2016-08-31 09:00:00 Since: 1.5.0 get_json_object get_json_object(json_txt, path) - Extracts a json object from path . Examples: > SELECT get_json_object('{\"a\":\"b\"}', '$.a'); b greatest greatest(expr, ...) - Returns the greatest value of all parameters, skipping null values. Examples: > SELECT greatest(10, 9, 2, 4, 3); 10 grouping grouping_id hash hash(expr1, expr2, ...) - Returns a hash value of the arguments. Examples: > SELECT hash('Spark', array(123), 2); -1321691492 hex hex(expr) - Converts expr to hexadecimal. Examples: > SELECT hex(17); 11 > SELECT hex('Spark SQL'); 537061726B2053514C hour hour(timestamp) - Returns the hour component of the string/timestamp. Examples: > SELECT hour('2009-07-30 12:58:59'); 12 Since: 1.5.0 hypot hypot(expr1, expr2) - Returns sqrt( expr1 2 + expr2 2). Examples: > SELECT hypot(3, 4); 5.0 if if(expr1, expr2, expr3) - If expr1 evaluates to true, then returns expr2 ; otherwise returns expr3 . Examples: > SELECT if(1 < 2, 'a', 'b'); a ifnull ifnull(expr1, expr2) - Returns expr2 if expr1 is null, or expr1 otherwise. Examples: > SELECT ifnull(NULL, array('2')); [\"2\"] in expr1 in(expr2, expr3, ...) - Returns true if expr equals to any valN. Arguments: expr1, expr2, expr3, ... - the arguments must be same type. Examples: > SELECT 1 in(1, 2, 3); true > SELECT 1 in(2, 3, 4); false > SELECT named_struct('a', 1, 'b', 2) in(named_struct('a', 1, 'b', 1), named_struct('a', 1, 'b', 3)); false > SELECT named_struct('a', 1, 'b', 2) in(named_struct('a', 1, 'b', 2), named_struct('a', 1, 'b', 3)); true initcap initcap(str) - Returns str with the first letter of each word in uppercase. All other letters are in lowercase. Words are delimited by white space. Examples: > SELECT initcap('sPark sql'); Spark Sql inline inline(expr) - Explodes an array of structs into a table. Examples: > SELECT inline(array(struct(1, 'a'), struct(2, 'b'))); 1 a 2 b inline_outer inline_outer(expr) - Explodes an array of structs into a table. Examples: > SELECT inline_outer(array(struct(1, 'a'), struct(2, 'b'))); 1 a 2 b input_file_block_length input_file_block_length() - Returns the length of the block being read, or -1 if not available. input_file_block_start input_file_block_start() - Returns the start offset of the block being read, or -1 if not available. input_file_name input_file_name() - Returns the name of the file being read, or empty string if not available. instr instr(str, substr) - Returns the (1-based) index of the first occurrence of substr in str . Examples: > SELECT instr('SparkSQL', 'SQL'); 6 int int(expr) - Casts the value expr to the target data type int . isnan isnan(expr) - Returns true if expr is NaN, or false otherwise. Examples: > SELECT isnan(cast('NaN' as double)); true isnotnull isnotnull(expr) - Returns true if expr is not null, or false otherwise. Examples: > SELECT isnotnull(1); true isnull isnull(expr) - Returns true if expr is null, or false otherwise. Examples: > SELECT isnull(1); false java_method java_method(class, method[, arg1[, arg2 ..]]) - Calls a method with reflection. Examples: > SELECT java_method('java.util.UUID', 'randomUUID'); c33fb387-8500-4bfa-81d2-6e0e3e930df2 > SELECT java_method('java.util.UUID', 'fromString', 'a5cf6c42-0c85-418f-af6c-3e4e5b1328f2'); a5cf6c42-0c85-418f-af6c-3e4e5b1328f2 json_tuple json_tuple(jsonStr, p1, p2, ..., pn) - Returns a tuple like the function get_json_object, but it takes multiple names. All the input parameters and output column types are string. Examples: > SELECT json_tuple('{\"a\":1, \"b\":2}', 'a', 'b'); 1 2 kurtosis kurtosis(expr) - Returns the kurtosis value calculated from values of a group. lag lag(input[, offset[, default]]) - Returns the value of input at the offset th row before the current row in the window. The default value of offset is 1 and the default value of default is null. If the value of input at the offset th row is null, null is returned. If there is no such offset row (e.g., when the offset is 1, the first row of the window does not have any previous row), default is returned. last last(expr[, isIgnoreNull]) - Returns the last value of expr for a group of rows. If isIgnoreNull is true, returns only non-null values. last_day last_day(date) - Returns the last day of the month which the date belongs to. Examples: > SELECT last_day('2009-01-12'); 2009-01-31 Since: 1.5.0 last_value last_value(expr[, isIgnoreNull]) - Returns the last value of expr for a group of rows. If isIgnoreNull is true, returns only non-null values. lcase lcase(str) - Returns str with all characters changed to lowercase. Examples: > SELECT lcase('SparkSql'); sparksql lead lead(input[, offset[, default]]) - Returns the value of input at the offset th row after the current row in the window. The default value of offset is 1 and the default value of default is null. If the value of input at the offset th row is null, null is returned. If there is no such an offset row (e.g., when the offset is 1, the last row of the window does not have any subsequent row), default is returned. least least(expr, ...) - Returns the least value of all parameters, skipping null values. Examples: > SELECT least(10, 9, 2, 4, 3); 2 left left(str, len) - Returns the leftmost len ( len can be string type) characters from the string str ,if len is less or equal than 0 the result is an empty string. Examples: > SELECT left('Spark SQL', 3); Spa length length(expr) - Returns the character length of string data or number of bytes of binary data. The length of string data includes the trailing spaces. The length of binary data includes binary zeros. Examples: > SELECT length('Spark SQL '); 10 > SELECT CHAR_LENGTH('Spark SQL '); 10 > SELECT CHARACTER_LENGTH('Spark SQL '); 10 levenshtein levenshtein(str1, str2) - Returns the Levenshtein distance between the two given strings. Examples: > SELECT levenshtein('kitten', 'sitting'); 3 like str like pattern - Returns true if str matches pattern, null if any arguments are null, false otherwise. Arguments: str - a string expression pattern - a string expression. The pattern is a string which is matched literally, with exception to the following special symbols: _ matches any one character in the input (similar to . in posix regular expressions) % matches zero or more characters in the input (similar to .* in posix regular expressions) The escape character is '\\'. If an escape character precedes a special symbol or another escape character, the following character is matched literally. It is invalid to escape any other character. Since Spark 2.0, string literals are unescaped in our SQL parser. For example, in order to match \"\\abc\", the pattern should be \"\\abc\". When SQL config 'spark.sql.parser.escapedStringLiterals' is enabled, it fallbacks to Spark 1.6 behavior regarding string literal parsing. For example, if the config is enabled, the pattern to match \"\\abc\" should be \"\\abc\". Examples: > SELECT '%SystemDrive%\\Users\\John' like '\\%SystemDrive\\%\\\\Users%' true Note: Use RLIKE to match with standard regular expressions. ln ln(expr) - Returns the natural logarithm (base e) of expr . Examples: > SELECT ln(1); 0.0 locate locate(substr, str[, pos]) - Returns the position of the first occurrence of substr in str after position pos . The given pos and return value are 1-based. Examples: > SELECT locate('bar', 'foobarbar'); 4 > SELECT locate('bar', 'foobarbar', 5); 7 > SELECT POSITION('bar' IN 'foobarbar'); 4 log log(base, expr) - Returns the logarithm of expr with base . Examples: > SELECT log(10, 100); 2.0 log10 log10(expr) - Returns the logarithm of expr with base 10. Examples: > SELECT log10(10); 1.0 log1p log1p(expr) - Returns log(1 + expr ). Examples: > SELECT log1p(0); 0.0 log2 log2(expr) - Returns the logarithm of expr with base 2. Examples: > SELECT log2(2); 1.0 lower lower(str) - Returns str with all characters changed to lowercase. Examples: > SELECT lower('SparkSql'); sparksql lpad lpad(str, len, pad) - Returns str , left-padded with pad to a length of len . If str is longer than len , the return value is shortened to len characters. Examples: > SELECT lpad('hi', 5, '??'); ???hi > SELECT lpad('hi', 1, '??'); h ltrim ltrim(str) - Removes the leading space characters from str . ltrim(trimStr, str) - Removes the leading string contains the characters from the trim string Arguments: str - a string expression trimStr - the trim string characters to trim, the default value is a single space Examples: > SELECT ltrim(' SparkSQL '); SparkSQL > SELECT ltrim('Sp', 'SSparkSQLS'); arkSQLS map map(key0, value0, key1, value1, ...) - Creates a map with the given key/value pairs. Examples: > SELECT map(1.0, '2', 3.0, '4'); {1.0:\"2\",3.0:\"4\"} map_concat map_concat(map, ...) - Returns the union of all the given maps Examples: > SELECT map_concat(map(1, 'a', 2, 'b'), map(2, 'c', 3, 'd')); {1:\"a\",2:\"c\",3:\"d\"} Since: 2.4.0 map_from_arrays map_from_arrays(keys, values) - Creates a map with a pair of the given key/value arrays. All elements in keys should not be null Examples: > SELECT map_from_arrays(array(1.0, 3.0), array('2', '4')); {1.0:\"2\",3.0:\"4\"} Since: 2.4.0 map_from_entries map_from_entries(arrayOfEntries) - Returns a map created from the given array of entries. Examples: > SELECT map_from_entries(array(struct(1, 'a'), struct(2, 'b'))); {1:\"a\",2:\"b\"} Since: 2.4.0 map_keys map_keys(map) - Returns an unordered array containing the keys of the map. Examples: > SELECT map_keys(map(1, 'a', 2, 'b')); [1,2] map_values map_values(map) - Returns an unordered array containing the values of the map. Examples: > SELECT map_values(map(1, 'a', 2, 'b')); [\"a\",\"b\"] max max(expr) - Returns the maximum value of expr . md5 md5(expr) - Returns an MD5 128-bit checksum as a hex string of expr . Examples: > SELECT md5('Spark'); 8cde774d6f7333752ed72cacddb05126 mean mean(expr) - Returns the mean calculated from values of a group. min min(expr) - Returns the minimum value of expr . minute minute(timestamp) - Returns the minute component of the string/timestamp. Examples: > SELECT minute('2009-07-30 12:58:59'); 58 Since: 1.5.0 mod expr1 mod expr2 - Returns the remainder after expr1 / expr2 . Examples: > SELECT 2 mod 1.8; 0.2 > SELECT MOD(2, 1.8); 0.2 monotonically_increasing_id monotonically_increasing_id() - Returns monotonically increasing 64-bit integers. The generated ID is guaranteed to be monotonically increasing and unique, but not consecutive. The current implementation puts the partition ID in the upper 31 bits, and the lower 33 bits represent the record number within each partition. The assumption is that the data frame has less than 1 billion partitions, and each partition has less than 8 billion records. The function is non-deterministic because its result depends on partition IDs. month month(date) - Returns the month component of the date/timestamp. Examples: > SELECT month('2016-07-30'); 7 Since: 1.5.0 months_between months_between(timestamp1, timestamp2[, roundOff]) - If timestamp1 is later than timestamp2 , then the result is positive. If timestamp1 and timestamp2 are on the same day of month, or both are the last day of month, time of day will be ignored. Otherwise, the difference is calculated based on 31 days per month, and rounded to 8 digits unless roundOff=false. Examples: > SELECT months_between('1997-02-28 10:30:00', '1996-10-30'); 3.94959677 > SELECT months_between('1997-02-28 10:30:00', '1996-10-30', false); 3.9495967741935485 Since: 1.5.0 named_struct named_struct(name1, val1, name2, val2, ...) - Creates a struct with the given field names and values. Examples: > SELECT named_struct(\"a\", 1, \"b\", 2, \"c\", 3); {\"a\":1,\"b\":2,\"c\":3} nanvl nanvl(expr1, expr2) - Returns expr1 if it's not NaN, or expr2 otherwise. Examples: > SELECT nanvl(cast('NaN' as double), 123); 123.0 negative negative(expr) - Returns the negated value of expr . Examples: > SELECT negative(1); -1 next_day next_day(start_date, day_of_week) - Returns the first date which is later than start_date and named as indicated. Examples: > SELECT next_day('2015-01-14', 'TU'); 2015-01-20 Since: 1.5.0 not not expr - Logical not. now now() - Returns the current timestamp at the start of query evaluation. Since: 1.5.0 ntile ntile(n) - Divides the rows for each window partition into n buckets ranging from 1 to at most n . nullif nullif(expr1, expr2) - Returns null if expr1 equals to expr2 , or expr1 otherwise. Examples: > SELECT nullif(2, 2); NULL nvl nvl(expr1, expr2) - Returns expr2 if expr1 is null, or expr1 otherwise. Examples: > SELECT nvl(NULL, array('2')); [\"2\"] nvl2 nvl2(expr1, expr2, expr3) - Returns expr2 if expr1 is not null, or expr3 otherwise. Examples: > SELECT nvl2(NULL, 2, 1); 1 octet_length octet_length(expr) - Returns the byte length of string data or number of bytes of binary data. Examples: > SELECT octet_length('Spark SQL'); 9 or expr1 or expr2 - Logical OR. parse_url parse_url(url, partToExtract[, key]) - Extracts a part from a URL. Examples: > SELECT parse_url('http://spark.apache.org/path?query=1', 'HOST') spark.apache.org > SELECT parse_url('http://spark.apache.org/path?query=1', 'QUERY') query=1 > SELECT parse_url('http://spark.apache.org/path?query=1', 'QUERY', 'query') 1 percent_rank percent_rank() - Computes the percentage ranking of a value in a group of values. percentile percentile(col, percentage [, frequency]) - Returns the exact percentile value of numeric column col at the given percentage. The value of percentage must be between 0.0 and 1.0. The value of frequency should be positive integral percentile(col, array(percentage1 [, percentage2]...) [, frequency]) - Returns the exact percentile value array of numeric column col at the given percentage(s). Each value of the percentage array must be between 0.0 and 1.0. The value of frequency should be positive integral percentile_approx percentile_approx(col, percentage [, accuracy]) - Returns the approximate percentile value of numeric column col at the given percentage. The value of percentage must be between 0.0 and 1.0. The accuracy parameter (default: 10000) is a positive numeric literal which controls approximation accuracy at the cost of memory. Higher value of accuracy yields better accuracy, 1.0/accuracy is the relative error of the approximation. When percentage is an array, each value of the percentage array must be between 0.0 and 1.0. In this case, returns the approximate percentile array of column col at the given percentage array. Examples: > SELECT percentile_approx(10.0, array(0.5, 0.4, 0.1), 100); [10.0,10.0,10.0] > SELECT percentile_approx(10.0, 0.5, 100); 10.0 pi pi() - Returns pi. Examples: > SELECT pi(); 3.141592653589793 pmod pmod(expr1, expr2) - Returns the positive value of expr1 mod expr2 . Examples: > SELECT pmod(10, 3); 1 > SELECT pmod(-10, 3); 2 posexplode posexplode(expr) - Separates the elements of array expr into multiple rows with positions, or the elements of map expr into multiple rows and columns with positions. Examples: > SELECT posexplode(array(10,20)); 0 10 1 20 posexplode_outer posexplode_outer(expr) - Separates the elements of array expr into multiple rows with positions, or the elements of map expr into multiple rows and columns with positions. Examples: > SELECT posexplode_outer(array(10,20)); 0 10 1 20 position position(substr, str[, pos]) - Returns the position of the first occurrence of substr in str after position pos . The given pos and return value are 1-based. Examples: > SELECT position('bar', 'foobarbar'); 4 > SELECT position('bar', 'foobarbar', 5); 7 > SELECT POSITION('bar' IN 'foobarbar'); 4 positive positive(expr) - Returns the value of expr . pow pow(expr1, expr2) - Raises expr1 to the power of expr2 . Examples: > SELECT pow(2, 3); 8.0 power power(expr1, expr2) - Raises expr1 to the power of expr2 . Examples: > SELECT power(2, 3); 8.0 printf printf(strfmt, obj, ...) - Returns a formatted string from printf-style format strings. Examples: > SELECT printf(\"Hello World %d %s\", 100, \"days\"); Hello World 100 days quarter quarter(date) - Returns the quarter of the year for date, in the range 1 to 4. Examples: > SELECT quarter('2016-08-31'); 3 Since: 1.5.0 radians radians(expr) - Converts degrees to radians. Arguments: expr - angle in degrees Examples: > SELECT radians(180); 3.141592653589793 rand rand([seed]) - Returns a random value with independent and identically distributed (i.i.d.) uniformly distributed values in [0, 1). Examples: > SELECT rand(); 0.9629742951434543 > SELECT rand(0); 0.8446490682263027 > SELECT rand(null); 0.8446490682263027 Note: function is non-deterministic in general case. randn randn([seed]) - Returns a random value with independent and identically distributed (i.i.d.) values drawn from the standard normal distribution. Examples: > SELECT randn(); -0.3254147983080288 > SELECT randn(0); 1.1164209726833079 > SELECT randn(null); 1.1164209726833079 Note: function is non-deterministic in general case. rank rank() - Computes the rank of a value in a group of values. The result is one plus the number of rows preceding or equal to the current row in the ordering of the partition. The values will produce gaps in the sequence. reflect reflect(class, method[, arg1[, arg2 ..]]) - Calls a method with reflection. Examples: > SELECT reflect('java.util.UUID', 'randomUUID'); c33fb387-8500-4bfa-81d2-6e0e3e930df2 > SELECT reflect('java.util.UUID', 'fromString', 'a5cf6c42-0c85-418f-af6c-3e4e5b1328f2'); a5cf6c42-0c85-418f-af6c-3e4e5b1328f2 regexp_extract regexp_extract(str, regexp[, idx]) - Extracts a group that matches regexp . Examples: > SELECT regexp_extract('100-200', '(\\\\d+)-(\\\\d+)', 1); 100 regexp_replace regexp_replace(str, regexp, rep) - Replaces all substrings of str that match regexp with rep . Examples: > SELECT regexp_replace('100-200', '(\\\\d+)', 'num'); num-num repeat repeat(str, n) - Returns the string which repeats the given string value n times. Examples: > SELECT repeat('123', 2); 123123 replace replace(str, search[, replace]) - Replaces all occurrences of search with replace . Arguments: str - a string expression search - a string expression. If search is not found in str , str is returned unchanged. replace - a string expression. If replace is not specified or is an empty string, nothing replaces the string that is removed from str . Examples: > SELECT replace('ABCabc', 'abc', 'DEF'); ABCDEF reverse reverse(array) - Returns a reversed string or an array with reverse order of elements. Examples: > SELECT reverse('Spark SQL'); LQS krapS > SELECT reverse(array(2, 1, 4, 3)); [3,4,1,2] Note: rse logic for arrays is available since 2.4.0. Since: 1.5.0 right right(str, len) - Returns the rightmost len ( len can be string type) characters from the string str ,if len is less or equal than 0 the result is an empty string. Examples: > SELECT right('Spark SQL', 3); SQL rint rint(expr) - Returns the double value that is closest in value to the argument and is equal to a mathematical integer. Examples: > SELECT rint(12.3456); 12.0 rlike str rlike regexp - Returns true if str matches regexp , or false otherwise. Arguments: str - a string expression regexp - a string expression. The pattern string should be a Java regular expression. Since Spark 2.0, string literals (including regex patterns) are unescaped in our SQL parser. For example, to match \"\\abc\", a regular expression for regexp can be \"^\\abc$\". There is a SQL config 'spark.sql.parser.escapedStringLiterals' that can be used to fallback to the Spark 1.6 behavior regarding string literal parsing. For example, if the config is enabled, the regexp that can match \"\\abc\" is \"^\\abc$\". Examples: When spark.sql.parser.escapedStringLiterals is disabled (default). > SELECT '%SystemDrive%\\Users\\John' rlike '%SystemDrive%\\\\Users.*' true When spark.sql.parser.escapedStringLiterals is enabled. > SELECT '%SystemDrive%\\Users\\John' rlike '%SystemDrive%\\Users.*' true Note: Use LIKE to match with simple string pattern. rollup round round(expr, d) - Returns expr rounded to d decimal places using HALF_UP rounding mode. Examples: > SELECT round(2.5, 0); 3.0 row_number row_number() - Assigns a unique, sequential number to each row, starting with one, according to the ordering of rows within the window partition. rpad rpad(str, len, pad) - Returns str , right-padded with pad to a length of len . If str is longer than len , the return value is shortened to len characters. Examples: > SELECT rpad('hi', 5, '??'); hi??? > SELECT rpad('hi', 1, '??'); h rtrim rtrim(str) - Removes the trailing space characters from str . rtrim(trimStr, str) - Removes the trailing string which contains the characters from the trim string from the str Arguments: str - a string expression trimStr - the trim string characters to trim, the default value is a single space Examples: > SELECT rtrim(' SparkSQL '); SparkSQL > SELECT rtrim('LQSa', 'SSparkSQLS'); SSpark schema_of_json schema_of_json(json[, options]) - Returns schema in the DDL format of JSON string. Examples: > SELECT schema_of_json('[{\"col\":0}]'); array<struct<col:int>> Since: 2.4.0 second second(timestamp) - Returns the second component of the string/timestamp. Examples: > SELECT second('2009-07-30 12:58:59'); 59 Since: 1.5.0 sentences sentences(str[, lang, country]) - Splits str into an array of array of words. Examples: > SELECT sentences('Hi there! Good morning.'); [[\"Hi\",\"there\"],[\"Good\",\"morning\"]] sequence sequence(start, stop, step) - Generates an array of elements from start to stop (inclusive), incrementing by step. The type of the returned elements is the same as the type of argument expressions. Supported types are: byte, short, integer, long, date, timestamp. The start and stop expressions must resolve to the same type. If start and stop expressions resolve to the 'date' or 'timestamp' type then the step expression must resolve to the 'interval' type, otherwise to the same type as the start and stop expressions. Arguments: start - an expression. The start of the range. stop - an expression. The end the range (inclusive). step - an optional expression. The step of the range. By default step is 1 if start is less than or equal to stop, otherwise -1. For the temporal sequences it's 1 day and -1 day respectively. If start is greater than stop then the step must be negative, and vice versa. Examples: > SELECT sequence(1, 5); [1,2,3,4,5] > SELECT sequence(5, 1); [5,4,3,2,1] > SELECT sequence(to_date('2018-01-01'), to_date('2018-03-01'), interval 1 month); [2018-01-01,2018-02-01,2018-03-01] Since: 2.4.0 sha sha(expr) - Returns a sha1 hash value as a hex string of the expr . Examples: > SELECT sha('Spark'); 85f5955f4b27a9a4c2aab6ffe5d7189fc298b92c sha1 sha1(expr) - Returns a sha1 hash value as a hex string of the expr . Examples: > SELECT sha1('Spark'); 85f5955f4b27a9a4c2aab6ffe5d7189fc298b92c sha2 sha2(expr, bitLength) - Returns a checksum of SHA-2 family as a hex string of expr . SHA-224, SHA-256, SHA-384, and SHA-512 are supported. Bit length of 0 is equivalent to 256. Examples: > SELECT sha2('Spark', 256); 529bc3b07127ecb7e53a4dcf1991d9152c24537d919178022b2c42657f79a26b shiftleft shiftleft(base, expr) - Bitwise left shift. Examples: > SELECT shiftleft(2, 1); 4 shiftright shiftright(base, expr) - Bitwise (signed) right shift. Examples: > SELECT shiftright(4, 1); 2 shiftrightunsigned shiftrightunsigned(base, expr) - Bitwise unsigned right shift. Examples: > SELECT shiftrightunsigned(4, 1); 2 shuffle shuffle(array) - Returns a random permutation of the given array. Examples: > SELECT shuffle(array(1, 20, 3, 5)); [3,1,5,20] > SELECT shuffle(array(1, 20, null, 3)); [20,null,3,1] Note: function is non-deterministic. Since: 2.4.0 sign sign(expr) - Returns -1.0, 0.0 or 1.0 as expr is negative, 0 or positive. Examples: > SELECT sign(40); 1.0 signum signum(expr) - Returns -1.0, 0.0 or 1.0 as expr is negative, 0 or positive. Examples: > SELECT signum(40); 1.0 sin sin(expr) - Returns the sine of expr , as if computed by java.lang.Math.sin . Arguments: expr - angle in radians Examples: > SELECT sin(0); 0.0 sinh sinh(expr) - Returns hyperbolic sine of expr , as if computed by java.lang.Math.sinh . Arguments: expr - hyperbolic angle Examples: > SELECT sinh(0); 0.0 size size(expr) - Returns the size of an array or a map. The function returns -1 if its input is null and spark.sql.legacy.sizeOfNull is set to true. If spark.sql.legacy.sizeOfNull is set to false, the function returns null for null input. By default, the spark.sql.legacy.sizeOfNull parameter is set to true. Examples: > SELECT size(array('b', 'd', 'c', 'a')); 4 > SELECT size(map('a', 1, 'b', 2)); 2 > SELECT size(NULL); -1 skewness skewness(expr) - Returns the skewness value calculated from values of a group. slice slice(x, start, length) - Subsets array x starting from index start (or starting from the end if start is negative) with the specified length. Examples: > SELECT slice(array(1, 2, 3, 4), 2, 2); [2,3] > SELECT slice(array(1, 2, 3, 4), -2, 2); [3,4] Since: 2.4.0 smallint smallint(expr) - Casts the value expr to the target data type smallint . sort_array sort_array(array[, ascendingOrder]) - Sorts the input array in ascending or descending order according to the natural ordering of the array elements. Null elements will be placed at the beginning of the returned array in ascending order or at the end of the returned array in descending order. Examples: > SELECT sort_array(array('b', 'd', null, 'c', 'a'), true); [null,\"a\",\"b\",\"c\",\"d\"] soundex soundex(str) - Returns Soundex code of the string. Examples: > SELECT soundex('Miller'); M460 space space(n) - Returns a string consisting of n spaces. Examples: > SELECT concat(space(2), '1'); 1 spark_partition_id spark_partition_id() - Returns the current partition id. split split(str, regex) - Splits str around occurrences that match regex . Examples: > SELECT split('oneAtwoBthreeC', '[ABC]'); [\"one\",\"two\",\"three\",\"\"] sqrt sqrt(expr) - Returns the square root of expr . Examples: > SELECT sqrt(4); 2.0 stack stack(n, expr1, ..., exprk) - Separates expr1 , ..., exprk into n rows. Examples: > SELECT stack(2, 1, 2, 3); 1 2 3 NULL std std(expr) - Returns the sample standard deviation calculated from values of a group. stddev stddev(expr) - Returns the sample standard deviation calculated from values of a group. stddev_pop stddev_pop(expr) - Returns the population standard deviation calculated from values of a group. stddev_samp stddev_samp(expr) - Returns the sample standard deviation calculated from values of a group. str_to_map str_to_map(text[, pairDelim[, keyValueDelim]]) - Creates a map after splitting the text into key/value pairs using delimiters. Default delimiters are ',' for pairDelim and ':' for keyValueDelim . Examples: > SELECT str_to_map('a:1,b:2,c:3', ',', ':'); map(\"a\":\"1\",\"b\":\"2\",\"c\":\"3\") > SELECT str_to_map('a'); map(\"a\":null) string string(expr) - Casts the value expr to the target data type string . struct struct(col1, col2, col3, ...) - Creates a struct with the given field values. substr substr(str, pos[, len]) - Returns the substring of str that starts at pos and is of length len , or the slice of byte array that starts at pos and is of length len . Examples: > SELECT substr('Spark SQL', 5); k SQL > SELECT substr('Spark SQL', -3); SQL > SELECT substr('Spark SQL', 5, 1); k substring substring(str, pos[, len]) - Returns the substring of str that starts at pos and is of length len , or the slice of byte array that starts at pos and is of length len . Examples: > SELECT substring('Spark SQL', 5); k SQL > SELECT substring('Spark SQL', -3); SQL > SELECT substring('Spark SQL', 5, 1); k substring_index substring_index(str, delim, count) - Returns the substring from str before count occurrences of the delimiter delim . If count is positive, everything to the left of the final delimiter (counting from the left) is returned. If count is negative, everything to the right of the final delimiter (counting from the right) is returned. The function substring_index performs a case-sensitive match when searching for delim . Examples: > SELECT substring_index('www.apache.org', '.', 2); www.apache sum sum(expr) - Returns the sum calculated from values of a group. tan tan(expr) - Returns the tangent of expr , as if computed by java.lang.Math.tan . Arguments: expr - angle in radians Examples: > SELECT tan(0); 0.0 tanh tanh(expr) - Returns the hyperbolic tangent of expr , as if computed by java.lang.Math.tanh . Arguments: expr - hyperbolic angle Examples: > SELECT tanh(0); 0.0 timestamp timestamp(expr) - Casts the value expr to the target data type timestamp . tinyint tinyint(expr) - Casts the value expr to the target data type tinyint . to_date to_date(date_str[, fmt]) - Parses the date_str expression with the fmt expression to a date. Returns null with invalid input. By default, it follows casting rules to a date if the fmt is omitted. Examples: > SELECT to_date('2009-07-30 04:17:52'); 2009-07-30 > SELECT to_date('2016-12-31', 'yyyy-MM-dd'); 2016-12-31 Since: 1.5.0 to_json to_json(expr[, options]) - Returns a JSON string with a given struct value Examples: > SELECT to_json(named_struct('a', 1, 'b', 2)); {\"a\":1,\"b\":2} > SELECT to_json(named_struct('time', to_timestamp('2015-08-26', 'yyyy-MM-dd')), map('timestampFormat', 'dd/MM/yyyy')); {\"time\":\"26/08/2015\"} > SELECT to_json(array(named_struct('a', 1, 'b', 2))); [{\"a\":1,\"b\":2}] > SELECT to_json(map('a', named_struct('b', 1))); {\"a\":{\"b\":1}} > SELECT to_json(map(named_struct('a', 1),named_struct('b', 2))); {\"[1]\":{\"b\":2}} > SELECT to_json(map('a', 1)); {\"a\":1} > SELECT to_json(array((map('a', 1)))); [{\"a\":1}] Since: 2.2.0 to_timestamp to_timestamp(timestamp[, fmt]) - Parses the timestamp expression with the fmt expression to a timestamp. Returns null with invalid input. By default, it follows casting rules to a timestamp if the fmt is omitted. Examples: > SELECT to_timestamp('2016-12-31 00:12:00'); 2016-12-31 00:12:00 > SELECT to_timestamp('2016-12-31', 'yyyy-MM-dd'); 2016-12-31 00:00:00 Since: 2.2.0 to_unix_timestamp to_unix_timestamp(expr[, pattern]) - Returns the UNIX timestamp of the given time. Examples: > SELECT to_unix_timestamp('2016-04-08', 'yyyy-MM-dd'); 1460041200 Since: 1.6.0 to_utc_timestamp to_utc_timestamp(timestamp, timezone) - Given a timestamp like '2017-07-14 02:40:00.0', interprets it as a time in the given time zone, and renders that time as a timestamp in UTC. For example, 'GMT+1' would yield '2017-07-14 01:40:00.0'. Examples: > SELECT to_utc_timestamp('2016-08-31', 'Asia/Seoul'); 2016-08-30 15:00:00 Since: 1.5.0 transform transform(expr, func) - Transforms elements in an array using the function. Examples: > SELECT transform(array(1, 2, 3), x -> x + 1); [2,3,4] > SELECT transform(array(1, 2, 3), (x, i) -> x + i); [1,3,5] Since: 2.4.0 translate translate(input, from, to) - Translates the input string by replacing the characters present in the from string with the corresponding characters in the to string. Examples: > SELECT translate('AaBbCc', 'abc', '123'); A1B2C3 trim trim(str) - Removes the leading and trailing space characters from str . trim(BOTH trimStr FROM str) - Remove the leading and trailing trimStr characters from str trim(LEADING trimStr FROM str) - Remove the leading trimStr characters from str trim(TRAILING trimStr FROM str) - Remove the trailing trimStr characters from str Arguments: str - a string expression trimStr - the trim string characters to trim, the default value is a single space BOTH, FROM - these are keywords to specify trimming string characters from both ends of the string LEADING, FROM - these are keywords to specify trimming string characters from the left end of the string TRAILING, FROM - these are keywords to specify trimming string characters from the right end of the string Examples: > SELECT trim(' SparkSQL '); SparkSQL > SELECT trim('SL', 'SSparkSQLS'); parkSQ > SELECT trim(BOTH 'SL' FROM 'SSparkSQLS'); parkSQ > SELECT trim(LEADING 'SL' FROM 'SSparkSQLS'); parkSQLS > SELECT trim(TRAILING 'SL' FROM 'SSparkSQLS'); SSparkSQ trunc trunc(date, fmt) - Returns date with the time portion of the day truncated to the unit specified by the format model fmt . fmt should be one of [\"year\", \"yyyy\", \"yy\", \"mon\", \"month\", \"mm\"] Examples: > SELECT trunc('2009-02-12', 'MM'); 2009-02-01 > SELECT trunc('2015-10-27', 'YEAR'); 2015-01-01 Since: 1.5.0 ucase ucase(str) - Returns str with all characters changed to uppercase. Examples: > SELECT ucase('SparkSql'); SPARKSQL unbase64 unbase64(str) - Converts the argument from a base 64 string str to a binary. Examples: > SELECT unbase64('U3BhcmsgU1FM'); Spark SQL unhex unhex(expr) - Converts hexadecimal expr to binary. Examples: > SELECT decode(unhex('537061726B2053514C'), 'UTF-8'); Spark SQL unix_timestamp unix_timestamp([expr[, pattern]]) - Returns the UNIX timestamp of current or specified time. Examples: > SELECT unix_timestamp(); 1476884637 > SELECT unix_timestamp('2016-04-08', 'yyyy-MM-dd'); 1460041200 Since: 1.5.0 upper upper(str) - Returns str with all characters changed to uppercase. Examples: > SELECT upper('SparkSql'); SPARKSQL uuid uuid() - Returns an universally unique identifier (UUID) string. The value is returned as a canonical UUID 36-character string. Examples: > SELECT uuid(); 46707d92-02f4-4817-8116-a4c3b23e6266 Note: function is non-deterministic. var_pop var_pop(expr) - Returns the population variance calculated from values of a group. var_samp var_samp(expr) - Returns the sample variance calculated from values of a group. variance variance(expr) - Returns the sample variance calculated from values of a group. weekday weekday(date) - Returns the day of the week for date/timestamp (0 = Monday, 1 = Tuesday, ..., 6 = Sunday). Examples: > SELECT weekday('2009-07-30'); 3 Since: 2.4.0 weekofyear weekofyear(date) - Returns the week of the year of the given date. A week is considered to start on a Monday and week 1 is the first week with >3 days. Examples: > SELECT weekofyear('2008-02-20'); 8 Since: 1.5.0 when CASE WHEN expr1 THEN expr2 [WHEN expr3 THEN expr4]* [ELSE expr5] END - When expr1 = true, returns expr2 ; else when expr3 = true, returns expr4 ; else returns expr5 . Arguments: expr1, expr3 - the branch condition expressions should all be boolean type. expr2, expr4, expr5 - the branch value expressions and else value expression should all be same type or coercible to a common type. Examples: > SELECT CASE WHEN 1 > 0 THEN 1 WHEN 2 > 0 THEN 2.0 ELSE 1.2 END; 1 > SELECT CASE WHEN 1 < 0 THEN 1 WHEN 2 > 0 THEN 2.0 ELSE 1.2 END; 2 > SELECT CASE WHEN 1 < 0 THEN 1 WHEN 2 < 0 THEN 2.0 END; NULL window xpath xpath(xml, xpath) - Returns a string array of values within the nodes of xml that match the XPath expression. Examples: > SELECT xpath('<a><b>b1</b><b>b2</b><b>b3</b><c>c1</c><c>c2</c></a>','a/b/text()'); ['b1','b2','b3'] xpath_boolean xpath_boolean(xml, xpath) - Returns true if the XPath expression evaluates to true, or if a matching node is found. Examples: > SELECT xpath_boolean('<a><b>1</b></a>','a/b'); true xpath_double xpath_double(xml, xpath) - Returns a double value, the value zero if no match is found, or NaN if a match is found but the value is non-numeric. Examples: > SELECT xpath_double('<a><b>1</b><b>2</b></a>', 'sum(a/b)'); 3.0 xpath_float xpath_float(xml, xpath) - Returns a float value, the value zero if no match is found, or NaN if a match is found but the value is non-numeric. Examples: > SELECT xpath_float('<a><b>1</b><b>2</b></a>', 'sum(a/b)'); 3.0 xpath_int xpath_int(xml, xpath) - Returns an integer value, or the value zero if no match is found, or a match is found but the value is non-numeric. Examples: > SELECT xpath_int('<a><b>1</b><b>2</b></a>', 'sum(a/b)'); 3 xpath_long xpath_long(xml, xpath) - Returns a long integer value, or the value zero if no match is found, or a match is found but the value is non-numeric. Examples: > SELECT xpath_long('<a><b>1</b><b>2</b></a>', 'sum(a/b)'); 3 xpath_number xpath_number(xml, xpath) - Returns a double value, the value zero if no match is found, or NaN if a match is found but the value is non-numeric. Examples: > SELECT xpath_number('<a><b>1</b><b>2</b></a>', 'sum(a/b)'); 3.0 xpath_short xpath_short(xml, xpath) - Returns a short integer value, or the value zero if no match is found, or a match is found but the value is non-numeric. Examples: > SELECT xpath_short('<a><b>1</b><b>2</b></a>', 'sum(a/b)'); 3 xpath_string xpath_string(xml, xpath) - Returns the text contents of the first xml node that matches the XPath expression. Examples: > SELECT xpath_string('<a><b>b</b><c>cc</c></a>','a/c'); cc year year(date) - Returns the year component of the date/timestamp. Examples: > SELECT year('2016-07-30'); 2016 Since: 1.5.0 zip_with zip_with(left, right, func) - Merges the two given arrays, element-wise, into a single array using function. If one array is shorter, nulls are appended at the end to match the length of the longer array, before applying function. Examples: > SELECT zip_with(array(1, 2, 3), array('a', 'b', 'c'), (x, y) -> (y, x)); [{\"y\":\"a\",\"x\":1},{\"y\":\"b\",\"x\":2},{\"y\":\"c\",\"x\":3}] > SELECT zip_with(array(1, 2), array(3, 4), (x, y) -> x + y); [4,6] > SELECT zip_with(array('a', 'b', 'c'), array('d', 'e', 'f'), (x, y) -> concat(x, y)); [\"ad\",\"be\",\"cf\"] Since: 2.4.0 | expr1 | expr2 - Returns the result of bitwise OR of expr1 and expr2 . Examples: > SELECT 3 | 5; 7 ~ ~ expr - Returns the result of bitwise NOT of expr . Examples: > SELECT ~ 0; -1","title":"Functions"},{"location":"#_1","text":"! expr - Logical not.","title":"!"},{"location":"#_2","text":"expr1 % expr2 - Returns the remainder after expr1 / expr2 . Examples: > SELECT 2 % 1.8; 0.2 > SELECT MOD(2, 1.8); 0.2","title":"%"},{"location":"#_3","text":"expr1 & expr2 - Returns the result of bitwise AND of expr1 and expr2 . Examples: > SELECT 3 & 5; 1","title":"&amp;"},{"location":"#_4","text":"expr1 * expr2 - Returns expr1 * expr2 . Examples: > SELECT 2 * 3; 6","title":"*"},{"location":"#_5","text":"expr1 + expr2 - Returns expr1 + expr2 . Examples: > SELECT 1 + 2; 3","title":"+"},{"location":"#-","text":"expr1 - expr2 - Returns expr1 - expr2 . Examples: > SELECT 2 - 1; 1","title":"-"},{"location":"#_6","text":"expr1 / expr2 - Returns expr1 / expr2 . It always performs floating point division. Examples: > SELECT 3 / 2; 1.5 > SELECT 2L / 2L; 1.0","title":"/"},{"location":"#_7","text":"expr1 < expr2 - Returns true if expr1 is less than expr2 . Arguments: expr1, expr2 - the two expressions must be same type or can be casted to a common type, and must be a type that can be ordered. For example, map type is not orderable, so it is not supported. For complex types such array/struct, the data types of fields must be orderable. Examples: > SELECT 1 < 2; true > SELECT 1.1 < '1'; false > SELECT to_date('2009-07-30 04:17:52') < to_date('2009-07-30 04:17:52'); false > SELECT to_date('2009-07-30 04:17:52') < to_date('2009-08-01 04:17:52'); true > SELECT 1 < NULL; NULL","title":"&lt;"},{"location":"#_8","text":"expr1 <= expr2 - Returns true if expr1 is less than or equal to expr2 . Arguments: expr1, expr2 - the two expressions must be same type or can be casted to a common type, and must be a type that can be ordered. For example, map type is not orderable, so it is not supported. For complex types such array/struct, the data types of fields must be orderable. Examples: > SELECT 2 <= 2; true > SELECT 1.0 <= '1'; true > SELECT to_date('2009-07-30 04:17:52') <= to_date('2009-07-30 04:17:52'); true > SELECT to_date('2009-07-30 04:17:52') <= to_date('2009-08-01 04:17:52'); true > SELECT 1 <= NULL; NULL","title":"&lt;="},{"location":"#_9","text":"expr1 <=> expr2 - Returns same result as the EQUAL(=) operator for non-null operands, but returns true if both are null, false if one of the them is null. Arguments: expr1, expr2 - the two expressions must be same type or can be casted to a common type, and must be a type that can be used in equality comparison. Map type is not supported. For complex types such array/struct, the data types of fields must be orderable. Examples: > SELECT 2 <=> 2; true > SELECT 1 <=> '1'; true > SELECT true <=> NULL; false > SELECT NULL <=> NULL; true","title":"&lt;=&gt;"},{"location":"#_10","text":"expr1 = expr2 - Returns true if expr1 equals expr2 , or false otherwise. Arguments: expr1, expr2 - the two expressions must be same type or can be casted to a common type, and must be a type that can be used in equality comparison. Map type is not supported. For complex types such array/struct, the data types of fields must be orderable. Examples: > SELECT 2 = 2; true > SELECT 1 = '1'; true > SELECT true = NULL; NULL > SELECT NULL = NULL; NULL","title":"="},{"location":"#_11","text":"expr1 == expr2 - Returns true if expr1 equals expr2 , or false otherwise. Arguments: expr1, expr2 - the two expressions must be same type or can be casted to a common type, and must be a type that can be used in equality comparison. Map type is not supported. For complex types such array/struct, the data types of fields must be orderable. Examples: > SELECT 2 == 2; true > SELECT 1 == '1'; true > SELECT true == NULL; NULL > SELECT NULL == NULL; NULL","title":"=="},{"location":"#_12","text":"expr1 > expr2 - Returns true if expr1 is greater than expr2 . Arguments: expr1, expr2 - the two expressions must be same type or can be casted to a common type, and must be a type that can be ordered. For example, map type is not orderable, so it is not supported. For complex types such array/struct, the data types of fields must be orderable. Examples: > SELECT 2 > 1; true > SELECT 2 > '1.1'; true > SELECT to_date('2009-07-30 04:17:52') > to_date('2009-07-30 04:17:52'); false > SELECT to_date('2009-07-30 04:17:52') > to_date('2009-08-01 04:17:52'); false > SELECT 1 > NULL; NULL","title":"&gt;"},{"location":"#_13","text":"expr1 >= expr2 - Returns true if expr1 is greater than or equal to expr2 . Arguments: expr1, expr2 - the two expressions must be same type or can be casted to a common type, and must be a type that can be ordered. For example, map type is not orderable, so it is not supported. For complex types such array/struct, the data types of fields must be orderable. Examples: > SELECT 2 >= 1; true > SELECT 2.0 >= '2.1'; false > SELECT to_date('2009-07-30 04:17:52') >= to_date('2009-07-30 04:17:52'); true > SELECT to_date('2009-07-30 04:17:52') >= to_date('2009-08-01 04:17:52'); false > SELECT 1 >= NULL; NULL","title":"&gt;="},{"location":"#_14","text":"expr1 ^ expr2 - Returns the result of bitwise exclusive OR of expr1 and expr2 . Examples: > SELECT 3 ^ 5; 2","title":"^"},{"location":"#abs","text":"abs(expr) - Returns the absolute value of the numeric value. Examples: > SELECT abs(-1); 1","title":"abs"},{"location":"#acos","text":"acos(expr) - Returns the inverse cosine (a.k.a. arc cosine) of expr , as if computed by java.lang.Math.acos . Examples: > SELECT acos(1); 0.0 > SELECT acos(2); NaN","title":"acos"},{"location":"#add_months","text":"add_months(start_date, num_months) - Returns the date that is num_months after start_date . Examples: > SELECT add_months('2016-08-31', 1); 2016-09-30 Since: 1.5.0","title":"add_months"},{"location":"#aggregate","text":"aggregate(expr, start, merge, finish) - Applies a binary operator to an initial state and all elements in the array, and reduces this to a single state. The final state is converted into the final result by applying a finish function. Examples: > SELECT aggregate(array(1, 2, 3), 0, (acc, x) -> acc + x); 6 > SELECT aggregate(array(1, 2, 3), 0, (acc, x) -> acc + x, acc -> acc * 10); 60 Since: 2.4.0","title":"aggregate"},{"location":"#and","text":"expr1 and expr2 - Logical AND.","title":"and"},{"location":"#approx_count_distinct","text":"approx_count_distinct(expr[, relativeSD]) - Returns the estimated cardinality by HyperLogLog++. relativeSD defines the maximum estimation error allowed.","title":"approx_count_distinct"},{"location":"#approx_percentile","text":"approx_percentile(col, percentage [, accuracy]) - Returns the approximate percentile value of numeric column col at the given percentage. The value of percentage must be between 0.0 and 1.0. The accuracy parameter (default: 10000) is a positive numeric literal which controls approximation accuracy at the cost of memory. Higher value of accuracy yields better accuracy, 1.0/accuracy is the relative error of the approximation. When percentage is an array, each value of the percentage array must be between 0.0 and 1.0. In this case, returns the approximate percentile array of column col at the given percentage array. Examples: > SELECT approx_percentile(10.0, array(0.5, 0.4, 0.1), 100); [10.0,10.0,10.0] > SELECT approx_percentile(10.0, 0.5, 100); 10.0","title":"approx_percentile"},{"location":"#array","text":"array(expr, ...) - Returns an array with the given elements. Examples: > SELECT array(1, 2, 3); [1,2,3]","title":"array"},{"location":"#array_contains","text":"array_contains(array, value) - Returns true if the array contains the value. Examples: > SELECT array_contains(array(1, 2, 3), 2); true","title":"array_contains"},{"location":"#array_distinct","text":"array_distinct(array) - Removes duplicate values from the array. Examples: > SELECT array_distinct(array(1, 2, 3, null, 3)); [1,2,3,null] Since: 2.4.0","title":"array_distinct"},{"location":"#array_except","text":"array_except(array1, array2) - Returns an array of the elements in array1 but not in array2, without duplicates. Examples: > SELECT array_except(array(1, 2, 3), array(1, 3, 5)); [2] Since: 2.4.0","title":"array_except"},{"location":"#array_intersect","text":"array_intersect(array1, array2) - Returns an array of the elements in the intersection of array1 and array2, without duplicates. Examples: > SELECT array_intersect(array(1, 2, 3), array(1, 3, 5)); [1,3] Since: 2.4.0","title":"array_intersect"},{"location":"#array_join","text":"array_join(array, delimiter[, nullReplacement]) - Concatenates the elements of the given array using the delimiter and an optional string to replace nulls. If no value is set for nullReplacement, any null value is filtered. Examples: > SELECT array_join(array('hello', 'world'), ' '); hello world > SELECT array_join(array('hello', null ,'world'), ' '); hello world > SELECT array_join(array('hello', null ,'world'), ' ', ','); hello , world Since: 2.4.0","title":"array_join"},{"location":"#array_max","text":"array_max(array) - Returns the maximum value in the array. NULL elements are skipped. Examples: > SELECT array_max(array(1, 20, null, 3)); 20 Since: 2.4.0","title":"array_max"},{"location":"#array_min","text":"array_min(array) - Returns the minimum value in the array. NULL elements are skipped. Examples: > SELECT array_min(array(1, 20, null, 3)); 1 Since: 2.4.0","title":"array_min"},{"location":"#array_position","text":"array_position(array, element) - Returns the (1-based) index of the first element of the array as long. Examples: > SELECT array_position(array(3, 2, 1), 1); 3 Since: 2.4.0","title":"array_position"},{"location":"#array_remove","text":"array_remove(array, element) - Remove all elements that equal to element from array. Examples: > SELECT array_remove(array(1, 2, 3, null, 3), 3); [1,2,null] Since: 2.4.0","title":"array_remove"},{"location":"#array_repeat","text":"array_repeat(element, count) - Returns the array containing element count times. Examples: > SELECT array_repeat('123', 2); [\"123\",\"123\"] Since: 2.4.0","title":"array_repeat"},{"location":"#array_sort","text":"array_sort(array) - Sorts the input array in ascending order. The elements of the input array must be orderable. Null elements will be placed at the end of the returned array. Examples: > SELECT array_sort(array('b', 'd', null, 'c', 'a')); [\"a\",\"b\",\"c\",\"d\",null] Since: 2.4.0","title":"array_sort"},{"location":"#array_union","text":"array_union(array1, array2) - Returns an array of the elements in the union of array1 and array2, without duplicates. Examples: > SELECT array_union(array(1, 2, 3), array(1, 3, 5)); [1,2,3,5] Since: 2.4.0","title":"array_union"},{"location":"#arrays_overlap","text":"arrays_overlap(a1, a2) - Returns true if a1 contains at least a non-null element present also in a2. If the arrays have no common element and they are both non-empty and either of them contains a null element null is returned, false otherwise. Examples: > SELECT arrays_overlap(array(1, 2, 3), array(3, 4, 5)); true Since: 2.4.0","title":"arrays_overlap"},{"location":"#arrays_zip","text":"arrays_zip(a1, a2, ...) - Returns a merged array of structs in which the N-th struct contains all N-th values of input arrays. Examples: > SELECT arrays_zip(array(1, 2, 3), array(2, 3, 4)); [{\"0\":1,\"1\":2},{\"0\":2,\"1\":3},{\"0\":3,\"1\":4}] > SELECT arrays_zip(array(1, 2), array(2, 3), array(3, 4)); [{\"0\":1,\"1\":2,\"2\":3},{\"0\":2,\"1\":3,\"2\":4}] Since: 2.4.0","title":"arrays_zip"},{"location":"#ascii","text":"ascii(str) - Returns the numeric value of the first character of str . Examples: > SELECT ascii('222'); 50 > SELECT ascii(2); 50","title":"ascii"},{"location":"#asin","text":"asin(expr) - Returns the inverse sine (a.k.a. arc sine) the arc sin of expr , as if computed by java.lang.Math.asin . Examples: > SELECT asin(0); 0.0 > SELECT asin(2); NaN","title":"asin"},{"location":"#assert_true","text":"assert_true(expr) - Throws an exception if expr is not true. Examples: > SELECT assert_true(0 < 1); NULL","title":"assert_true"},{"location":"#atan","text":"atan(expr) - Returns the inverse tangent (a.k.a. arc tangent) of expr , as if computed by java.lang.Math.atan Examples: > SELECT atan(0); 0.0","title":"atan"},{"location":"#atan2","text":"atan2(exprY, exprX) - Returns the angle in radians between the positive x-axis of a plane and the point given by the coordinates ( exprX , exprY ), as if computed by java.lang.Math.atan2 . Arguments: exprY - coordinate on y-axis exprX - coordinate on x-axis Examples: > SELECT atan2(0, 0); 0.0","title":"atan2"},{"location":"#avg","text":"avg(expr) - Returns the mean calculated from values of a group.","title":"avg"},{"location":"#base64","text":"base64(bin) - Converts the argument from a binary bin to a base 64 string. Examples: > SELECT base64('Spark SQL'); U3BhcmsgU1FM","title":"base64"},{"location":"#bigint","text":"bigint(expr) - Casts the value expr to the target data type bigint .","title":"bigint"},{"location":"#bin","text":"bin(expr) - Returns the string representation of the long value expr represented in binary. Examples: > SELECT bin(13); 1101 > SELECT bin(-13); 1111111111111111111111111111111111111111111111111111111111110011 > SELECT bin(13.3); 1101","title":"bin"},{"location":"#binary","text":"binary(expr) - Casts the value expr to the target data type binary .","title":"binary"},{"location":"#bit_length","text":"bit_length(expr) - Returns the bit length of string data or number of bits of binary data. Examples: > SELECT bit_length('Spark SQL'); 72","title":"bit_length"},{"location":"#boolean","text":"boolean(expr) - Casts the value expr to the target data type boolean .","title":"boolean"},{"location":"#bround","text":"bround(expr, d) - Returns expr rounded to d decimal places using HALF_EVEN rounding mode. Examples: > SELECT bround(2.5, 0); 2.0","title":"bround"},{"location":"#cardinality","text":"cardinality(expr) - Returns the size of an array or a map. The function returns -1 if its input is null and spark.sql.legacy.sizeOfNull is set to true. If spark.sql.legacy.sizeOfNull is set to false, the function returns null for null input. By default, the spark.sql.legacy.sizeOfNull parameter is set to true. Examples: > SELECT cardinality(array('b', 'd', 'c', 'a')); 4 > SELECT cardinality(map('a', 1, 'b', 2)); 2 > SELECT cardinality(NULL); -1","title":"cardinality"},{"location":"#cast","text":"cast(expr AS type) - Casts the value expr to the target data type type . Examples: > SELECT cast('10' as int); 10","title":"cast"},{"location":"#cbrt","text":"cbrt(expr) - Returns the cube root of expr . Examples: > SELECT cbrt(27.0); 3.0","title":"cbrt"},{"location":"#ceil","text":"ceil(expr) - Returns the smallest integer not smaller than expr . Examples: > SELECT ceil(-0.1); 0 > SELECT ceil(5); 5","title":"ceil"},{"location":"#ceiling","text":"ceiling(expr) - Returns the smallest integer not smaller than expr . Examples: > SELECT ceiling(-0.1); 0 > SELECT ceiling(5); 5","title":"ceiling"},{"location":"#char","text":"char(expr) - Returns the ASCII character having the binary equivalent to expr . If n is larger than 256 the result is equivalent to chr(n % 256) Examples: > SELECT char(65); A","title":"char"},{"location":"#char_length","text":"char_length(expr) - Returns the character length of string data or number of bytes of binary data. The length of string data includes the trailing spaces. The length of binary data includes binary zeros. Examples: > SELECT char_length('Spark SQL '); 10 > SELECT CHAR_LENGTH('Spark SQL '); 10 > SELECT CHARACTER_LENGTH('Spark SQL '); 10","title":"char_length"},{"location":"#character_length","text":"character_length(expr) - Returns the character length of string data or number of bytes of binary data. The length of string data includes the trailing spaces. The length of binary data includes binary zeros. Examples: > SELECT character_length('Spark SQL '); 10 > SELECT CHAR_LENGTH('Spark SQL '); 10 > SELECT CHARACTER_LENGTH('Spark SQL '); 10","title":"character_length"},{"location":"#chr","text":"chr(expr) - Returns the ASCII character having the binary equivalent to expr . If n is larger than 256 the result is equivalent to chr(n % 256) Examples: > SELECT chr(65); A","title":"chr"},{"location":"#coalesce","text":"coalesce(expr1, expr2, ...) - Returns the first non-null argument if exists. Otherwise, null. Examples: > SELECT coalesce(NULL, 1, NULL); 1","title":"coalesce"},{"location":"#collect_list","text":"collect_list(expr) - Collects and returns a list of non-unique elements.","title":"collect_list"},{"location":"#collect_set","text":"collect_set(expr) - Collects and returns a set of unique elements.","title":"collect_set"},{"location":"#concat","text":"concat(col1, col2, ..., colN) - Returns the concatenation of col1, col2, ..., colN. Examples: > SELECT concat('Spark', 'SQL'); SparkSQL > SELECT concat(array(1, 2, 3), array(4, 5), array(6)); [1,2,3,4,5,6] Note: at logic for arrays is available since 2.4.0.","title":"concat"},{"location":"#concat_ws","text":"concat_ws(sep, [str | array(str)]+) - Returns the concatenation of the strings separated by sep . Examples: > SELECT concat_ws(' ', 'Spark', 'SQL'); Spark SQL","title":"concat_ws"},{"location":"#conv","text":"conv(num, from_base, to_base) - Convert num from from_base to to_base . Examples: > SELECT conv('100', 2, 10); 4 > SELECT conv(-10, 16, -10); -16","title":"conv"},{"location":"#corr","text":"corr(expr1, expr2) - Returns Pearson coefficient of correlation between a set of number pairs.","title":"corr"},{"location":"#cos","text":"cos(expr) - Returns the cosine of expr , as if computed by java.lang.Math.cos . Arguments: expr - angle in radians Examples: > SELECT cos(0); 1.0","title":"cos"},{"location":"#cosh","text":"cosh(expr) - Returns the hyperbolic cosine of expr , as if computed by java.lang.Math.cosh . Arguments: expr - hyperbolic angle Examples: > SELECT cosh(0); 1.0","title":"cosh"},{"location":"#cot","text":"cot(expr) - Returns the cotangent of expr , as if computed by 1/java.lang.Math.cot . Arguments: expr - angle in radians Examples: > SELECT cot(1); 0.6420926159343306","title":"cot"},{"location":"#count","text":"count(*) - Returns the total number of retrieved rows, including rows containing null. count(expr[, expr...]) - Returns the number of rows for which the supplied expression(s) are all non-null. count(DISTINCT expr[, expr...]) - Returns the number of rows for which the supplied expression(s) are unique and non-null.","title":"count"},{"location":"#count_min_sketch","text":"count_min_sketch(col, eps, confidence, seed) - Returns a count-min sketch of a column with the given esp, confidence and seed. The result is an array of bytes, which can be deserialized to a CountMinSketch before usage. Count-min sketch is a probabilistic data structure used for cardinality estimation using sub-linear space.","title":"count_min_sketch"},{"location":"#covar_pop","text":"covar_pop(expr1, expr2) - Returns the population covariance of a set of number pairs.","title":"covar_pop"},{"location":"#covar_samp","text":"covar_samp(expr1, expr2) - Returns the sample covariance of a set of number pairs.","title":"covar_samp"},{"location":"#crc32","text":"crc32(expr) - Returns a cyclic redundancy check value of the expr as a bigint. Examples: > SELECT crc32('Spark'); 1557323817","title":"crc32"},{"location":"#cube","text":"","title":"cube"},{"location":"#cume_dist","text":"cume_dist() - Computes the position of a value relative to all values in the partition.","title":"cume_dist"},{"location":"#current_database","text":"current_database() - Returns the current database. Examples: > SELECT current_database(); default","title":"current_database"},{"location":"#current_date","text":"current_date() - Returns the current date at the start of query evaluation. Since: 1.5.0","title":"current_date"},{"location":"#current_timestamp","text":"current_timestamp() - Returns the current timestamp at the start of query evaluation. Since: 1.5.0","title":"current_timestamp"},{"location":"#date","text":"date(expr) - Casts the value expr to the target data type date .","title":"date"},{"location":"#date_add","text":"date_add(start_date, num_days) - Returns the date that is num_days after start_date . Examples: > SELECT date_add('2016-07-30', 1); 2016-07-31 Since: 1.5.0","title":"date_add"},{"location":"#date_format","text":"date_format(timestamp, fmt) - Converts timestamp to a value of string in the format specified by the date format fmt . Examples: > SELECT date_format('2016-04-08', 'y'); 2016 Since: 1.5.0","title":"date_format"},{"location":"#date_sub","text":"date_sub(start_date, num_days) - Returns the date that is num_days before start_date . Examples: > SELECT date_sub('2016-07-30', 1); 2016-07-29 Since: 1.5.0","title":"date_sub"},{"location":"#date_trunc","text":"date_trunc(fmt, ts) - Returns timestamp ts truncated to the unit specified by the format model fmt . fmt should be one of [\"YEAR\", \"YYYY\", \"YY\", \"MON\", \"MONTH\", \"MM\", \"DAY\", \"DD\", \"HOUR\", \"MINUTE\", \"SECOND\", \"WEEK\", \"QUARTER\"] Examples: > SELECT date_trunc('YEAR', '2015-03-05T09:32:05.359'); 2015-01-01 00:00:00 > SELECT date_trunc('MM', '2015-03-05T09:32:05.359'); 2015-03-01 00:00:00 > SELECT date_trunc('DD', '2015-03-05T09:32:05.359'); 2015-03-05 00:00:00 > SELECT date_trunc('HOUR', '2015-03-05T09:32:05.359'); 2015-03-05 09:00:00 Since: 2.3.0","title":"date_trunc"},{"location":"#datediff","text":"datediff(endDate, startDate) - Returns the number of days from startDate to endDate . Examples: > SELECT datediff('2009-07-31', '2009-07-30'); 1 > SELECT datediff('2009-07-30', '2009-07-31'); -1 Since: 1.5.0","title":"datediff"},{"location":"#day","text":"day(date) - Returns the day of month of the date/timestamp. Examples: > SELECT day('2009-07-30'); 30 Since: 1.5.0","title":"day"},{"location":"#dayofmonth","text":"dayofmonth(date) - Returns the day of month of the date/timestamp. Examples: > SELECT dayofmonth('2009-07-30'); 30 Since: 1.5.0","title":"dayofmonth"},{"location":"#dayofweek","text":"dayofweek(date) - Returns the day of the week for date/timestamp (1 = Sunday, 2 = Monday, ..., 7 = Saturday). Examples: > SELECT dayofweek('2009-07-30'); 5 Since: 2.3.0","title":"dayofweek"},{"location":"#dayofyear","text":"dayofyear(date) - Returns the day of year of the date/timestamp. Examples: > SELECT dayofyear('2016-04-09'); 100 Since: 1.5.0","title":"dayofyear"},{"location":"#decimal","text":"decimal(expr) - Casts the value expr to the target data type decimal .","title":"decimal"},{"location":"#decode","text":"decode(bin, charset) - Decodes the first argument using the second argument character set. Examples: > SELECT decode(encode('abc', 'utf-8'), 'utf-8'); abc","title":"decode"},{"location":"#degrees","text":"degrees(expr) - Converts radians to degrees. Arguments: expr - angle in radians Examples: > SELECT degrees(3.141592653589793); 180.0","title":"degrees"},{"location":"#dense_rank","text":"dense_rank() - Computes the rank of a value in a group of values. The result is one plus the previously assigned rank value. Unlike the function rank, dense_rank will not produce gaps in the ranking sequence.","title":"dense_rank"},{"location":"#double","text":"double(expr) - Casts the value expr to the target data type double .","title":"double"},{"location":"#e","text":"e() - Returns Euler's number, e. Examples: > SELECT e(); 2.718281828459045","title":"e"},{"location":"#element_at","text":"element_at(array, index) - Returns element of array at given (1-based) index. If index < 0, accesses elements from the last to the first. Returns NULL if the index exceeds the length of the array. element_at(map, key) - Returns value for given key, or NULL if the key is not contained in the map Examples: > SELECT element_at(array(1, 2, 3), 2); 2 > SELECT element_at(map(1, 'a', 2, 'b'), 2); b Since: 2.4.0","title":"element_at"},{"location":"#elt","text":"elt(n, input1, input2, ...) - Returns the n -th input, e.g., returns input2 when n is 2. Examples: > SELECT elt(1, 'scala', 'java'); scala","title":"elt"},{"location":"#encode","text":"encode(str, charset) - Encodes the first argument using the second argument character set. Examples: > SELECT encode('abc', 'utf-8'); abc","title":"encode"},{"location":"#exists","text":"exists(expr, pred) - Tests whether a predicate holds for one or more elements in the array. Examples: > SELECT exists(array(1, 2, 3), x -> x % 2 == 0); true Since: 2.4.0","title":"exists"},{"location":"#exp","text":"exp(expr) - Returns e to the power of expr . Examples: > SELECT exp(0); 1.0","title":"exp"},{"location":"#explode","text":"explode(expr) - Separates the elements of array expr into multiple rows, or the elements of map expr into multiple rows and columns. Examples: > SELECT explode(array(10, 20)); 10 20","title":"explode"},{"location":"#explode_outer","text":"explode_outer(expr) - Separates the elements of array expr into multiple rows, or the elements of map expr into multiple rows and columns. Examples: > SELECT explode_outer(array(10, 20)); 10 20","title":"explode_outer"},{"location":"#expm1","text":"expm1(expr) - Returns exp( expr ) - 1. Examples: > SELECT expm1(0); 0.0","title":"expm1"},{"location":"#factorial","text":"factorial(expr) - Returns the factorial of expr . expr is [0..20]. Otherwise, null. Examples: > SELECT factorial(5); 120","title":"factorial"},{"location":"#filter","text":"filter(expr, func) - Filters the input array using the given predicate. Examples: > SELECT filter(array(1, 2, 3), x -> x % 2 == 1); [1,3] Since: 2.4.0","title":"filter"},{"location":"#find_in_set","text":"find_in_set(str, str_array) - Returns the index (1-based) of the given string ( str ) in the comma-delimited list ( str_array ). Returns 0, if the string was not found or if the given string ( str ) contains a comma. Examples: > SELECT find_in_set('ab','abc,b,ab,c,def'); 3","title":"find_in_set"},{"location":"#first","text":"first(expr[, isIgnoreNull]) - Returns the first value of expr for a group of rows. If isIgnoreNull is true, returns only non-null values.","title":"first"},{"location":"#first_value","text":"first_value(expr[, isIgnoreNull]) - Returns the first value of expr for a group of rows. If isIgnoreNull is true, returns only non-null values.","title":"first_value"},{"location":"#flatten","text":"flatten(arrayOfArrays) - Transforms an array of arrays into a single array. Examples: > SELECT flatten(array(array(1, 2), array(3, 4))); [1,2,3,4] Since: 2.4.0","title":"flatten"},{"location":"#float","text":"float(expr) - Casts the value expr to the target data type float .","title":"float"},{"location":"#floor","text":"floor(expr) - Returns the largest integer not greater than expr . Examples: > SELECT floor(-0.1); -1 > SELECT floor(5); 5","title":"floor"},{"location":"#format_number","text":"format_number(expr1, expr2) - Formats the number expr1 like '#,###,###.##', rounded to expr2 decimal places. If expr2 is 0, the result has no decimal point or fractional part. expr2 also accept a user specified format. This is supposed to function like MySQL's FORMAT. Examples: > SELECT format_number(12332.123456, 4); 12,332.1235 > SELECT format_number(12332.123456, '##################.###'); 12332.123","title":"format_number"},{"location":"#format_string","text":"format_string(strfmt, obj, ...) - Returns a formatted string from printf-style format strings. Examples: > SELECT format_string(\"Hello World %d %s\", 100, \"days\"); Hello World 100 days","title":"format_string"},{"location":"#from_json","text":"from_json(jsonStr, schema[, options]) - Returns a struct value with the given jsonStr and schema . Examples: > SELECT from_json('{\"a\":1, \"b\":0.8}', 'a INT, b DOUBLE'); {\"a\":1, \"b\":0.8} > SELECT from_json('{\"time\":\"26/08/2015\"}', 'time Timestamp', map('timestampFormat', 'dd/MM/yyyy')); {\"time\":\"2015-08-26 00:00:00.0\"} Since: 2.2.0","title":"from_json"},{"location":"#from_unixtime","text":"from_unixtime(unix_time, format) - Returns unix_time in the specified format . Examples: > SELECT from_unixtime(0, 'yyyy-MM-dd HH:mm:ss'); 1970-01-01 00:00:00 Since: 1.5.0","title":"from_unixtime"},{"location":"#from_utc_timestamp","text":"from_utc_timestamp(timestamp, timezone) - Given a timestamp like '2017-07-14 02:40:00.0', interprets it as a time in UTC, and renders that time as a timestamp in the given time zone. For example, 'GMT+1' would yield '2017-07-14 03:40:00.0'. Examples: > SELECT from_utc_timestamp('2016-08-31', 'Asia/Seoul'); 2016-08-31 09:00:00 Since: 1.5.0","title":"from_utc_timestamp"},{"location":"#get_json_object","text":"get_json_object(json_txt, path) - Extracts a json object from path . Examples: > SELECT get_json_object('{\"a\":\"b\"}', '$.a'); b","title":"get_json_object"},{"location":"#greatest","text":"greatest(expr, ...) - Returns the greatest value of all parameters, skipping null values. Examples: > SELECT greatest(10, 9, 2, 4, 3); 10","title":"greatest"},{"location":"#grouping","text":"","title":"grouping"},{"location":"#grouping_id","text":"","title":"grouping_id"},{"location":"#hash","text":"hash(expr1, expr2, ...) - Returns a hash value of the arguments. Examples: > SELECT hash('Spark', array(123), 2); -1321691492","title":"hash"},{"location":"#hex","text":"hex(expr) - Converts expr to hexadecimal. Examples: > SELECT hex(17); 11 > SELECT hex('Spark SQL'); 537061726B2053514C","title":"hex"},{"location":"#hour","text":"hour(timestamp) - Returns the hour component of the string/timestamp. Examples: > SELECT hour('2009-07-30 12:58:59'); 12 Since: 1.5.0","title":"hour"},{"location":"#hypot","text":"hypot(expr1, expr2) - Returns sqrt( expr1 2 + expr2 2). Examples: > SELECT hypot(3, 4); 5.0","title":"hypot"},{"location":"#if","text":"if(expr1, expr2, expr3) - If expr1 evaluates to true, then returns expr2 ; otherwise returns expr3 . Examples: > SELECT if(1 < 2, 'a', 'b'); a","title":"if"},{"location":"#ifnull","text":"ifnull(expr1, expr2) - Returns expr2 if expr1 is null, or expr1 otherwise. Examples: > SELECT ifnull(NULL, array('2')); [\"2\"]","title":"ifnull"},{"location":"#in","text":"expr1 in(expr2, expr3, ...) - Returns true if expr equals to any valN. Arguments: expr1, expr2, expr3, ... - the arguments must be same type. Examples: > SELECT 1 in(1, 2, 3); true > SELECT 1 in(2, 3, 4); false > SELECT named_struct('a', 1, 'b', 2) in(named_struct('a', 1, 'b', 1), named_struct('a', 1, 'b', 3)); false > SELECT named_struct('a', 1, 'b', 2) in(named_struct('a', 1, 'b', 2), named_struct('a', 1, 'b', 3)); true","title":"in"},{"location":"#initcap","text":"initcap(str) - Returns str with the first letter of each word in uppercase. All other letters are in lowercase. Words are delimited by white space. Examples: > SELECT initcap('sPark sql'); Spark Sql","title":"initcap"},{"location":"#inline","text":"inline(expr) - Explodes an array of structs into a table. Examples: > SELECT inline(array(struct(1, 'a'), struct(2, 'b'))); 1 a 2 b","title":"inline"},{"location":"#inline_outer","text":"inline_outer(expr) - Explodes an array of structs into a table. Examples: > SELECT inline_outer(array(struct(1, 'a'), struct(2, 'b'))); 1 a 2 b","title":"inline_outer"},{"location":"#input_file_block_length","text":"input_file_block_length() - Returns the length of the block being read, or -1 if not available.","title":"input_file_block_length"},{"location":"#input_file_block_start","text":"input_file_block_start() - Returns the start offset of the block being read, or -1 if not available.","title":"input_file_block_start"},{"location":"#input_file_name","text":"input_file_name() - Returns the name of the file being read, or empty string if not available.","title":"input_file_name"},{"location":"#instr","text":"instr(str, substr) - Returns the (1-based) index of the first occurrence of substr in str . Examples: > SELECT instr('SparkSQL', 'SQL'); 6","title":"instr"},{"location":"#int","text":"int(expr) - Casts the value expr to the target data type int .","title":"int"},{"location":"#isnan","text":"isnan(expr) - Returns true if expr is NaN, or false otherwise. Examples: > SELECT isnan(cast('NaN' as double)); true","title":"isnan"},{"location":"#isnotnull","text":"isnotnull(expr) - Returns true if expr is not null, or false otherwise. Examples: > SELECT isnotnull(1); true","title":"isnotnull"},{"location":"#isnull","text":"isnull(expr) - Returns true if expr is null, or false otherwise. Examples: > SELECT isnull(1); false","title":"isnull"},{"location":"#java_method","text":"java_method(class, method[, arg1[, arg2 ..]]) - Calls a method with reflection. Examples: > SELECT java_method('java.util.UUID', 'randomUUID'); c33fb387-8500-4bfa-81d2-6e0e3e930df2 > SELECT java_method('java.util.UUID', 'fromString', 'a5cf6c42-0c85-418f-af6c-3e4e5b1328f2'); a5cf6c42-0c85-418f-af6c-3e4e5b1328f2","title":"java_method"},{"location":"#json_tuple","text":"json_tuple(jsonStr, p1, p2, ..., pn) - Returns a tuple like the function get_json_object, but it takes multiple names. All the input parameters and output column types are string. Examples: > SELECT json_tuple('{\"a\":1, \"b\":2}', 'a', 'b'); 1 2","title":"json_tuple"},{"location":"#kurtosis","text":"kurtosis(expr) - Returns the kurtosis value calculated from values of a group.","title":"kurtosis"},{"location":"#lag","text":"lag(input[, offset[, default]]) - Returns the value of input at the offset th row before the current row in the window. The default value of offset is 1 and the default value of default is null. If the value of input at the offset th row is null, null is returned. If there is no such offset row (e.g., when the offset is 1, the first row of the window does not have any previous row), default is returned.","title":"lag"},{"location":"#last","text":"last(expr[, isIgnoreNull]) - Returns the last value of expr for a group of rows. If isIgnoreNull is true, returns only non-null values.","title":"last"},{"location":"#last_day","text":"last_day(date) - Returns the last day of the month which the date belongs to. Examples: > SELECT last_day('2009-01-12'); 2009-01-31 Since: 1.5.0","title":"last_day"},{"location":"#last_value","text":"last_value(expr[, isIgnoreNull]) - Returns the last value of expr for a group of rows. If isIgnoreNull is true, returns only non-null values.","title":"last_value"},{"location":"#lcase","text":"lcase(str) - Returns str with all characters changed to lowercase. Examples: > SELECT lcase('SparkSql'); sparksql","title":"lcase"},{"location":"#lead","text":"lead(input[, offset[, default]]) - Returns the value of input at the offset th row after the current row in the window. The default value of offset is 1 and the default value of default is null. If the value of input at the offset th row is null, null is returned. If there is no such an offset row (e.g., when the offset is 1, the last row of the window does not have any subsequent row), default is returned.","title":"lead"},{"location":"#least","text":"least(expr, ...) - Returns the least value of all parameters, skipping null values. Examples: > SELECT least(10, 9, 2, 4, 3); 2","title":"least"},{"location":"#left","text":"left(str, len) - Returns the leftmost len ( len can be string type) characters from the string str ,if len is less or equal than 0 the result is an empty string. Examples: > SELECT left('Spark SQL', 3); Spa","title":"left"},{"location":"#length","text":"length(expr) - Returns the character length of string data or number of bytes of binary data. The length of string data includes the trailing spaces. The length of binary data includes binary zeros. Examples: > SELECT length('Spark SQL '); 10 > SELECT CHAR_LENGTH('Spark SQL '); 10 > SELECT CHARACTER_LENGTH('Spark SQL '); 10","title":"length"},{"location":"#levenshtein","text":"levenshtein(str1, str2) - Returns the Levenshtein distance between the two given strings. Examples: > SELECT levenshtein('kitten', 'sitting'); 3","title":"levenshtein"},{"location":"#like","text":"str like pattern - Returns true if str matches pattern, null if any arguments are null, false otherwise. Arguments: str - a string expression pattern - a string expression. The pattern is a string which is matched literally, with exception to the following special symbols: _ matches any one character in the input (similar to . in posix regular expressions) % matches zero or more characters in the input (similar to .* in posix regular expressions) The escape character is '\\'. If an escape character precedes a special symbol or another escape character, the following character is matched literally. It is invalid to escape any other character. Since Spark 2.0, string literals are unescaped in our SQL parser. For example, in order to match \"\\abc\", the pattern should be \"\\abc\". When SQL config 'spark.sql.parser.escapedStringLiterals' is enabled, it fallbacks to Spark 1.6 behavior regarding string literal parsing. For example, if the config is enabled, the pattern to match \"\\abc\" should be \"\\abc\". Examples: > SELECT '%SystemDrive%\\Users\\John' like '\\%SystemDrive\\%\\\\Users%' true Note: Use RLIKE to match with standard regular expressions.","title":"like"},{"location":"#ln","text":"ln(expr) - Returns the natural logarithm (base e) of expr . Examples: > SELECT ln(1); 0.0","title":"ln"},{"location":"#locate","text":"locate(substr, str[, pos]) - Returns the position of the first occurrence of substr in str after position pos . The given pos and return value are 1-based. Examples: > SELECT locate('bar', 'foobarbar'); 4 > SELECT locate('bar', 'foobarbar', 5); 7 > SELECT POSITION('bar' IN 'foobarbar'); 4","title":"locate"},{"location":"#log","text":"log(base, expr) - Returns the logarithm of expr with base . Examples: > SELECT log(10, 100); 2.0","title":"log"},{"location":"#log10","text":"log10(expr) - Returns the logarithm of expr with base 10. Examples: > SELECT log10(10); 1.0","title":"log10"},{"location":"#log1p","text":"log1p(expr) - Returns log(1 + expr ). Examples: > SELECT log1p(0); 0.0","title":"log1p"},{"location":"#log2","text":"log2(expr) - Returns the logarithm of expr with base 2. Examples: > SELECT log2(2); 1.0","title":"log2"},{"location":"#lower","text":"lower(str) - Returns str with all characters changed to lowercase. Examples: > SELECT lower('SparkSql'); sparksql","title":"lower"},{"location":"#lpad","text":"lpad(str, len, pad) - Returns str , left-padded with pad to a length of len . If str is longer than len , the return value is shortened to len characters. Examples: > SELECT lpad('hi', 5, '??'); ???hi > SELECT lpad('hi', 1, '??'); h","title":"lpad"},{"location":"#ltrim","text":"ltrim(str) - Removes the leading space characters from str . ltrim(trimStr, str) - Removes the leading string contains the characters from the trim string Arguments: str - a string expression trimStr - the trim string characters to trim, the default value is a single space Examples: > SELECT ltrim(' SparkSQL '); SparkSQL > SELECT ltrim('Sp', 'SSparkSQLS'); arkSQLS","title":"ltrim"},{"location":"#map","text":"map(key0, value0, key1, value1, ...) - Creates a map with the given key/value pairs. Examples: > SELECT map(1.0, '2', 3.0, '4'); {1.0:\"2\",3.0:\"4\"}","title":"map"},{"location":"#map_concat","text":"map_concat(map, ...) - Returns the union of all the given maps Examples: > SELECT map_concat(map(1, 'a', 2, 'b'), map(2, 'c', 3, 'd')); {1:\"a\",2:\"c\",3:\"d\"} Since: 2.4.0","title":"map_concat"},{"location":"#map_from_arrays","text":"map_from_arrays(keys, values) - Creates a map with a pair of the given key/value arrays. All elements in keys should not be null Examples: > SELECT map_from_arrays(array(1.0, 3.0), array('2', '4')); {1.0:\"2\",3.0:\"4\"} Since: 2.4.0","title":"map_from_arrays"},{"location":"#map_from_entries","text":"map_from_entries(arrayOfEntries) - Returns a map created from the given array of entries. Examples: > SELECT map_from_entries(array(struct(1, 'a'), struct(2, 'b'))); {1:\"a\",2:\"b\"} Since: 2.4.0","title":"map_from_entries"},{"location":"#map_keys","text":"map_keys(map) - Returns an unordered array containing the keys of the map. Examples: > SELECT map_keys(map(1, 'a', 2, 'b')); [1,2]","title":"map_keys"},{"location":"#map_values","text":"map_values(map) - Returns an unordered array containing the values of the map. Examples: > SELECT map_values(map(1, 'a', 2, 'b')); [\"a\",\"b\"]","title":"map_values"},{"location":"#max","text":"max(expr) - Returns the maximum value of expr .","title":"max"},{"location":"#md5","text":"md5(expr) - Returns an MD5 128-bit checksum as a hex string of expr . Examples: > SELECT md5('Spark'); 8cde774d6f7333752ed72cacddb05126","title":"md5"},{"location":"#mean","text":"mean(expr) - Returns the mean calculated from values of a group.","title":"mean"},{"location":"#min","text":"min(expr) - Returns the minimum value of expr .","title":"min"},{"location":"#minute","text":"minute(timestamp) - Returns the minute component of the string/timestamp. Examples: > SELECT minute('2009-07-30 12:58:59'); 58 Since: 1.5.0","title":"minute"},{"location":"#mod","text":"expr1 mod expr2 - Returns the remainder after expr1 / expr2 . Examples: > SELECT 2 mod 1.8; 0.2 > SELECT MOD(2, 1.8); 0.2","title":"mod"},{"location":"#monotonically_increasing_id","text":"monotonically_increasing_id() - Returns monotonically increasing 64-bit integers. The generated ID is guaranteed to be monotonically increasing and unique, but not consecutive. The current implementation puts the partition ID in the upper 31 bits, and the lower 33 bits represent the record number within each partition. The assumption is that the data frame has less than 1 billion partitions, and each partition has less than 8 billion records. The function is non-deterministic because its result depends on partition IDs.","title":"monotonically_increasing_id"},{"location":"#month","text":"month(date) - Returns the month component of the date/timestamp. Examples: > SELECT month('2016-07-30'); 7 Since: 1.5.0","title":"month"},{"location":"#months_between","text":"months_between(timestamp1, timestamp2[, roundOff]) - If timestamp1 is later than timestamp2 , then the result is positive. If timestamp1 and timestamp2 are on the same day of month, or both are the last day of month, time of day will be ignored. Otherwise, the difference is calculated based on 31 days per month, and rounded to 8 digits unless roundOff=false. Examples: > SELECT months_between('1997-02-28 10:30:00', '1996-10-30'); 3.94959677 > SELECT months_between('1997-02-28 10:30:00', '1996-10-30', false); 3.9495967741935485 Since: 1.5.0","title":"months_between"},{"location":"#named_struct","text":"named_struct(name1, val1, name2, val2, ...) - Creates a struct with the given field names and values. Examples: > SELECT named_struct(\"a\", 1, \"b\", 2, \"c\", 3); {\"a\":1,\"b\":2,\"c\":3}","title":"named_struct"},{"location":"#nanvl","text":"nanvl(expr1, expr2) - Returns expr1 if it's not NaN, or expr2 otherwise. Examples: > SELECT nanvl(cast('NaN' as double), 123); 123.0","title":"nanvl"},{"location":"#negative","text":"negative(expr) - Returns the negated value of expr . Examples: > SELECT negative(1); -1","title":"negative"},{"location":"#next_day","text":"next_day(start_date, day_of_week) - Returns the first date which is later than start_date and named as indicated. Examples: > SELECT next_day('2015-01-14', 'TU'); 2015-01-20 Since: 1.5.0","title":"next_day"},{"location":"#not","text":"not expr - Logical not.","title":"not"},{"location":"#now","text":"now() - Returns the current timestamp at the start of query evaluation. Since: 1.5.0","title":"now"},{"location":"#ntile","text":"ntile(n) - Divides the rows for each window partition into n buckets ranging from 1 to at most n .","title":"ntile"},{"location":"#nullif","text":"nullif(expr1, expr2) - Returns null if expr1 equals to expr2 , or expr1 otherwise. Examples: > SELECT nullif(2, 2); NULL","title":"nullif"},{"location":"#nvl","text":"nvl(expr1, expr2) - Returns expr2 if expr1 is null, or expr1 otherwise. Examples: > SELECT nvl(NULL, array('2')); [\"2\"]","title":"nvl"},{"location":"#nvl2","text":"nvl2(expr1, expr2, expr3) - Returns expr2 if expr1 is not null, or expr3 otherwise. Examples: > SELECT nvl2(NULL, 2, 1); 1","title":"nvl2"},{"location":"#octet_length","text":"octet_length(expr) - Returns the byte length of string data or number of bytes of binary data. Examples: > SELECT octet_length('Spark SQL'); 9","title":"octet_length"},{"location":"#or","text":"expr1 or expr2 - Logical OR.","title":"or"},{"location":"#parse_url","text":"parse_url(url, partToExtract[, key]) - Extracts a part from a URL. Examples: > SELECT parse_url('http://spark.apache.org/path?query=1', 'HOST') spark.apache.org > SELECT parse_url('http://spark.apache.org/path?query=1', 'QUERY') query=1 > SELECT parse_url('http://spark.apache.org/path?query=1', 'QUERY', 'query') 1","title":"parse_url"},{"location":"#percent_rank","text":"percent_rank() - Computes the percentage ranking of a value in a group of values.","title":"percent_rank"},{"location":"#percentile","text":"percentile(col, percentage [, frequency]) - Returns the exact percentile value of numeric column col at the given percentage. The value of percentage must be between 0.0 and 1.0. The value of frequency should be positive integral percentile(col, array(percentage1 [, percentage2]...) [, frequency]) - Returns the exact percentile value array of numeric column col at the given percentage(s). Each value of the percentage array must be between 0.0 and 1.0. The value of frequency should be positive integral","title":"percentile"},{"location":"#percentile_approx","text":"percentile_approx(col, percentage [, accuracy]) - Returns the approximate percentile value of numeric column col at the given percentage. The value of percentage must be between 0.0 and 1.0. The accuracy parameter (default: 10000) is a positive numeric literal which controls approximation accuracy at the cost of memory. Higher value of accuracy yields better accuracy, 1.0/accuracy is the relative error of the approximation. When percentage is an array, each value of the percentage array must be between 0.0 and 1.0. In this case, returns the approximate percentile array of column col at the given percentage array. Examples: > SELECT percentile_approx(10.0, array(0.5, 0.4, 0.1), 100); [10.0,10.0,10.0] > SELECT percentile_approx(10.0, 0.5, 100); 10.0","title":"percentile_approx"},{"location":"#pi","text":"pi() - Returns pi. Examples: > SELECT pi(); 3.141592653589793","title":"pi"},{"location":"#pmod","text":"pmod(expr1, expr2) - Returns the positive value of expr1 mod expr2 . Examples: > SELECT pmod(10, 3); 1 > SELECT pmod(-10, 3); 2","title":"pmod"},{"location":"#posexplode","text":"posexplode(expr) - Separates the elements of array expr into multiple rows with positions, or the elements of map expr into multiple rows and columns with positions. Examples: > SELECT posexplode(array(10,20)); 0 10 1 20","title":"posexplode"},{"location":"#posexplode_outer","text":"posexplode_outer(expr) - Separates the elements of array expr into multiple rows with positions, or the elements of map expr into multiple rows and columns with positions. Examples: > SELECT posexplode_outer(array(10,20)); 0 10 1 20","title":"posexplode_outer"},{"location":"#position","text":"position(substr, str[, pos]) - Returns the position of the first occurrence of substr in str after position pos . The given pos and return value are 1-based. Examples: > SELECT position('bar', 'foobarbar'); 4 > SELECT position('bar', 'foobarbar', 5); 7 > SELECT POSITION('bar' IN 'foobarbar'); 4","title":"position"},{"location":"#positive","text":"positive(expr) - Returns the value of expr .","title":"positive"},{"location":"#pow","text":"pow(expr1, expr2) - Raises expr1 to the power of expr2 . Examples: > SELECT pow(2, 3); 8.0","title":"pow"},{"location":"#power","text":"power(expr1, expr2) - Raises expr1 to the power of expr2 . Examples: > SELECT power(2, 3); 8.0","title":"power"},{"location":"#printf","text":"printf(strfmt, obj, ...) - Returns a formatted string from printf-style format strings. Examples: > SELECT printf(\"Hello World %d %s\", 100, \"days\"); Hello World 100 days","title":"printf"},{"location":"#quarter","text":"quarter(date) - Returns the quarter of the year for date, in the range 1 to 4. Examples: > SELECT quarter('2016-08-31'); 3 Since: 1.5.0","title":"quarter"},{"location":"#radians","text":"radians(expr) - Converts degrees to radians. Arguments: expr - angle in degrees Examples: > SELECT radians(180); 3.141592653589793","title":"radians"},{"location":"#rand","text":"rand([seed]) - Returns a random value with independent and identically distributed (i.i.d.) uniformly distributed values in [0, 1). Examples: > SELECT rand(); 0.9629742951434543 > SELECT rand(0); 0.8446490682263027 > SELECT rand(null); 0.8446490682263027 Note: function is non-deterministic in general case.","title":"rand"},{"location":"#randn","text":"randn([seed]) - Returns a random value with independent and identically distributed (i.i.d.) values drawn from the standard normal distribution. Examples: > SELECT randn(); -0.3254147983080288 > SELECT randn(0); 1.1164209726833079 > SELECT randn(null); 1.1164209726833079 Note: function is non-deterministic in general case.","title":"randn"},{"location":"#rank","text":"rank() - Computes the rank of a value in a group of values. The result is one plus the number of rows preceding or equal to the current row in the ordering of the partition. The values will produce gaps in the sequence.","title":"rank"},{"location":"#reflect","text":"reflect(class, method[, arg1[, arg2 ..]]) - Calls a method with reflection. Examples: > SELECT reflect('java.util.UUID', 'randomUUID'); c33fb387-8500-4bfa-81d2-6e0e3e930df2 > SELECT reflect('java.util.UUID', 'fromString', 'a5cf6c42-0c85-418f-af6c-3e4e5b1328f2'); a5cf6c42-0c85-418f-af6c-3e4e5b1328f2","title":"reflect"},{"location":"#regexp_extract","text":"regexp_extract(str, regexp[, idx]) - Extracts a group that matches regexp . Examples: > SELECT regexp_extract('100-200', '(\\\\d+)-(\\\\d+)', 1); 100","title":"regexp_extract"},{"location":"#regexp_replace","text":"regexp_replace(str, regexp, rep) - Replaces all substrings of str that match regexp with rep . Examples: > SELECT regexp_replace('100-200', '(\\\\d+)', 'num'); num-num","title":"regexp_replace"},{"location":"#repeat","text":"repeat(str, n) - Returns the string which repeats the given string value n times. Examples: > SELECT repeat('123', 2); 123123","title":"repeat"},{"location":"#replace","text":"replace(str, search[, replace]) - Replaces all occurrences of search with replace . Arguments: str - a string expression search - a string expression. If search is not found in str , str is returned unchanged. replace - a string expression. If replace is not specified or is an empty string, nothing replaces the string that is removed from str . Examples: > SELECT replace('ABCabc', 'abc', 'DEF'); ABCDEF","title":"replace"},{"location":"#reverse","text":"reverse(array) - Returns a reversed string or an array with reverse order of elements. Examples: > SELECT reverse('Spark SQL'); LQS krapS > SELECT reverse(array(2, 1, 4, 3)); [3,4,1,2] Note: rse logic for arrays is available since 2.4.0. Since: 1.5.0","title":"reverse"},{"location":"#right","text":"right(str, len) - Returns the rightmost len ( len can be string type) characters from the string str ,if len is less or equal than 0 the result is an empty string. Examples: > SELECT right('Spark SQL', 3); SQL","title":"right"},{"location":"#rint","text":"rint(expr) - Returns the double value that is closest in value to the argument and is equal to a mathematical integer. Examples: > SELECT rint(12.3456); 12.0","title":"rint"},{"location":"#rlike","text":"str rlike regexp - Returns true if str matches regexp , or false otherwise. Arguments: str - a string expression regexp - a string expression. The pattern string should be a Java regular expression. Since Spark 2.0, string literals (including regex patterns) are unescaped in our SQL parser. For example, to match \"\\abc\", a regular expression for regexp can be \"^\\abc$\". There is a SQL config 'spark.sql.parser.escapedStringLiterals' that can be used to fallback to the Spark 1.6 behavior regarding string literal parsing. For example, if the config is enabled, the regexp that can match \"\\abc\" is \"^\\abc$\". Examples: When spark.sql.parser.escapedStringLiterals is disabled (default). > SELECT '%SystemDrive%\\Users\\John' rlike '%SystemDrive%\\\\Users.*' true When spark.sql.parser.escapedStringLiterals is enabled. > SELECT '%SystemDrive%\\Users\\John' rlike '%SystemDrive%\\Users.*' true Note: Use LIKE to match with simple string pattern.","title":"rlike"},{"location":"#rollup","text":"","title":"rollup"},{"location":"#round","text":"round(expr, d) - Returns expr rounded to d decimal places using HALF_UP rounding mode. Examples: > SELECT round(2.5, 0); 3.0","title":"round"},{"location":"#row_number","text":"row_number() - Assigns a unique, sequential number to each row, starting with one, according to the ordering of rows within the window partition.","title":"row_number"},{"location":"#rpad","text":"rpad(str, len, pad) - Returns str , right-padded with pad to a length of len . If str is longer than len , the return value is shortened to len characters. Examples: > SELECT rpad('hi', 5, '??'); hi??? > SELECT rpad('hi', 1, '??'); h","title":"rpad"},{"location":"#rtrim","text":"rtrim(str) - Removes the trailing space characters from str . rtrim(trimStr, str) - Removes the trailing string which contains the characters from the trim string from the str Arguments: str - a string expression trimStr - the trim string characters to trim, the default value is a single space Examples: > SELECT rtrim(' SparkSQL '); SparkSQL > SELECT rtrim('LQSa', 'SSparkSQLS'); SSpark","title":"rtrim"},{"location":"#schema_of_json","text":"schema_of_json(json[, options]) - Returns schema in the DDL format of JSON string. Examples: > SELECT schema_of_json('[{\"col\":0}]'); array<struct<col:int>> Since: 2.4.0","title":"schema_of_json"},{"location":"#second","text":"second(timestamp) - Returns the second component of the string/timestamp. Examples: > SELECT second('2009-07-30 12:58:59'); 59 Since: 1.5.0","title":"second"},{"location":"#sentences","text":"sentences(str[, lang, country]) - Splits str into an array of array of words. Examples: > SELECT sentences('Hi there! Good morning.'); [[\"Hi\",\"there\"],[\"Good\",\"morning\"]]","title":"sentences"},{"location":"#sequence","text":"sequence(start, stop, step) - Generates an array of elements from start to stop (inclusive), incrementing by step. The type of the returned elements is the same as the type of argument expressions. Supported types are: byte, short, integer, long, date, timestamp. The start and stop expressions must resolve to the same type. If start and stop expressions resolve to the 'date' or 'timestamp' type then the step expression must resolve to the 'interval' type, otherwise to the same type as the start and stop expressions. Arguments: start - an expression. The start of the range. stop - an expression. The end the range (inclusive). step - an optional expression. The step of the range. By default step is 1 if start is less than or equal to stop, otherwise -1. For the temporal sequences it's 1 day and -1 day respectively. If start is greater than stop then the step must be negative, and vice versa. Examples: > SELECT sequence(1, 5); [1,2,3,4,5] > SELECT sequence(5, 1); [5,4,3,2,1] > SELECT sequence(to_date('2018-01-01'), to_date('2018-03-01'), interval 1 month); [2018-01-01,2018-02-01,2018-03-01] Since: 2.4.0","title":"sequence"},{"location":"#sha","text":"sha(expr) - Returns a sha1 hash value as a hex string of the expr . Examples: > SELECT sha('Spark'); 85f5955f4b27a9a4c2aab6ffe5d7189fc298b92c","title":"sha"},{"location":"#sha1","text":"sha1(expr) - Returns a sha1 hash value as a hex string of the expr . Examples: > SELECT sha1('Spark'); 85f5955f4b27a9a4c2aab6ffe5d7189fc298b92c","title":"sha1"},{"location":"#sha2","text":"sha2(expr, bitLength) - Returns a checksum of SHA-2 family as a hex string of expr . SHA-224, SHA-256, SHA-384, and SHA-512 are supported. Bit length of 0 is equivalent to 256. Examples: > SELECT sha2('Spark', 256); 529bc3b07127ecb7e53a4dcf1991d9152c24537d919178022b2c42657f79a26b","title":"sha2"},{"location":"#shiftleft","text":"shiftleft(base, expr) - Bitwise left shift. Examples: > SELECT shiftleft(2, 1); 4","title":"shiftleft"},{"location":"#shiftright","text":"shiftright(base, expr) - Bitwise (signed) right shift. Examples: > SELECT shiftright(4, 1); 2","title":"shiftright"},{"location":"#shiftrightunsigned","text":"shiftrightunsigned(base, expr) - Bitwise unsigned right shift. Examples: > SELECT shiftrightunsigned(4, 1); 2","title":"shiftrightunsigned"},{"location":"#shuffle","text":"shuffle(array) - Returns a random permutation of the given array. Examples: > SELECT shuffle(array(1, 20, 3, 5)); [3,1,5,20] > SELECT shuffle(array(1, 20, null, 3)); [20,null,3,1] Note: function is non-deterministic. Since: 2.4.0","title":"shuffle"},{"location":"#sign","text":"sign(expr) - Returns -1.0, 0.0 or 1.0 as expr is negative, 0 or positive. Examples: > SELECT sign(40); 1.0","title":"sign"},{"location":"#signum","text":"signum(expr) - Returns -1.0, 0.0 or 1.0 as expr is negative, 0 or positive. Examples: > SELECT signum(40); 1.0","title":"signum"},{"location":"#sin","text":"sin(expr) - Returns the sine of expr , as if computed by java.lang.Math.sin . Arguments: expr - angle in radians Examples: > SELECT sin(0); 0.0","title":"sin"},{"location":"#sinh","text":"sinh(expr) - Returns hyperbolic sine of expr , as if computed by java.lang.Math.sinh . Arguments: expr - hyperbolic angle Examples: > SELECT sinh(0); 0.0","title":"sinh"},{"location":"#size","text":"size(expr) - Returns the size of an array or a map. The function returns -1 if its input is null and spark.sql.legacy.sizeOfNull is set to true. If spark.sql.legacy.sizeOfNull is set to false, the function returns null for null input. By default, the spark.sql.legacy.sizeOfNull parameter is set to true. Examples: > SELECT size(array('b', 'd', 'c', 'a')); 4 > SELECT size(map('a', 1, 'b', 2)); 2 > SELECT size(NULL); -1","title":"size"},{"location":"#skewness","text":"skewness(expr) - Returns the skewness value calculated from values of a group.","title":"skewness"},{"location":"#slice","text":"slice(x, start, length) - Subsets array x starting from index start (or starting from the end if start is negative) with the specified length. Examples: > SELECT slice(array(1, 2, 3, 4), 2, 2); [2,3] > SELECT slice(array(1, 2, 3, 4), -2, 2); [3,4] Since: 2.4.0","title":"slice"},{"location":"#smallint","text":"smallint(expr) - Casts the value expr to the target data type smallint .","title":"smallint"},{"location":"#sort_array","text":"sort_array(array[, ascendingOrder]) - Sorts the input array in ascending or descending order according to the natural ordering of the array elements. Null elements will be placed at the beginning of the returned array in ascending order or at the end of the returned array in descending order. Examples: > SELECT sort_array(array('b', 'd', null, 'c', 'a'), true); [null,\"a\",\"b\",\"c\",\"d\"]","title":"sort_array"},{"location":"#soundex","text":"soundex(str) - Returns Soundex code of the string. Examples: > SELECT soundex('Miller'); M460","title":"soundex"},{"location":"#space","text":"space(n) - Returns a string consisting of n spaces. Examples: > SELECT concat(space(2), '1'); 1","title":"space"},{"location":"#spark_partition_id","text":"spark_partition_id() - Returns the current partition id.","title":"spark_partition_id"},{"location":"#split","text":"split(str, regex) - Splits str around occurrences that match regex . Examples: > SELECT split('oneAtwoBthreeC', '[ABC]'); [\"one\",\"two\",\"three\",\"\"]","title":"split"},{"location":"#sqrt","text":"sqrt(expr) - Returns the square root of expr . Examples: > SELECT sqrt(4); 2.0","title":"sqrt"},{"location":"#stack","text":"stack(n, expr1, ..., exprk) - Separates expr1 , ..., exprk into n rows. Examples: > SELECT stack(2, 1, 2, 3); 1 2 3 NULL","title":"stack"},{"location":"#std","text":"std(expr) - Returns the sample standard deviation calculated from values of a group.","title":"std"},{"location":"#stddev","text":"stddev(expr) - Returns the sample standard deviation calculated from values of a group.","title":"stddev"},{"location":"#stddev_pop","text":"stddev_pop(expr) - Returns the population standard deviation calculated from values of a group.","title":"stddev_pop"},{"location":"#stddev_samp","text":"stddev_samp(expr) - Returns the sample standard deviation calculated from values of a group.","title":"stddev_samp"},{"location":"#str_to_map","text":"str_to_map(text[, pairDelim[, keyValueDelim]]) - Creates a map after splitting the text into key/value pairs using delimiters. Default delimiters are ',' for pairDelim and ':' for keyValueDelim . Examples: > SELECT str_to_map('a:1,b:2,c:3', ',', ':'); map(\"a\":\"1\",\"b\":\"2\",\"c\":\"3\") > SELECT str_to_map('a'); map(\"a\":null)","title":"str_to_map"},{"location":"#string","text":"string(expr) - Casts the value expr to the target data type string .","title":"string"},{"location":"#struct","text":"struct(col1, col2, col3, ...) - Creates a struct with the given field values.","title":"struct"},{"location":"#substr","text":"substr(str, pos[, len]) - Returns the substring of str that starts at pos and is of length len , or the slice of byte array that starts at pos and is of length len . Examples: > SELECT substr('Spark SQL', 5); k SQL > SELECT substr('Spark SQL', -3); SQL > SELECT substr('Spark SQL', 5, 1); k","title":"substr"},{"location":"#substring","text":"substring(str, pos[, len]) - Returns the substring of str that starts at pos and is of length len , or the slice of byte array that starts at pos and is of length len . Examples: > SELECT substring('Spark SQL', 5); k SQL > SELECT substring('Spark SQL', -3); SQL > SELECT substring('Spark SQL', 5, 1); k","title":"substring"},{"location":"#substring_index","text":"substring_index(str, delim, count) - Returns the substring from str before count occurrences of the delimiter delim . If count is positive, everything to the left of the final delimiter (counting from the left) is returned. If count is negative, everything to the right of the final delimiter (counting from the right) is returned. The function substring_index performs a case-sensitive match when searching for delim . Examples: > SELECT substring_index('www.apache.org', '.', 2); www.apache","title":"substring_index"},{"location":"#sum","text":"sum(expr) - Returns the sum calculated from values of a group.","title":"sum"},{"location":"#tan","text":"tan(expr) - Returns the tangent of expr , as if computed by java.lang.Math.tan . Arguments: expr - angle in radians Examples: > SELECT tan(0); 0.0","title":"tan"},{"location":"#tanh","text":"tanh(expr) - Returns the hyperbolic tangent of expr , as if computed by java.lang.Math.tanh . Arguments: expr - hyperbolic angle Examples: > SELECT tanh(0); 0.0","title":"tanh"},{"location":"#timestamp","text":"timestamp(expr) - Casts the value expr to the target data type timestamp .","title":"timestamp"},{"location":"#tinyint","text":"tinyint(expr) - Casts the value expr to the target data type tinyint .","title":"tinyint"},{"location":"#to_date","text":"to_date(date_str[, fmt]) - Parses the date_str expression with the fmt expression to a date. Returns null with invalid input. By default, it follows casting rules to a date if the fmt is omitted. Examples: > SELECT to_date('2009-07-30 04:17:52'); 2009-07-30 > SELECT to_date('2016-12-31', 'yyyy-MM-dd'); 2016-12-31 Since: 1.5.0","title":"to_date"},{"location":"#to_json","text":"to_json(expr[, options]) - Returns a JSON string with a given struct value Examples: > SELECT to_json(named_struct('a', 1, 'b', 2)); {\"a\":1,\"b\":2} > SELECT to_json(named_struct('time', to_timestamp('2015-08-26', 'yyyy-MM-dd')), map('timestampFormat', 'dd/MM/yyyy')); {\"time\":\"26/08/2015\"} > SELECT to_json(array(named_struct('a', 1, 'b', 2))); [{\"a\":1,\"b\":2}] > SELECT to_json(map('a', named_struct('b', 1))); {\"a\":{\"b\":1}} > SELECT to_json(map(named_struct('a', 1),named_struct('b', 2))); {\"[1]\":{\"b\":2}} > SELECT to_json(map('a', 1)); {\"a\":1} > SELECT to_json(array((map('a', 1)))); [{\"a\":1}] Since: 2.2.0","title":"to_json"},{"location":"#to_timestamp","text":"to_timestamp(timestamp[, fmt]) - Parses the timestamp expression with the fmt expression to a timestamp. Returns null with invalid input. By default, it follows casting rules to a timestamp if the fmt is omitted. Examples: > SELECT to_timestamp('2016-12-31 00:12:00'); 2016-12-31 00:12:00 > SELECT to_timestamp('2016-12-31', 'yyyy-MM-dd'); 2016-12-31 00:00:00 Since: 2.2.0","title":"to_timestamp"},{"location":"#to_unix_timestamp","text":"to_unix_timestamp(expr[, pattern]) - Returns the UNIX timestamp of the given time. Examples: > SELECT to_unix_timestamp('2016-04-08', 'yyyy-MM-dd'); 1460041200 Since: 1.6.0","title":"to_unix_timestamp"},{"location":"#to_utc_timestamp","text":"to_utc_timestamp(timestamp, timezone) - Given a timestamp like '2017-07-14 02:40:00.0', interprets it as a time in the given time zone, and renders that time as a timestamp in UTC. For example, 'GMT+1' would yield '2017-07-14 01:40:00.0'. Examples: > SELECT to_utc_timestamp('2016-08-31', 'Asia/Seoul'); 2016-08-30 15:00:00 Since: 1.5.0","title":"to_utc_timestamp"},{"location":"#transform","text":"transform(expr, func) - Transforms elements in an array using the function. Examples: > SELECT transform(array(1, 2, 3), x -> x + 1); [2,3,4] > SELECT transform(array(1, 2, 3), (x, i) -> x + i); [1,3,5] Since: 2.4.0","title":"transform"},{"location":"#translate","text":"translate(input, from, to) - Translates the input string by replacing the characters present in the from string with the corresponding characters in the to string. Examples: > SELECT translate('AaBbCc', 'abc', '123'); A1B2C3","title":"translate"},{"location":"#trim","text":"trim(str) - Removes the leading and trailing space characters from str . trim(BOTH trimStr FROM str) - Remove the leading and trailing trimStr characters from str trim(LEADING trimStr FROM str) - Remove the leading trimStr characters from str trim(TRAILING trimStr FROM str) - Remove the trailing trimStr characters from str Arguments: str - a string expression trimStr - the trim string characters to trim, the default value is a single space BOTH, FROM - these are keywords to specify trimming string characters from both ends of the string LEADING, FROM - these are keywords to specify trimming string characters from the left end of the string TRAILING, FROM - these are keywords to specify trimming string characters from the right end of the string Examples: > SELECT trim(' SparkSQL '); SparkSQL > SELECT trim('SL', 'SSparkSQLS'); parkSQ > SELECT trim(BOTH 'SL' FROM 'SSparkSQLS'); parkSQ > SELECT trim(LEADING 'SL' FROM 'SSparkSQLS'); parkSQLS > SELECT trim(TRAILING 'SL' FROM 'SSparkSQLS'); SSparkSQ","title":"trim"},{"location":"#trunc","text":"trunc(date, fmt) - Returns date with the time portion of the day truncated to the unit specified by the format model fmt . fmt should be one of [\"year\", \"yyyy\", \"yy\", \"mon\", \"month\", \"mm\"] Examples: > SELECT trunc('2009-02-12', 'MM'); 2009-02-01 > SELECT trunc('2015-10-27', 'YEAR'); 2015-01-01 Since: 1.5.0","title":"trunc"},{"location":"#ucase","text":"ucase(str) - Returns str with all characters changed to uppercase. Examples: > SELECT ucase('SparkSql'); SPARKSQL","title":"ucase"},{"location":"#unbase64","text":"unbase64(str) - Converts the argument from a base 64 string str to a binary. Examples: > SELECT unbase64('U3BhcmsgU1FM'); Spark SQL","title":"unbase64"},{"location":"#unhex","text":"unhex(expr) - Converts hexadecimal expr to binary. Examples: > SELECT decode(unhex('537061726B2053514C'), 'UTF-8'); Spark SQL","title":"unhex"},{"location":"#unix_timestamp","text":"unix_timestamp([expr[, pattern]]) - Returns the UNIX timestamp of current or specified time. Examples: > SELECT unix_timestamp(); 1476884637 > SELECT unix_timestamp('2016-04-08', 'yyyy-MM-dd'); 1460041200 Since: 1.5.0","title":"unix_timestamp"},{"location":"#upper","text":"upper(str) - Returns str with all characters changed to uppercase. Examples: > SELECT upper('SparkSql'); SPARKSQL","title":"upper"},{"location":"#uuid","text":"uuid() - Returns an universally unique identifier (UUID) string. The value is returned as a canonical UUID 36-character string. Examples: > SELECT uuid(); 46707d92-02f4-4817-8116-a4c3b23e6266 Note: function is non-deterministic.","title":"uuid"},{"location":"#var_pop","text":"var_pop(expr) - Returns the population variance calculated from values of a group.","title":"var_pop"},{"location":"#var_samp","text":"var_samp(expr) - Returns the sample variance calculated from values of a group.","title":"var_samp"},{"location":"#variance","text":"variance(expr) - Returns the sample variance calculated from values of a group.","title":"variance"},{"location":"#weekday","text":"weekday(date) - Returns the day of the week for date/timestamp (0 = Monday, 1 = Tuesday, ..., 6 = Sunday). Examples: > SELECT weekday('2009-07-30'); 3 Since: 2.4.0","title":"weekday"},{"location":"#weekofyear","text":"weekofyear(date) - Returns the week of the year of the given date. A week is considered to start on a Monday and week 1 is the first week with >3 days. Examples: > SELECT weekofyear('2008-02-20'); 8 Since: 1.5.0","title":"weekofyear"},{"location":"#when","text":"CASE WHEN expr1 THEN expr2 [WHEN expr3 THEN expr4]* [ELSE expr5] END - When expr1 = true, returns expr2 ; else when expr3 = true, returns expr4 ; else returns expr5 . Arguments: expr1, expr3 - the branch condition expressions should all be boolean type. expr2, expr4, expr5 - the branch value expressions and else value expression should all be same type or coercible to a common type. Examples: > SELECT CASE WHEN 1 > 0 THEN 1 WHEN 2 > 0 THEN 2.0 ELSE 1.2 END; 1 > SELECT CASE WHEN 1 < 0 THEN 1 WHEN 2 > 0 THEN 2.0 ELSE 1.2 END; 2 > SELECT CASE WHEN 1 < 0 THEN 1 WHEN 2 < 0 THEN 2.0 END; NULL","title":"when"},{"location":"#window","text":"","title":"window"},{"location":"#xpath","text":"xpath(xml, xpath) - Returns a string array of values within the nodes of xml that match the XPath expression. Examples: > SELECT xpath('<a><b>b1</b><b>b2</b><b>b3</b><c>c1</c><c>c2</c></a>','a/b/text()'); ['b1','b2','b3']","title":"xpath"},{"location":"#xpath_boolean","text":"xpath_boolean(xml, xpath) - Returns true if the XPath expression evaluates to true, or if a matching node is found. Examples: > SELECT xpath_boolean('<a><b>1</b></a>','a/b'); true","title":"xpath_boolean"},{"location":"#xpath_double","text":"xpath_double(xml, xpath) - Returns a double value, the value zero if no match is found, or NaN if a match is found but the value is non-numeric. Examples: > SELECT xpath_double('<a><b>1</b><b>2</b></a>', 'sum(a/b)'); 3.0","title":"xpath_double"},{"location":"#xpath_float","text":"xpath_float(xml, xpath) - Returns a float value, the value zero if no match is found, or NaN if a match is found but the value is non-numeric. Examples: > SELECT xpath_float('<a><b>1</b><b>2</b></a>', 'sum(a/b)'); 3.0","title":"xpath_float"},{"location":"#xpath_int","text":"xpath_int(xml, xpath) - Returns an integer value, or the value zero if no match is found, or a match is found but the value is non-numeric. Examples: > SELECT xpath_int('<a><b>1</b><b>2</b></a>', 'sum(a/b)'); 3","title":"xpath_int"},{"location":"#xpath_long","text":"xpath_long(xml, xpath) - Returns a long integer value, or the value zero if no match is found, or a match is found but the value is non-numeric. Examples: > SELECT xpath_long('<a><b>1</b><b>2</b></a>', 'sum(a/b)'); 3","title":"xpath_long"},{"location":"#xpath_number","text":"xpath_number(xml, xpath) - Returns a double value, the value zero if no match is found, or NaN if a match is found but the value is non-numeric. Examples: > SELECT xpath_number('<a><b>1</b><b>2</b></a>', 'sum(a/b)'); 3.0","title":"xpath_number"},{"location":"#xpath_short","text":"xpath_short(xml, xpath) - Returns a short integer value, or the value zero if no match is found, or a match is found but the value is non-numeric. Examples: > SELECT xpath_short('<a><b>1</b><b>2</b></a>', 'sum(a/b)'); 3","title":"xpath_short"},{"location":"#xpath_string","text":"xpath_string(xml, xpath) - Returns the text contents of the first xml node that matches the XPath expression. Examples: > SELECT xpath_string('<a><b>b</b><c>cc</c></a>','a/c'); cc","title":"xpath_string"},{"location":"#year","text":"year(date) - Returns the year component of the date/timestamp. Examples: > SELECT year('2016-07-30'); 2016 Since: 1.5.0","title":"year"},{"location":"#zip_with","text":"zip_with(left, right, func) - Merges the two given arrays, element-wise, into a single array using function. If one array is shorter, nulls are appended at the end to match the length of the longer array, before applying function. Examples: > SELECT zip_with(array(1, 2, 3), array('a', 'b', 'c'), (x, y) -> (y, x)); [{\"y\":\"a\",\"x\":1},{\"y\":\"b\",\"x\":2},{\"y\":\"c\",\"x\":3}] > SELECT zip_with(array(1, 2), array(3, 4), (x, y) -> x + y); [4,6] > SELECT zip_with(array('a', 'b', 'c'), array('d', 'e', 'f'), (x, y) -> concat(x, y)); [\"ad\",\"be\",\"cf\"] Since: 2.4.0","title":"zip_with"},{"location":"#_15","text":"expr1 | expr2 - Returns the result of bitwise OR of expr1 and expr2 . Examples: > SELECT 3 | 5; 7","title":"|"},{"location":"#_16","text":"~ expr - Returns the result of bitwise NOT of expr . Examples: > SELECT ~ 0; -1","title":"~"}]}