Get Even More Visitors To Your Blog, Upgrade To A Business Listing >>

Hive Built-in Functions

Introduction to Hive Built-in Functions

Functions are built to perform different analytical requirements and operations like mathematical, logical, arithmetic and relational, on huge datasets and tables. Functions are used when we have to re-use similar operations multiple times. In this advanced BigData world, to handle and process huge datasets, writing MapReduce programs becomes difficult. So, Hive Query Language  (HQL) was introduced on top of Apache Hadoop in October 2010, using which we can easily run queries, similar to SQLs. HQLs are further internally converted into Mapreduce programs to yield the outputs. In this topic, we are going to discuss Hive Built-in Functions

Hive functions are broadly categorized into two :

  1. Built-in Functions.
  2. User-Defined Functions.

In this article, we are going to discuss built-in functions in hive

Built-In Functions of Hive

Built-in Functions are predefined and readily available to use in Hive. They are categorized as below:

1. String Functions

  • Used for string manipulations and transformation.
Function Name Return Type Description Example
substr(string, int, int) String It returns a string from the specified start position to specified length substr(‘Hive query ‘,5,5) results in ‘query’
rtrim(string X) String It returns a string without right leading spaces rtrim(‘Hello     ‘) results in ‘Hello’
ltrim(string X) String It returns a string without left leading spaces ltrim(‘      Hello’) results in ‘Hello’
reverse(string X) String It returns a reversed string of X reverse(‘hello’) results in ‘olleh’
rpad(string X, int , string ) String It returns a right padded string with ‘pad’ to a length of rpad(‘Hello ‘,3,’hi ‘) results in ‘Hello hi hi hi ‘
lpad(string X, int , string ) String It returns a left padded string with ‘pad’ to a length of lpad(‘Hello’,3,’ Hi ‘) results in ‘ Hi Hi Hi Hello’
split(string X,string pat) Array It returns an array of strings after splitting the string around the specified pattern split(“A,B,C,D”,”,”) results in [‘A’,B’,’C’,’D’]
length(string X) Integer It returns length of string X length(‘Bigdata’) results in 7
concat(string X,String Y) String It returns a concatenated string of both X and Y concat(‘Hello’,’World’) results in ‘HelloWorld’

2. Mathematical Functions

  • Used for applying mathematical operations like rounding, ceiling, flooring, etc. instead of creating UDFs.
Function Name Return Type Description Example
round(double X) Double It returns the rounded off value of X round(29.5) results in 30
round(double X,Int d) Double It returns the rounded off value of X to d decimal places round(29.3456,2) results in 29.34
ceil(double X) Double It returns a minimum BIGINT value that is equal to or greater than X ceil(20.5555) results in 21
floor(double X) Double It returns a maximum BIGINT value that is equal to or greater than X ceil(20.5555) results in 20
rand() Double It returns random numbers between 0 to 9 rand() results in rand(0-9)
abs(double X) Double It returns the absolute value of the number X abs(-25) results in 25
pow(double X,double Y) Double It returns X value raised to the Y power pow(2,3) results in 8
exp(double X) Double It returns the exponent value of X exp(2)results in 7.389

 3. Conditional Functions

  • Used to check an expression for True or False and returns corresponding results.
Function Name Return Type Description Example
isnull(X) Boolean It returns TRUE if X is NULL else false isnull(‘NULL’) returns in TRUE
isnotnull(X) Boolean It returns TRUE if X is not NULL else false isnotnull(‘NULL’) returns in FALSE
nvl(arg X,arg Y) String It returns arg Y if arg X is NULL else return arg X nvl(‘NULL’,’Value is Null’) results in ‘Value is Null’
if(Boolean testCondition, T valueTrue, T valueFalseOrNull) T It returns True when test condition is true and False or Null otherwise if (2=2,’True’,’False’)
coalesce(X,Y) (Any) It returns the first non-null values from the list{I have a doubt on this—will it return the very first value or all non null values?} coalesce(null,null,null,1,5,null,null,6) results in 1

4. Date Functions

  • Used for performing date manipulations and date type conversions
Function Name Return Type Description Example
current_date Date It returns current date of query execution current_date() returns today’s date
unix_timestamp() BigInt It returns current unix timestamp of query execution in seconds unix_timestamp() returns current unix timestamp
year(string date) Int It returns year part of a date year(‘1994-11-24’) returns 1994
quarter(string date) Int It returns the quarter of the year of a date quarter(‘1994-11-24’) returns 4
month(string date) Int It returns the month part of a date month(‘1994-11-24’) returns 11
hour(string date) Int It returns the hour part of the timestamp hour(‘1994-11-24 12:45:23’) returns 12
minute(string date) Int It returns the minute part of the timestamp year(‘1994-11-24 12:45:23’) returns 45
to_date() String It returns date part of timestamp string to_date() returns date part
date_sub(string date, int days) String It returns subtraction of number of days to the date date_sub(‘1994-11-24’,20) returns ‘1994-11-04’
months_between(date1,date2) Double It returns the number of months between date1 and date2{months or no of days?} months_between(‘1994-11-24′,’1994-11-04’) returns 20

5. Collection Function

  • Used for transforming and retrieving parts of collection types like map, array, etc.
Function Name Return Type Description Example
size(map) Int It returns the total number of elements in the map size([‘a’:2,’b’:6]) returns 2
size(array) Int It returns the total number of elements in the array size([1,3,4,7]) returns 4
array_contains(array,value) Boolean It returns true if the array contains the value array_contains([1,3,9,8,5,4,7],5) returns TRUE
map_keys(map) Array It returns an unordered array of keys of the map map_keys([‘a’:2,’b’:6,’c’:1]) returns [‘b’,’c’,’a’]
map_valuess(map) Array It returns an unordered array of values of the map map_keys([‘a’:2,’b’:6,’c’:1]) returns [1,6,2]
sort_array(array) Array It returns a sorted array of the given array sort_array([1,3,9,8,5,4,7]) returns [1,3,4,5,7,8,9]

Conclusion

So far we have discussed different built-in functions in Hive. Compared to MapReduce, Hive is more convenient and time-saving. Anyone who has basic SQL knowledge can easily write HQLs, rather than complicated MapReduce programs for data processing.

Recommended Articles

This is a guide to Hive Built-in Functions. Here we discuss the Built-in Functions which are predefined and readily available to use in Hive. You may also look at the following article to learn more –

  1. Hive String Functions
  2. Hive Function
  3. Hive Commands
  4. Hive Architecture

The post Hive Built-in Functions appeared first on EDUCBA.



This post first appeared on Best Online Training & Video Courses | EduCBA, please read the originial post: here

Share the post

Hive Built-in Functions

×

Subscribe to Best Online Training & Video Courses | Educba

Get updates delivered right to your inbox!

Thank you for your subscription

×