Tuesday, March 30, 2021

hive with clause

the subquery twice or once) but it works. Here is the fiddle

WITH subquery as
       ( 
SELECT
    First.deal_id
    ,COALESCE( First.exchange, Second.exchange ) as exchange_id
    ,COALESCE( First.publisher, Second.publisher ) as pub_id
    ,COUNT (DISTINCT(case when Second.user_id is null then First.user_id else null END)) AS Incremental
    ,SUM (First.imps) AS First_imps
    ,SUM (Second.imps) AS Second_imps
    FROM
        (
            SELECT
                a.deal_id
                ,a.exchange
                ,a.publisher
                ,a.user_id
                ,1 AS imps
            FROM
                T1 a 
            WHERE
                a.deal_id >= 1
        ) First 
        FULL OUTER JOIN (
            SELECT
                a.exchange
                ,a.publisher
                ,a.user_id
                ,1 AS imps
            FROM
                T1 a
            WHERE
            a.deal_id = 0
        ) Second
        ON (
            First.exchange = Second.exchange
            AND First.publisher = Second.publisher
            AND First.user_id = Second.user_id
        )
        GROUP BY
        First.deal_id
        ,COALESCE( First.exchange, Second.exchange )
        ,COALESCE( First.publisher, Second.publisher )
        )

SELECT
deal.deal_id,
deal.exchange_id,
deal.pub_id,
sum(deal.incremental),
sum(deal.first_imps),
sum(coalesce(deal.second_imps, 0) + coalesce(oa.second_imps,0))
FROM 
subquery deal
LEFT JOIN 
subquery oa 
ON (deal.exchange_id = oa.exchange_id
AND deal.pub_id = oa.pub_id
AND oa.deal_id is null)
WHERE deal.deal_id is not null
GROUP BY
deal.deal_id,
deal.exchange_id,
deal.pub_id
;

hive nvl, colease, when, if

Hadoop Hive supports the various Conditional functions such as IF, CASE, COALESCE, NVL, DECODE etc. You can use these function for testing equality, comparison operators and check if value is null. Following diagram shows various Hive Conditional Functions:

Hadoop Hive Conditional Functions

Hive Conditional Functions

Below table describes the various Hive conditional functions:

Conditional FunctionDescription
IF(boolean testCondition, T valueTrue, T valueFalseOrNull);This is the one of best Hive Conditional Functions and is similar to the IF statements in other programming languages. The IF Hive Conditional functions tests an expression and returns a corresponding result depending on whether the result is true, false or null.
isnull( a )Returns true if a is NULL and false otherwise.
isnotnull ( a )This function is a negation of isnull function. Returns true if a is not NULL and false otherwise.
CASE WHEN a THEN b [WHEN c THEN d]… [ELSE e] ENDFunction Tests whether any of a sequence of expressions is true, and returns a corresponding result for the first true expression. You can use OR, IN, REGEXP in the CASE expressions.
CASE a WHEN b THEN c [WHEN d THEN e]… [ELSE f] ENDFunction Tests whether any of a sequence of expressions is true, and returns a corresponding result for the first true expression. You can use OR, IN, REGEXP in the CASE expressions.
NVL(arg1, arg2)This function will test if expression is null, it’ll return expression if result is not null otherwise second argument is returned.
coalesce(value1, value 2, …)Returns the first non-null value for list of values provided as arguments.
decode(<expr>, <search1>,<result1>, …<search N>, <result N>, <default>)Compares an expression to one or more possible values, and returns a corresponding result when a match is found.

Hive Conditional Functions Examples

Hive IF condition function example

select IF(1=1,'TRUE','FALSE') as IF_CONDITION_TEST;

Hive ISNULL condition function example

select isnull( NULL );

Hive ISNOTNULL condition function example

select isnotnull( NULL );

Hive CASE conditional function examples

select case x 
when 1 then 'one' 
when 2 then 'two' 
when 0 then 'zero' 
else 'out of range' 
end from t1;

select case 
when dayname(now()) in ('Saturday','Sunday') then 'result undefined on weekends' 
when x > y then 'x greater than y' 
when x = y then 'x and y are equal' 
when x is null or y is null then 'one of the columns is null' 
else null end 
from t1;

Hive NVL conditional function examples

select nvl(null,'value is null');

Hive COALESCE conditional function examples

select coalesce(null,'a',null,'b');

Hive DECODE conditional function examples

SELECT event, decode(day_of_week, 1, "Monday", 2, "Tuesday", 3, "Wednesday", 4, "Thursday", 5, "Friday", 6, "Saturday", 7, "Sunday", "Unknown day") FROM calendar;