How can I use another value when selected column value was NULL in BigQuery?
BigQuery(BQ) is very useful for data analysis or processing.
It is good at handling huge data. It returns summary result in short time.
But data is not always clean.
Sometimes it contains NULL in specific column.
In that case we should use another value.
Then how can we use alternative value ?
So today I introduce about "How to use alternative column value instead of null in BigQuery".
Prepare data
First, we have to prepare data.
Load this CSV file into table null_sample
.
col1,col2,col3
a,b,c
,b,c
,,c
a,b,
a,,
Then we can see table like below.
Row | col1 | col2 | col3 |
---|---|---|---|
1 | null | b | c |
2 | null | null | c |
3 | a | b | c |
4 | a | b | null |
5 | a | null | null |
How to use alternative column value instead of null in BigQuery
There are 2 methods to use another value instaed of NULL.
- IFNULL
- COALESCE
I will introduce each of them.
IFNULL
First you may imagine IF
function.
There is similar function IFNULL
that is for NULL values.
In order to use IFNULL()
, you can set column that you want to check NULL as first parameter.
And set alternative value as second parameter.
SELECT
col1,
col2,
col3,
IFNULL(col1,col2) as f
FROM test.null_sample
Result is like below.
Row | col1 | col2 | col3 | f |
---|---|---|---|---|
1 | null | b | c | b |
2 | null | null | c | null |
3 | a | b | c | a |
4 | a | b | null | a |
5 | a | null | null | a |
As you see, in row 1 and 2, function returned col2 value.
Then there is a problem.
If we want to use col3 value when col1 and col2 are NULL, what should we do ?
In that case we use nested IFNULL
.
SELECT
col1,
col2,
col3,
IFNULL(col1,IFNULL(col2,col3)) as f
FROM test.null_sample
Row | col1 | col2 | col3 | f |
---|---|---|---|---|
1 | null | b | c | b |
2 | null | null | c | c |
3 | a | b | c | a |
4 | a | b | null | a |
5 | a | null | null | a |
Nested structure is hard to read.
So there is another solution.
COALESCE
In order to use another value instead of NULL, we can also use COALESCE
.
In COALESCE
, it accepts multiple parameters.
And it returns first not-null-value.
So it doesn't require nested structure.
SELECT
col1,
col2,
col3,
COALESCE(col1,col2,col3) as f
FROM test.null_sample
Row | col1 | col2 | col3 | f |
---|---|---|---|---|
1 | null | b | c | b |
2 | null | null | c | c |
3 | a | b | c | a |
4 | a | b | null | a |
5 | a | null | null | a |
This SQL is better to read.
Conclusion
Today I explained about "How to use alternative column value instead of null in BigQuery".
There are 2 methods to use another value instead of NULL.
- IFNULL
- COALESCE
We can use IFNULL
like IF
function. But it can check only 1 expression.
COALESCE
accepts 2 or more parameters and returns first non-null-value.
So it is useful to check multiple columns.