Spark SQL NOT operator and Null-aware predicate sub-queries cannot be used in nested conditions

Multi tool use
up vote
1
down vote
favorite
The following Spark SQL query works fine:
((country IN (FROM medium_countries) ) AND (country IN (FROM big_countries))) AND EMAIL IS NOT NULL
and the following one works fine:
FALSE = ((country IN (FROM medium_countries)) AND (country IN (FROM big_countries))) AND EMAIL IS NOT NULL
but when I add NOT
operator, like:
NOT ((country IN (FROM medium_countries)) AND (country IN (FROM big_countries))) AND EMAIL IS NOT NULL
it fails with the following error:
Exception in thread "main" org.apache.spark.sql.AnalysisException: Null-aware predicate sub-queries cannot be used in nested conditions: (NOT (country#22 IN (list#99 ) && country#22 IN (list#100 )) && isnotnull(EMAIL#20));;
Filter (NOT (country#22 IN (list#99 ) && country#22 IN (list#100 )) && isnotnull(EMAIL#20))
: :- SubqueryAlias `medium_countries`
: : +- Project [value#6 AS country#8]
: : +- LocalRelation [value#6]
: +- SubqueryAlias `big_countries`
: +- Project [value#1 AS country#3]
: +- LocalRelation [value#1]
+- SubqueryAlias `users`
+- Project [name#19, email#20, phone#21, country#22, monotonically_increasing_id() AS UniqueID#27L]
+- Project [_1#14 AS name#19, _2#15 AS email#20, _3#16 AS phone#21, _4#17 AS country#22]
+- LocalRelation [_1#14, _2#15, _3#16, _4#17]
Could you please explain why NOT
is not working there?
apache-spark apache-spark-sql
add a comment |
up vote
1
down vote
favorite
The following Spark SQL query works fine:
((country IN (FROM medium_countries) ) AND (country IN (FROM big_countries))) AND EMAIL IS NOT NULL
and the following one works fine:
FALSE = ((country IN (FROM medium_countries)) AND (country IN (FROM big_countries))) AND EMAIL IS NOT NULL
but when I add NOT
operator, like:
NOT ((country IN (FROM medium_countries)) AND (country IN (FROM big_countries))) AND EMAIL IS NOT NULL
it fails with the following error:
Exception in thread "main" org.apache.spark.sql.AnalysisException: Null-aware predicate sub-queries cannot be used in nested conditions: (NOT (country#22 IN (list#99 ) && country#22 IN (list#100 )) && isnotnull(EMAIL#20));;
Filter (NOT (country#22 IN (list#99 ) && country#22 IN (list#100 )) && isnotnull(EMAIL#20))
: :- SubqueryAlias `medium_countries`
: : +- Project [value#6 AS country#8]
: : +- LocalRelation [value#6]
: +- SubqueryAlias `big_countries`
: +- Project [value#1 AS country#3]
: +- LocalRelation [value#1]
+- SubqueryAlias `users`
+- Project [name#19, email#20, phone#21, country#22, monotonically_increasing_id() AS UniqueID#27L]
+- Project [_1#14 AS name#19, _2#15 AS email#20, _3#16 AS phone#21, _4#17 AS country#22]
+- LocalRelation [_1#14, _2#15, _3#16, _4#17]
Could you please explain why NOT
is not working there?
apache-spark apache-spark-sql
add a comment |
up vote
1
down vote
favorite
up vote
1
down vote
favorite
The following Spark SQL query works fine:
((country IN (FROM medium_countries) ) AND (country IN (FROM big_countries))) AND EMAIL IS NOT NULL
and the following one works fine:
FALSE = ((country IN (FROM medium_countries)) AND (country IN (FROM big_countries))) AND EMAIL IS NOT NULL
but when I add NOT
operator, like:
NOT ((country IN (FROM medium_countries)) AND (country IN (FROM big_countries))) AND EMAIL IS NOT NULL
it fails with the following error:
Exception in thread "main" org.apache.spark.sql.AnalysisException: Null-aware predicate sub-queries cannot be used in nested conditions: (NOT (country#22 IN (list#99 ) && country#22 IN (list#100 )) && isnotnull(EMAIL#20));;
Filter (NOT (country#22 IN (list#99 ) && country#22 IN (list#100 )) && isnotnull(EMAIL#20))
: :- SubqueryAlias `medium_countries`
: : +- Project [value#6 AS country#8]
: : +- LocalRelation [value#6]
: +- SubqueryAlias `big_countries`
: +- Project [value#1 AS country#3]
: +- LocalRelation [value#1]
+- SubqueryAlias `users`
+- Project [name#19, email#20, phone#21, country#22, monotonically_increasing_id() AS UniqueID#27L]
+- Project [_1#14 AS name#19, _2#15 AS email#20, _3#16 AS phone#21, _4#17 AS country#22]
+- LocalRelation [_1#14, _2#15, _3#16, _4#17]
Could you please explain why NOT
is not working there?
apache-spark apache-spark-sql
The following Spark SQL query works fine:
((country IN (FROM medium_countries) ) AND (country IN (FROM big_countries))) AND EMAIL IS NOT NULL
and the following one works fine:
FALSE = ((country IN (FROM medium_countries)) AND (country IN (FROM big_countries))) AND EMAIL IS NOT NULL
but when I add NOT
operator, like:
NOT ((country IN (FROM medium_countries)) AND (country IN (FROM big_countries))) AND EMAIL IS NOT NULL
it fails with the following error:
Exception in thread "main" org.apache.spark.sql.AnalysisException: Null-aware predicate sub-queries cannot be used in nested conditions: (NOT (country#22 IN (list#99 ) && country#22 IN (list#100 )) && isnotnull(EMAIL#20));;
Filter (NOT (country#22 IN (list#99 ) && country#22 IN (list#100 )) && isnotnull(EMAIL#20))
: :- SubqueryAlias `medium_countries`
: : +- Project [value#6 AS country#8]
: : +- LocalRelation [value#6]
: +- SubqueryAlias `big_countries`
: +- Project [value#1 AS country#3]
: +- LocalRelation [value#1]
+- SubqueryAlias `users`
+- Project [name#19, email#20, phone#21, country#22, monotonically_increasing_id() AS UniqueID#27L]
+- Project [_1#14 AS name#19, _2#15 AS email#20, _3#16 AS phone#21, _4#17 AS country#22]
+- LocalRelation [_1#14, _2#15, _3#16, _4#17]
Could you please explain why NOT
is not working there?
apache-spark apache-spark-sql
apache-spark apache-spark-sql
edited Nov 10 at 18:21
asked Nov 10 at 17:15


alexanoid
6,8601175166
6,8601175166
add a comment |
add a comment |
active
oldest
votes
active
oldest
votes
active
oldest
votes
active
oldest
votes
active
oldest
votes
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53241439%2fspark-sql-not-operator-and-null-aware-predicate-sub-queries-cannot-be-used-in-ne%23new-answer', 'question_page');
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
kc1oBnJ3uk j Mdfqlk ntBPxySfxo1IHeJCRtGTRA2 O0,W5xWTVoFv0tcmGu7H 11L