子查询_DML_SQL 手册_分布式关系型数据库 DRDS_阿里云技术文档

与 MySQL 子查询的差异

在 MySQL 支持的子查询基础上，增加了以下限制，使用时需要注意。

支持 WHERE 条件中的子查询以及列子查询，不支持HAVING 子句中的子查询，JOIN ON 条件中的子查询；

等号操作行符的标量子查询(The Subquery as Scalar Operand)不支持 ROW 语法。例如：

支持：select * from tb1 where id in (select id from tb2)
支持：select * from tb1 where (id, name) in (select id, name from tb2)  
支持：select * from tb1 where row(id, name) in (select id, name from tb2)
支持：select * from tb1 where row(id, name) not in (select id, name from tb2)    
不支持：select * from tb1 where row(id, name) = (select id, name from tb2)

目前仅支持 SELECT 语句中的子查询。 DELETE 语句中的子查询，目前不支持。例如：
```
DELETE FROM t1 WHERE ROW(c1,c2) IN (
SELECT c1, c2 FROM t2
);
```

其它 INSERT, UPDATE, SET 中的子查询皆不支持。此处不再举出示例。

效率

DRDS 子查询部分只能转化为比较低效的 APPLY 执行器执行，在实际使用中请尽量避免以下类型的子查询。

WHERE 条件中 OR 与子查询共存时，执行效率会依外表数据情况大幅降低。例如：

高效：select * from tb1 where id in (select id from tb2)
高效：select * from tb1 where id in (select id from tb2) and id>3
低效：select * from tb1 where id in (select id from tb2) or  id>3

关联子查询(Correlated Subqueries)的关联项中带函数或非等号运算符。例如：

高效：select * from tb1 a where id in
      (select id from tb2 b where a.name=b.name)
低效：select * from tb1 a where id in
      (select id from tb2 b where UPPER(a.name)=b.name)
低效：select * from tb1 a where id in
      (select id from tb2 b where a.decimal_test=abs(b.decimal_test))
低效：select * from tb1 a where id in
      (select id from tb2 b where a.name!=b.name)
低效：select * from tb1 a where id in
      (select id from tb2 b where a.name>=b.name)

关联子查询(Correlated Subqueries)关联项与其它条件的逻辑运算符为 OR。例如：

高效：select * from tb1 a where id in
      (select id from tb2 b where a.name=b.name 
                                  and b.date_test<'2015-12-02')
低效：select * from tb1 a where id in
      (select id from tb2 b where a.name=b.name 
                                  or b.date_test<'2015-12-02')
低效：select * from tb1 a where id in
      (select id from tb2 b where a.name=b.name 
                                  or b.date_test=a.date_test)

标量子查询(The Subquery as Scalar Operand)带关联项。例如：

高效：select * from tb1 a where id >
      (select id from tb2 b where b.date_test<'2015-12-02')
低效：select * from tb1 a where id >
      (select id from tb2 b where a.name=b.name 
                                  and b.date_test<'2015-12-02')

跨嵌套层的子查询关联项。例如：

SQL 多层嵌套，但每层子查询关联项仅与次级上层关联，此类高效。

高效：select * from tb1 a where id in(select id from tb2 b 
    where a.name=b.name and 
    exists (select name from tb3 c where b.address=c.address))

SQL 多层嵌套，但是表 c 的子查询关联项中与表 a 的列进行了关联，此类低效。

低效：select * from tb1 a where id in(select id from tb2 b 
    where a.name=b.name and 
    exists (select name from tb3 c where a.address=c.address))

子查询中包含 GROUP BY，请确保 GROUP BY 的分组列包含关联项。例如：

SQL 子查询中包含聚合函数和关联项，关联项 b.pk 被包含中分组列(pk)中，此类 SQL 高效。

高效：select * from tb1 a where exists 
(select pk from tb2 b 
            where a.pk=b.pk and  b.date_test='2003-04-05' 
            group by pk);

SQL 子查询中包含聚合函数和关联项，关联项 b.date_test 不被包含中分组列(pk)中，此类 SQL 低效。

低效：select * from tb1 a where exists 
(select pk from tb2 b 
            where a.date_test=b.date_test and b.date_test='2003-04-05' 
            group by pk);

目前支持的子查询类别

Comparisons Using Subqueries

最常见的子查询类别，语法为：

non_subquery_operand comparison_operator (subquery)
comparison_operator ： =  >  <  >=  <=  <>  !=  <=> like

例如：

... WHERE 'a' = (SELECT column1 FROM t1)

注意：目前仅支持子查询在比较运算符的右边。

Subqueries with ANY, ALL, IN/NOT IN, EXISTS/NOT EXISTS

语法：

operand comparison_operator ANY (subquery)
operand comparison_operator ALL (subquery)
operand IN (subquery)
operand NOT IN (subquery)
operand EXISTS (subquery)
operand NOT EXISTS (subquery)
comparison_operator:=  >  <  >=  <=  <>  !=

ANY : 如果子查询返回的任意一行满足 ANY 前的表达式，返回 TRUE，否则返回 FALSE。ALL：如果子查询返回所有行都满足 ALL 前的表达式，返回 TRUE，否则返回 FALSE。IN : IN 在子查询前使用时，与 =ANY 是等价的。例如：

SELECT s1 FROM t1 WHERE s1 = ANY (SELECT s1 FROM t2);
SELECT s1 FROM t1 WHERE s1 IN    (SELECT s1 FROM t2);

NOT IN：NOT IN 在子查询前使用时，与 <>ALL 是等价的。例如：

SELECT s1 FROM t1 WHERE s1 <> ALL (SELECT s1 FROM t2);
SELECT s1 FROM t1 WHERE s1 NOT IN (SELECT s1 FROM t2);

EXISTS/NOT EXISTS：如果子查询返回任意行，EXISTS 子查询结果为 TRUE，NOT EXISTS 子查询结果为 FALSE。如果返回为空，则结果反过来。例如：

SELECT column1 FROM t1 WHERE EXISTS (SELECT * FROM t2);

如果子查询中包含任意行，即使只包含 NULL 的行值，WHERE 条件也会返回 TRUE。

Row Subqueries

Row Subqueries 支持以下表达式：

=  >  <  >=  <=  <>  !=  <=>

例如：

SELECT * FROM t1
  WHERE (col1,col2) = (SELECT col3, col4 FROM t2 WHERE id = 10);
SELECT * FROM t1
  WHERE ROW(col1,col2) = (SELECT col3, col4 FROM t2 WHERE id = 10);

以上两个 SQL 是等价的。只有满足以下条件时，t1 表的数据行才会返回：

子查询 (SELECT col3, col4 FROM t2 WHERE id = 10 ) 返回的仅有一行记录。返回多行会报错。
子查询返回的 col3, col4 与主表的 col1, col2 相等。

Correlated Subqueries

Correlated Subqueries 是指子查询中包含对外层查询表的引用。例如：

SELECT * FROM t1
  WHERE column1 = ANY (SELECT column1 FROM t2
                       WHERE t2.column2 = t1.column2);

示例 SQL 子查询中并没有包含 t1 表及其列名 column2，此时会向上一层寻找 t1 表的引用。

注意：将 Correlated Subqueries 改写为 JOIN 有可能会提高其性能。

Derived Tables (Subqueries in the FROM Clause)

Derived Tables 是指在 FROM 子句中的子查询：

SELECT ... FROM (subquery) [AS] tbl_name ...

例如：

CREATE TABLE t1 (s1 INT, s2 CHAR(5), s3 FLOAT);
INSERT INTO t1 VALUES (1,'1',1.0);
INSERT INTO t1 VALUES (2,'2',2.0);
SELECT sb1,sb2,sb3
  FROM (SELECT s1 AS sb1, s2 AS sb2, s3*2 AS sb3 FROM t1) AS sb
  WHERE sb1 > 1;

查询的结果为：2, '2', 4

从场景举例：假如现在需要分组数据 SUM 后的平均值，直接使用以下 SQL 无法得到想要的结果。

SELECT AVG(SUM(column1)) FROM t1 GROUP BY column1;

此时可使用 Derived Tables 拿到需要的信息：

SELECT AVG(sum_column1)
  FROM (SELECT SUM(column1) AS sum_column1
        FROM t1 GROUP BY column1) AS t1;

注意：

Derived Tables 必须拥有一个别名。
Derived Tables 可以返回一个标量，列，行或表。
Derived Tables 不可以成为 Correlated Subqueries，即不能包含子查询外部表的引用。

子查询_DML_SQL 手册_分布式关系型数据库 DRDS

子查询