Feature: #104631 - Add UNION Clause support to the QueryBuilder

See forge#104631

Description

The UNION clause is used to combine the result sets of two or more SELECT statements, which all database vendors support, each with their own specific variations.

However, there is a commonly shared subset that works across all of them:

SELECT column_name(s) FROM table1
WHERE ...

UNION <ALL | DISTINCT>

SELECT column_name(s) FROM table2
WHERE ...

ORDER BY ...
LIMIT x OFFSET y
Copied!

with shared requirements:

  • Each SELECT must return the same fields in number, naming and order.
  • Each SELECT must not have ORDER BY, expect MySQL allowing it to be used as sub query expression encapsulated in parentheses.

Generic UNION clause support has been contributed to Doctrine DBAL and is included since Release 4.1.0 which introduces two new API method on the \QueryBuilder:

  • union(string|QueryBuilder $part) to create first UNION query part
  • addUnion(string|QueryBuilder $part, UnionType $type = UnionType::DISTINCT) to add additional UNION (ALL|DISTINCT) query parts with the selected union query type.

TYPO3 decorates the Doctrine DBAL \QueryBuilder to provide for most API methods automatic quoting of identifiers and values and to apply database restrictions automatically for SELECT queries.

The Doctrine DBAL API has been adopted now to provide the same surface for the TYPO3 \TYPO3\CMS\Core\Database\Query\QueryBuilder and the intermediate \TYPO3\CMS\Core\Database\Query\ConcreteQueryBuilder to make it easier to create UNION clause queries. The API on both methods allows to provide dedicated QueryBuilder instances or direct queries as strings in case it is needed.

In queries containing subqueries, only named placeholders (such as :username) can be used and must be registered on the outermost QueryBuilder object, similar to advanced query creation with SUB QUERIES.

UnionType::DISTINCT and UnionType::ALL

Each subsequent part needs to be defined either as UNION DISTINCT or UNION ALL which could have not so obvious effects.

For example, using UNION ALL for all parts in between except for the last one would generate larger result sets first, but discards duplicates when adding the last result set. On the other side, using UNION ALL tells the query optimizer not to scan for duplicates and remove them at all which can be a performance improvement - if you can deal with duplicates it can be ensured that each part does not produce same outputs.

Example: Compose a UNION clause query

Custom service class using a UNION query to retrieve data.
use TYPO3\CMS\Core\Database\Connection;
use TYPO3\CMS\Core\Database\ConnectionPool;
use TYPO3\CMS\Core\Database\Query\QueryBuilder;

final readonly MyService {
  public function __construct(
    private ConnectionPool $connectionPool,
  ) {}

  public function executeUnionQuery(
    int $pageIdOne,
    int $pageIdTwo,
  ): ?array {
    $connection = $this->connectionPool->getConnectionForTable('pages');
    $unionQueryBuilder = $connection->createQueryBuilder();
    $firstPartQueryBuilder = $connection->createQueryBuilder();
    $firstPartQueryBuilder->getRestrictions()->removeAll();
    $secondPartQueryBuilder = $connection->createQueryBuilder();
    $secondPartQueryBuilder->getRestrictions()->removeAll();
    $expr = $unionQueryBuilder->expr();

    $firstPartQueryBuilder
      ->select('uid', 'pid', 'title')
      ->from('pages')
      ->where(
        $expr->eq(
          'pages.uid',
          $unionQueryBuilder->createNamedParameter($pageIdOne),
      );
    $secondPartQueryBuilder
      ->select('uid', 'pid', 'title')
      ->from('pages')
      ->where(
        $expr->eq(
          'pages.uid',
          $unionQueryBuilder->createNamedParameter($pageIdOne),
      );

      return $unionQueryBuilder
        ->union($firstPartQueryBuilder)
        ->addUnion($secondPartQueryBuilder, UnionType::DISTINCT)
        ->orderBy('uid', 'ASC')
        ->executeQuery()
        ->fetchAllAssociative();
  }
}
Copied!

This would create the following query for MySQL with $pageIdOne = 100 and $pageIdTwo = 10:

    (SELECT `uid`, `pid`, `title` FROM pages WHERE `pages`.`uid` = 100)
UNION
    (SELECT `uid`, `pid`, `title` FROM pages WHERE `pages`.`uid` = 10)
ORDER BY `uid` ASC
Copied!

Impact

Extension authors can use the new QueryBuilder methods to build more advanced queries.