Important: #105310 - Create CHAR and BINARY as fixed-length columns 

See forge#105310

Description 

TYPO3 parses ext_tables.sql files into a Doctrine DBAL object schema to define a virtual database scheme, enriched with \TYPO3\CMS\Core\Schema\DefaultTcaSchema information for TCA managed tables and fields.

Fixed- and variable-length variants have been parsed in the past, but failed to flag the column as $fixed = true for the fixed-length database field types CHAR and BINARY. This resulted in the wrong creation of these columns as VARCHAR and VARBINARY, which is now corrected.

ext_tables.sql created as (before) created as (now)
CHAR(10) VARCHAR(10) CHAR(10)
VARCHAR(10) VARCHAR(10) VARCHAR(10)
BINARY(10) VARBINARY(10) BINARY(10)
VARBINARY(10) VARBINARY(10) VARBINARY(10)

Not all relational database management systems (RDBMS) behave the same way for fixed-length columns. Implementation differences need to be respected to ensure consistent query and data behaviour across all supported database systems.

Fixed-length CHAR 

Key difference between CHAR and VARCHAR

The main difference between CHAR and VARCHAR is how the database stores character data. CHAR, which stands for character, is a fixed-length data type. It always reserves a specific amount of storage space for each value, regardless of whether the actual data occupies that space entirely. For example, if a column is defined as CHAR(10) and the word apple is stored inside of it, it will still occupy 10 characters (not just 5). Unused characters are padded with spaces.

On the other hand, VARCHAR, short for variable character, is a variable-length data type. It only uses as much storage space as needed to store the actual data without padding. Thus, storing the word apple in a VARCHAR(10) column will only occupy 5 characters.

The main difference between PostgreSQL and MySQL, MariaDB or SQLite is that PostgreSQL also returns the padded spaces for values that do not fill the full defined length (for example, apple[space][space][space][space] [space]).

In addition, these padded spaces are respected in query conditions, sorting or calculations (such as concat()). These differences make a significant impact and must be considered when using CHAR fields.

Rule of thumb for fixed-length CHAR columns

  • Use only with guaranteed fixed-length values to avoid padding.
  • For 255 or more characters, VARCHAR or TEXT must be used.

More hints for fixed-length CHAR columns

  • Ensure that stored values are fixed-length (non-space characters), for example by using hash algorithms that produce fixed-length identifiers.
  • Ensure that query statements use trim or rightPad within WHERE, HAVING or SELECT operations when values are not guaranteed to be fixed-length.

Example of differences in behaviour of fixed-length CHAR types 

The following examples illustrate how different relational database management systems handle fixed-length CHAR values, and why the behaviour must be carefully considered when storing or querying such data.

Creating a fixed-length field 
Example ext_tables.sql defining a fixed-length tt_content field
CREATE TABLE `tt_content` (
    `some_label` CHAR(10) DEFAULT '' NOT NULL,
);
Copied!
Inserting example data 

Two example rows are added below: one value fits exactly 10 characters, the other uses only 6 characters.

Adding two example rows
<?php

use TYPO3\CMS\Core\Database\ConnectionPool;
use TYPO3\CMS\Core\Utility\GeneralUtility;

$queryBuilder = GeneralUtility::makeInstance(ConnectionPool::class)
    ->getConnectionForTable('tt_content');
// adding a value with 10 chars
$queryBuilder->insert(
    'tt_content',
    [
        'some_label' => 'some-label',
    ],
    [
        'some_label' => Connection::PARAM_STR,
    ],
);
// adding a value with only 6 chars
$queryBuilder->insert(
    'tt_content',
    [
        'some_label' => 'label1',
    ],
    [
        'some_label' => Connection::PARAM_STR,
    ],
);
Copied!
Retrieving the records 
Get all records from table
<?php

use TYPO3\CMS\Core\Database\ConnectionPool;
use TYPO3\CMS\Core\Utility\GeneralUtility;

$queryBuilder = GeneralUtility::makeInstance(ConnectionPool::class)
  ->getQueryBuilderForTable('tt_content');
$rows = $queryBuilder
    ->select('uid', 'some_label')
    ->from('tt_content')
    ->executeQuery()
    ->fetchAllAssociative();
Copied!
Result differences across platforms 

The returned values differ depending on the database platform.

Result rows MySQL, MariaDB or SQLite
<?php

$rows = [
    [
        'uid' => 1,
        'some_label' => 'some-label',
    ],
    [
        'uid' => 2,
        'some_label' => 'label1',
    ],
];
Copied!
Result rows with PostgreSQL
<?php

$rows = [
    [
        'uid' => 1,
        'some_label' => 'some-label',
    ],
    [
        'uid' => 2,
        // PostgreSQL applies the fixed length to the value directly,
        // filling it up with spaces
        'some_label' => 'label1    ',
    ],
];
Copied!
Result rows difference between database platforms (commented)
 <?php

 $rows = [
     [
         'uid' => 1,
         'some_label' => 'some-label',
     ],
     [
         'uid' => 2,
-        'some_label' => 'label1',      // MySQL, MariaDB, SQLite
+        'some_label' => 'label1    ',  // PostgreSQL
     ],
 ];
Copied!
Querying trimmed versus padded values 

Using a trimmed value in a WHERE clause can match the row, but the returned value will differ depending on the database platform.

Retrieve with trimmed value
use TYPO3\CMS\Core\Database\ConnectionPool;
use TYPO3\CMS\Core\Utility\GeneralUtility;

$queryBuilder = GeneralUtility::makeInstance(ConnectionPool::class)
  ->getQueryBuilderForTable('tt_content');
$rows = $queryBuilder
    ->select('uid', 'some_label')
    ->from('tt_content')
    ->where(
        $queryBuilder->eq(
            'some_label',
            $queryBuilder->createNamedParameter('label1'), // trimmed value!
        ),
    )
    ->executeQuery()
    ->fetchAllAssociative();

// $rows contains the record for
// PostgreSQL: $rows = [['uid' => 2, 'some_label' => 'label1    ']];
// Others....: $rows = [['uid' => 2, 'some_label' => 'label1']];
Copied!
Enforcing trimmed values in queries 
Retrieve with enforced trimmed value.
<?php

use Doctrine\DBAL\Platforms\TrimMode;
use TYPO3\CMS\Core\Database\ConnectionPool;
use TYPO3\CMS\Core\Utility\GeneralUtility;

$queryBuilder = GeneralUtility::makeInstance(ConnectionPool::class)
  ->getQueryBuilderForTable('tt_content');
$rows = $queryBuilder
    ->select('uid')
    ->addSelectLiteral(
        $queryBuilder->expr()->as(
            $queryBuilder->expr()->trim(
                'fixed_title',
                TrimMode::TRAILING,
                ' '
            ),
            'fixed_title',
        ),
    )
    ->from('tt_content')
    ->where(
        $queryBuilder->eq(
            'some_label',
            $queryBuilder->createNamedParameter('label1'),
        ),
    )
    ->executeQuery()
    ->fetchAllAssociative();

// $rows contains the record for
// PostgreSQL: $rows = [['uid' => 2, 'some_label' => 'label1']];
// Others....: $rows = [['uid' => 2, 'some_label' => 'label1']];
// and ensures the same content across all supported database systems.
Copied!
Querying space-padded values in PostgreSQL 
Retrieve with space-padded value for PostgreSQL does not retrieve the record
<?php

use TYPO3\CMS\Core\Database\ConnectionPool;
use TYPO3\CMS\Core\Utility\GeneralUtility;

// PostgreSQL specific query!

$queryBuilder = GeneralUtility::makeInstance(ConnectionPool::class)
  ->getQueryBuilderForTable('tt_content');
$rows = $queryBuilder
    ->select('uid', 'some_label')
    ->from('tt_content')
    ->where(
        $queryBuilder->eq(
            'some_label',
            $queryBuilder->createNamedParameter('label1    '), // untrimmed value!
        ),
    )
    ->executeQuery()
    ->fetchAllAssociative();

// $rows === []
Copied!

Additional tools for consistent behaviour 

Additional ExpressionBuilder methods can be used to ensure consistent behaviour across all supported platforms:

  • \TYPO3\CMS\Core\Database\Query\Expression\ExpressionBuilder::trim()
  • \TYPO3\CMS\Core\Database\Query\Expression\ExpressionBuilder::rightPad()

Recommendation 

CHAR and BINARY fields can be used for storage or performance adjustments, but only when composed data and queries account for the differences between database systems.

Otherwise, the safe bet is to consistently use VARCHAR and VARBINARY column types.