Attention

TYPO3 v8 has reached its end-of-life March 31st, 2020 and is not maintained by the community anymore. Looking for a stable version? Use the version switch on the top left.

You can order Extended Long Term Support (ELTS) here: TYPO3 ELTS.

Character sets

All TYPO3 CMS websites use UTF-8 as their character set. Using UTF-8 means you have a consistent data storage and can store any glyph from any language without thinking more about charsets.

Charset in frontend (advanced)

UTF-8 is also be used in the frontend automatically and it is recommended to use it. However it is possible to change charset settings in TypoScript.

config.metaCharset defines the character set of the HTML output. If this is set to another value than UTF-8, all content is converted before output although internally processed in UTF-8. This is useful for special cases like Japanese websites where they e.g. use "shift-jis" for content delivery.

If config.metaCharset is not UTF-8, GET / POST data is automatically converted from config.metaCharset to UTF-8.

Database field lengths

The TYPO3 CMS Core is compatible with UTF-8.

You might however face the problem that the database field lengths of some extensions must be extended. For example, each Chinese glyph takes three bytes. So if a field is a varchar(10) and an author enters 10 Chinese glyphs only the first 3 glyphs will be stored (since they take up 9 bytes). UTF-8 is tricky in this respect because all ASCII chars take only 1 byte while European special chars typically take up 2 and asian charsets take up 3 - but some special glyphs could take even 5-6 bytes!

For more information on how to set up the database to use UTF-8, please read here:

http://dev.mysql.com/doc/refman/5.7/en/charset-unicode.html