Character sets

All TYPO3 websites use UTF-8 as their character set. Using UTF-8 means that you have a consistent data storage and can store any glyph from any language without worrying further about character sets.

Database field lengths

The TYPO3 Core is compatible with UTF-8.

However, you might face the problem that the database field lengths of some extensions need to be extended. For example, each Chinese glyph takes three bytes. So if a field is a varchar(10) and an author enters 10 Chinese glyphs, only the first 3 glyphs will be stored (as they require 9 bytes). UTF-8 is tricky in this respect, as all ASCII characters only need 1 byte, while European special characters usually need 2 and Asian character sets 3 bytes - but some special characters can even need 5-6 bytes!