InboundEmail import fails with long UTF-8 subject (multibyte truncation bug)

Collapse
X
 
  • Time
  • Show
Clear All
new posts
  • dimyy
    Active Community Member
    • Jun 2018
    • 599

    #1

    InboundEmail import fails with long UTF-8 subject (multibyte truncation bug)

    EspoCRM version: 9.3 (also affects earlier versions)
    DB: MariaDB

    Summary: Importing inbound emails with long UTF-8 subjects (e.g. Cyrillic, Chinese, Arabic) fails with SQL error SQLSTATE[22007]: Incorrect string value because substr() cuts a multibyte character in half.

    Steps to reproduce:
    1. Configure InboundEmail
    2. Receive an email with a subject longer than 127 characters in a multibyte encoding (e.g. Cyrillic — each character is 2 bytes in UTF-8)
    3. The email import fails

    Error from log:
    SQLSTATE[22007]: Invalid datetime format: 1366 Incorrect string value: '\xD1'
    for column `email`.`name` at row 1


    Root cause:

    application/Espo/Core/Mail/Importer/DefaultImporter.php, lines 423–424 (https://github.com/espocrm/espocrm/b...ter.php#L423):
    if (strlen($subject) > self::SUBJECT_MAX_LENGTH) {
    $subject = substr($subject, offset: 0, length: self::SUBJECT_MAX_LENGTH);
    }


    strlen() returns byte count, not character count. For a 150-character Cyrillic subject this is ~300 bytes, which exceeds SUBJECT_MAX_LENGTH (255). Then substr() truncates at byte 255, which may land in the middle of a multibyte UTF-8 sequence (e.g. \xD1\x83 → only \xD1 remains). MySQL rejects the invalid UTF-8 string.

    Note: VARCHAR(255) in MySQL stores 255 characters, not bytes, so the original 150-character string would fit without any truncation.

    Suggested fix:
    if (mb_strlen($subject) > self::SUBJECT_MAX_LENGTH) {
    $subject = mb_substr($subject, 0, self::SUBJECT_MAX_LENGTH);
    }


    This correctly counts and truncates by characters, preserving valid UTF-8.
  • yuri
    EspoCRM product developer
    • Mar 2014
    • 9680

    #2
    Thanks for reporting and investigation. https://github.com/espocrm/espocrm/issues/3586

    Comment

    Working...