(2.0.0b4) iconv_int_utf8() does not properly convert to utf8

Locked
fivenote
User
Posts:3
Joined:Wed Nov 21, 2007 1:31 pm
(2.0.0b4) iconv_int_utf8() does not properly convert to utf8

Post by fivenote » Tue Apr 08, 2008 2:10 pm

My system does not have iconv(), so getid3 loads its replacement module: module.lib.iconv_replacement.php.

I was getting garbage characters when converting tags to utf-8. I looked at iconv_int_utf8() and found that the bit math was not correct for converting to utf-8 per the spec at http://linux.die.net/man/7/utf8.

To get it working, I changed the iconv_int_utf8() function from...

Code: Select all

    public static function iconv_int_utf8($charval) {
        if ($charval < 128) {
            // 0bbbbbbb
            $newcharstring = chr($charval);
        } elseif ($charval < 2048) {
            // 110bbbbb 10bbbbbb
            $newcharstring  = chr(($charval >> 6) | 0xC0);    
            $newcharstring .= chr(($charval & 0x3F) | 0x80);
        } elseif ($charval < 65536) {
            // 1110bbbb 10bbbbbb 10bbbbbb
            $newcharstring  = chr(($charval >> 12) | 0xE0);   
            $newcharstring .= chr(($charval >>  6) | 0xC0);   
            $newcharstring .= chr(($charval & 0x3F) | 0x80);
        } else {
            // 11110bbb 10bbbbbb 10bbbbbb 10bbbbbb
            $newcharstring  = chr(($charval >> 18) | 0xF0);   
            $newcharstring .= chr(($charval >> 12) | 0xC0);   
            $newcharstring .= chr(($charval >>  6) | 0xC0);   
            $newcharstring .= chr(($charval & 0x3F) | 0x80);
        }
        return $newcharstring;
    }
to...

Code: Select all

    public static function iconv_int_utf8($charval) {
        if ($charval < 128) {
            // 0bbbbbbb
            $newcharstring = chr($charval);
        } elseif ($charval < 2048) {
            // 110bbbbb 10bbbbbb
            $newcharstring  = chr((($charval >> 6) & 0x1F) | 0xC0);    
            $newcharstring .= chr(($charval & 0x3F) | 0x80);
        } elseif ($charval < 65536) {
            // 1110bbbb 10bbbbbb 10bbbbbb
            $newcharstring  = chr((($charval >> 12) & 0x0F) | 0xE0);   
            $newcharstring .= chr((($charval >>  6) & 0x3F) | 0x80);   
            $newcharstring .= chr(($charval & 0x3F) | 0x80);
        } else {
            // 11110bbb 10bbbbbb 10bbbbbb 10bbbbbb
            $newcharstring  = chr((($charval >> 18) & 0x07) | 0xF0);   
            $newcharstring .= chr((($charval >> 12) & 0x3F) | 0x80);   
            $newcharstring .= chr((($charval >>  6) & 0x3F) | 0x80);   
            $newcharstring .= chr(($charval & 0x3F) | 0x80);
        }
        return $newcharstring;
    }
Can this fix be included in the next getid3 release?

Thank you.

Locked