[1.8.5-20110218] - ASF files - last byte trimmed from tags

[1.8.5-20110218] - ASF files - last byte trimmed from tags

Postby jthomerson » Mon Jun 20, 2011 12:28 pm

In my asf (wmv) files, I am seeing that the last letter of the tags is being lost. I've tracked this down and found that it's because the last byte (x00 null byte) is being stripped, which causes an iconv error when the string is then converted from UTF16le.

The solution is to remove line 1190 from getid3.php (the line containing: "$value = (is_string($value) ? trim($value) : $value);"). The ironic thing about that line of code is that just two lines later, the author includes the warning "// do not trim!! Unicode characters will get mangled if trailing nulls are removed!"

To reproduce this, you can:
Download file http://download.jw.org/files/media_books/lr_ASL_01.wmv
The title in the file is "Teacher 01-Why Jesus Was a Great Teacher (ASL)" but the ASF tags extracted by getid3 will be "Teacher 01-Why Jesus Was a Great Teacher (ASL" (notice the missing closing parenthesis).

Code: Select all
array(5) {
  [0]=>string(41) "$comment_name = 'asf'; $tag_key = 'title'"
  [1]=>string(110) "before line 1190: Teacher 01-Why Jesus Was a Great Teacher (ASL)"
  [2]=>string(306) "before line 1190 (hex bytes): 54 00 65 00 61 00 63 00 68 00 65 00 72 00 20 00 30 00 31 00 2d 00 57 00 68 00 79 00 20 00 4a 00 65 00 73 00 75 00 73 00 20 00 57 00 61 00 73 00 20 00 61 00 20 00 47 00 72 00 65 00 61 00 74 00 20 00 54 00 65 00 61 00 63 00 68 00 65 00 72 00 20 00 28 00 41 00 53 00 4c 00 29 00 "
  [3]=>string(108) "after line 1190: Teacher 01-Why Jesus Was a Great Teacher (ASL)"
  [4]=>string(302) "after line 1190 (hex bytes): 54 00 65 00 61 00 63 00 68 00 65 00 72 00 20 00 30 00 31 00 2d 00 57 00 68 00 79 00 20 00 4a 00 65 00 73 00 75 00 73 00 20 00 57 00 61 00 73 00 20 00 61 00 20 00 47 00 72 00 65 00 61 00 74 00 20 00 54 00 65 00 61 00 63 00 68 00 65 00 72 00 20 00 28 00 41 00 53 00 4c 00 29 "
}


As you see in the dump above, the string is still okay (at this point), but the trim caused the second byte of the last character ")" to be removed (it should be "29 00" since it's encoded in UTF16). Later on the code will use iconv to convert the string, and iconv will then give the following errors:

Code: Select all
Notice: /qa_trunk/lib/getid3/getid3.lib.php line 902 - iconv() [<a href='function.iconv'>function.iconv</a>]: Detected an incomplete multibyte character in input string
Notice: /qa_trunk/lib/getid3/getid3.lib.php line 902 - iconv() [<a href='function.iconv'>function.iconv</a>]: Detected an illegal character in input string


At this point, the last parenthesis is gone. This happens on multiple tags - I'm just using the title as an example.
jthomerson
User
 
Posts: 1
Joined: Mon Jun 20, 2011 10:48 am

Re: [1.8.5-20110218] - ASF files - last byte trimmed from ta

Postby James Heinrich » Mon Jun 20, 2011 2:50 pm

Thanks for the report. It has already been fixed in v1.9.0 (which is going through final quality control and should be out within a day or two)

Cross-referencing bug reports for my own tracking:
viewtopic.php?t=1136
James Heinrich
getID3() v1 developer
 
Posts: 1203
Joined: Fri May 04, 2001 11:00 am
Location: London, ON, Canada


Return to Bug Reports (v1.x) - resolved

Who is online

Users browsing this forum: No registered users and 0 guests

cron