Multiple TCON tags get lost in 2.0.0b5

Joined:Sun Nov 01, 2009 10:07 am
Are you a spambot?:no
Multiple TCON tags get lost in 2.0.0b5

Post by quack23 » Sun Nov 01, 2009 11:20 am

I'm using getid3() as shipped in Ampache. My MP3 collection is tagged with ExFalso using the Mutagen tagging library. The library sets multiple tags in the id3v2.4.0 TCON tag format, as multiple NULL-separated free-form strings (not numerical IDs).

I noticed that my files were imported correctly in Ampache except for the ones tagged with multiple genres. These files were all imported with a genre of "Blues" (not one of the genres with which these files were tagged). Chasing it down, I discovered a couple of bugs in the TCON/genre handling in module.tags.id3v2.php -- specifically ParseID3v2GenreString().

First, when converting 2.4 genres to 2.3 parentheses-wrapped style, the while loop terminates before appending and parentheses-wrapping the final chunk. So that chunk gets lost.

Second, when parsing that parentheses-wrapped string, each element is passed to getid3_id3v1::LookupGenreName without checking whether the element is_numeric or not. Since getid3_id3v1::LookupGenreName doesn't perform any input checking, a non-numeric string can possibly be passed to intval() which will return 0 -- which is a valid entry in the lookup table. So calling this on a freeform genre "Foo" will result in replacing "Foo" with "Blues".

This patch fixes both issues as simply as possible (it's two lines and a comment). I've completed preliminary testing on it but I don't have a wide range of tags with which to test. I know a lot of tagging tools simply refuse to set multiple TCON values, so if anyone has multi-genre 2.3 or 2.4 tags they can test with, it would be appreciated.

(careful, lines aren't wrapped quite right...)

Code: Select all

--- module.tag.id3v2.php.orig   2009-11-01 05:54:44.000000000 -0500
+++ module.tag.id3v2.php        2009-11-01 05:57:40.000000000 -0500
@@ -1918,6 +1918,7 @@
                                $genre_string .= '('.substr($unprocessed, 0, $end_pos).')';
                                $unprocessed = substr($unprocessed, $end_pos + 1);
+            $genre_string .= '('.$unprocessed.')'; // catch the last piece
         } elseif (eregi('^([0-9]+|CR|RX)$', $genre_string)) {
                // some tagging program (including some that use TagLib) fail to include null byte after numeric genre
@@ -1940,7 +1941,8 @@
                 $element      = substr($genre_string, $start_pos + 1, $end_pos - ($start_pos + 1));
                 $genre_string = substr($genre_string, 0, $start_pos).substr($genre_string, $end_pos + 1);

-                if (getid3_id3v1::LookupGenreName($element)) { // $element is a valid genre id/abbreviation
+                // check that $element seems numeric before passing to getid3_id3v1::LookupGenreName
+                if (is_numeric($element) AND getid3_id3v1::LookupGenreName($element)) { // $element is a valid genre id/abbreviation

                     if (empty($return_array['genre']) || !in_array(getid3_id3v1::LookupGenreName($element), $return_array['genre'])) { // avoid duplicate entires
                         $return_array['genre'][] = getid3_id3v1::LookupGenreName($element);