Forum OpenACS Development: Re: primary key in cr_mime_types

Collapse
Posted by Gustaf Neumann on
i had a problem with uploading mp3 files into xowiki. firefox 2.0 uploads .mp3 files with a content_type audio/mpeg (as listed by iana) where the actual cr_mime_types table contains two different entries for mp3 files.


dotlrn-test5=# select * from cr_mime_types where file_extension = 'mp3';
label | mime_type | file_extension
-------------+--------------+----------------
Audio MPEG | audio/x-mpeg | mp3
Audio - MP3 | audio/mp3 | mp3
(2 rows)

In the current code for the content repository inserting an entry with a mime_type of audio/mpeg triggers a foreign key violation on table "cr_revisions" with foreign key constraint "cr_revisions_mime_type_ref".

my quick fix was

insert into cr_mime_types (label, mime_type, file_extension) select 'Audio - MP3','audio/mpg', 'mp3' from dual where not exists (select 1 from cr_mime_types where mime_type = 'audio/mpg');

is there a better way?

Collapse
Posted by Mark Aufflick on
It's been a while (nearly 3 years), but I don't recall anything breaking when I removed that unique constraint and added one on extension.

By leaving the extension column allowing nulls there is no problem with multiple null values even with a unique constraint.

Collapse
Posted by Gustaf Neumann on
there is no unique constraint on cr_mime_types. adding the value does work (the error "goes away"), but the mess increases. A result will be, that there are mp3 files in the cr, which might have one of the three values
  • audio/mp3
  • audio/mpeg
  • audio/x-mpeg
in the database. When OpenACS delivers these files, some of these might have e.g. the first, some other the second mime-type. A browser will most likely behave different on those, the user has no idea, why....

When an e.g. mp3 file is stored in the static area, the mime-type might be resolved via ns_guesstype, maybe coming up with a forth variant...

actually, while iana recommends for mp3 audio/mpeg, mac OS X returns "audio/mpg", what i have used in the statement above. the mess is actually a real word mess.

so, for delivery, a unique mapping would be good. This could be achieved e.g. via

  • a unique constraint in the DB + a mapping table for incoming files that maps different incoming mime-types to a canonical, or
  • using ns_guesstype always for delivery, no matter what the stored mime_type is.

The mess is not yet to big. we have multiple mime_types for .mp3, .cdf and .wav

     4 | mp3
     2 | cdf
     2 | wav

i think they can be standardized:

  • mp3:
    • we have: audio/x-mpeg, audio/mp3, audio/mpg, audio/mpeg
    • IANA: audio/mpeg, Apache: audio/mpeg, mac: file -i audio/mpg
  • wav:
      we have: audio/x-wav, audio/wav
    • Apache: audio/x-wav, mac: file -i audio/x-wav
  • cdf
    • we have: application/x-cdf, application/x-netcdf
    • Apache: application/x-netcdf

The only problem seems to be mp3, where indeed multiple ideas exist, trend towards audio/mpeg. I would say that handling this case in the file upload form via a configured ns_guesstype should be the way to go...

Collapse
Posted by Jon Griffin on
It also appears that some pdf files also have the problem.
I am not sure how firefox determines the mime type, but I am getting a CR error when uploading .pdf, the sql shows:

application/unknown

which isn't in the table.