Florian Kainz
2011-04-14 22:13:45 UTC
Hi,
At ILM we want to implement a workflow where a computer graphics artist
can bring up an OpenEXR image of, say, a scene from Rango on his or her
screen, point to a pixel, and be find out that the object seen at that
pixel is called "Beans/dress/button3."
This will require storing per-pixel object identifiers in an OpenEXR file.
In order to avoid re-inventing the wheel, I would like to find out if any
OpenEXR user have done something like this already. If you have done it,
would you be willing to share how you did it?
Also, is there any interest in per-pixel object identifiers outside ILM?
Florian
Unless somebody has a better idea, we'll probably do something like this:
Add a channel called objectID, of type UINT, to the image.
If the image has multiple views, then add an objectID channel
to every view.
Add an attribute called objectID to the header. The type of the
attribute is a map from unsigned integers to lists of strings.
If the image has multiple views, then add multiple attributes
with names of the form <view name>.objectID.
For each unsigned integer value that occurs in one or more pixels
in the objectID channel, a corresponding map entry in the objectID
attribute contains a list of all objects that are visible in those
pixels. More than one object may be visible in a given pixel because
of transparency, motion blur, reflections, or anti-aliasing.
To find out which object or objects cover a given pixel, application
software first looks up the value stored in the objectID channel for
that pixel, then it looks up the corresponding list of object names
in the objectID attribute.
In images with lots of transparency or motion blur any given object
name may occur multiple times in the map from unsigned integers to
string lists. In order to save disk space the objectID attribute
could be compressed by using a two-stage lookup, where the attribute
contains two maps, one from pixel values lists of integer object
identifiers, and one from object identifiers to object names.
With 32-bit UINT pixel values this scheme could run out of object
identifiers for images with more than four Gigapixels, but in VFX
production images that large are rare.
At ILM we want to implement a workflow where a computer graphics artist
can bring up an OpenEXR image of, say, a scene from Rango on his or her
screen, point to a pixel, and be find out that the object seen at that
pixel is called "Beans/dress/button3."
This will require storing per-pixel object identifiers in an OpenEXR file.
In order to avoid re-inventing the wheel, I would like to find out if any
OpenEXR user have done something like this already. If you have done it,
would you be willing to share how you did it?
Also, is there any interest in per-pixel object identifiers outside ILM?
Florian
Unless somebody has a better idea, we'll probably do something like this:
Add a channel called objectID, of type UINT, to the image.
If the image has multiple views, then add an objectID channel
to every view.
Add an attribute called objectID to the header. The type of the
attribute is a map from unsigned integers to lists of strings.
If the image has multiple views, then add multiple attributes
with names of the form <view name>.objectID.
For each unsigned integer value that occurs in one or more pixels
in the objectID channel, a corresponding map entry in the objectID
attribute contains a list of all objects that are visible in those
pixels. More than one object may be visible in a given pixel because
of transparency, motion blur, reflections, or anti-aliasing.
To find out which object or objects cover a given pixel, application
software first looks up the value stored in the objectID channel for
that pixel, then it looks up the corresponding list of object names
in the objectID attribute.
In images with lots of transparency or motion blur any given object
name may occur multiple times in the map from unsigned integers to
string lists. In order to save disk space the objectID attribute
could be compressed by using a two-stage lookup, where the attribute
contains two maps, one from pixel values lists of integer object
identifiers, and one from object identifiers to object names.
With 32-bit UINT pixel values this scheme could run out of object
identifiers for images with more than four Gigapixels, but in VFX
production images that large are rare.