[Ssc-dev] Storing Obscura metadata in JPEG files

Mon Dec 5 13:58:40 EST 2011

Thanks, Andrew.  My responses inline...

///////////////////////
Harlo Holmes
guardianproject.info

On Fri, Dec 2, 2011 at 10:27 AM, Andrew Senior <andrew.senior at gmail.com>wrote:

> OK. That looks good - so you'd like a get/put for a string buffer? (plus a
> hash of that buffer? ) Can you gzip the string buffer and supply a block of
> bytes + length?

Ok, agreed: gzip'ed data + length.

> I'll put these calls into the library. Your hashes are string
> representations of hexadecimal numbers?
>

Yes, I think that'll suffice as strings.

>
> For signing you'll want some more calls at a lower level into the JPEG
> file?
> Say calls returning buffers containing the raw data and the length?
> Now, the redaction region information needs to also have raw buffers of
> reversibility information, to be accessed by the library, so I was thinking
> we'd put that in a different data structure to be parsed by the library? At
> the moment it's structured differently- as image strips
>  rather than as rectangular regions. Signatures of regions that overlap
> might prove a little tricky.
>

I think I understand, but I'm not too sure:  By "image strips," do you mean
that the rectangular image region matrix is flattened by rows in output?
 If so, this is ok (I think I miss the point!)

>
> Perhaps it's best if we store multiple APPn segments - one for the
> metadata block, and one for the redaction information. Each being up to
> 64KB, and with different (but similar) string headers e.g. "ObscuraMeta"
> and "ObscuraRedaction"
>

Understood.  Let's do this!

>
> Thanks,
> Andrew
>
> On Thu, Dec 1, 2011 at 12:51 PM, Harlo Holmes <harlo at guardianproject.info>wrote:
>
>> Hi Andrew,
>>
>> Thank you so much for this!
>>
>> First off, I finished up the metadata spec, and have committed it here:
>>
>>
>> https://github.com/guardianproject/SecureSmartCam/commit/861b26d01f57c7d60cb078d82747cb5a4979bee2
>>
>>
>> It's a .json file that you can reference for the required fields at this
>> time (values input are sample data that infer data types.)  I will also
>> include an XML representation of the same structure, I'm just more nimble
>> working in JSON.
>>
>> I definitely want to insert our data in an APPn field; hopefully we can
>> use another APPn than those currently used to store EXIF/XMP (which should
>> be preserved for sharing the media across other
>> viewers/platforms/services/software.)  So, if we can safely preserve
>> Informa elsewhere than APP2, that would be great.  (If not, we can use #2,
>> we'll extend our spec to include EXIF data generated by the device...)
>>
>> In regards to division of labor, I expect the app will handle all the
>> medatada generation, and encryption-- the JpegRedaction library should only
>> serve to input the generated payload into the correct APPn segment.  (I
>> hope I've understood the process properly-- please let me know if I'm
>> off-base here.)  I would want the app to send a complete data object (as a
>> JSON string) to the JpegRedaction class on save; that interaction being the
>> final step to generating the resulting image.
>>
>> Thanks!
>> Harlo
>>
>> On Mon, Nov 28, 2011 at 11:38 PM, Andrew Senior <andrew.senior at gmail.com>wrote:
>>
>>> I've been working on extending the JPEG redaction library to handle
>>> editing the metadata, and looking at storing proprietary metadata in the
>>> files.
>>> I think it might be best to store our proprietary data in an APPn data
>>> structure rather than makernote.
>>> Then we can have the option to preserve makernotes, and not confuse
>>> other parsers with misleading makernote (I can't actually work out how
>>> image software is supposed to know
>>> what format the makernote is in- if it's determined by the manufacturer
>>> tag, or if everyone just tries to decode makernotes and sees if they match
>>> their own format.
>>> If determined by the Manufacturer tag, then we'd have to change that
>>> too.)
>>>
>>> APPn data on the other hand seems open for reuse. There are at least two
>>> standards using APP2 field. (Flashpix and ICC_PROFILE) and two for APP1
>>> (EXIF and XMP)
>>>
>>> XMP stands for extensible metadata, so we could use this standard, but
>>> I'm not sure it can coexist with EXIF, and I'm a wary of requiring XML and
>>> of fitting into this existing standard.
>>>
>>> On the other hand the EXIF standard http://www.exif.org/Exif2-2.PDFtalks about storing multiple APP2 segments, and skipping APPn segments if a
>>> reader can't parse them.
>>> Flashpix seems less important to preserve, and I suspect that we can
>>> coexist with other APP2 - we just define our own header string.
>>> Alternatively we can pick our own 'n'. I think 13 (Adobe photoshop and
>>> IPTC) is the only one I've seen in use, and this page<http://www.ozhiker.com/electronics/pjmt/jpeg_info/app_segments.html>only lists a few. I think this would be my inclination.
>>>
>>> If the library is to construct this data segment, I would suggest that
>>> we store our metadata as a TIFF IFD, as that seems like a compact, flexible
>>> standard, for which some code already exists in the library. It's pretty
>>> basic with a flat structure.
>>> APPn segments are supposed to be under 64KB, but you can have multiple
>>> segments to exceed that limit.
>>>
>>> If the data segment were to be constructed inside the android app, then
>>> alternative (and richer) structures (XML or protobuffer?) might be more
>>> feasible or preferable. We have to think about the division of labour
>>> between the app and the JpegRedactionLibrary, and any other clients we may
>>> want in the future that might read/validate the data. The library has
>>> direct access to the raw data for signing/encryption, but the app has the
>>> metadata (and presumably the crypto libraries- I haven't looked at what we
>>> might use from the C++ code?)
>>>
>>> Candidate information to store:
>>>
>>> Consent information: face locations, identities and consent status.
>>> Signatures
>>> Key identities.
>>> Actual public keys?
>>> Obscura version information.
>>> Redaction information: regions' coordinates and encrypted contents.
>>> Non-standard (e.g. sensor) metadata
>>> Audit trail (details of obscura-cam operations)
>>> Anything else?
>>>
>>> We need to work out what needs to be/can be signed and what encrypted.
>>>
>>> Andrew
>>>
>>
>>
>> ///////////////////////
>> Harlo Holmes
>> guardianproject.info
>>
>>
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mayfirst.org/pipermail/ssc-dev/attachments/20111205/27bd46d7/attachment.htm>