Quantcast
Channel: Intel Developer Zone Articles
Viewing all articles
Browse latest Browse all 3384

JavaScript* Parser for Depth Photo

$
0
0

Abstract

The JavaScript parser for depth photo parses eXtensible Device Metadata (XDM) image files [1] and extracts metadata embedded in image files to generate XML files. In addition, this app analyzes XML files to extract color image data and depth map data. It is a fundamental building block for depth photography use cases, like the image viewer, refocus feature, parallax feature, and measurement feature. We have deployed the JavaScript parser in an open source project named Depthy [2] and proved its correctness and efficiency.

The input to this app is an XDM image file and outputs include XML file(s), color image file(s), and depth map file(s).

XDM

First, we describe the input to this app, XDM image files. XDM is a standard for storing metadata in a container image while maintaining the compatibility with existing image viewers. It is designed for Intel® RealSense™ Technology [3]. Metadata includes device-related information, like depth map, device and camera pose, lens perspective model, vendor information, and point cloud. The following figure shows an example where the XDM file stores the depth map (right) as metadata with the color image (left).

XDM

Adobe XMP Standard

Currently, the XDM specification supports four types of container image formats: JPEG, PNG, TIFF, and GIF. XDM metadata is serialized and embedded inside a container image file, and its storage format is based on the Adobe Extensible Metadata Platform (XMP) standard [4]. This app is specifically developed for JPEG format. Next we briefly describe how XMP metadata is embedded in JPEG image files and how the parser parses XMP packets.

In the XMP standard, 2-byte markers are interspersed among data. The marker types 0xFFE0–0xFFEF are generally used for application data, named APPn. By convention, an APPn marker begins with a string identifying the usage, called a namespace or signature string. An APP1 marker identifies Exif and TIFF metadata; an APP13 marker designates Photoshop Image Resources that contains IPTC metadata, another or multiple APP1 marker designate the location of the XMP packet(s).

The following table shows an entry format for the StandardXMP section in the JPEG, including:

  • 2-byte APP1 marker 0xFFE1
  • Length of this XMP packet, 2-bytes long
  • StandardXMP namespace, http://ns.adobe.com/xap/1.0/, 29-bytes long
  • XMP packet, less than 65,503 bytes

Adobe XMP Standard

If the serialized XMP packet becomes larger than 64 KB, it can be divided into a main portion (StandardXMP) and an extended portion (ExtendedXMP), stored in multiple JPEG marker segments. The entry format for the ExtendedXMP section is similar to that for StandardXMP except that the namespace is http://ns.adobe.com/xmp/extension/.

The following image shows how StandardXMP and ExtendedXMP are embedded in a JPEG image file.

Adobe XMP Standard

The following code snippet shows three functions:

  • findMarker. Parse the JPEG file (that is, buffer) from the specified location (that is, position) and search 0xFFE1 marker. If it is found, return the marker position; otherwise, return -1.
  • findHeader. Look for StandardXMP namespace (http://ns.adobe.com/xap/1.0/) and ExtendedXMP namespace (http://ns.adobe.com/xmp/extension/) in the JPEG file (that is, buffer) from the specified location (that is, position). If found, return corresponding namespace; otherwise, return an empty string.
  • findGUID.Look for GUID which is stored in xmpNote:HasExtendedXMP in the JPEG file (that is, buffer) from the start location (that is, position) to the end location (that is, position+size-1) and return its location.
// Return buffer index that contains marker 0xFFE1 from buffer[position]
// If not found, return -1
function findMarker(buffer, position) {
    var index;
    for (index = position; index < buffer.length; index++) {
        if ((buffer[index] == marker1) && (buffer[index + 1] == marker2))
            return index;
    }
    return -1;
}

// Return header/namespace if found; return "" if not found
function findHeader(buffer, position) {
    var string1 = buffer.toString('ascii', position + 4, position + 4 + header1.length);
    var string2 = buffer.toString('ascii', position + 4, position + 4 + header2.length);
    if (string1 == header1)
        return header1;
    else if (string2 == header2)
        return header2;
    else
        return noHeader;
}

// Return GUID position
function findGUID(buffer, position, size) {
    var string = buffer.toString('ascii', position, position + size - 1);
    var xmpNoteString = "xmpNote:HasExtendedXMP=";
    var GUIDPosition = string.search(xmpNoteString);
    var returnPos = GUIDPosition + position + xmpNoteString.length + 1;
    return returnPos;
}

A 128-bit GUID stored as a 32-byte ASCII hex string is stored in each ExtendedXMP segment following the ExtendedXMP namespace. It is also stored in the StandardXMP segment as the value of xmpNote:HasExtendedXMP property. This way, we can detect a mismatched or modified ExtendedXMP.

XML

XMP metadata can be directly embedded within an XML document [5]. According to the XDM specification, the XML data structure is defined as follows:

XML

The image file contains the following items as shown in the above table, formatted as RDF/XML. This describes the general structure:

  • Container image. The image external to the XDM, visible to normal non-XDM apps.
  • DeviceThe root object of the RDF/XML document as in the Adobe XMP standard.
    • Revision - Revision of XDM specification
    • VendorInfo - Vendor-related information for the device
    • DevicePose - Device pose with respect to the world
    • Cameras - RDF sequence of one or more camera entities
      • Camera - All the information for a given camera. There must be a camera for any image. The container image is associated with the first camera, which is considered the primary camera for the image.
        • VendorInfo - Vendor-related information for the camera
        • CameraPose - Camera pose relative to the device
        • Image - Image provided by the camera
        • ImagingModel - Imaging (lens) model
        • Depthmap - Depth-related information including the depth map and noise model
          • NoiseModel - Noise properties for the sensor
        • PointCloud - Point-cloud data

The following code snippet is the main function of this app, which parses the input JPEG file by searching APP1 marker 0xFFE1. If it is found, search the StandardXMP namespace string and ExtendedXMP namespace string. If the former, calculate metadata size and starting point, extract the metadata, and form the StandardXMP XML file. If the latter, calculate metadata size and starting point, extract the metadata, and form the ExtendedXMP XML file. The app’s outputs are two XML files.

// Main function to parse XDM file
function xdmParser(xdmFilePath) {
 try {
     //Get JPEG file size in bytes
     var fileStats = fs.statSync(xdmFilePath);
     var fileSizeInBytes = fileStats["size"];

     var fileBuffer = new Buffer(fileSizeInBytes);

        //Get JPEG file descriptor
     var xdmFileFD = fs.openSync(xdmFilePath, 'r');

     //Read JPEG file into a buffer (binary)
     fs.readSync(xdmFileFD, fileBuffer, 0, fileSizeInBytes, 0);

     var bufferIndex, segIndex = 0, segDataTotalLength = 0, XMLTotalLength = 0;
     for (bufferIndex = 0; bufferIndex < fileBuffer.length; bufferIndex++) {
         var markerIndex = findMarker(fileBuffer, bufferIndex);
         if (markerIndex != -1) {
                // 0xFFE1 marker is found
             var segHeader = findHeader(fileBuffer, markerIndex);
             if (segHeader) {
                 // Header is found
                 // If no header is found, go find the next 0xFFE1 marker and skip this one
                    // segIndex starts from 0, NOT 1
                 var segSize = fileBuffer[markerIndex + 2] * 16 * 16 + fileBuffer[markerIndex + 3];
                 var segDataStart;

                 // 2-->segSize is 2-byte long
                    // 1-->account for the last 0 at the end of header, one byte
                 segSize -= (segHeader.length + 2 + 1);
                 // 2-->0xFFE1 is 2-byte long
                 // 2-->segSize is 2-byte long
                 // 1-->account for the last 0 at the end of header, one byte
                 segDataStart = markerIndex + segHeader.length + 2 + 2 + 1;
               
                 if (segHeader == header1) {
                        // StandardXMP
                     var GUIDPos = findGUID(fileBuffer, segDataStart, segSize);
                     var GUID = fileBuffer.toString('ascii', GUIDPos, GUIDPos + 32);
                     var segData_xap = new Buffer(segSize - 54);
                     fileBuffer.copy(segData_xap, 0, segDataStart + 54, segDataStart + segSize);
                     fs.appendFileSync(outputXAPFile, segData_xap);
                 }
                 else if (segHeader == header2) {
                        // ExtendedXMP
                     var segData = new Buffer(segSize - 40);
                     fileBuffer.copy(segData, 0, segDataStart + 40, segDataStart + segSize);
                     XMLTotalLength += (segSize - 40);
                     fs.appendFileSync(outputXMPFile, segData);
                 }
                 bufferIndex = markerIndex + segSize;
                 segIndex++;
                 segDataTotalLength += segSize;
             }
         }
         else {
                // No more marker can be found. Stop the loop
             break;
         };
     }
 } catch(ex) {
  console.log("Something bad happened! " + ex);
 }
}

The following code snippet parses the XML file and extracts the color image and depth map for depth photography purposes. It is very straightforward. The function xmpMetadataParser() searches the attribute named IMAGE:DATA and extracts the corresponding data into a JPEG file which is the color image. If multiples are found, multiple JPEG files are created. The function also searches the attribute named DEPTHMAP:DATA and extracts the corresponding data into a PNG file which is the depth map. If multiples are found, multiple PNG files are created, too. The app’s outputs are JPEG file(s) and PNG file(s).

// Parse XMP metadata and search attribute names for color image and depth map
function xmpMetadataParser() {
    var imageIndex = 0, depthImageIndex = 0, outputPath = "";
    parser = sax.parser();

    // Extract data when specific data attributes are encountered
    parser.onattribute = function (attr) {
        if ((attr.name == "IMAGE:DATA") || (attr.name == "GIMAGE:DATA")) {
            outputPath = inputJpgFile.substring(0, inputJpgFile.length - 4) + "_" + imageIndex + ".jpg";
            var atob = require('atob'), b64 = attr.value, bin = atob(b64);
            fs.writeFileSync(outputPath, bin, 'binary');
            imageIndex++;
        } else if ((attr.name == "DEPTHMAP:DATA") || (attr.name == "GDEPTH:DATA")) {
            outputPath = inputJpgFile.substring(0, inputJpgFile.length - 4) + "_depth_" + depthImageIndex + ".png";
            var atob = require('atob'), b64 = attr.value, bin = atob(b64);
            fs.writeFileSync(outputPath, bin, 'binary');
            depthImageIndex++;
        }
    };

    parser.onend = function () {
        console.log("All done!")
    }
}

// Process XMP metadata
function processXmpData(filePath) {
    try {
        var file_buf = fs.readFileSync(filePath);
        parser.write(file_buf.toString('utf8')).close();
    } catch (ex) {
        console.log("Something bad happened! " + ex);
    }
}

Conclusion

This white paper described the XDM file format, Adobe XMP standard, and XML data structure. The JavaScript parser app for the depth photo parses the XDM image file and output StandardXMP XML file and ExtendedXMP XML file. Then it parses the XML files to extract color image file(s) and depth map file(s). This app does not depend on any other programs. It is a basic building block for any depth photography use cases.

References

[1] “The eXtensible Device Metadata (XDM) specification, version 1.0,” https://software.intel.com/en-us/articles/the-extensible-device-metadata-xdm-specification-version-10

[2] Open source project Depthy. http://depthy.me/#/

[3] Intel® RealSense™ Technology: http://www.intel.com/content/www/us/en/architecture-and-technology/realsense-overview.html

[4] Adobe XMP Developer Center. http://www.adobe.com/devnet/xmp.html

[5] “XML 1.0 Specification,” World Wide Web Consortium. Retrieved 2010-08-22.

About The Author

Yu Bai is an application engineer in Intel® Software and Services Group (SSG), working with external ISVs to ensure their applications run well on Intel® platforms. Before joining SSG, she worked for Rudolph Technologies as a senior software engineer, developing applications used in the operation of precision photolithograph equipment for the semiconductor capital equipment industry. Prior to Rudolph, she worked for Marvell Semiconductor as a staff engineer working on power analysis and power modeling for the company's application processors. She joined Marvell through the company's acquisition of Intel® XScale technology in 2006.

Yu received her master and doctorate degrees in Electrical Science and Computer Engineering from Brown University. Her graduate research focused on high-performance and low-power computer architecture design. Yu holds six U.S. patents and has published 10+ journal and international conference papers on power/performance management and optimization.


Viewing all articles
Browse latest Browse all 3384

Trending Articles



<script src="https://jsc.adskeeper.com/r/s/rssing.com.1596347.js" async> </script>