WWW.DISSERTATION.XLIBX.INFO
FREE ELECTRONIC LIBRARY - Dissertations, online materials
 
<< HOME
CONTACTS



Pages:   || 2 |

«Perceptual Signal Coding for More Efficient Usage of Bit Codes Scott Miller Mahdi Nezamabadi Scott Daly Dolby Laboratories, Inc. What defines a ...»

-- [ Page 1 ] --

Perceptual Signal Coding

for More Efficient Usage

of Bit Codes

Scott Miller

Mahdi Nezamabadi

Scott Daly

Dolby Laboratories, Inc.

What defines a digital video signal?

•  SMPTE 292M, SMPTE 372M, HDMI?

–  No, these are interface specifications

–  They don’t say anything about what the RGB or

YCbCr values mean

•  Rec601 or Rec709?

–  Not really, these are encoding specifications which

define the OETF (Opto-Electrical Transfer Function)

used for image capture –  Image display ≠ inverse of image capture © 2012 SMPTE · e 2012 Annual Technical Conference & Exhibition · www.smpte2012.org What defines a digital video signal?

•  This does!

•  The EOTF (Electro-Optical Transfer Function) is what really matters –  Content is created by artists while viewing a display –  So the reference display defines the signal © 2012 SMPTE · e 2012 Annual Technical Conference & Exhibition · www.smpte2012.org Current Video Signals •  “Gamma” nonlinearity –  Came from CRT (Cathode Ray Tube) physics –  Very much the same since the 1930s –  Works reasonably well since it is similar to human visual sensitivity (with caveats) •  No actual standard until last year! (2011) –  Finally, with CRTs almost extinct the effort was made to officially document their response curve –  Result was ITU-R Recommendation BT.1886 © 2012 SMPTE · e 2012 Annual Technical Conference & Exhibition · www.smpte2012.org Recommendation ITU-R BT.1886 EOTF © 2012 SMPTE · e 2012 Annual Technical Conference & Exhibition · www.smpte2012.org If gamma works so well, why change?

•  Gamma is similar to perception within limits –  Traditional cinema and television programs are viewed at moderately low light levels, and have relatively limited dynamic ranges –  Within these constraints, gamma works •  If we stay in these ranges for the future, no change may be required –  This is the current thinking for UHDTV (Ultra High Definition Television), detailed in ITU-R Report BT.2246 © 2012 SMPTE · e 2012 Annual Technical Conference & Exhibition · www.smpte2012.org 12 bit Rec1886 Curve with 100 nit Peak © 2012 SMPTE · e 2012 Annual Technical Conference & Exhibition · www.smpte2012.org But 100 nits is no longer adequate!

•  Most modern display devices are already operating well above this level –  Consumer displays typically 200 to 500 nits –  Commercial displays available at 1000 to 2000 nits –  Laboratory displays at 4000 to 20,000 nits •  When we expand the range, gamma shows its limitations © 2012 SMPTE · e 2012 Annual Technical Conference & Exhibition · www.smpte2012.org 12 bit Rec1886 Curves © 2012 SMPTE · e 2012 Annual Technical Conference & Exhibition · www.smpte2012.org How can we improve performance?

•  Add more bits –  Not very practical - too many legacy pipelines –  10 or 12 bits is about the best we can expect typically •  Use a better curve –  Power functions waste codes at high end –  Log functions waste codes at low end –  Greatest efficiency would be to follow human perception © 2012 SMPTE · e 2012 Annual Technical Conference & Exhibition · www.smpte2012.org Perception & Signal Coding •  DICOM (Digital Imaging and Communications In Medicine) –  Grayscale standard display function - 1998 –  Barten model used directly for signal coding –  0.05 to ~4000 nits –  Too aggressive with Barten parameters – visible steps at low end of scale

–  –  –

© 2012 SMPTE · e 2012 Annual Technical Conference & Exhibition · www.smpte2012.org Building Some Optimized Curves •  Choose peaks of 100, 1000, and 10,000 nits as before –  Then pick f so that a near zero minimum level is reached –  0 to 100 nits = 0.46 JNDs per code word at 12 bits –  0 to 1000 nits = 0.68 JNDs per code word at 12 bits –  0 to 10,000 nits = 0.9 JNDs per code word at 12 bits © 2012 SMPTE · e 2012 Annual Technical Conference & Exhibition · www.smpte2012.org 12 bit Uniform JND Curves © 2012 SMPTE · e 2012 Annual Technical Conference & Exhibition · www.smpte2012.org Functional Approximation •  Would be great to have a functional form of the Iterative LUT –  Helpful for standardization –  Simpler to document –  Invertibility is very helpful –  Good alignment with a modified Naka-Rushton model

–  –  –

© 2012 SMPTE · e 2012 Annual Technical Conference & Exhibition · www.smpte2012.org 12 bit PQ and Rec1886 Curves © 2012 SMPTE · e 2012 Annual Technical Conference & Exhibition · www.smpte2012.org 10 bit PQ and Rec1886 Curves © 2012 SMPTE · e 2012 Annual Technical Conference & Exhibition · www.smpte2012.org Visual Test Framework © 2012 SMPTE · e 2012 Annual Technical Conference & Exhibition · www.smpte2012.org JND Cross Test Pattern

–  –  –

© 2012 SMPTE · e 2012 Annual Technical Conference & Exhibition · www.smpte2012.org PQ Enables Good Performance With High Dynamic Ranges •  PQ shows 1 to 2 bit advantage over gamma at low end with 1000 nit peak signals •  Only slight performance impact going to 10,000 nit peak signals with PQ •  Gamma gets much worse with higher peak levels

–  –  –





Scott Miller Dolby Laboratories, Inc., 1040 Stony Hill Road, Yardley, PA, jsmill@dolby.com Mahdi Nezamabadi Dolby Laboratories, Inc., 1040 Stony Hill Road, Yardley, PA, mneza@dolby.com Scott Daly Dolby Laboratories, Inc., 432 Lakeside Drive, Sunnyvale, CA, sdaly@dolby.com

–  –  –

Abstract. As the performance of electronic display systems continues to increase, the limitations of current signal coding methods become more and more apparent. With bit depth limitations set by industry standard interfaces, a more efficient coding system is desired to allow image quality to increase without requiring expansion of legacy infrastructure bandwidth. A good approach to this problem is to let the human visual system determine the quantization curve used to encode video signals. In this way optimal efficiency is maintained across the luminance range of interest, and the visibility of quantization artifacts is kept to a uniformly small level.

Keywords. perception, human visual system, transfer function, perceptual curve, signal, encoding, coding, bit depth, gamma, logarithmic, Barten, efficiency, EOTF The authors are solely responsible for the content of this technical presentation. The technical presentation does not necessarily reflect the official position of the Society of Motion Picture and Television Engineers (SMPTE), and its printing and distribution does not constitute an endorsement of views which may be expressed. This technical presentation is subject to a formal peer-review process by the SMPTE

Board of Editors, upon completion of the conference. Citation of this work should state that it is a SMPTE meeting paper. EXAMPLE:

Author's Last Name, Initials. 2011. Title of Presentation, Meeting name and location.: SMPTE. For information about securing permission to reprint or reproduce a technical presentation, please contact SMPTE at jwelch@smpte.org or 914-761-1100 (3 Barker Ave., White Plains, NY 10601).

Copyright © 2012 Society of Motion Picture and Television Engineers. All rights reserved.

Introduction The fundamental basis for interpreting any visual signal is knowledge of that signal’s transfer function – the description of how to convert the signal’s carrier (analog voltage, film density, or digital code values) to optical energy. With electronic displays for television and film, the critical information is found in the EOTF (electro-optical transfer function) for reference standard displays. The vast majority of content is color graded (either live in the camera, or during post production) according to artistic preference while viewing on a reference standard display.

Therefore it is the EOTF and not the OETF (opto-electronic transfer function – used in camera capture) that truly defines the intent of visual signal code values. Reference EOTF curves have been defined for television1 and digital cinema2 applications – both based on power functions with exponent values of 2.4 and 2.6 respectively. While these systems have some known issues with dark level reproduction, they have been used with great success for many years. This success comes primarily because these curves crudely approximate human perception when implemented on relatively dim reference displays with a peak brightness of ~50 to 100 cd/m2, and with a dynamic range (or contrast) less than 3 log units of luminance. As typical display brightness and dynamic range has steadily increased this approximation has steadily become more and more inaccurate. Typical displays today are now achieving peak levels of 500 cd/m2 or more (with several commercial examples above 1000 cd/m2) and artifacts in dark details have increased proportionately. Further, through digital driving circuitry, display noise is vastly reduced, and through better cameras and use of synthetic imagery, the image capture noise is vastly lower or even zero. Thus the well-known effect of masking by noise no longer hinders low amplitude visibility. It is clear that the displays of today and the future could benefit from a better system.

Gamma Coding and Perception The ITU-R Rec. BT.18861 EOTF for television, commonly referred to as “gamma encoding”, is often said to be perceptually linear. A recent ITU report on Ultra-High Definition Television (UHDTV) (Report ITU-R BT.2246)3 used a scaled Barten contrast sensitivity function, called “Barten (Ramp)”, along with an alternative threshold function by Schreiber to illustrate how the ITU-R Rec. BT.1886 EOTF for HDTV behaved similarly to human perception, and was near or below visual detection thresholds for 10 and 12 bit implementations. Though this is roughly the case for a gamma curve with a peak level of 100 cd/m2 (or 100 nits) as shown in figure 1, when higher peak luminance levels are used the 12 bit gamma curve quickly rises above both the Barten and Schreiber thresholds, suggesting that it will become likely to show visible quantization artifacts – especially at the dark end of the luminance range.

Figure 1. 12 bit Rec1886 gamma curves with peak luminances of 100, 1000, and 10,000 cd/m2.

–  –  –

Barten Perceptual Model Several models have been created over the years to represent the human visual system response. One well respected model for the contrast sensitivity function (CSF) was developed by Peter Barten4 and has been referenced by many electronic imaging studies and standards.

This complex model of contrast sensitivity is based on physics, optics, and some experimentally determined parameters. It has been shown to align well with many visual experiments spanning several decades of research, and is given summarized in figure 2, where S is contrast sensitivity, L is luminance in cd/m2, u is spatial frequency in cycles/deg., and X0 is angular size in deg.

Figure 2. Barten model for human visual contrast sensitivity.

For more details, consult Barten’s 2004 paper5 where he describes the equation and documents most of the commonly accepted parameter values. This model was used by Digital Imaging and Communication in Medicine (DICOM) to create a specialized EOTF for the medical industry6, and also used as a visual threshold reference in studies for digital cinema7 and UHDTV3.

An EOTF Based on Perception To create a more efficient EOTF, a curve is desired that is a closer fit to the actual human visual response curve. Since the Barten model has been used effectively as a benchmark for evaluating the performance of other EOTF curves, why not use it directly to compute an optimized perceptual EOTF?

For this system, Barten model parameters were chosen to be very similar to those used in prior studies with two exceptions: Angular size X0 was chosen to be 40 degrees (this angle is representative of many display scenarios, and additionally the overall system is near its peak sensitivity at 40 degrees), and the spatial frequency u was allowed to vary with luminance to track the maximum sensitivity of the human visual system, that is, tracking the peak of the CSF as it undergoes shape changes as a function of adapting luminance level – shown in figure 3.

–  –  –

0 ≤ හ ≤ 1;

ස = 10,000;

ළ = 78.8438;

ළ = 0.1593;

ළ! = 0.8359;

ළ! = 18.8516;

ළ! = 18.6875 For compact reference, this functional form is referred to as the Perceptual Quantizer (or PQ) curve. This signal encoding is anchored to absolute luminance levels viewed on the display screen (note that this is not absolute luminance at capture; television and cinema are display referred and not scene referred systems). The PQ curve has nearly a square-root behaviour (slope = -1/2) at the darkest light levels, consistent with the Rose-DeVries law based on photon detection statistics, and then rolls off to a constant zero slope for the highest light levels, which is consistent with the log behaviour of the well-known Weber’s law. Between those extreme luminance regions, it exhibits varying slopes, and throughout the mid luminance levels it exhibits a slope similar to the gamma nonlinearities.

Visual Tests A real world comparison was developed to illustrate the advantages of the Perceptual Quantizer over traditional ITU-R Rec. BT.1886 gamma for some brighter display scenarios. Figure 5a shows plots of the 12 bit PQ curve generated for a range up to 10K cd/m2 as well as a version generated for a range up to 1K cd/m2, compared to a 1K cd/m2 peak Rec1886 curve. The 1K PQ signal shows much higher performance than the ITU-R Rec. BT.1886 gamma function, and

–  –  –

Figure 5a. 12 bit PQ curves compared to 12 bit Rec1886 curve at 1000 cd/m2 peak.

Figure 5b. 10 bit PQ curves compared to 10 bit Rec1886 curve at 1000 cd/m2 peak.

Though the ITU-R Rec. BT.1886 systems show greater precision in their brightest region, the plots indicate that these areas are below perceptual thresholds, so these levels are likely to be “wasted” visually, and not contribute to a better viewing experience.



Pages:   || 2 |


Similar works:

«How-To Guide CUSTOMER Document Version: 1.5 – 2015-12-28 How to Scramble Data Using SAP Test Data Migration Server Release 4.0 Typographic Conventions Type Style Description Example Words or characters quoted from the screen. These include field names, screen titles, pushbuttons labels, menu names, menu paths, and menu options. Textual cross-references to other documents. Example Emphasized words or expressions. Technical names of system objects. These include report names, program names,...»

«COPYRIGHT NOTICE: Theodore Ziolkowski: The Sin of Knowledge is published by Princeton University Press and copyrighted, © 2000, by Princeton University Press. All rights reserved. No part of this book may be reproduced in any form by any electronic or mechanical means (including photocopying, recording, or information storage and retrieval) without permission in writing from the publisher, except for reading and browsing via the World Wide Web. Users are not permitted to mount this file on any...»

«Behaviour 152 (2015) 335–357 brill.com/beh Non-reciprocal but peaceful fruit sharing in wild bonobos in Wamba Shinya Yamamoto a,b,∗ a Graduate School of Intercultural Studies, Kobe University, 1-2-1 Tsurukabuto, Nada-ku, 657-8501 Kobe, Japan b Wildlife Research Center, Kyoto University, Yoshida-honmachi, Sakyo-ku, 606-8501 Kyoto, Japan * Author’s e-mail address: shinyayamamoto1981@gmail.com Accepted 30 December 2014; published online 29 January 2015 Abstract Food sharing is considered to...»





 
<<  HOME   |    CONTACTS
2016 www.dissertation.xlibx.info - Dissertations, online materials

Materials of this site are available for review, all rights belong to their respective owners.
If you do not agree with the fact that your material is placed on this site, please, email us, we will within 1-2 business days delete him.