Is Access Advance Really Checking Essentiality? -- An HEVC Case Study

In most standards-setting organizations, including the ones involved with High Efficiency Video Coding (HEVC) (H.265), participants designate their intellectual property as relevant or essential to practice a portion of the standard without scrutiny.  This has led to widespread inflation of unmerited licensing demands even from well-respected companies.  

This has only been exacerbated by the current patent pool ecosystem, whereby multiple pools purport to each license a single standard. These pools claim to offer thousands or tens of thousands of allegedly-essential patents, both without providing evidence that the patents are, in fact, essential, and generally without evidence of what percentage of the patent landscape the pool actually has power to license. But even a cursory review of many of the patents can lead to a quick conclusion that it should not be designated essential, as the following analysis of U.S. Patent 10,250,913 (part of the Access Advance patent pool) demonstrates. As such, it is important to perform truly objective analyses of standard-essential patents.

SUMMARY

While designated as essential, U.S. Patent 10,250,913 (the “‘913 Patent”) claims a very specific way of dividing an image into blocks for encoding and decoding that is not required by the HEVC standard.  During the prediction and transform encoding steps, the ’913 Patent claims dividing a picture using a nested organizational structure that divides the picture into four using a technique called quadtrees.  The quadtrees are defined, in part, by the “maximum size” of the units in each quadtree.  Beyond that, the claims require that certain quadtree regions are specified by determining if particular quadtree sub-regions exceed a maximum size.  However, the H.265 standard, while specifying that quadtrees are used, does not describe how the quadtrees are specified, much less state any relationship between quadtree regions and certain quadtree sub-regions, as required by claims 1 and 13 of the ’913 Patent.  Thus, the ‘913 Patent claims a very narrow way of using quadtrees that is not even mentioned in the HEVC standard, much less required.  Accordingly, the patent is not necessary nor essential to practice the standard.

BACKGROUND

Encoding video generally involves four distinct steps:

  • Partitioning each picture into multiple units called “coding units”

  • Predicting each unit using inter or intra prediction, and subtracting the prediction from each unit to find a residual

  • Transforming and quantizing the residual (the difference between the original picture unit and the prediction)

  • Entropy encoding the residual and other information (e.g., transform output, prediction information, mode information and headers) in preparation for transmission to the decoder. 

Once the encoded bitstream reaches the decoder, the decoder performs the same process in reverse. 

To specify the coding units (or “blocks” in the ’913 Patent), quadtrees are used, as shown below:

Screen Shot 2021-07-20 at 5.44.50 PM.png

U.S. PATENT 10,250,913

The ’913 Patent focuses on the prediction and transform steps, and the patent notes that improvements in coding efficiency can be achieved when the coding unit sizes for prediction differ from the coding unit sizes used for residual transform processing.

For example, the ’913 Patent discloses that prediction and residual blocks may have different sizes:

For transform coding, the blocks (or the corresponding blocks of sample arrays), for which a particular set of prediction parameters has been used, can be further split before applying the transform. The transform blocks can be equal to or smaller than the blocks that are used for prediction. It is also possible that a transform block includes more than one of the blocks that are used for prediction. Different transform blocks can have different sizes and the transform blocks can represent quadratic or rectangular blocks.

’913 Patent at 47:61-65.

To specify the prediction and transform coding unit sizes, the ’913 Patent describes two nested quadtrees that are defined, in part, by the “maximum size” of the coding units in each quadtree. The ’913 Patent describes the two nested quadtrees:

Thus, in accordance with the example presented above with respect to FIGS. 3a to 6a, sub-divider 28 defined a primary sub-division for prediction purposes and a subordinate sub-division of the blocks of different sizes of the primary sub-division for residual coding purposes. The data stream inserter 18 coded the primary sub-division by signaling for each treeblock in a zigzag scan order, a bit sequence built in accordance with FIG. 6a along with coding the maximum primary block size and the maximum hierarchy level of the primary sub-division. For each thus defined prediction block, associated prediction parameters have been included into the data stream. Additionally, a coding of similar information, i.e., maximum size, maximum hierarchy level and bit sequence in accordance with FIG. 6a , took place for each prediction block the size of which was equal to or smaller than the maximum size for the residual sub-division and for each residual tree root block into which prediction blocks have been pre-divided the size of which exceeded the maximum size defined for residual blocks. For each thus defined residual block, residual data is inserted into the data stream.

’913 Patent at 21:44-64.

The concept of nested quadtrees, and the logic that is used to divide the nested quadtrees, is claimed in claims 1 and 13 of the ‘913 Patent shown below, with the key contextual elements in bold and the key claim limitations discussed in this analysis bolded and underlined:  

1. A decoder comprising:

an extractor configured to extract, from a data stream representing encoded video information, information related to first and second maximum region sizes, first and second subdivision information, and a maximum hierarchy level wherein the first maximum region size and the first subdivision information are associated with prediction coding and the second maximum region size and the second subdivision information are associated with transform coding;

a divider configured to:

divide an array of information samples representing a spatially sampled portion of the video information into a first set of root regions based on the first maximum region size,

sub-divide at least some of the first set of root regions into a first set of sub-regions using recursive multi-tree partitioning based on the first subdivision information,

determine whether a size of at least one of the first set of sub-regions exceeds the second maximum region size;

responsive to a determination that the size of at least one of the first set of sub-regions does exceed the second maximum region size, divide at least one of the first set of sub-regions into a second set of root regions of the second maximum region size, and

determine, for each of the second set of root regions of the second maximum region size, whether the respective root region of the second set of root regions is to be sub-divided;

responsive to a determination that the respective root region of the second set of root regions is to be sub-divided, sub-divide the respective root region of the second set of root regions into a second set of sub-regions using recursive multi-tree partitioning based on the second subdivision information and the maximum hierarchy level; and

a reconstructor configured to reconstruct the array of information samples using prediction coding in accordance with the first set of sub-regions and transform coding in accordance with the second set of sub-regions.

Importantly, the bolded/underlined language was added by amendment to secure allowance over prior art, and it is this requirement that is not found in the HEVC standard, as explained below.

FILE HISTORY OF THE ’913 PATENT

During prosecution, the application leading to the ’913 Patent was rejected over prior art that disclosed using a maximum size of blocks for prediction coding and a second maximum size of blocks for transform coding. See Final Rejection Dated October 3, 2017.  This art was consistent with what is described in the HEVC standards documentation.

The applicant responded by amending the claims to add a new limitation for determining whether a size of at least one of the first set of sub-regions exceeds the second maximum region size, and added language to make the “dividing the at least one of the first set of sub-regions into a second set of root regions of the second maximum region size” step responsive to the newly added determination. Office Action Response Dated April 2, 2018.

The applicant further argued that the prior art “fails to teach or discuss a specific operation in which a size of a prediction sub-block in prediction coding is checked to determine whether that size exceeds a second maximum size (e.g., as indicated by the alleged transform size flag) associated with a transform block in transform coding, let alone further dividing the prediction sub-block to the second maximum size based on such size determination.Id. (emphasis added). 

In other words, the applicant asserted that a “operation . . . in prediction coding” must be used to determine whether the size of a prediction sub-block exceeds a transform quadtree maximum size, and then divide the prediction sub-blocks during the prediction step to fit the maximum size of the transform quadtree. The applicant effectively moved a part of the partitioning of the transform quadtree from the transform step to the prediction step.

THE ’913 PATENT IS NOT ESSENTIAL TO THE HEVC STANDARD

The following sections of the November 2019 version of ITU-T H.265 have been identified by Access Advance as implicating claims 1 and 13 of the ’913 Patent:

  • 6.3.2

  • 7.3.2.2.1, 7.3.8.2, 7.3.8.4, 7.3.8.5, 7.3.8.8, 7.4.3.2.1, 7.4.9.4, 7.4.9.8

  • 8.5.3.1, 8.6.7

A summary of each of these sections from H.265 is provided below, but none of these sections describe in any way or refer to the requirements of claims 1 and 13 of the ’913 Patent that were added during prosecution and require determining whether the size of a prediction sub-region exceeds the maximum size of transform region, and do not describe any logic corresponding to dividing the prediction sub-regions to align with the maximum transform region size at all, much less in a manner responsive to that determination.

Section 6.3.2 is titled “Block and quadtree structures,” and gives an overview of the Coding Tree Blocks (CTBs) used during coding, and indicates that two quadtrees are used, one for the prediction tree and one for the transform tree. However, this section does not describe how the quadtrees are specified, nor does it suggest that transform quadtree regions are specified by determining that a prediction quadtree sub-region exceeds the maximum size of the transform quadtree, as required by claims 1 and 13 of the ’913 Patent.

Sections 7 of the H.265 standard is titled “Syntax and Semantics” and all of Sections 7.3.2.2.1, 7.3.8.2, 7.3.8.4, 7.3.8.5, 7.3.8.8, 7.4.3.2.1, 7.4.9.4, and 7.4.9.8 simply describe the syntax and variable names that are used in H.265. These sections do not describe any logic that would correspond to the requirements of claims 1 and 13 as described above.

Section 8.5.3.1 describes the inputs and outputs for decoding the prediction units in inter prediction mode. For example, it lists the input variables that specify the size, width, height, and index of the current luma prediction blocks, and describes the ordered steps taken to decode prediction units in inter prediction mode. It does not describe the determining steps discussed above.

Section 8.6.7 is titled “Picture construction process prior to in-loop filter process,” and describes the input variables for picture construction, including the variables that describe the predicted and residual samples of the current block. Once again, it does not describe the determining steps discussed above.

Thus, the H.265 sections identified by Access Advance, while generally relevant to describing quadtrees for prediction and transform coding, do not describe or require the logic in claims 1 and 13 of the ’913 Patent.

CONCLUSION

Often, it is easy for practitioners in this field to see whether a patent is essential to a technical standard; that should certainly be the case from companies that participate in the standard-setting process.  Yet most pools lack any analysis whatsoever.  The ’913 Patent is emblematic of a systemic issue in licensing of allegedly standard essential patents within video codec technology and beyond.