nv2a: Support signed textures #36

JayFoxRox · 2019-06-23T16:10:53Z

This PR adds a bit of a hack for signed textures. Consider this merely a stub.

Xbox allows per-texture per-component signed-flags. This flag decides wether a texture component is intepreted as unsigned value in range 0.0 to 1.0 (UNORM), or as signed two's complement value -1.0 to 1.0 (SNORM).

In hardware, the output of this sample is probably a sign-extended fixed point integer value.

There's different solutions we could use to solve this; this implements a limited variant of the third proposal (which has many issues):

1. Two's complement conversion in shader

The value is sampled as UNORM by OpenGL (values mapped to range from 0.0 to 1.0), but then mapped to SNORM in the range -1.0 to 0.0, or 0.0 to 1.0, depending on the highest bit (by checking if the UNORM value is above 0.5).

This breaks when interpolating 8-bit values below 0x7F to values above 0x80 (equvialent issues happen for other number of bits).
No access to raw integer values.

The interpolation issue happens because with unsigned textures (such as OpenGL) those 8-bit values would be sampled as value 0.5 (127 or 128). Going from 127 (< 0.5) to 128 (> 0.5) means a slow value increase, so the interpolated value goes up as the sample point moves closer to raw value 128.
- For signed textures (two's complement), the sampled values would be 1.0 (127) and -1.0 (-128) respectively. When going from 127 (1.0) to 128 (-1.0) we should expect a rapid decrease for the interpolated value.
However, as OpenGL doesn't interpolate this direction, the interpolated values are wrong in such transition areas.

Also see nlguillemot/SNormTest#2 (comment)

2. SNORM textures

OpenGL has signed textures itself. These aren't per-component (as on Xbox), so we'd have to sample different textures (or texture-views in OpenGL 4.3 and above), then mix the results in the shaders.

Multiple texture lookups.
Only works for 8-bit RGBA formats (R8, RG8, RGB8 and RGBA8); internal storage for RGB565 etc. will have to be changed (likely software decoding in driver).
Will not work for DXT at all, so we have to decode these manually.
Will not emulate texture address accuracy of Xbox.
Will not provide access to raw integer values (might be necessary for dotmapping or some games).

3. Software decoding on CPU

We could just decode all affected formats on the CPU and upload the expected GPU results to the GPU.

Overhead for each texture upload (although we already do this for unswizzling).
A lot of code complexity for decoding each texture format, including DXT.
Will not emulate texture address accuracy of Xbox.

This PR implements this, but doesn't support DXT textures

4. Software interpolation in shader

We can also use point-filtering and do all interpolation in the shader.

Multiple texture lookups.
Complicates shader logic, and will probably negatively affect performance.

This approach could also be extended with integer lookups / software DXT decompression.
This means it should work for all formats and optionally provide raw integer access; but we can also continue to use the host GPUs format decoder / normalization.

We could also integrate the software unswizzling in the shader, so the CPU rarely has to touch resources. We could move this into a compute(-like)-shader, if the performance impact is too high.

As we'd also control the texture address, we could even accurately represent the texture address fraction bits for the interpolation (so effects like banding are emulated).

This PR strongly interacts with texture / surface caching and probably dotmapping.

TODO:

Add more bias, so that the values match Xbox
PR description

JayFoxRox · 2019-06-24T22:47:23Z

What happens when R8 or something has signed channels?

Will G and B be affected by R signedness? will A still be 1.0?

JayFoxRox · 2019-06-25T09:29:52Z

hw/xbox/nv2a/nv2a_pgraph.c

@@ -4089,31 +4089,113 @@ static uint8_t* convert_texture_data(const TextureShape s,
                                     unsigned int slice_pitch)
 {
    //FIXME: Handle all formats
-    if (s.color_format == NV097_SET_TEXTURE_FORMAT_COLOR_LU_IMAGE_A8R8G8B8) {
+    if ((s.color_format == NV097_SET_TEXTURE_FORMAT_COLOR_LU_IMAGE_A8R8G8B8) ||


FIXME: Explain what this code path does

Note to self: This flips the sign.

Normally this is two's-complement. So:

0x80 = -128 = 128

0xFF = ..-1 = 255

0x00 = ...0 = ..0

0x7F = .127 = 127

Note the discontinuity between 0xFF and 0x00 when interpreting it as unsigned value. By flipping the sign bit and interpreting as unsigned, we achieve the following:

0x80 will be 0x00 = ...0 = ..0

0xFF will be 0x7F = .127 = 127

0x00 will be 0x80 = -128 = 128

0x7F will be 0xFF = ..-1 = 255

- So we have gotten rid of the discontinuity.
In the shader, we scale and add a bias, to bring it back into the -128 to 127 range.

JayFoxRox · 2019-06-25T09:31:06Z

hw/xbox/nv2a/nv2a_pgraph.c

        return NULL;
      }

+      // Convert XOR mask to be re-usable for 8 or 16bpp


FIXME: Add comment that this is a hack / incomplete

JayFoxRox · 2019-06-25T10:34:27Z

How are 1 bit values signed? like A1R5G5B5

JayFoxRox · 2019-06-25T13:51:39Z

hw/xbox/nv2a/nv2a_pgraph.c

@@ -4086,6 +4086,38 @@ static void convert_yuy2_to_rgb(const uint8_t *line, unsigned int ix,
    *b = cliptobyte((298 * c + 516 * d + 128) >> 8);
 }

+static const uint32_t* get_gl_sign_bits(GLenum gl_format, GLenum gl_type) {


Explain what this does

JayFoxRox · 2019-06-25T14:15:46Z

hw/xbox/nv2a/nv2a_pgraph.c

@@ -4086,6 +4086,38 @@ static void convert_yuy2_to_rgb(const uint8_t *line, unsigned int ix,
    *b = cliptobyte((298 * c + 516 * d + 128) >> 8);
 }

+static const uint32_t* get_gl_sign_bits(GLenum gl_format, GLenum gl_type) {
+  if ((gl_format == GL_BGRA) && (gl_type == GL_UNSIGNED_INT_8_8_8_8_REV)) {


Also using GL_RGBA, GL_UNSIGNED_INT_8_8_8_8

JayFoxRox · 2019-06-25T20:22:56Z

Banding is wrong for RGBA4444
RGBA5551 alpha is wrong when signed

So I assume that a bias is wrong, but 8 bits hides that problem.

This is a design issue that I'll try to solve. But as a temporary hack we can probably fall back to the current solution, too (as the normalized result will be "good enough" for most applications).

JayFoxRox · 2019-06-25T22:58:36Z

hw/xbox/nv2a/nv2a_regs.h

@@ -1055,6 +1055,8 @@
 #           define NV097_SET_TEXTURE_FORMAT_COLOR_LU_IMAGE_R5G6B5   0x11
 #           define NV097_SET_TEXTURE_FORMAT_COLOR_LU_IMAGE_A8R8G8B8 0x12
 #           define NV097_SET_TEXTURE_FORMAT_COLOR_LU_IMAGE_Y8       0x13
+#           define NV097_SET_TEXTURE_FORMAT_COLOR_LU_IMAGE_R8B8     0x16
+#           define NV097_SET_TEXTURE_FORMAT_COLOR_LU_IMAGE_G8B8     0x17


Split this commit into a separate PR

JayFoxRox · 2019-06-25T23:39:18Z

hw/xbox/nv2a/nv2a_psh.c

+    qstring_append(final, "\n");
+    qstring_append(final, "vec4 signed_texture(vec4 sample, bvec4 mask) {\n");
+    qstring_append(final, "  vec4 signed_sample = sample * 2.0 - 1.0;\n");
+    qstring_append(final, "  return mix(sample, signed_sample, mask);\n");


// Hack so interpolation will work properly uint8_biased = uint8 xor 0x80; // Normalization step //FIXME: GL will use / 255.0, or / 15.0 etc.? uint8_biased_normalized = uint8_biased/254; // Unbias to recover intended value sint8 = (uint8_biased_normalized*2-1)-2/254

Equivalent (unconfirmed):

// Normalization step uint8_biased_normalized = uint8_biased/255; // Assumed as part of GL // Unbias to recover intended value sint8 = (uint8_biased_normalized*255-1)/127-1.0

Equivalent (unconfirmed):

// Normalization step uint8_biased_normalized = uint8_biased/255; // Assumed as part of GL // Unbias to recover intended value const scale = 255/127 const bias_prime = -1.0-1/127 sint8 = uint8_biased_normalized * scale + bias

is closer to values shown for SNORM8 at https://github.com/nlguillemot/SNormTest

(Clamping not shown here)

Also see equation 2.3 (page 11) of https://www.khronos.org/registry/OpenGL/specs/gl/glspec33.core.pdf

JayFoxRox · 2019-10-27T00:40:48Z

hw/xbox/nv2a/nv2a_pgraph.c

+      bool gl_component_signed[4] = { false, false, false, false };
+      for(int i = 0; i < 4; i++) {
+
+        // Use GL swizzle mask to figure out which GL component to use


I don't remember what I meant by this and the comment doesn't help.
I should improve this comment.

JayFoxRox · 2019-11-13T21:14:41Z

hw/xbox/nv2a/nv2a_pgraph.c

@@ -4080,6 +4127,116 @@ static uint8_t* convert_texture_data(const TextureShape s,
                                     unsigned int row_pitch,
                                     unsigned int slice_pitch)
 {
+    //FIXME: Handle all formats
+    if (true) {


Should probably happen after format conversion

JayFoxRox · 2019-11-13T21:15:13Z

hw/xbox/nv2a/nv2a_pgraph.c

+      // Get texture format information
+      ColorFormatInfo f = kelvin_color_format_map[s.color_format];
+
+      //FIXME: Extend for other formats in the main texture table


This comment was probably added when get_gl_sign_bits wasn't a function yet.
It can be removed.

JayFoxRox · 2019-11-13T21:17:11Z

hw/xbox/nv2a/nv2a_pgraph.c

+          GLenum gl_swizzle = f.gl_swizzle_mask[i];
+
+          //FIXME: What happens with these?
+          if (gl_swizzle == GL_ZERO) {


This should be a switch-case?

JayFoxRox · 2019-11-13T21:18:57Z

hw/xbox/nv2a/nv2a_pgraph.c

+        gl_component_used[j] = true;
+
+        // Apply XOR mask for signedness, and mark as signed
+        if (s.signed_rgba[i]) {


Can't this be checked at the start of the loop?

JayFoxRox · 2019-11-13T21:19:57Z

hw/xbox/nv2a/nv2a_pgraph.c

+          }
+
+          if (gl_swizzle == GL_RED) {
+            j = 0;


j should be called something like gl_component_index

JayFoxRox · 2019-11-13T21:20:38Z

hw/xbox/nv2a/nv2a_pgraph.c

+      uint32_t gl_xor_mask = 0;
+      bool gl_component_used[4] = { false, false, false, false };
+      bool gl_component_signed[4] = { false, false, false, false };
+      for(int i = 0; i < 4; i++) {


i should be called something like nv2a_component_index

nv2a: Support signed textures (breaks if interpolation is active)

558fe2b

JayFoxRox force-pushed the bad-signed-tex branch from 55e26f2 to 558fe2b Compare June 23, 2019 18:28

wip: Sign textures on CPU

9146789

JayFoxRox changed the title ~~nv2a: Support signed textures (breaks if interpolation is active)~~ nv2a: Support signed textures Jun 25, 2019

Support some OpenGL texture swizzle + some 8bpp and 16bpp

4efcb0b

JayFoxRox force-pushed the bad-signed-tex branch from 6f7c9b3 to 4efcb0b Compare June 25, 2019 09:22

JayFoxRox commented Jun 25, 2019

View reviewed changes

nv2a: Add support for R8B8 and G8B8 textures (unsure if correct)

dad91cb

Add generic sign-bit handler

7cf772c

JayFoxRox commented Jun 25, 2019

View reviewed changes

JayFoxRox mentioned this pull request Jun 25, 2019

NV_PGRAPH_TEXFILTER_SIGNED not supported xqemu/xqemu#75

Open

JayFoxRox commented Jun 25, 2019

View reviewed changes

JayFoxRox added the wip label Jun 27, 2019

JayFoxRox commented Oct 27, 2019

View reviewed changes

JayFoxRox commented Nov 13, 2019

View reviewed changes

hw/xbox/nv2a/nv2a_pgraph.c

}

if (gl_swizzle == GL_RED) {

j = 0;

Copy link

Owner Author

JayFoxRox Nov 13, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

j should be called something like gl_component_index

JayFoxRox commented Nov 13, 2019

View reviewed changes

Triticum0 mentioned this pull request Dec 4, 2021

NV_PGRAPH_TEXFILTER_SIGNED not supported xemu-project/xemu#587

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

nv2a: Support signed textures #36

nv2a: Support signed textures #36

JayFoxRox commented Jun 23, 2019 •

edited

Loading

JayFoxRox commented Jun 24, 2019

JayFoxRox Jun 25, 2019

JayFoxRox Jun 25, 2019

JayFoxRox commented Jun 25, 2019

JayFoxRox Jun 25, 2019

JayFoxRox Jun 25, 2019

JayFoxRox commented Jun 25, 2019

JayFoxRox Jun 25, 2019

JayFoxRox Jun 25, 2019 •

edited

Loading

JayFoxRox Jun 25, 2019

JayFoxRox Oct 27, 2019

JayFoxRox Nov 13, 2019

JayFoxRox Nov 13, 2019

JayFoxRox Nov 13, 2019

JayFoxRox Nov 13, 2019

JayFoxRox Nov 13, 2019

JayFoxRox Nov 13, 2019

nv2a: Support signed textures #36

Are you sure you want to change the base?

nv2a: Support signed textures #36

Conversation

JayFoxRox commented Jun 23, 2019 • edited Loading

1. Two's complement conversion in shader

2. SNORM textures

3. Software decoding on CPU

4. Software interpolation in shader

JayFoxRox commented Jun 24, 2019

Choose a reason for hiding this comment

Choose a reason for hiding this comment

JayFoxRox commented Jun 25, 2019

Choose a reason for hiding this comment

Choose a reason for hiding this comment

JayFoxRox commented Jun 25, 2019

Choose a reason for hiding this comment

JayFoxRox Jun 25, 2019 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

JayFoxRox commented Jun 23, 2019 •

edited

Loading

JayFoxRox Jun 25, 2019 •

edited

Loading