Sylvain
I think what happening with the software renderer is:
* you're somehow in background (so texture creation is not possible)
* it resizes and wants to push a SDL_WINDOWEVENT_SIZE_CHANGED
It call:
https://hg.libsdl.org/SDL/file/a010811d40dd/src/render/SDL_render.c#l683
* GetOutputSize
* SW_GetOutputSize
* SW_ActivateRenderer
* SDL_GetWindowSurface
* SDL_CreateWindowFramebuffer which is mapped to SDL_CreateWindowTexture
and it ends up re-creating the surface/a texture, while being in background
Sylvain
Currently SDL_CreateTextureFromSurface picks first valid format, and do a conversion.
format = renderer->info.texture_formats[0];
for (i = 0; i < renderer->info.num_texture_formats; ++i) {
if (!SDL_ISPIXELFORMAT_FOURCC(renderer->info.texture_formats[i]) &&
SDL_ISPIXELFORMAT_ALPHA(renderer->info.texture_formats[i]) == needAlpha) {
format = renderer->info.texture_formats[i];
break;
It could try to find a better format, for instance :
if SDL_Surface has no Amask, but a colorkey :
if surface fmt is RGB888, try to pick ARGB8888 renderer fmt
if surface fmt is BGR888, try to pick ABGR8888 renderer fmt
else
try to pick the same renderer format as surface fmt
if no format has been picked, use the fallback.
I think it goes with bug 4290 fastpath BlitNtoN
when you expand a surface with pixel format of size 24 to 32, there is a fast path possible.
So with this issue:
- if you have a surface with colorkey (RGB or BGR, not palette), it takes a renderer format where the conversion is faster.
(it avoids, if possible, RGB -> ABGR which means switching RGB to BGR)
- if you have a surface ABGR format, it try to take the ABGR from the renderer.
(it avoids, if possible, ABGR -> ARGB, which means switch RGB to BGR)
Sylvain
OpenGLES2 SDL renderer has support for textures ARGB, ABGR, RGB and BGR, whereas OpenGL SDL renderer only had ARGB.
If you think it's worth adding it, here's a patch. I quickly tried and it worked, but there may be missing things or corner case.
This is the best option for macOS and iOS, the only platforms with Metal.
Pre-Metal versions of these platforms will fall back to OpenGL (ES), as
appropriate.
Huge thanks to Alexander Szpakowski, who worked incredibly hard to get the
Metal renderer to such a high-quality state!
./src/render/SDL_render.c(2168): Error! E1054: Expression must be constant
./src/render/SDL_render.c(2168): Error! E1054: Expression must be constant
./src/render/SDL_render.c(2175): Error! E1054: Expression must be constant
./src/render/SDL_render.c(2175): Error! E1054: Expression must be constant
./src/render/SDL_render.c(2322): Error! E1054: Expression must be constant
./src/render/SDL_render.c(2322): Error! E1054: Expression must be constant
./src/render/SDL_render.c(2322): Error! E1054: Expression must be constant
./src/render/SDL_render.c(2322): Error! E1054: Expression must be constant
./src/render/SDL_render.c(2329): Error! E1054: Expression must be constant
./src/render/SDL_render.c(2329): Error! E1054: Expression must be constant
./src/render/SDL_render.c(2329): Error! E1054: Expression must be constant
./src/render/SDL_render.c(2329): Error! E1054: Expression must be constant
./src/render/software/SDL_render_sw.c(602): Error! E1054: Expression must be constant
./src/render/software/SDL_render_sw.c(602): Error! E1054: Expression must be constant
./src/render/software/SDL_render_sw.c(602): Error! E1054: Expression must be constant
./src/render/software/SDL_render_sw.c(602): Error! E1054: Expression must be constant
https://bugzilla.libsdl.org/show_bug.cgi?id=4350
We can't safely call GL_DestroyRenderer() until GL_LoadFunctions()
succeeds because we will be missing functions that we try to use
when activating the renderer for destruction if we have an GL context.
Sylvain
Re-opening this issue.
It fixes the test-case, but it introduces a regression with another bug (bug #4313).
So here's a new patch that activate cropping of the source surface to solve the issue.
It also reverts the wrong changeset.
It prevents unneeded colorkey error message.
Include guards in most changed files were missing, I added them keeping
the same style as other SDL files. In some cases I moved the include
guards around to be the first thing the header has to take advantage of
any possible improvements compiler may have for inclusion guards.
I have no idea if this works (or if it ever worked, having now examined this
code), as I have no way to compile or test this.
If it's broken, send patches. :)
This means it doesn't have to block while the current frame finishes using the
vertex buffer; it just moves on to the next, probably-not-in-use buffer.
- high-level filters out duplicate render commands from the queue so
backends don't have to.
- Setting draw color is now a render command, so backends can put color
information into the vertex buffer to upload with everything else instead
of setting it with slower dynamic data later.
- backends can request that they always batch, even for legacy programs,
since the lowlevel API can deal with it (Metal, and eventually Vulkan
and such...)
- high-level makes sure the queue has at least one setdrawcolor and
setviewport command before any draw calls, so the backends don't ever have
to manage cases where this hasn't been explicitly set yet.
- backends allocating vertex buffer space can specify alignment, and the
high-level will keep track of gaps in the buffer between the last used
positions and the aligned data that can be used for later allocations
(Metal and such need to specify some constant data on 256 byte boundaries,
but we don't want to waste all that space we had to skip to meet alignment
requirements).
It now uses a growable linked list that keeps a pool of allocated items for
reuse, and reallocs the vertex array as necessary. Testsprite2 can scale to
20,000 (or more!) draws now without drama.
This moves all the rendering to a command list that is flushed to the GL as
necessary, making most common activities upload a single vertex buffer per
frame and dramatically reducing state changes. In pathological cases,
like Emscripten running on iOS's Safari, performance can go from a dozen
draw calls killing your performance to 1000 draw calls running smoothly.
This is work in progress, and not ready to ship. Among other things, it has
a hardcoded array that isn't checked for overflow. But the basic idea is
sound!
The only place angle is activated and causes effect is RenderCopyEx. All other
methods which use vertex shader, leave angle disabled and cause useless sin/cos
calculation in shader.
To get around shader's interface is changed to a vector that contains results
of sin and cos. To behave properly when disabled, cos value is set with offset
-1.0 making 0.0 default when deactivated.
As nice side effect it simplifies GLES2_UpdateVertexBuffer: All attributes are
vectors now.
Additional background:
* On RaspberryPi it gives a performace win for operations. Tested with
[1] numbers go down for 5-10% (not easy to estimate due to huge variation).
* SDL_RenderCopyEx was tested with [2]
* It works around left rotated display caused by low accuracy sin implemetation
in RaspberryPi/VC4 [3]
[1] https://github.com/schnitzeltony/sdl2box
[2] https://github.com/schnitzeltony/sdl2rendercopyex
[3] https://github.com/anholt/mesa/issues/110
Signed-off-by: Andreas M?ller <schnitzeltony@gmail.com>
duckgrease
SDL_RenderCopyEx blits wrong image (in some cases it's bunch of alternating horizontal lines, some cases it's image from the wrong coordinate, and in some cases it's just a bunch of garbled pixels), when the following conditions are met:
- Use software renderer.
- Enable either horizontal or vertical flip.
- source and destination rectangles must have same width and height, and must be smaller than the size of the texture.
- source rectangle's X and Y coordinates must be 0.
Sylvain
Patch a few warnings when using:
-Wmissing-prototypes -Wdocumentation -Wdocumentation-unknown-command
They are automatically enabled with -Wall
Anthony @ POW Games
SDL_CreateTextureFromSurface makes an internal call to SDL_GetColorKey which can return an error and spams the error log with "Surface doesn't have a colorkey" even though the original function didn't return an error.
Olli-Samuli Lehmus
If one creates a window with the SDL_WINDOW_FULLSCREEN_DESKTOP flag, and creates a render target with SDL_SetHint(SDL_HINT_RENDER_SCALE_QUALITY, "linear"), and afterwards sets SDL_SetHint(SDL_HINT_RENDER_SCALE_QUALITY, "nearest"), after minimizing the window, the scale quality hint is lost on the render target. Textures however do keep their interpolation modes.
SDL now builds with gcc 7.2 with the following command line options:
-Wall -pedantic-errors -Wno-deprecated-declarations -Wno-overlength-strings --std=c99
- Use a single buffer for various non-changing constants accessed by the GPU, instead of multiple buffers.
- Do the half-pixel offset for points and lines using a transform matrix so we don't need a malloc when rendering.
- Don't add a half-pixel offset for other primitives and textures. This matches D3D and GL render behaviour.
- Remove the half-texel texture coordinate offset since it's not needed now that there's no more half-pixel position offset when rendering a texture.
- Don't try to set texture usage on iOS 8 since it doesn't exist there.
Eric wing
There is a tiny bug in the new overscan code for the SDL_renderer.
In SDL_renderer.c, line 1265, the if check for SDL_strcasecmp with "direct3d" needs to be inverted.
Instead of:
if(SDL_strcasecmp("direct3d", SDL_GetCurrentVideoDriver())) {
It should be:
if(0 == SDL_strcasecmp("direct3d", SDL_GetCurrentVideoDriver())) {
This bug causes the "overscan" mode to pretty much be completely ignored in all cases and all things remain letterboxed (as before the feature).
Yuri K. Schlesner
When using texture filtering, there are filtering artifacts visible on the edges of scaled textures, where the texture filtering pulls in texels from the other side of the texture. Using clamping texture modes wouldn't completely fix this since source rectangles don't need to cover the whole texture. (See screenshot attached in next post.)
The opengl driver uses clamping on textures and so avoid this at least in the cases where the source rect is the whole texture. The direct3d driver does not and so has problems in every case. I'm not sure if it can actually completely be fixed, but at least enabling clamping for direct3d would be one step in the right direction.
This works better for games where there may be a bunch of simulation logic that needs to be run before the next rendering pass, and prevents blocking if the next drawable is busy.
This isn't complete, but is enough to run testsprite2. It's currently
Mac-only; with a little work to figure out how to properly glue in a Metal
layer to a UIView, this will likely work on iOS, too.
This is only wired up to the configure script right now, and disabled by
default. CMake and Xcode still need their bits filled in as appropriate.
New functions get and set the YUV colorspace conversion mode:
SDL_SetYUVConversionMode()
SDL_GetYUVConversionMode()
SDL_GetYUVConversionModeForResolution()
SDL_ConvertPixels() converts between all supported RGB and YUV formats, with SSE acceleration for converting from planar YUV formats (YV12, NV12, etc) to common RGB/RGBA formats.
Added a new test program, testyuv, to verify correctness and speed of YUV conversion functionality.
Carlos
Angle supports GLES3 but when using these functions (SDL_CreateWindow and SDL_CreateRenderer), defaults again to GLES2.0.
A current workaround (hack) to retrieve a GLES3.0 context with Angle is:
1) set
SDL_GL_SetAttribute(SDL_GL_CONTEXT_MAJOR_VERSION, 3);
SDL_GL_SetAttribute(SDL_GL_CONTEXT_MINOR_VERSION, 0);
after InitSDL AND after calling SDL_CreateWindow (before SDL_CreateRenderer)
2) Comment lines 2032-2044 in SDL_render_gles2.c, funtion GLES2_CreateRenderer
window_flags = SDL_GetWindowFlags(window);
if (!(window_flags & SDL_WINDOW_OPENGL) ||
profile_mask != SDL_GL_CONTEXT_PROFILE_ES || major != RENDERER_CONTEXT_MAJOR || minor != RENDERER_CONTEXT_MINOR) {
changed_window = SDL_TRUE;
SDL_GL_SetAttribute(SDL_GL_CONTEXT_PROFILE_MASK, SDL_GL_CONTEXT_PROFILE_ES);
SDL_GL_SetAttribute(SDL_GL_CONTEXT_MAJOR_VERSION, RENDERER_CONTEXT_MAJOR);
SDL_GL_SetAttribute(SDL_GL_CONTEXT_MINOR_VERSION, RENDERER_CONTEXT_MINOR);
if (SDL_RecreateWindow(window, window_flags | SDL_WINDOW_OPENGL) < 0) {
goto error;
}
}
This retrives a GLES3 context as confirmed using glGetString(GL_VERSION). This should be fixed by modifying a few if's.
Sylvain
Few issues with YUV on SDL2 when using odd dimensions, and missing conversions from/back to YUV formats.
1) The big part is that SDL_ConvertPixels() does not convert to/from YUV in most cases. This now works with any format and also with odd dimensions,
by adding two internal functions SDL_ConvertPixels_YUV_to_ARGB8888 and SDL_ConvertPixels_ARGB8888_to_YUV (could it be XRGB888 ?).
The target format is hard coded to ARGB888 (which is the default in the internal of the software renderer).
In case of different YUV conversion, it will do an intermediate conversion to a ARGB8888 buffer.
SDL_ConvertPixels_YUV_to_ARGB8888 is somehow redundant with all the "Color*Dither*Mod*".
But it allows some completeness of SDL_ConvertPixels to handle all YUV format.
It also works with odd dimensions.
Moreover, I did some benchmark(SDL_ConvertPixel vs Color32DitherYV12Mod1X and Color32DitherYUY2Mod1X).
gcc-6.3 and clang-4.0. gcc performs better than clang. And, with gcc, SDL_ConvertPixels() performs better (20%) than the two C function Color32Dither*().
For instance, to convert 10 times a 3888x2592 image, it takes ~195 ms with SDL_ConvertPixels and ~235 ms with Color32Dither*().
Especially because of gcc vectorize feature that optimises all conversion loops (-ftree-loop-vectorize).
Nb: I put no image pitch for the YUV buffers. because it complexify a little bit the code and the API :
There would be some ambiguity when setting the pitch exactly to image width:
would it a be pitch of image width (for luma and chroma). or just contiguous data ? (could set pitch=0 for the later).
2) Small issues with odd dimensions:
If width "w" is odd, luma plane width is still "w" whereas chroma planes will be "(w + 1)/2". Almost the same for odd h.
Solution is to strategically substitute "w" by "(w+1)/2" at the good places ...
- In the repository, SDL_ConvertPixels() handles YUV only if yuv source format is exactly the same as YUV destination format.
It basically does a memcpy of pixels, but it's done incorrectly when width or height is odd (wrong size of chroma planes). This is fixed.
- SDL Renderers don't support odd width/height for YUV textures.
This is fixed for software, opengl, opengles2. (opengles 1 does not support it and fallback to software rendering).
This is *not* fixed for D3D and D3D11 ... (and others, psp ?)
Only *two* Dither function are fixed ... not sure if others are really used.
- This is not possible to create a NV12/NV12 texture with the software renderer, whereas other renderers allow it.
This is fixed, by using SDL_ConvertPixels underneath.
- It was not possible to SDL_UpdateTexture() of format NV12/NV21 with the software renderer. this is fixed.
Here's also two testcases:
- that do all combination of conversion.
- to test partial UpdateTexture
Anthony
This is what's making the software renderer crash with rotated destination rectangles of w or h = 0:
SDL_SetHint(SDL_HINT_RENDER_SCALE_QUALITY, "2");
Martin Gerhardy
just for easier debugging issues in the own code...
SDL_CreateRenderer should maybe also use this macro
Ryan C. Gordon
I'll go one better: it should have an SDL_assert().
UX-admin
I am compiling with the Sun Studio 12 u2 compiler. There are multiple issues with the build, but this particular issue appears to be that it is illegal to declare a union of a struct of floats and a float. While GCC 4.8.1 does not flag this as an error, Sun Studio is much more standards compliant and strict, halting further compilation with an error.
Simon Hug
The bug is in the GL_ResetState and GLES_ResetState functions which get called after a new GL context is created. These functions set the cached current color to transparent black, but the GL specification says the initial color is opaque white.
The attached patch changes the values to 0xffffffff to reflect the initial state of the current color. Should the ResetState functions get called anywhere else in the future, this probably has to call the GL functions itself to ensure that the colors match.
Littlefighter19
When trying to mirror something on the PSP, I've stumbled upon the problem,
that using SDL_RenderCopyEx with SDL_FLIP_HORIZONTAL flips the image vertically, vise-versa SDL_FLIP_VERTICAL flips the image horizontally.
Proposed patch would be swapping the check in line 944 with the one in line 948 in SDL_render_psp.c
Daniel
SDL_RenderReadPixels with SDL_RENDERER_SOFTWARE reads pixels from wrong coordinates.
SW_RenderReadPixels adjusts the rect coordinates according to the viewport. But since this is already done by SDL_RenderReadPixels, the final rect has x2 bigger X and Y.
Eric wing
Hi, I think I found a bug when using SDL_WINDOW_ALLOW_HIGHDPI with SDL_RenderSetLogicalSize on iOS. I use SDL_RenderSetLogicalSize for all my stuff. I just tried turning on SDL_WINDOW_ALLOW_HIGHDPI on iOS and suddenly all my touch/mouse positions are really broken/far-off-the-mark.
I actually don't have a real retina device (still) so I'm seeing this using the iOS simulator with a 6plus template.
Attached is a simple test program that can reproduce the problem. It uses RenderSetLogicalSize and draws some moving happy faces (to show the boundaries/space of the LogicalSize and that it is working correctly for that part).
When you click/touch, it will draw one more happy face where your button point is.
If you comment out SDL_WINDOW_ALLOW_HIGHDPI, everything works as expected. But if you compile with it in, the mouse coordinates seem really far off the mark. (Face appears far up and to the left.)
Alex Szpakowski on the mailing list suggests the problem is
"I believe this is a bug in SDL_Render?s platform-agnostic mouse coordinate scaling code. It assumes the units of the mouse coordinates are always in pixels, which isn?t the case where high-DPI is involved (regardless of whether iOS is used) ? they?re actually in ?DPI independent? coordinates (which matches the window size, but not the renderer output size)."
Additionally, if this is correct, the Mac under Retina is also probably affected too and "as well as any other platform SDL adds high-dpi support for in the future".
felix
The functions in src/render/SDL_yuv_mmx.c contain the following inline assembly snippet:
/* tap dance to workaround the inability to use %%ebx at will... */
/* move one thing to the stack... */
"pushl $0\n" /* save a slot on the stack. */
"pushl %%ebx\n" /* save %%ebx. */
"movl %0, %%ebx\n" /* put the thing in ebx. */
"movl %%ebx,4(%%esp)\n" /* put the thing in the stack slot. */
"popl %%ebx\n" /* get back %%ebx (the PIC register). */
Here's how it ended up in a binary on my old laptop:
0xb5c17dbd <ColorRGBDitherYV12MMX1X+93>: push $0x0
0xb5c17dbf <ColorRGBDitherYV12MMX1X+95>: push %ebx
0xb5c17dc0 <ColorRGBDitherYV12MMX1X+96>: mov 0xc(%esp),%ebx
0xb5c17dc4 <ColorRGBDitherYV12MMX1X+100>: mov %ebx,0x4(%esp)
0xb5c17dc8 <ColorRGBDitherYV12MMX1X+104>: pop %ebx
Apparently the compiler, oblivious to the fact that the assembly snippet manipulates the %esp register, decided to refer to the operand via that same register instead of via %ebp (I believe -fomit-frame-pointer enables this). This causes %ebx to be loaded with the wrong value, which later leads to a null pointer dereference.
Recent GCC can use the %ebx register normally: <https://gcc.gnu.org/bugzilla/show_bug.cgi?id=47602#c16>. There is even an explicit constraint "b" for allocating it.