_InterlockedExchange_rel() is required for correctness on ARM because
the _ReadWriteBarrier() macro is only a compiler memory barrier, not a
hardware memory barrier. Due to ARM's relaxed memory model, this means
the '*lock = 0' write may be observed before the operations inside the
lock, causing possible corruption of data protected by the lock.
_InterlockedExchange_acq() is more efficient on ARM because it avoids an
expensive full memory barrier that _InterlockedExchange() does.
David Carlier
This form of 'or' provides a hint that performance
will probably be improved if shared resources dedicated
to the executing processor are released for use by other
processors
The real problem is that SDL_atomic.c was built in thumb mode instead of ARM mode, which is required to use the mcr instruction on ARM platforms. Added a compiler error to catch this case so we don't generate code that does infinite recursion.
I also added a potentially better way to handle things on Linux ARM platforms, based on comments in the Chromium headers, which we can try out after 2.0.10 ships.
The internal function SDL_EGL_LoadLibrary() did not delete and remove a mostly
uninitialized data structure if loading the library first failed. A later try to
use EGL then skipped initialization and assumed it was previously successful
because the data structure now already existed. This led to at least one crash
in the internal function SDL_EGL_ChooseConfig() because a NULL pointer was
dereferenced to make a call to eglBindAPI().