Invalid UTF-8 was being best-effort consumed, which given the right sequence of brokenness, could end up with diagnostic locations referring to bytes beyond the end of a line.
Improve the UTF-8 decoding so that it can detect when multi-byte codepoints are missing the high-bit being set.
Actually detect this in a lexer, and parser and produce errors.
Bug: tint:1437
Bug: chromium:1305648
Change-Id: I459f0df840b4ce8c4f5f82363f93602bf8326984
Reviewed-on: https://dawn-review.googlesource.com/c/tint/+/83540
Kokoro: Kokoro <noreply+kokoro@google.com>
Reviewed-by: Antonio Maiorano <amaiorano@google.com>
Commit-Queue: Ben Clayton <bclayton@google.com>