== Performance improvement ideas == === 1. Low-level parsing === ==== 1.0.1 Check name validity only once? ==== Perhaps we can combine SymbolTable, and name validation access such that we only store valid names in SymbolTable. As such, validity only needs to be checked if there's no hit. ==== 1.1 Integrate char decoding and scanning ==== (BIG EFFORT) It would be possible to combine character decoding (byte -> UTF-8 char) and low-level XML scanning, so that effectively we would have somewhat XML-token-aware Reader. This could speed up parsing a bit, maybe by 10% or so. Downside is that Reader would be clunkier, and each decoder would be bigger than they are now. An additional benefit would be 100% accurate byte offsets for the reader. === 2. DTD handling === ==== 2.1 Efficient next-state data struct for DFA ==== Currently standard HashMap is used for figuring out next-state for DFAs. It would be possible to create type-safe alternative, that would be more compact (lower mem usage), and potentially even slightly faster.