Skip to content

Speed up large XML parses#12917

Closed
singpolyma wants to merge 1 commit into
HaxeFoundation:developmentfrom
singpolyma:faster-xml
Closed

Speed up large XML parses#12917
singpolyma wants to merge 1 commit into
HaxeFoundation:developmentfrom
singpolyma:faster-xml

Conversation

@singpolyma
Copy link
Copy Markdown
Contributor

Creating the children array and attribute map for nodes that will never use it (text nodes, etc) is both wasteful in terms of RAM and in terms of time. Parsing a 400MB XML file this small change saves me a full second of time in benchmarks.

Creating the children array and attribute map for nodes that will never
use it (text nodes, etc) is both wasteful in terms of RAM and in terms
of time. Parsing a 400MB XML file this small change saves me a full
second of time in benchmarks.
@back2dos
Copy link
Copy Markdown
Member

Probably one of the few places where inheritance would really make sense: Have one subclass for trees (documents and elements) and one for leafs (the rest). This would replace all the node type conditionals with polymorphism and also make smaller instances on static targets (even if fields are null, they still take 8 bytes each).

@singpolyma
Copy link
Copy Markdown
Contributor Author

Sure, inheritance or enum could make sense. Would have to benchmark to see the impact. This small change has a good performance impact without any API changes though.

@Simn
Copy link
Copy Markdown
Member

Simn commented Jun 1, 2026

Surely this should use a switch instead.

@singpolyma
Copy link
Copy Markdown
Contributor Author

@Simn for what specifically? It's two overlapping conditions...

@Simn Simn closed this in 44069c4 Jun 3, 2026
@tobil4sk
Copy link
Copy Markdown
Member

tobil4sk commented Jun 3, 2026

Does this have an effect on the null safety invariants of this class?

@Simn
Copy link
Copy Markdown
Member

Simn commented Jun 3, 2026

Does this have an effect on the null safety invariants of this class?

Sorry, what?

@tobil4sk
Copy link
Copy Markdown
Member

tobil4sk commented Jun 3, 2026

The children field is typed as Array<Xml>, not Null<Array<Xml>> which it maybe should be now since it can be null. (it's also curious that this didn't trigger an error in CI)

I was wondering if making it null instead of [] can break some methods that assume a non-null array, however, looking into it, it appears there are appropriate ensureElementType checks already which should make sure an exception is thrown instead of a null-access error.

@RblSb
Copy link
Copy Markdown
Member

RblSb commented Jun 3, 2026

Haxe std is only trying to be null-safe as external interface right now, not by implementation. Marking type as nullable should be correct change (since this is also public field)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants