In the previous part to this series, I mentioned a 'frozen string' approach in the context of keeping an internal model of parsed the XML text, I now prefer using the term 'Abstract Syntax Tree' - though I'm not certain whether it strictly fits the WIKI defiintion or that used on the JetBrains MPS web page.
In CoherentWeb, the 'XML model' is maintained as a simple collection of 'Node Descriptor Groups' (NDG), with 1 NDG for each node, except attribute and text nodes as these don't contribute towards the XML hierarchy. Each CDATA section has its own NDG, and perhaps surprisingly, even all the 'XML' nodes within comments have one.
NDGs are slightly different depending on whether they represent element nodes or comment nodes for example, but to take an element node - you might have the following:
The above should be fairly self-explanatory, Start and End are the character positions of the node within the XML text. NodePointer is the count of how many preceding sibling nodes there are in the tree, this, combined with the ParentIndex and Level identify the exact postition of the NDG within the AST. This data is managed for each node and the whole collection makes up the Abstract Syntax Tree (AST).
The AST is used for a whole range of tasks including:
Knowing the context of the current selection
Building a lazy-loading element tree from the currently selected node
Showing the selected XPath location, complete with qualifying predicates
Displaying the currently selected element name
Showing XPath, name and value of all nodes that have value within the current element
Providing 'soft-edges' to tags and attributes to prevent accidental deletions
Determining whether a key-press requires a partial reparse, a full reparse, or just an offset within the model
The screenshot below shows the amount of information that needs to be displayed to the user when a node is selected, the AST is critical to doing this in a performant way:
As well as the AST, syntax coloring of XML text is also exploited as this contains useful information about such things as whether an attribute name or attribute value is pressed, so it isn't necessary to replicate this in the model. Even quotation marks are a slightly different color (is this cheating?) so it can be determined whether they're opening or closing.
All node data is stored in a generic List<nodedescriptorgroup>. NodeDescriptorGroup is a Class, it could have been a Struct, but then the AST would live on the stack instead of the heap - this could then cause stackoverflow issues for very large XML documents.
So, that's the Abstract Syntax Tree, an important part of an XML Editor design if you want to avoid excessive parsing of the XML text and provide user interface features where the exact selection context must be known.


0 comments:
Post a Comment