Retrieving data from tree-structured XML databases can be a tedious and computationally intensive process that produces extraneous intermediate partial matches that are not ultimately returned as a search result. This technology is an XML query process, TwigStack, which provides a compact search path for easy retrieval of search matches. TwigStack implements a novel holistic twig join algorithm for matching an XML query twig pattern. The technique uses a chain of linked stacks to compactly represent partial results, which are then composed to obtain matches for the twig pattern. TwigStack provides a streamlined query solution to return relevant search results quickly, allowing XML databases to be navigated more efficiently.
Prior XML query algorithms have typically decomposed the twig pattern into binary structural relationships and achieved twig matching by either (i) using structural join algorithms to match the binary relationships against the XML database, or (ii) stitching together these basic matches. A disadvantage of this approach is that the size of the intermediate results can be large even when the input and output sizes are smaller. TwigStack overcomes this limitation by first generating partial solutions to individual query paths, then merging them together to compute the answers to the twig pattern. This ensures that no intermediate solution is larger than the final answer to the query twig pattern.
Analysis of TwigStack using a range of real and synthetic experimental data showed that the algorithm was I/O and CPU optimal for a large class of query twig patterns.
Tech Ventures Reference: IR Proxy71