10.3.1 Word Searching
Use of the facilities in this clause in the style or transformation languages requires the word feature.
(word-parse nl string)
(word-parse nl)
This builds a new grove by performing an auxiliary parse using the Data Tokenizer Property Set.  string, if specified, is the ISO 639 language code of the language which should be assumed for the purposes of determining what constitutes a word.  The algorithm to be used is not specified in this International Standard.
<propset psn=datatok fullnm="Data Tokenizer Property Set">
<classdef rcsnm=tokroot appnm="tokenized root" conprop=strings>
<propdef rcsnm=strings datatype=nodelist ac=tokenstr>
<classdef rcsnm=tokenstr appnm="tokenized string" conprop=string>
<propdef rcsnm=string datatype=string>
For each member of nl, a tokenized string node is created for each word in the data of that member.  The root of the auxiliary grove has these tokenized string nodes as children.  A node-list of all the tokenized string nodes is returned.  If a member, x, of nl contains another member, y, of nl as a descendant, then the data of y is removed from the data of x before x is parsed for words.
(select-tokens nl string)
Returns a node-list containing each member of nl that is a tokenized-string node with a string property equal to string.