通过实例学习Virtools脚本语言VSL - 解析字符串
<div dir="ltr">
<p>该习题演示解析字符串及用字符串中包含的信息填充数组(Array)。</p>
<ol>
<li>开始一个新的作品并创建一个数组(Array)。把数组重命名为 "Players"
(没有引号) 并添加三个列(column),如下命名 - 列类型:<br><ul>
<li>NickNames - String
</li>
<li>Age - Integer
</li>
<li>Score - Integer.<br>
</li>
</ul>
</li>
<li> 在Level下创建新脚本,并添加一个Run VSL BB。在VSL Script Manager中添加两个pIn。第一个pIn重命名为"data",类型设为String。第二个pIn重命名为"array",类型设为Array。<br>
</li>
<li>切换到 Schematic工作区,输入以下字符(不包括引号),作为“data”pIn的值:
<p>
"Eva,22,1024. <br>
Jane,34, 544. <br>
Pierre, 17, 5410. <br>
John, 85,10."</p>
<p>
你可能想要展开'data' pIn中的数据输入的字段。</p>
<p>
构想是解析输入的字符串,提取出其中的信息,然后复制到数组中。该习题中,所需要的信息是名字、年龄和积分。逗号和句号作为数据是引不起人们兴趣的,但作为隔离数据字段或标志行结束点的字符是非常有用的。你会用到<a href="https://sites.google.com/site/x3dofcn/vsl-virtools/vsl-sdk-binding-tables/vsl_classes">VSL <-
SDK 对应表 - 类与方法</a>
中列出了的StringTokenizer类。给定要解析的字符串及用到的分隔符,"NextToken(str iPrevToken)" 这个方法就会一个个的提取出令牌。<br><br><span style="font-family: arial,sans-serif; color: #000000;">【译注:网络资源 - </span>
<span style="font-family: arial,sans-serif; color: #000000;">
<strong>bruce</strong>
</span>
<span style="font-family: arial,sans-serif; color: #000000;">
</span>
<span style="font-family: arial,sans-serif; color: #000000;">- 在邱仲潘译的《MasteringJava2》有这么一段<br><br>
StreamTokenizer类根据用户定义的规则,从输入流中提取可识别的子串和标记符号,这个过程称为<strong>令牌化</strong>
(<strong>tokenizing</strong>
),因为流简化为了令牌符号。<strong>令牌</strong>
(<strong>token</strong>
)通常代表关键字、变量名、字符串、<strong>直接量</strong>
和大括号等语法标点。<br><br>
我们参考邱仲潘的这段译文,统一为<br>
token:令牌<br>
tokenizing:令牌化<br>
tokenizer:令牌解析器<br><br>
cherami提到的翻译为“标记”,也可以理解,但token更准确的指一个字串(或流)中的以空格、','等(用户指定的规则)分割开来的一个一个的子串,使用“标记”好像范围比较窄。借用令牌网中的这个术语--“令牌”,我觉得很形象。<br></span>
】<br></p>
</li>
<li>在代码窗口中输入下面的代码:<br><pre>void main()<br>
{<br><span>// We clear all data in the array</span>
<br>
array.Clear();<br><br><span>// We create the first tokenizer in order to<br>
// get data line by line. The "." separates lines.</span>
<br>
str tokenLine = null;<br>
StringTokenizer tokenizerLine(data.CStr(), ".");<br><br>
int row = 0;<br><br><span>// Get new line</span>
<br>
while (tokenLine = tokenizerLine.NextToken(tokenLine))<br>
{<br><span>// For each line extracted, we add a row in the array</span>
<br>
array.AddRow();<br><br><span>// The second tokenizer works with the extracted line<br>
// to extract the data on a word by word basis.<br>
// The "," separates words.</span>
<br>
str tokenWord = null;<br>
StringTokenizer tokenizerWord(tokenLine, ",");<br><br>
int column = 0;<br><br><span>// Get new word</span>
<br>
while (tokenWord = tokenizerWord.NextToken(tokenWord))<br>
{<br><span>// Insert word in the array</span>
<br>
array.SetElementStringValue(row, column, tokenWord);<br>
++column;<br>
} <br>
++row;<br>
}<br>
}
</pre>
</li>
<li>编译VSL脚本并运行。要确认那个数组中的内容如下:<br><br><br><div style="display: block; text-align: left;">
<a href="https://sites.google.com/site/x3dofcn/vsl-virtools/Examples/4_parse-string/vsl_parse1.png?attredirects=0"><img src="https://sites.google.com/site/x3dofcn/_/rsrc/1245403946960/vsl-virtools/Examples/4_parse-string/vsl_parse1.png" border="0" alt=""></a>
</div>
<br><br>
你可以看到,"Jane", "Pierre" 和 "John"这几个名字提取得不是很好,它们都以一个换行符开始(非打印换行符以一个小盒子的样子显示)。为了移除这个额外的字符,你需要给VSL脚本添加一个移除换行符的函数。下面的代码应该能完成这个任务:<br><pre>void RemoveFirstReturnCharacter(String str2clear)<br>
{<br><span>// If first character is equal to return...</span>
<br>
if (str2clear[0] == '/n')<br><span>/ ... crop string from second character to the end</span>
<br>
str2clear = str2clear.Crop(1, str2clear.Length()-1);<br>
}<br></pre>
</li>
<li>修改你的代码,要包括上面的函数。你的代码现在应该是像这个样子:<br><pre>void main()<br>
{<br><span>// We clear all data in the array</span>
<br>
array.Clear();<br><br><span>// We create the first tokenizer in order to<br>
// get data line by line</span>
<br>
str tokenLine = null;<br>
StringTokenizer tokenizerLine(data.CStr(), ".");<br><br>
int row = 0;<br><br><span>// Get new line</span>
<br>
while (tokenLine = tokenizerLine.NextToken(tokenLine))<br>
{<br><span>// For each line extracted, we add a row in the array</span>
<br>
array.AddRow();<br><br><span>// The second tokenizer works with the extracted line<br>
// to extract the data on a word by word basis.<br>
// The "," separates words.</span>
<br>
str tokenWord = null;<br>
StringTokenizer tokenizerWord(tokenLine, ",");<br><br>
int column = 0;<br><br><span>// Get new word</span>
<br>
while (tokenWord = tokenizerWord.NextToken(tokenWord))<br>
{<br><span>// Remove first character if it's a '/n'</span>
<br>
String strToClear = tokenWord;<br>
RemoveFirstReturnCharacter(strToClear);<br><br><span>// Insert word in the array</span>
<br>
array.SetElementStringValue(row, column, strToClear.CStr());<br>
++column;<br>
} <br>
++row;<br>
}<br>
}<br>
现在,在把单词插入数组之前,新的函数检查字符串并对之修改(如果有必要) - 移除换行符。<br></pre>
</li>
<li> 编译你的VSL脚本并运行。你的数组现在是不是看起来好多了?<br><br><br><div style="display: block; text-align: left;">
<a href="https://sites.google.com/site/x3dofcn/vsl-virtools/Examples/4_parse-string/vsl_parse2.png?attredirects=0"><img src="https://sites.google.com/site/x3dofcn/_/rsrc/1245403976002/vsl-virtools/Examples/4_parse-string/vsl_parse2.png" border="0" alt=""></a>
</div>
</li>
</ol>
</div>