|
Comments and Discussions
|
 |
 |
I saw there was a thread about nested tokens not working correctly. Having used this function instead of writing my own, I ran across a simular problem. The variable 'i' used in the function is not a local variable. So if you run into this problem just dim the 'i' inside the function and make it a local variable and poof.. no more infinite loops.
On a side note, I'd like to thank Chris for posting this, it saved me a bit of time and the code is beatifuly writen. Except for the whole 'i' variable thing. :P
|
|
|
|
 |
I tried your Tokenize function in VBScript on an ASP page. Unfortunately I find that it does not work correctly with nested tokenize calls. The code tries to split a string into records separated by semi-colons), and then tries to split the record into fields seperated by commas. Can anyone help with sample code below?
It works as expected in VB.
<%@ Language=VBScript %>
<%
Sub AddRecord(byVal RecordString)
Dim b, Seps1(1)
Seps1(0) = ","
b = Tokenize( RecordString, Seps )
For i = 1 to Ubound( b )
Response.Write ">>>>Field " & i & ": " & b(i-1) & " "
Next
End Sub
%>
<HTML>
<HEAD>
<META NAME="GENERATOR" Content="Microsoft Visual Studio 6.0">
<TITLE></TITLE>
<style>
TD
{
BORDER-RIGHT: blue 1px solid;
BORDER-TOP: blue 1px solid;
BORDER-LEFT: blue 1px solid;
BORDER-BOTTOM: blue 1px solid;
}
</style>
</HEAD>
<BODY>
Clients |
<%
MyString = "A1,A2,A3;B1,B2,B3"
'Split results string into clients
Dim a, Seps(1)
Seps(0) = ";"
a = Tokenize( MyString, Seps )
For i = 1 to UBound(a)
Response.Write "Record " & i & ": " & a(i-1) & " "
AddRecord( a(i-1) )
Next
%>
|
</BODY>
</HTML>
|
|
|
|
 |
Function Tokenize(byVal TokenString, byRef TokenSeparators())
Dim Tokens(), TempTokens()
ReDim Tokens(0)
Tokens(0) = TokenString
Dim i
For i = 0 to UBound(TokenSeparators)
Dim nTempTokens
nTempTokens = 0
Dim k
For k = 0 to UBound(Tokens)
Dim a
a = Split(Tokens(k), TokenSeparators(i), -1, 1)
ReDim Preserve TempTokens(nTempTokens + UBound(a))
Dim l
For l = 0 to UBound(a)
TempTokens(nTempTokens + l) = Trim(a(l))
Next
nTempTokens = UBound(TempTokens) + 1
Next
' Tokens = TempTokens
ReDim Tokens(UBound(TempTokens))
Dim j
For j = 0 to UBound(TempTokens)
Tokens(j) = TempTokens(j)
Next
Next
Tokenize = Tokens
End Function
|
|
|
|
 |
If the String is "Tom, Dick, Harry and Andy", with the Tokenize function you end up with: Found 4 tokens Keyword 1 = Tom Keyword 2 = Dick Keyword 3 = Harry Keyword 4 = y (And was removed assuming 'And' is a Seperator) I have fixed this problem by tokenizing using a space delimiter and then passing the tokens (whole words) to a Cleanup function. Since I know the words are whole we can match on exact spelling. I am filtering out badwords in this case but you can filter out whatever words you need. Cleanword returns True or False. If FALSE is return by the function then I throw out that word. ' ***** CODE *****' Dim Str, OutputString, Seps(1) Str = "Tom, Dick, Harry and Andy" Seps(0) = " " 'Space Delimited' Str = LCase(Str) 'Change everything to lower case to simplfy searching' tokens = Tokenize(Str, Seps) 'Tokenize function as written by Chris Maunder' Response.Write "<p>Found " & UBound(tokens) & " tokens</p>" Response.Write "<ol>" For i=1 to UBound(tokens) Response.Write "<li>Keyword " & i & " = " & tokens(i-1) & "</li>" if CleanWord(tokens(i-1)) then OutputString = OutputString & " " & tokens(i-1) end if next Response.Write "</ol><br/><br/> Output String = " & OutputString function CleanWord(WordString) Dim i, output output = TRUE Dim badwords(4) badwords(0) = "and" badwords(1) = "drat" badwords(2) = "darn" badwords(3) = "bummer" For i=1 to UBound(badwords) if WordString = badwords(i-1) then output = FALSE exit for end if Next CleanWord = output end function ' *** END CODE ***' Now the Output looks like this: Found 5 tokens Keyword 1 = tom Keyword 2 = dick Keyword 3 = harry Keyword 4 = and Keyword 5 = andy Output String = tom dick harry andy We still have found all the token, but 'and' was rejected by the CleanWord function because it is found in the badword list, but 'andy' was not. The Output String shows this. These functions have come in handy on so many occations since I have started using them. Almost any time I need to filter out text from users for any reason what so ever I put these two functions to use. The only thing that I don't really like is creating the badword array. The list can be quite long, so it can dirty up your code quite a bit. It is helpful if you create your array outside the function and pass it in byRef.
|
|
|
|
 |
So you extended the VBScript Split function?`
Except the ability to pass multiple delimiters, this is what the VBScript function does.
Or am I wrong
|
|
|
|
 |
Pretty much. I never claimed this was rocket science - but it has been handy so I figured there maybe be someone out there who could use it
|
|
|
|
 |
Consider it used
k
|
|
|
|
 |
I have been looking for a function in vb that gives the sub string of a nother string, i figured you would know after looking at the tokenize function, more importantly i have been trying to get an activex control that i created and tested to launch from explorer 5.5 .. using the <OBJECT> tag, had no joy! any ideas or references.
Cheers
Ali
|
|
|
|
 |
I modified it slightly so it could deal with CSV files. I have copied extra code below. It allows comma's that it finds between two quotes to be ignored and treat all the text between the quotes as a token. Small change / contribution to a good script.
I used arrays in this mod but it could equally be done with to variable. I wanted to experiment with the redim statement. So here is both versions of the code. Redim and not.
Cheers,
Gareth
'WITH ARRAYS
Function Tokenize(byVal TokenString, byRef TokenSeparators())
Dim NumWords, a()
NumWords = 0
Dim NumSeps
NumSeps = UBound(TokenSeparators)
'GW CHANGE
dim alQuoteIndex()
Do
redim alquoteindex(1)
dim numquotes
dim lpos
Dim SepIndex, SepPosition
SepPosition = 0
SepIndex = -1
'GW CHANGE
numquotes = 0
lpos =1
for i = 0 to NumSeps-1
' Find location of separator in the string
Dim pos
pos = InStr(TokenString, TokenSeparators(i))
' Is the separator present, and is it closest to the beginning of the string?
If pos > 0 and ( (SepPosition = 0) or (pos < SepPosition) ) Then
SepPosition = pos
SepIndex = i
End If
Next
' Did we find any separators?
If SepIndex < 0 Then
' None found - so the token is the remaining string
redim preserve a(NumWords+1)
a(NumWords) = TokenString
Else
'GW CHANGE
do while lpos > 0
lpos = instr(lpos,tokenstring,"""",1)
if lpos>0 then
numquotes = numquotes + 1
redim preserve alquoteindex(numquotes)
alquoteindex(numquotes-1) = lpos
if numquotes = 2 then
lpos = 0
else
lpos = lpos + 1
end if
end if
loop
if numquotes>0 then
if sepposition>alquoteindex(0) and sepposition= 0)
Tokenize = a
End Function
'WITHOUT ARRAYS
Function Tokenize(byVal TokenString, byRef TokenSeparators())
Dim NumWords, a()
NumWords = 0
Dim NumSeps
NumSeps = UBound(TokenSeparators)
'GW CHANGE
dim firstquote
dim secondquote
Do
redim alquoteindex(1)
dim numquotes
dim lpos
Dim SepIndex, SepPosition
SepPosition = 0
SepIndex = -1
'GW CHANGE
numquotes = 0
lpos =1
firstquote = 0
secondquote = 0
for i = 0 to NumSeps-1
' Find location of separator in the string
Dim pos
pos = InStr(TokenString, TokenSeparators(i))
' Is the separator present, and is it closest to the beginning of the string?
If pos > 0 and ( (SepPosition = 0) or (pos < SepPosition) ) Then
SepPosition = pos
SepIndex = i
End If
Next
' Did we find any separators?
If SepIndex < 0 Then
' None found - so the token is the remaining string
redim preserve a(NumWords+1)
a(NumWords) = TokenString
Else
'GW CHANGE
do while lpos > 0
lpos = instr(lpos,tokenstring,"""",1)
if lpos>0 then
numquotes = numquotes + 1
if numquotes = 2 then
secondquote = lpos
lpos = 0
else
firstquote = lpos
lpos = lpos + 1
end if
end if
loop
if numquotes>0 then
if sepposition>firstquote and sepposition= 0)
Tokenize = a
End Function
|
|
|
|
 |
I wanted to use this but I can't use space as a delimiter ?
|
|
|
|
 |
You can use space as a delimited. Just set:
Str = "Tom Dick Harry"
Seps(0) = " "
and it works. I just tried it.
|
|
|
|
 |
|
|
General News Suggestion Question Bug Answer Joke Rant Admin
Use Ctrl+Left/Right to switch messages, Ctrl+Up/Down to switch threads, Ctrl+Shift+Left/Right to switch pages.
|
First Posted | 16 May 2000 |
Views | 140,412 |
Bookmarked | 17 times |
|
|