Take the 2-minute tour ×
Code Review Stack Exchange is a question and answer site for peer programmer code reviews. It's 100% free, no registration required.

In the process of trying to build a serializable data structure, I found myself building large strings, which gets very slow because VBA copies a string every time concatenation is performed.

To alleviate this, referring to Dynamic Array and Java's StringBuilder interface, I cobbled together a unicode clsStringBuilder class.

This isn't a very big chunk of code, but I'd be interested in any advice about edge cases that I maybe haven't considered, unexpected copying behavior that VBA might be performing "behind my back" which I could avoid, or corrections to coding style (or lack thereof).

Option Compare Database
Option Explicit

'******
'* v2 *
'******


Private Declare Sub CopyMemory Lib "kernel32.dll" Alias "RtlMoveMemory" (ByVal dst As Long, ByVal src As Long, ByVal Length As Long)

Private Const DEFAULT_CAPACITY As Long = &H10
Private m_currLen As Long
Private m_stringBuffer() As Byte

Private Sub Class_Initialize()
    ReDim m_stringBuffer(0 To (DEFAULT_CAPACITY * 2) - 1) 'Each unicode character is 2 bytes
End Sub

Public Function Append(strString As String) As clsStringBuilder
On Error GoTo derp


    If m_currLen + LenB(strString) < UBound(m_stringBuffer) Then
        CopyMemory VarPtr(m_stringBuffer(m_currLen)), StrPtr(strString), LenB(strString)
    Else
        If m_currLen + LenB(strString) < UBound(m_stringBuffer) * 2 Then
            Expand
        Else
            Expand m_currLen + LenB(strString)
        End If
        CopyMemory VarPtr(m_stringBuffer(m_currLen)), StrPtr(strString), LenB(strString)
    End If
    m_currLen = m_currLen + LenB(strString)
    Set Append = Me
    Exit Function

derp:
    Stop
    Resume
End Function

Public Property Get Length() As Long
    Length = m_currLen / 2
End Property

Public Property Get Capacity() As Long
    Capacity = UBound(m_stringBuffer)
End Property

Private Sub Expand(Optional newSize As Long = 0)
    If newSize <> 0 Then
        ReDim Preserve m_stringBuffer(0 To newSize - 1)
    Else
        ReDim Preserve m_stringBuffer(0 To (UBound(m_stringBuffer) * 2) + 1)
    End If
End Sub

Public Function toString() As String
    toString = Mid(m_stringBuffer, 1, m_currLen / 2)
End Function

Here is a test:

Public Sub Main()
    Dim sb As clsStringBuilder
    Set sb = New clsStringBuilder
    Dim strString As String
    Dim i As Long
    Dim StartTime As Double

    'VBA String
    StartTime = MicroTimer()
    For i = 0 To 100000
        strString = strString + "Hello World;"
    Next
    Debug.Print "The VBA String took: " & Round(MicroTimer - StartTime, 3) & " seconds"

    'StringBuilder
    StartTime = MicroTimer()
    For i = 0 To 100000
        sb.Append "Hello World;"
    Next
    Debug.Print "The Stringbuilder took: " & Round(MicroTimer - StartTime, 3) & " seconds"

    'Are the strings the same?
    Debug.Print StrComp(strString, sb.toString, vbBinaryCompare)
End Sub

Here is Microsoft's MicroTimer function, which can be found here:

Private Declare Function getFrequency Lib "kernel32" _
Alias "QueryPerformanceFrequency" (cyFrequency As Currency) As Long
Private Declare Function getTickCount Lib "kernel32" _
Alias "QueryPerformanceCounter" (cyTickCount As Currency) As Long
Function MicroTimer() As Double
'

' Returns seconds.
    Dim cyTicks1 As Currency
    Static cyFrequency As Currency
    '
    MicroTimer = 0

' Get frequency.
    If cyFrequency = 0 Then getFrequency cyFrequency

' Get ticks.
    getTickCount cyTicks1

' Seconds
    If cyFrequency Then MicroTimer = cyTicks1 / cyFrequency
End Function
share|improve this question

2 Answers 2

up vote 10 down vote accepted

I like this, a lot. It's.. brilliant. Like .NET String objects, VBA strings are immutable, which means like in .NET, when "the quick brown fox" gets appended with "jumps over" and then "the lazy dog", it's 4 strings that have been generated, and thus the first one got copied 3 times; a VBA StringBuilder class is therefore definitely welcome to any VBA toolkit!

This is some serious code you've got here. Let's take a look.

So you've called the class clsStringBuilder. I know where you're coming from, but there's no real reason for this "cls" Hungarian prefix - I'd remove it, and call the class StringBuilder.

'******
'* v2 *
'******

Don't bother with that. I know version control is natively near-impossible with VBA, but there is no need to "version" code in comments nonetheless; do you actually maintain the version number? Why bother? Just remove it, it's useless clutter.

Private Const DEFAULT_CAPACITY As Long = &H10

Why not 16? Hexadecimal 10 is 16 right? I think it would be clearer to use a decimal notation. Actually that capacity is potentially confusing, especially given the hex notation. Is &H10 the number of bytes? Characters?

ReDim m_stringBuffer(0 To (DEFAULT_CAPACITY * 2) - 1) 'Each unicode character is 2 bytes

Ah, characters then. How about calling the constant DEFAULT_CHARACTER_CAPACITY? Nah, too long.. and I personally don't like the YELLCASE, I'd just call it InitialCharacterCapacity, but I've seen other people use ALL CAPS for constants - as long as you're consistent, it works :)

BTW that's a good comment you have there, but I wouldn't bother specifying "unicode" characters; it gets confusing when the IDE itself only supports ANSI strings!

I don't like prefixes and abbreviated names, so m_currLen would become currentLength and m_stringBuffer would become stringBufffer, or just buffer.

Actually since currentLength is in bytes, I'd call it currentByteLength, so as to avoid question marks when it comes to this:

Public Property Get Length() As Long
    Length = m_currLen / 2
End Property

Public Function Append(strString As String) As clsStringBuilder

strString, really? I doesn't get any more Hungarian than that! Also you should know that parameters are passed ByRef by default - I'd change the signature to this:

Public Function Append(ByVal value As String) As StringBuilder

The error handling isn't optimal - if anything blows up, you're bringing up the IDE for the end user to scratch their head and debug your code! That's not production-ready:

derp:
    Stop
    Resume

"derp" means nothing to me - I like following a "template" like this:

Public Sub Foo()
    On Error GoTo CleanFail

    'implementation

CleanExit:
    Exit Sub

CleanFail:
    'handle error
    Resume CleanExit
End Sub

You might also want to make sure Expand doesn't actually shrink the buffer. I think. ;)

Lastly, I'm not sure I understand why toString isn't following the convention and named in PascalCase like every public method - ToString would be better-looking.

Good job!

share|improve this answer
1  
Thanks for the feedback - fixing the naming conventions now. For the member variables, I've often run into cases where I need to make them private and provide access through Get/Set/Let properties. In those cases, I use "m_" to keep VBA from yelling "Ambiguous name detected: membername". What is the best practice for those situations? –  Blackhawk Oct 22 '14 at 20:36
1  
I like stuffing all my private fields into a private type - here would be Private Type TStringBuilder, with all the members. Then the class only has 1 field, which I call this, like, Private this As TStringBuilder - this eliminates the name clashes, and I like seeing this.MemberName = value in a Public Property Let MemberName(ByVal value As Whatever) block ;) –  Mat's Mug Oct 22 '14 at 20:39
    
Interesting... I like this :D I'll have to think through the implications... –  Blackhawk Oct 22 '14 at 20:54
2  
Maybe it's just me, but I like YELLCASE constants... –  RubberDuck Oct 23 '14 at 1:13

Your StringBuilder is pretty damn impressive :) ++ to Mat's suggestions except the YELLCASE which I am on RubberDuck's side ;)

I think I have identified a potential overflow of memory (out of memory). It's probably very unlikely to happen to anyone but hey... If you wrap your loop with another loop VBA runtime doesn't seem to catch up with counting and releasing the references...Your StringBuilder is too damn quick to VBA runtime ;)

Ex:

For j = 0 To 1000
    Dim csb As New clsStringBuilder

    StartTime = MicroTimer()
    For i = 0 To 100000
        csb.Append "Hello World;"
    Next
Next

That will stop in derp at some point and cause an out of memory... AFAIC, nothing you can really do... except don't allow people like me to test your code ;P jk!

A few other minor things from me though:

Select Case is faster than If-Else

□ Division is more expensive than addition & multiplication

□ Multiple computation to get the same number is a bit inefficient. If you need to get the value of Ubound(arr) 5 times within one if-else/select case consider storing this number in a variable.

Mid$() (in ToString()) should be a tiny bit faster than Mid()

□ Probably a safer option to use & instead of + for String concatenation. (your Main())

Overall speed seems just a tiny bit faster with my improvements - too subtle? ;)

1000 tests each

enter image description here

Ok I just changed the name to StringBuilder and here's what I've done to it:

Private Declare Sub CopyMemory Lib "kernel32.dll" Alias "RtlMoveMemory" _
    (ByVal dst As Long, ByVal src As Long, ByVal Length As Long)

Private Const DEFAULT_CAPACITY As Long = 16
Private m_currLen As Long
Private m_stringBuffer() As Byte

Private Sub Class_Initialize()
    ReDim m_stringBuffer(0 To (DEFAULT_CAPACITY + DEFAULT_CAPACITY) - 1) 'Each unicode character is 2 bytes
End Sub

Public Function Append(strString As String) As StringBuilder
On Error GoTo derp

    Dim uBuffer As Long
    uBuffer = UBound(m_stringBuffer)

    Dim lengthB As Long
    lengthB = LenB(strString)

    Dim sPtr As Long
    sPtr = StrPtr(strString)

    Dim currLen As Long
    currLen = m_currLen + lengthB

    Select Case currLen
        Case Is < uBuffer
            CopyMemory VarPtr(m_stringBuffer(m_currLen)), sPtr, lengthB
        Case Is < (uBuffer + uBuffer)
            Expand
            CopyMemory VarPtr(m_stringBuffer(m_currLen)), sPtr, lengthB
        Case Else
            Expand currLen
            CopyMemory VarPtr(m_stringBuffer(m_currLen)), sPtr, lengthB
    End Select

    m_currLen = currLen
    Set Append = Me
    Exit Function

derp:
    Stop
    Resume
End Function

Public Property Get Length() As Long
    Length = m_currLen * 0.5
End Property

Public Property Get Capacity() As Long
    Capacity = UBound(m_stringBuffer)
End Property

Private Sub Expand(Optional newSize As Long = 0)
    Select Case newSize
        Case Is = 0
            ReDim Preserve m_stringBuffer(0 To (UBound(m_stringBuffer) + UBound(m_stringBuffer)) + 1)
        Case Else
            ReDim Preserve m_stringBuffer(0 To newSize - 1)
    End Select
End Sub

Public Function ToString() As String
    ToString = Mid$(m_stringBuffer, 1, m_currLen * 0.5)
End Function

You could possibly play a bit more with the Select Case but I've left it in a state that I am happy with...

m_stringBuffer(m_currLen) should be O(1) so no need to store in a variable IMO

share|improve this answer
    
Thanks! Working through this now... what did you use to generate the fancy chart? –  Blackhawk Oct 23 '14 at 17:43
1  
you can't be serious with your question? :P Excel 2010... –  Michal Krzych Oct 23 '14 at 17:59
2  
Ah, of course - it was the colors that threw me, since they aren't the canned "Office pastels" –  Blackhawk Oct 23 '14 at 18:01
    
@Blackhawk no worries ;) I probably wouldn't guess myself hehe I bet we all got our favorite RGBs ;) –  Michal Krzych Oct 23 '14 at 22:14

Your Answer

 
discard

By posting your answer, you agree to the privacy policy and terms of service.

Not the answer you're looking for? Browse other questions tagged or ask your own question.