Skip to content
Home » Blog » Larger arrays in LotusScript

Larger arrays in LotusScript

The array datatype in LotusScript supports arrays containing up to 216 elements (215 if you don’t use negative index values). This is fine for most purposes, but what if you need a larger indexed collection?

The NotesJSONArray class will work if the data you want to store can be represented in JSON — primitive types and structures created from primitive types. It’s about half the speed of an array, and doesn’t support arrays of objects. I’m not sure how many elements it can hold — I tested it up to a million number values. The coding is also more complex than corresponding array-based code.

In this first of a series of posts about different specific data structures in LotusScript, I show how to create super large arrays via custom classes. I did two implementations, each able in theory to contain more elements than you actually have memory to store — a billion or more. I was able to store around 200 million Integers before my client crashed.

Note: an array variable of fixed size is further limited in size because local variables in a given “scope” can’t exceed 32KB — so it depends on the sizes of the elements. To get around this, use a dynamic array with the Redim statement.

You can also, without defining a special class, use the List datatype to manage large collections. The index of a List is a string rather than a number, but you can just convert your numeric index to a string and use that as the list key. You also have to do your own tracking of which is the next unused index. For convenience, I made a custom class which does all this for you.

So, below I show both an array-based and a list-based implementation. The array-based is 3 to 5 times faster.

Book covers.

If you find this website helpful and want to give back, may I suggest buying, reading, and reviewing one of my excellent books? More are coming soon!

If you want email whenever there’s a new post, you can subscribe to the email list, which is 100% private and used only to send you information about stuff on this site.

Specification

The implementation will consist of a class, LargeArray, whose array elements are Variants (may be any value, including Object values). Its methods and properties as follows:

  • property Ubound as Long – the index of the highest-numbered array element. -1 if array is empty.
  • property value(ByVal index As Long) as Variant – get/set the value of an element. Unassigned elements have value EMPTY. When setting an element beyond the current end of the array, the array grows to include that position. This may leave unassigned elements.
  • sub GetValue(target as Variant, ByVal index As Long) – used by the caller when they to retrieve the element at position index when don’t know whether the element is an object — hence they don’t know whether to use Set or Let. The target parameter is passed by reference so the method assigns it to the requested value.
  • sub Append(valu as Variant) – Add to the end of the array. This is the same as assigning property value(ubound+1).
  • New — the constructor takes no arguments. An initial size isn’t specified — we will grow the array as needed.

In addition, the ListArray class, with a similar interface, uses a List-based implementation to do the same task. This is less code because all we’re doing is storing a list and keeping track of which is the highest-numbered element. It’s slower than LargeArray, but maybe uses less memory since it’s just allocating what it needs one element at a time as opposed to big chunks. Another nice thing about this approach is you can iterate through the internal list directly with a Forall loop (if you don’t mind a frown from the Object Oriented Programming Police for insufficient encapsulation).

I haven’t bothered to create a class based on JSON, because of its inability to store Object values and because even though it’s a built-in class, there’s no performance advantage compared to the array-based implementation.

Implementation comments

I chose to use arrays for this internally (as opposed to a linked list or tree or…) because access to them via an index is fast. The LargeArray datatype internally is a class containing an array where each element is an object containing an array. So if we allow up to 32000 elements in each array, we can store 32000^2, or a billion or so elements.

It’s easier to use a class that lets you append elements to a collection as needed, instead of having to specify the size up front. So I wanted to make it possible to just assign an element with an arbitrary index, and the array will grow as required. Behind the scenes, this is done with ReDim statements. Resizing an array while preserving its contents is fairly efficient, but we don’t want to have to resize arrays of thousands of elements each time an element is added. So we’ll allow for wasting some space for the sake of performance, and increase the array sizes in increments of 1000 elements.

For the list-based implementation, this is not a concern since the List code takes care of memory management for you.

Note: During testing, I found that accessing a List sequentially by key is much faster than accessing it in random order. Interesting.

Code

%REM
	Class LargeArrayNode
	Internal class used by LargeArray. There's not a lot of error checking here because we expect well-conditioned data from the caller.
	Constructor: New LargeArrayNode(initialIndex)
		The internal array is initialized to contain elements indexed 0 to at least initialIndex.
%END REM
Private Class LargeArrayNode
	Private z_data() As Variant

	Sub New(ByVal initialIndex%)
		Initialindex = 999 + (Initialindex \ 1000) * 1000
		ReDim z_data(0 To InitialIndex)
	End Sub
	
	%REM
		Property value (get or set)
		Description: read and write values from the internal array. A set operation will grow the array in 1000-element increments if
			it doesn't already contain an element with the specified index.
	%END REM
	Public Property Set value(ByVal ind%) As Variant
		If UBound(z_data) < ind Then
			ReDim Preserve z_data(0 To (ind \ 1000) * 1000 + 999)
		End If
		If IsObject(value) Then Set z_data(ind) = value Else z_data(ind) = value
	End Property
	Public Property Get value(ByVal ind%) As Variant
		If ind <= UBound(z_data) Then
			If IsObject(z_data(ind)) Then Set value = z_data(ind) Else value = z_data(ind)
		End If
	End Property
	
	%REM
		Sub GetValue
		Description: Assign target argument the value at position index. The caller can also reference me.value(index), but this
			function lets them do it without needing to know whether the value is an object (which needs a Set statement to assign).
	%END REM
	Sub GetValue(target, ByVal index%)
		Dim empty
		If index <= UBound(z_data) Then
			If IsObject(z_data(index)) Then Set target = z_data(index) Else target = z_data(index)
		Else
			target = empty
		End If
	End Sub
End Class

%REM
	Class LargeArray
	by Andre Guirard
	Description: Like a standard zero-based array, but with ability to hold up to a billion elements in theory.
		Note: memory limitations will probably limit you to about 200 million elements.
%END REM
Class LargeArray
	Private z_data() As LargeArrayNode
	' number of elements in array - indexes are zero-based.
	Public Ubound As Long
	
	Sub New
		ReDim z_data(0 To 999)
		me.Ubound = -1
	End Sub
	
	%REM
		Property value   get, set
		Description: Assign or read an element of the array
		Arguments:
			ind: zero-based index of element being referenced. Need not be contiguous with existing elements.
	%END REM
	Public Property Set value(ByVal ind As Long) As Variant
		Dim block%, blockLim&, offset%, aNode As LargeArrayNode
		If ind < 0 Then Error 9
		block = ind \ 32000
		offset = ind Mod 32000
		If block > UBound(z_data) Then
			blockLim = block + 1000
			If blockLim >= 32000 Then blockLim = 31999
			ReDim Preserve z_data(0 To blockLim)
		End If
		Set aNode = z_data(block)
		If aNode Is Nothing Then
			Set aNode = New LargeArrayNode(offset)
			Set z_data(block) = aNode
		End If
		If IsObject(value) Then Set aNode.value(offset) = value Else aNode.value(offset) = value
		If me.Ubound < ind Then me.Ubound = ind
	End Property
	Public Property Get value(ByVal ind As Long) As Variant
		Dim block%, blockLim&, offset%, aNode As LargeArrayNode
		If ind < 0 Or ind > me.Ubound Then Error 9
		block = ind \ 32000
		offset = ind Mod 32000
		Set aNode = z_data(block)
		If Not aNode Is Nothing Then
			aNode.GetValue me.value, offset
		End If
	End Property
	
	%REM
		Sub GetValue
		Description: Use this to get a value from the array if you're not sure whether the value you're getting is an object.
			The target argument is passed by reference, and will be assigned to the value at the specified index.
			Same as target = obj.value(ind)
	%END REM
	Sub GetValue(target, ByVal ind As Long)
		Dim block%, blockLim&, offset%, aNode As LargeArrayNode
		If ind < 0 Or ind > me.Ubound Then Error 9
		block = ind \ 32000
		offset = ind Mod 32000
		Set aNode = z_data(block)
		If Not aNode Is Nothing Then
			aNode.GetValue target, offset
		End If
	End Sub
	
	%REM
		Sub Append
		Description: Add a value to the end of the array.
	%END REM
	Sub Append(valu)
		If IsObject(valu) Then Set value(me.Ubound + 1&) = valu Else value(me.Ubound + 1&) = valu
	End Sub
End Class

And the ListArray class:

%REM
	Class ListArray
	Description: Array with Long numeric index, with a List as back-end storage.
	Constructor: New ListArray
%END REM
Class ListArray
	' elements stored here, can be accessed using Forall
	Public Data List As Variant
	Private z_ubound As Long
	
	Sub New
		z_ubound = -1&
	End Sub
	
	%REM
		Property Get Ubound
		Description: Return the largest index in use.
	%END REM
	Public Property Get UBound As Long
		me.UBound = z_ubound
	End Property
	
	%REM
		Sub Append
		Description: Add to end of array, increasing Ubound by one.
	%END REM
	Sub Append(valu As Variant)
		z_ubound = z_ubound + 1&
		If IsObject(valu) Then Set Data(z_ubound) = valu Else Data(z_ubound) = valu
	End Sub
	
	Private Sub Assign(a, b) ' variable assignment that doesn't care whether value is an object.
		If IsObject(b) Then Set a = b Else a = b
	End Sub
	
	Public Property Get value(ByVal index As Long) As Variant
		On Error Resume Next ' return EMPTY if no such element.
		Assign me.value, me.Data(index)
	End Property
	Public Property Set value(ByVal index As Long)
		If index < 0 Then Error 9 ' Subscript out of range
		If index > z_ubound Then z_ubound = index
		If IsObject(value) Then Set Data(z_ubound) = value Else Data(z_ubound) = value
	End Property
	
	%REM
		Sub GetValue
			Use
				Call obj.GetValue(target, i)
			instead of
				target = obj.value(i)
			to assign target in cases where you're not sure whether the value is an object. 
	%END REM
	Sub GetValue(target, ByVal index&)
		Assign target, value(index)
	End Sub
End Class

3 thoughts on “Larger arrays in LotusScript”

  1. Fun fact: when calling NotesStream.Read, you have to be prepared that the returned array can have a lowerbound of -32768 if more than 32767 bytes are requested or remaining in the stream when no &length param is supplied.
    Makes for interesting code when chomping through large Streams in LotusScript and you want to maximize blocksize for performance.

    1. Interesting tidbit. Of course you realize I’m going to have to test how much difference using larger blocks actually makes now.

  2. Watch it with long lines of accented Unicode. NotesStream corrupts it (SPR is open) My current “fix” in the JSON turbo class is to not write too large pieces of text to a NotesStream object. I use it in there to convert tfrom strings to array of Byte, which traverses faster than string operations.

Leave a Reply

Your email address will not be published. Required fields are marked *