I previously published two linked list data structures for objects in the LotusScript Gold Collection. These are in script libraries ObjectList and ObjectListLite. They’re actually — despite the “lite” — both a more complex implementation than you need for most purposes. Here I present a new, simpler class that does only the basics of a doubly-linked list.
The Queue class I wrote about previously is a variant of the linked list, but a list with limited functionality. The code is simpler, because you don’t need to be able to search the list for a particular value, or insert into the middle, or sort the values. If you do need those things, you have to choose a different data structure than Queue.
A simple linked list doesn’t do everything you might want to do, but it’s less limited than a queue. You can navigate to the beginning or end of the list, iterate through it in either direction, insert or delete values at any point.
Specification
The LinkedList class conceptually has a start and end element, and a “cursor” or current position that you can move around in the list.
Key principles
- If there are any elements, there’s always a “current” element.
- As much as possible, the “current” pointer doesn’t move unless the caller explicitly moves it. Of course if the current node is deleted some other one must be made current.
Definition
The class has the following properties and methods:
- count (Long, read-only): the number of elements in the list.
- value (Variant, read/write): the value stored in the “current” position.
- IsEmpty (Boolean, read-only): True if the list contains no elements.
- IsAtEnd (Boolean, read-only): True if the list is empty or the current position is the last element.
- Sub New(devaultval As Variant): The constructor specifies the value to return for value if the list is empty (if it’s not empty there’s always a current element to supply a value). Recommend using Nothing if the list will contain objects.
- Function First As Boolean: Set the current position to the first element. This returns True if there is a first element (i.e. list is not empty).
- Function Last As Boolean: Set the current position to the last element. This returns True if there is a last element (i.e. list is not empty).
- Function Next As Boolean: Advance “current” to the next element. Returns True if there is a next element.
- Function Prev As Boolean: Back up “current” to the previous element. Returns True if there is a previous element.
- Sub Append(value): add an element whose value is given to the end of the list. This doesn’t change which is the current element unless the list was empty before, then the added element becomes current. Same rule applies to the next two methods.
- Sub Insert(value): Add an element before the current element, not changing the current element.
- Sub AddAfter(value): Add an element after the current element, not changing the current element.
- Function Remove: Delete the current element from the list. The new current element is either the next one if there is a next one, else the previous one.
If you find this website helpful and want to give back, may I suggest buying, reading, and reviewing one of my excellent books? More are coming soon!
If you want email whenever there’s a new post, you can subscribe to the email list, which is 100% private and used only to send you information about stuff on this site.
Implementation notes
A refresher: every object variable is just a pointer to the actual object, so when you assign an object variable, you’re not copying the object, just its address. By defining a class with some members whose datatypes are of that same class, you can chain them together and navigate the chain by following the pointers. For instance, here’s a class you can use as a “node” in a linked list.
Private Class ListNode Public value As Variant Public Next As ListNode Public Prev As ListNode Sub New(value) If IsObject(value) Then Set me.value = value Else me.value = value End Sub End Class
The Next and Prev properties here will point to other occurrences of ListNode objects. You can also build other structures using these types of pointers. This example shows the connections between objects in a three-element list:
This record is intended for a general-purpose list class, so the value member, a Variant, can contain any value, including a (pointer to an) object of some other type.
This is just one way to implement the API shown given in the specification above. Behind the scenes, you might use an array to store the values instead. Or you might only have “next” pointers and no “prev” pointers, so the Prev method has to go to the start of the list and scan forward to find the previous node — inefficient, unless you know the caller will rarely try to go backwards or insert into the middle, in which case it might be more efficient. In fact if you refer back to the linked list used in the Queue class, that’s the case there. We know exactly how the list will be used, and that use doesn’t require ever going backwards, so we left out the Prev link and the unused methods, and gained simplicity and efficiency. The generalized code shown here is three times as long.
I chose to write a doubly-linked list for my general-purpose list class because I want it to perform well for general use — allowing for unlimited list length (which wouldn’t work if I used an array) and decent overall performance during navigation rather than maximizing a particular type of use.
One option I considered was to make the chain into a ring — have the Next pointer of the last node point to the first node, and vice versa. I wrote and tested that code, if anyone wants to see it. It’s a little counterintuitive, but doing so simplifies the logic of adding and deleting nodes, so the code ended up 7 lines shorter.
Point is, the caller doesn’t care how it’s implemented. The above-defined API hides these details from them. Unless, of course, the hidden details of the design impose limits or affect performance for their use case — but if so, you can redo the inner design while keeping the interface the same, minimizing code changes to the already tested calling code. You do of course have to unit test the updated inner code.
Code
The helper class ListNode is shown above. Here’s the code for LinkedList class:
Public Const ERR_INVALIDPOS = 20034 Private Const MSG_INVALIDPOS = "Cannot operate on current list node; list is empty"%REM Class LinkedList By Andre Guirard Description: Basic doubly-linked list data structure Constructor: New LinkedList(defaultval) where defaultval is value to return if asked for a value while the list is empty. %END REM
Class LinkedList Private z_first As ListNode Private z_current As ListNode' last-accessed node in the list.
Private z_last As ListNode Private z_count As Long Private z_defval As Variant Sub New(defaultval) If IsObject(defaultval) Then Set z_defval = defaultval Else z_defval = defaultval End Sub Sub Delete Dim nnext As ListNode Do Until z_first Is Nothing Set nnext = z_first.next Delete z_first Set z_first = nnext Loop End Sub%REM Property Get Count (read only) Description: Return number of elements in the list %END REM
Public Property Get Count As Long count = z_count End Property%REM Property Get IsEmpty (read only) Description: TRUE if list contains no elements. %END REM
Public Property Get IsEmpty As Boolean me.IsEmpty = (z_count = 0&) End Property%REM Property Get IsAtEnd (read only) Description: Return TRUE if there's no "next" list element from current position. %END REM
Public Property Get IsAtEnd As Boolean IsAtEnd = z_current Is z_last' they might both be Nothing if list is empty, this still works
End Property' methods to position the cursor %REM Function First Description: Position to the first element. Returns: FALSE if list is empty %END REM
Function First As Boolean Set z_current = z_first First = Not (z_current Is Nothing) End Function%REM Function Last Description: Position to the last element. Returns: FALSE if list is empty. %END REM
Function Last As Boolean Set z_current = z_last Last = Not (z_current Is Nothing) End Function%REM Function Next Description: Advance cursor to next item Returns: FALSE if there is no next item (in which case cursor doesn't move) %END REM
Function Next As Boolean If Not (z_current Is z_last) Then Set z_current = z_current.next me.Next = True End If End Function%REM Function Prev Description: back up cursor to previous item Returns: FALSE if there is no previous item (in which case cursor doesn't move) %END REM
Function Prev As Boolean If Not (z_current Is z_first) Then Set z_current = z_current.prev Prev = True End If End Function%REM Property Value (read/write) Description: The value stored at the current list position. Returns the "default value" if there is no current position (which only happens if the list is empty). Setting Current when list is empty causes an error. %END REM
Public Property Get Value As Variant If z_current Is Nothing Then If IsObject(z_defval) Then Set Value = z_defval Else Value = z_defval Else If IsObject(z_current.value) Then Set Value = z_current.value Else Value = z_current.value End If End Property Public Property Set Value As Variant If z_current Is Nothing Then Error ERR_INVALIDPOS, MSG_INVALIDPOS If IsObject(Value) Then Set z_current.value = Value Else z_current.value = Value End Property%REM Function addNode Description: Insert a node before the specified node. Patch in all the pointers correctly. Don't adjust first or current position. Nothing as argument adds to end of list. %END REM
Private Function addNode(value, beforeNode As ListNode) As ListNode Set addNode = New ListNode(value) If z_first Is Nothing Then Set z_first = addNode Set z_last = addNode Set z_current = addNode Else' list is not empty
Set addNode.next = beforeNode If beforeNode Is Nothing Then' add to end
Set z_last.Next = addNode Set addnode.Prev = z_last Set z_last = addNode Else' not at end
Set addNode.Prev = beforeNode.Prev Set beforenode.prev = addNode If addnode.prev Is Nothing Then' adding before first node
Set z_first = addnode Else Set addNode.prev.next = addNode End If End If End If z_count = z_count + 1 End Function%REM Sub Append Description: Add a new item to the end of the list. Current position unaffected unless the list was empty, in which case new node is now current. %END REM
Sub Append(value) Call addNode(value, Nothing) End Sub%REM Sub Insert Description: Add a new item before the current position. If list is empty this is the same as Append. Else current position is unchanged. %END REM
Sub Insert(value) Call addNode(value, z_current) End Sub%REM Sub AddAfter Description: Add a new item after the current position. If list is empty this is the same as Append. Else current position is unchanged. %END REM
Sub AddAfter(value) If z_count = 0 Then Call addNode(value, Nothing) Else Call addNode(value, z_current.next) End Sub%REM Function Remove Description: Delete the current node, if any. The new current position will be the node following the current node, if any, else the last node, if any, else (no nodes left) current position will be invalid. Returns: TRUE if list is not empty after deletion. %END REM
Function Remove As boolean Dim nod As ListNode If Not (z_current Is Nothing) Then z_count = z_count - 1 If z_count = 0 Then Delete z_current' this was the last node, we are done.
Else me.Remove = True Set nod = z_current' if the last node is being deleted, the new current node is the one before it, else it's the one after.
If nod Is z_last Then Set z_current = nod.prev Set z_last = z_current Else Set z_current = nod.next Set z_current.prev = nod.Prev End If If nod Is z_first Then Set z_first = nod.next Else Set nod.prev.next = nod.next End If Delete nod End If End if End Function End Class
Example code
To see how we could use this class, I’ll revisit a previous example — the FileSearch function that uses the Queue class. You may recall, there was a problem in that implementation in terms of the order in which it processed subfolders, resulting in the array of matching filepaths it returns being in what some might regard as the wrong order — not grouped by their higher level containing folders.
The problem is that the Queue class only lets you retrieve the oldest value. You don’t know that a folder contains subfolders until you process that folder. By that time, the folder’s siblings are already queued up for processing. When you discover subfolders, you want to process them next after the current folder — you want them inserted at the beginning of your work list, not at the end. A Queue doesn’t allow this, but with the LinkedList class you can add the new entries wherever you like.
%REM Function FileSearch Description: Recursively search a given folder for files who name match a set of patterns. Works on any OS. Arguments: baseDir: the folder you want to search. May optionally have a directory delimiter at the end, e.g. "C:\foo" or "C:\foo\". Should work on any OS. patterns: a string delimited with "|" listing the second operands for a LIKE operator to match against the filename, case insensitive. E.g. "*.dat|ac[0-5]*.cab" to find all ".DAT" files and all ".CAB" files whose names begin "ac" followed by digit 0 to 5. Returns: array of strings containing full filepaths of all matching files, or if no matches found, array containing one blank element. %END REM
Function FileSearch(ByVal baseDir$, ByVal patterns$) As Variant Dim delim$, afile$, results$, adir$, patternarr Dim work As New LinkedList("") Dim files List As Boolean ' find what path delimiter the caller prefers. If basedir Like "/*" Then delim = "/"' linux style path
ElseIf basedir Like "\*" Or basedir Like "[A-Z]:\*" Then delim = "\"' windows style path
Else Error 20030, "FileSearch needs an absolute filepath." End If patternarr = Split(LCase(patterns), "|")' internally we want all filepaths to end with the folder separator
If baseDir Like "*[!\/]" Then baseDir = baseDir & delim work.Append basedir' while there are directories in the queue, process the first directory.
Do While work.First adir = work.value Erase files afile = Dir$(aDir & "*", Attr_hidden + Attr_system)' all files but not directories.
Do Until afile = "" files(afile) = True' remember the names of the files
If IsLikeAnyOf(LCase(aFile), patternarr) Then results = results & NEWLINE & adir & afile End If afile = Dir$ Loop' now find the names of each subfolder.
afile = Dir$(aDir, Attr_directory + attr_hidden)' this will include files also
Do Until afile = "" If afile <> "." And afile <> ".." Then If Not IsElement(files(afile)) Then' this is a folder. Add it to work queue for processing on later iteration.
work.Insert adir & afile & delim End If End If afile = Dir$ Loop work.Remove Loop FileSearch = Split(Mid$(results, 2), NEWLINE) End Function
I take advantage of how LinkedList repositions the current item after an item is deleted or added. The main loop takes the first list item to work on, so that’s still the current item while the code processes the files and subfolders in that folder.
When it discovers a subfolder name, which Dir$ provides in alphabetical order (at least on Windows), the code would like to add it to the beginning of the list, but still in alphabetical order — to be processed after any previous subfolders in that same folder.
Since the Insert method doesn’t change the current position, we can do this by just calling Insert. It adds the new entry to the list just before the current entry, for the folder we’re processing. The current element is no longer the first element, but it’s still the point where if we find another subfolder, we want to insert it there, just below any subfolders we added earlier.
At the end of the loop, the Remove method is used to get the folder we just processed out of the list. The current list position is we don’t care where, because in the next iteration of the big loop, it uses First to navigate to the top of the list, which in this example would contain the first subfolder name. And so it goes.
Final thoughts
As I mentioned above, while you can’t always see (and don’t always care) how a class is implemented, you can observe the consequences of those design choices in terms of performance. As an example, we can deduce from its performance that the built-in class NotesDocumentCollection must use an implementation similar to this linked list. It’s well known that the GetNextDocument methods and GetPrevDocument are efficient, while GetNthDocument is slow, even if you’re getting the next document after the one you just processed.
When you ask for document 348, the class doesn’t know the index of the last-fetched document, so it has to rewind to the first document, then follow the “next” link 347 times to find the requested entry. This is of course very slow, especially if you’re iterating through the whole list, giving you order of n2 performance.
HCL could make this more efficient by keeping track of the numerical index of their current position so GetNthDocument has the data to calculate that it can get to the requested element by using “next” or “prev” just once. The ObjectListLite class in the LotusScript Gold Collection is an example of this approach.