Lars Berntrop-Bos posted a comment concerning the LotusScript NotesStream class and his practice of reading as large a block as he possibly could to optimize performance. I decided to run a test to see how much difference that made. The results are as follows:
Block size (bytes) | Runtime to read a large file (seconds) |
---|---|
512 | 10.30 |
1024 | 5.09 |
2048 | 2.75 |
4096 | 1.49 |
8192 | 0.86 |
16384 | 0.52 |
32768 | 0.35 |
65536 | 0.26 |
Each of these cases is doing the same amount of work — reading every byte in the same large binary file. The effect of increasing the block size is dramatic — the largest possible block size was 40 times faster than a 512-byte block. 216 bytes is the upper limit of the NotesStream.Read method, because that’s the maximum number of elements in the array datatype it returns.
If you find this website helpful and want to give back, may I suggest buying, reading, and reviewing one of my excellent books? More are coming soon!
If you want email whenever there’s a new post, you can subscribe to the email list, which is 100% private and used only to send you information about stuff on this site.
The implication is that there’s a large constant overhead involved in just making the call to the Read method — making twice as many calls to read the same data takes almost twice as long.
When scanning the data in the array, on the other hand, accessing a byte at a high array position doesn’t take any longer than accessing one near the beginning of the array, so making use of the data in your program, whatever you’re doing with it, shouldn’t take longer. In fact, if all you’re doing with it is appending it to another file, it’ll probably be a lot faster to use large blocks since the NotesStream.Write method probably has a similar performance profile.
Best practice recommendation
Lars is correct. Unless you’re only interested in data at a specific known file position that you can access with a single read (e.g. the header bytes of an image file), use the maximum length value, 65536, with the NotesStream.Read method.
Following this recommendation, I shaved 30% off the time required to compute a file checksum (doing arithmetic on each byte takes longer than reading the data). Thanks, Lars!
Caveat
While a LotusScript array may contain up to 216 elements, the array index is a signed 16-bit integer. So if you read more than 215-1 bytes, the Read method is forced to return an array whose starting index is negative. The bytes will still be in consecutive array positions, but the first one will be at some negative index value.
This isn’t a big deal, but you have to code for it by using Lbound to determine the index of the first element — don’t assume it’ll be zero.
Do Until stream.IsEOS
buffer = stream.Read(65536)
For i = Lbound(buffer) To Ubound(buffer)
' whatever it is you're doing to each byte, buffer(i)
Next
Loop
If you pass the buffer to the NotesStream.Write method, it’s fine — that method doesn’t assume the array starts at zero.
Testing methodology
Here’s the test code, using a performance monitoring class I wrote about earlier:
Sub testBlockReadSize Dim pt As New PerfTimer(500), byt Dim i%, ses As New NotesSession, stream As NotesStream, blen& blen = 512 While blen < 70000 Print blen Set stream = ses.Createstream stream.open "filepath goes here", "Binary" pt.start CStr(blen) Do byt = stream.read(blen) Loop Until stream.iseos pt.Finish blen = blen * 2& Wend MsgBox pt.results End Sub
The assumption is that your input file is large enough to provide a meaningful challenge — mine was 192 MB. I put in a large cutoff for the PerfTimer — 500 seconds — but it doesn’t matter what I enter there because we’re not using the isDone property — just timing one iteration of the code under test.