Skip to content
Home » Blog » Read Attachment Files without Writing to Disk

Read Attachment Files without Writing to Disk

Someone sent me a link to this “domino idea” asking for the ability to read file data from attachments in Notes documents without “extracting the file” to disk, since ECLs might not always let you do that.

I never like to say there’s no way to do that, so the question prompted me to create an answer. This code uses DXL — it exports the document, then searches the exported data for the file data objects, which are encoded in base64. Fortunately it’s just the plain file data without CD record headers or the like, so it’s easy to convert to a NotesStream containing the raw data — or to interpret that data as text in a specified character set.

Specification

The AttachmentReader class has the following properties and methods:

  • Sub New (doc As NotesDocument): Scan the document and build a list of attachments.
  • Count As Integer: (read-only). How many attachments there are in the document, including those attached to the document (V2 style) and those attached in rich text items.
  • item(key) As ARAttachment: (read-only). Return a single attachment based on a 1-based numeric index or by the “unique filename” of the attachment. Normally the unique filename is the same as the name of the original file that was attached, unless there are multiple attachments with the same name, then the duplicates are given numeric suffixes to differentiate them.

A single attachment is represented by an ARAttachment object, with these properties and methods:

  • Text(char_set$) As NotesStream: Return the text from a text file, interpreting the contents using the specified character set.
  • Raw As NotesStream: Return the binary file data.
  • Sub Write(stream As NotesStream): Stream the binary file data to an existing stream. This is a way to append the data to a file (or just export it to a file, but if you want to do that you don’t need this code).
  • FileName As String: The “display filename,” which is generally the original name of the file, saved as part of the attachment reference in a rich text item.
  • UniqueName As String: The internal filename, usually the same as the FileName property unless there are duplicate file names.
  • ItemName As String: If the attachment is embedded in a rich text item, the name of the rich text item, else “” for V2 style attachments.
Book covers.

If you find this website helpful and want to give back, may I suggest buying, reading, and reviewing one of my excellent books? More are coming soon!

If you want email whenever there’s a new post, you can subscribe to the email list, which is 100% private and used only to send you information about stuff on this site.

Source Code

%REM
	Library AttachmentReader
	
	A LotusScript class to read attachment file data without first writing it to disk.
	This is helpful in situations where the ECL might not allow the code to access disk.
	
	© 2023 Andre Guirard

	Licensed under the Apache License, Version 2.0 (the "License");
	you may not use this file except in compliance with the License.
	You may obtain a copy of the License at

    http://www.apache.org/licenses/LICENSE-2.0

	Unless required by applicable law or agreed to in writing, software
	distributed under the License is distributed on an "AS IS" BASIS,
	WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
	See the License for the specific language governing permissions and
	limitations under the License.
%END REM
Option Public
Option Declare

%REM
	Class ARAttachment
	Represents one file attachment in a Notes document.
	Constructor: New ARAttachment(base64TextNode As NotesDOMTextNode, filename$)
		Where base64TextNode contains the base64-encoded file data
		and filename is the unique name of the file (which rich text uses to reference it unambiguously even in cases where multiple files with the same name have been attached).
%END REM
Class ARAttachment
	' if attached to a rich text item, name of item -- blank if attached to document (V2 style attachment).
	Public ItemName As String
	
	' original filename of attachment as recorded in the rich text data.
	Public FileName As String
	
	' unique name of attachment within document
	Public UniqueName As String
	Private z_data As String
	
	Sub New(tex As NotesDOMTextNode, ByVal uniqNam$)
		UniqueName = uniqNam
		z_data = tex.Nodevalue
	End Sub
	
	%REM
		Property Text (read) - Interpret the file data as text in a specified character set.
			The result is returned in Unicode regardless what character set the data are in, because that's a limitation of the NotesStream class.
	%END REM
	Public Property Get Text(ByVal char_set$) As NotesStream
		Dim doc As NotesDocument
		Dim mime As NotesMIMEEntity
		Dim streamIn As NotesStream, streamOut As NotesStream
		Dim db As NotesDatabase
		Dim ses As New NotesSession
		Dim contentType$
		Set db = ses.CurrentDatabase
		Set doc = db.CreateDocument
		Set mime = doc.CreateMIMEEntity("Body")
		Set streamIn = ses.CreateStream  
		Call streamIn.WriteText(z_data)
		streamIn.Position = 0
		If char_set = "" Then contentType = "text/plain" Else contentType = "text/plain;charset=" & char_set 
		Call mime.SetContentFromText(streamIn, contentType, ENC_BASE64)
		Set streamOut = ses.Createstream
		Call mime.Getcontentastext(streamOut, True)
		streamOut.Position = 0
		Set me.Text = streamOut
	End Property
	
	%REM
		Sub Write: Write the binary data into a NotesStream. Stream position is at end of data.
		This is the most efficient way to write the data into a file.
	%END REM
	Sub Write(stream As NotesStream)
		Dim doc As NotesDocument
		Dim mime As NotesMIMEEntity
		Dim streamIn As NotesStream
		Dim db As NotesDatabase
		Dim ses As New NotesSession
		Set db = ses.CurrentDatabase
		Set doc = db.CreateDocument
		Set mime = doc.CreateMIMEEntity("Body")
		Set streamIn = ses.CreateStream  
		Call streamIn.WriteText(z_data)
		streamIn.Position = 0
		Call mime.SetContentFromText(streamIn, "binary", ENC_BASE64)
		Call mime.Getcontentasbytes(Stream, True)
	End Sub
	
	%REM
		Property Raw (read) - Return the binary data. Stream position is at the beginning (0).
	%END REM
	Public Property Get Raw As NotesStream
		Dim streamOut As NotesStream
		Dim ses As New NotesSession
		Set streamOut = ses.CreateStream
		me.Write streamOut
		streamOut.Position = 0
		Set Raw = streamOut
	End Property
End Class

%REM
	Class AttachmentReader
	Read file attachments from a NotesDocument.
	Constructor: New AttachmentReader(doc As NotesDOcument)
%END REM
Class AttachmentReader
	Private z_byName List As ARAttachment
	Private z_attCount As Integer
	Private z_byIndex() As ARAttachment
	
	Sub New (doc As NotesDocument)
		Dim ses As New NotesSession
		Dim domp As NotesDOMParser
		Dim dxle As NotesDXLExporter, domd As NotesDOMDocumentNode, elRoot As NotesDOMElementNode
		Set dxle = ses.Createdxlexporter(doc)
		Set domp = ses.Createdomparser(dxle)
		dxle.Omitrichtextpictures = True
		dxle.Outputdoctype = False
		dxle.Process
		Set domd = domp.Document
		Set elRoot = domd.Documentelement
		Dim nl As NotesDOMNodeList, elFile As NotesDOMElementNode, elFileData As NotesDOMElementNode
		Dim elObj As NotesDOMElementNode, elItem As NotesDOMElementNode, nod As NotesDOMNode, tex As NotesDOMTextNode, i%
		Set nl = elRoot.Getelementsbytagname("filedata")
		For i = 1 To nl.Numberofentries
			Set elFileData = nl.Getitem(i)
			Set tex = Nothing
			Set elFile = elFileData.Parentnode
			If elFile.Nodename = "file" Then
			Set elObj = elFile.Parentnode
			If elObj.Nodename = "object" Then
				Set elItem = elObj.Parentnode
				If elItem.Nodename = "item" Then
					Set nod = elFileData.Firstchild
					Do Until nod.Isnull
						If nod.Nodetype = DOMNODETYPE_TEXT_NODE Then
							Set tex = nod
							Exit Do
						End If 
						Set nod = nod.Nextsibling
					Loop
				End If
			End If
			End If
			If Not (tex Is Nothing) Then
				Dim ara As New ARAttachment(tex, elFile.Getattribute("name"))
				Set z_byName(ara.UniqueName) = ara
				z_attCount = z_attCount + 1
			End If
		Next
		If z_attCount Then
			ReDim z_byIndex(1 To z_attCount)
			i = 0
			ForAll thing In z_byName
				i = i + 1
				Set z_byIndex(i) = thing
			End ForAll
		End If
		
		Set nl = elRoot.Getelementsbytagname("attachmentref")
		For i = 1 To nl.Numberofentries
			Dim key$, displayName$, elRef As NotesDOMElementNode
			Set elRef = nl.Getitem(i)
			key = elRef.Getattribute("name")
			displayName = elRef.Getattribute("displayname")
			If IsElement(z_byName(key)) Then
				Set ara = z_byName(key)
				ara.FileName = displayName
				Set elItem = elRef.Parentnode
				Do Until elItem.Isnull
					If elItem.Nodename = "item" Then
						ara.ItemName = elItem.Getattribute("name")
						Exit Do
					End If
					Set elItem = elItem.Parentnode
				Loop
			End If
		Next
	End Sub
	
	' Property item(key): If key is a number from 1 to the number of attachments, return the attachment at that number position.
	' If key is a string, return the attachment with that unique filename.
	Public Property Get item(ByVal key) As ARAttachment
		If DataType(key) = 8 Then
			If IsElement(z_byName(key)) Then
				Set item = z_byName(key)
			End If
		Else
			If key >= 1 And key <= z_attCount Then
				Set item = z_byIndex(key)
			End If
		End If
	End Property
	
	Public Property Get Count As Integer
		Count = z_attCount
	End Property
End Class

Code Sample

Option Public
Option Declare
Use "AttachmentReader"

Sub Initialize
	Dim ses As New NotesSession, db As NotesDatabase, doc1 As NotesDocument, vu As NotesView
	Dim itum As NotesItem
	Set db = ses.Currentdatabase
	Set vu = db.getview("Main")
	Set doc1 = vu.Getdocumentbykey("utf8", True)
	Dim ar As New AttachmentReader(doc1)
	Dim ara As ARAttachment, contents$, stream As NotesStream, bytes, i%
	
	Set ara = ar.item("utf8.txt")
	contents = ara.text("UTF-8").readtext & {
Hex:}
	Set stream = ara.Raw
	bytes = stream.Read
	For i = 0 To UBound(bytes)
		contents = contents & " " & Right$("0" & Hex(bytes(i)), 2)
	Next
	MsgBox "File contents: " & contents
End Sub

9 thoughts on “Read Attachment Files without Writing to Disk”

  1. Woah, this is cool. Never occurred to me to use DXL to access attachments, or anything else, inside of a document. Brilliant!

  2. If you are brave enough to use Java, you do not need to hack around with the cumbersome and unreliable DXL routines in LotusScript.

    Read the attachment (as embedded object) in a binary stream and return that content in a proper UTF-8 converted text.

    private String getAttachmentContent(Document docTarget, String rtiFieldName, String fileName) {
    final RichTextItem rti = (RichTextItem) docTarget.getFirstItem(rtiFieldName);
    final EmbeddedObject eo = rti.getEmbeddedObject(fileName);
    if (null != eo) {
    final InputStream isFile = eo.getInputStream();
    final ByteArrayOutputStream resultData = new ByteArrayOutputStream();
    final byte[] buffer = new byte[1024];
    for (int length; (length = isFile.read(buffer)) != -1;) {
    resultData.write(buffer, 0, length);
    }
    isFile.close();

    final String fileContent = resultData.toString(“UTF-8”);
    return fileContent;
    }
    return “”;
    }

    1. Yes, this is another way to do it if using Java is an option. It isn’t always, because sometimes there’s Notes client UI involvement. You can invoke Java functions via LS2J, but if the data are not text, passing back binary data to the LotusScript code might be a challenge, and performance is also an issue with LS2J.

      I’m also uncertain getInputStream will work in all cases, as it uses a temporary file. It would be interesting to try that, and also to do performance testing since reading and writing a disk file is likely to slow things down.

  3. Great article! Thank you for sharing.

    Is there a way to save a file onto a document without it going to disk first? We looked at doing that in Java but didn’t seem like that was possible.

    1. I believe so. I haven’t reviewed the attachment-creation library recently but I suppose you can just make up any name — it doesn’t have to be an actual file.

  4. @David Leedy: This is very do-able in Java. Have a look at this solution on Stackoverflow that’s a FileItem class but implemented in memory. You can adapt this to handle attachments in memory only. I’ve used it to handle attachments in multipart form data then save to notesdocuments. Works a treat. https://stackoverflow.com/a/62624820/12571484

  5. Thank you for this really useful class.

    I’d been battling for a few days, without success, working out how to process unicode text files containing emojis without resorting to file system access. I much prefer in-memory and this code does the trick perfectly.

Leave a Reply

Your email address will not be published. Required fields are marked *