Someone sent me a link to this “domino idea” asking for the ability to read file data from attachments in Notes documents without “extracting the file” to disk, since ECLs might not always let you do that.
I never like to say there’s no way to do that, so the question prompted me to create an answer. This code uses DXL — it exports the document, then searches the exported data for the file data objects, which are encoded in base64. Fortunately it’s just the plain file data without CD record headers or the like, so it’s easy to convert to a NotesStream containing the raw data — or to interpret that data as text in a specified character set.
Specification
The AttachmentReader class has the following properties and methods:
- Sub New (doc As NotesDocument): Scan the document and build a list of attachments.
- Count As Integer: (read-only). How many attachments there are in the document, including those attached to the document (V2 style) and those attached in rich text items.
- item(key) As ARAttachment: (read-only). Return a single attachment based on a 1-based numeric index or by the “unique filename” of the attachment. Normally the unique filename is the same as the name of the original file that was attached, unless there are multiple attachments with the same name, then the duplicates are given numeric suffixes to differentiate them.
A single attachment is represented by an ARAttachment object, with these properties and methods:
- Text(char_set$) As NotesStream: Return the text from a text file, interpreting the contents using the specified character set.
- Raw As NotesStream: Return the binary file data.
- Sub Write(stream As NotesStream): Stream the binary file data to an existing stream. This is a way to append the data to a file (or just export it to a file, but if you want to do that you don’t need this code).
- FileName As String: The “display filename,” which is generally the original name of the file, saved as part of the attachment reference in a rich text item.
- UniqueName As String: The internal filename, usually the same as the FileName property unless there are duplicate file names.
- ItemName As String: If the attachment is embedded in a rich text item, the name of the rich text item, else “” for V2 style attachments.
If you find this website helpful and want to give back, may I suggest buying, reading, and reviewing one of my excellent books? More are coming soon!
If you want email whenever there’s a new post, you can subscribe to the email list, which is 100% private and used only to send you information about stuff on this site.
Source Code
%REM Library AttachmentReader A LotusScript class to read attachment file data without first writing it to disk. This is helpful in situations where the ECL might not allow the code to access disk. © 2023 Andre Guirard Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with the License. You may obtain a copy of the License at http://www.apache.org/licenses/LICENSE-2.0 Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License. %END REM
Option Public Option Declare%REM Class ARAttachment Represents one file attachment in a Notes document. Constructor: New ARAttachment(base64TextNode As NotesDOMTextNode, filename$) Where base64TextNode contains the base64-encoded file data and filename is the unique name of the file (which rich text uses to reference it unambiguously even in cases where multiple files with the same name have been attached). %END REM
Class ARAttachment ' if attached to a rich text item, name of item -- blank if attached to document (V2 style attachment). Public ItemName As String ' original filename of attachment as recorded in the rich text data. Public FileName As String ' unique name of attachment within document Public UniqueName As String Private z_data As String Sub New(tex As NotesDOMTextNode, ByVal uniqNam$) UniqueName = uniqNam z_data = tex.Nodevalue End Sub%REM Property Text (read) - Interpret the file data as text in a specified character set. The result is returned in Unicode regardless what character set the data are in, because that's a limitation of the NotesStream class. %END REM
Public Property Get Text(ByVal char_set$) As NotesStream Dim doc As NotesDocument Dim mime As NotesMIMEEntity Dim streamIn As NotesStream, streamOut As NotesStream Dim db As NotesDatabase Dim ses As New NotesSession Dim contentType$ Set db = ses.CurrentDatabase Set doc = db.CreateDocument Set mime = doc.CreateMIMEEntity("Body") Set streamIn = ses.CreateStream Call streamIn.WriteText(z_data) streamIn.Position = 0 If char_set = "" Then contentType = "text/plain" Else contentType = "text/plain;charset=" & char_set Call mime.SetContentFromText(streamIn, contentType, ENC_BASE64) Set streamOut = ses.Createstream Call mime.Getcontentastext(streamOut, True) streamOut.Position = 0 Set me.Text = streamOut End Property%REM Sub Write: Write the binary data into a NotesStream. Stream position is at end of data. This is the most efficient way to write the data into a file. %END REM
Sub Write(stream As NotesStream) Dim doc As NotesDocument Dim mime As NotesMIMEEntity Dim streamIn As NotesStream Dim db As NotesDatabase Dim ses As New NotesSession Set db = ses.CurrentDatabase Set doc = db.CreateDocument Set mime = doc.CreateMIMEEntity("Body") Set streamIn = ses.CreateStream Call streamIn.WriteText(z_data) streamIn.Position = 0 Call mime.SetContentFromText(streamIn, "binary", ENC_BASE64) Call mime.Getcontentasbytes(Stream, True) End Sub%REM Property Raw (read) - Return the binary data. Stream position is at the beginning (0). %END REM
Public Property Get Raw As NotesStream Dim streamOut As NotesStream Dim ses As New NotesSession Set streamOut = ses.CreateStream me.Write streamOut streamOut.Position = 0 Set Raw = streamOut End Property End Class%REM Class AttachmentReader Read file attachments from a NotesDocument. Constructor: New AttachmentReader(doc As NotesDOcument) %END REM
Class AttachmentReader Private z_byName List As ARAttachment Private z_attCount As Integer Private z_byIndex() As ARAttachment Sub New (doc As NotesDocument) Dim ses As New NotesSession Dim domp As NotesDOMParser Dim dxle As NotesDXLExporter, domd As NotesDOMDocumentNode, elRoot As NotesDOMElementNode Set dxle = ses.Createdxlexporter(doc) Set domp = ses.Createdomparser(dxle) dxle.Omitrichtextpictures = True dxle.Outputdoctype = False dxle.Process Set domd = domp.Document Set elRoot = domd.Documentelement Dim nl As NotesDOMNodeList, elFile As NotesDOMElementNode, elFileData As NotesDOMElementNode Dim elObj As NotesDOMElementNode, elItem As NotesDOMElementNode, nod As NotesDOMNode, tex As NotesDOMTextNode, i% Set nl = elRoot.Getelementsbytagname("filedata") For i = 1 To nl.Numberofentries Set elFileData = nl.Getitem(i) Set tex = Nothing Set elFile = elFileData.Parentnode If elFile.Nodename = "file" Then Set elObj = elFile.Parentnode If elObj.Nodename = "object" Then Set elItem = elObj.Parentnode If elItem.Nodename = "item" Then Set nod = elFileData.Firstchild Do Until nod.Isnull If nod.Nodetype = DOMNODETYPE_TEXT_NODE Then Set tex = nod Exit Do End If Set nod = nod.Nextsibling Loop End If End If End If If Not (tex Is Nothing) Then Dim ara As New ARAttachment(tex, elFile.Getattribute("name")) Set z_byName(ara.UniqueName) = ara z_attCount = z_attCount + 1 End If Next If z_attCount Then ReDim z_byIndex(1 To z_attCount) i = 0 ForAll thing In z_byName i = i + 1 Set z_byIndex(i) = thing End ForAll End If Set nl = elRoot.Getelementsbytagname("attachmentref") For i = 1 To nl.Numberofentries Dim key$, displayName$, elRef As NotesDOMElementNode Set elRef = nl.Getitem(i) key = elRef.Getattribute("name") displayName = elRef.Getattribute("displayname") If IsElement(z_byName(key)) Then Set ara = z_byName(key) ara.FileName = displayName Set elItem = elRef.Parentnode Do Until elItem.Isnull If elItem.Nodename = "item" Then ara.ItemName = elItem.Getattribute("name") Exit Do End If Set elItem = elItem.Parentnode Loop End If Next End Sub' Property item(key): If key is a number from 1 to the number of attachments, return the attachment at that number position. ' If key is a string, return the attachment with that unique filename.
Public Property Get item(ByVal key) As ARAttachment If DataType(key) = 8 Then If IsElement(z_byName(key)) Then Set item = z_byName(key) End If Else If key >= 1 And key <= z_attCount Then Set item = z_byIndex(key) End If End If End Property Public Property Get Count As Integer Count = z_attCount End Property End Class
Code Sample
Option Public Option Declare Use "AttachmentReader" Sub Initialize Dim ses As New NotesSession, db As NotesDatabase, doc1 As NotesDocument, vu As NotesView Dim itum As NotesItem Set db = ses.Currentdatabase Set vu = db.getview("Main") Set doc1 = vu.Getdocumentbykey("utf8", True) Dim ar As New AttachmentReader(doc1) Dim ara As ARAttachment, contents$, stream As NotesStream, bytes, i% Set ara = ar.item("utf8.txt") contents = ara.text("UTF-8").readtext & { Hex:} Set stream = ara.Raw bytes = stream.Read For i = 0 To UBound(bytes) contents = contents & " " & Right$("0" & Hex(bytes(i)), 2) Next MsgBox "File contents: " & contents End Sub
Woah, this is cool. Never occurred to me to use DXL to access attachments, or anything else, inside of a document. Brilliant!
If you are brave enough to use Java, you do not need to hack around with the cumbersome and unreliable DXL routines in LotusScript.
Read the attachment (as embedded object) in a binary stream and return that content in a proper UTF-8 converted text.
private String getAttachmentContent(Document docTarget, String rtiFieldName, String fileName) {
final RichTextItem rti = (RichTextItem) docTarget.getFirstItem(rtiFieldName);
final EmbeddedObject eo = rti.getEmbeddedObject(fileName);
if (null != eo) {
final InputStream isFile = eo.getInputStream();
final ByteArrayOutputStream resultData = new ByteArrayOutputStream();
final byte[] buffer = new byte[1024];
for (int length; (length = isFile.read(buffer)) != -1;) {
resultData.write(buffer, 0, length);
}
isFile.close();
final String fileContent = resultData.toString(“UTF-8”);
return fileContent;
}
return “”;
}
Yes, this is another way to do it if using Java is an option. It isn’t always, because sometimes there’s Notes client UI involvement. You can invoke Java functions via LS2J, but if the data are not text, passing back binary data to the LotusScript code might be a challenge, and performance is also an issue with LS2J.
I’m also uncertain getInputStream will work in all cases, as it uses a temporary file. It would be interesting to try that, and also to do performance testing since reading and writing a disk file is likely to slow things down.
Great article! Thank you for sharing.
Is there a way to save a file onto a document without it going to disk first? We looked at doing that in Java but didn’t seem like that was possible.
IIRC, the LotusScript Gold Collection project in OpenNTF has a library for that.
In the write scenario don’t you need to specify the filename of the binary data ?
I believe so. I haven’t reviewed the attachment-creation library recently but I suppose you can just make up any name — it doesn’t have to be an actual file.
@David Leedy: This is very do-able in Java. Have a look at this solution on Stackoverflow that’s a FileItem class but implemented in memory. You can adapt this to handle attachments in memory only. I’ve used it to handle attachments in multipart form data then save to notesdocuments. Works a treat. https://stackoverflow.com/a/62624820/12571484
Thank you for this really useful class.
I’d been battling for a few days, without success, working out how to process unicode text files containing emojis without resorting to file system access. I much prefer in-memory and this code does the trick perfectly.