Notes/Domino applications don’t just have code — they also store data, often a lot of it. But it can take years for them to accumulate enough documents for any performance issues to start seriously impacting users.
When designing an application, especially a brand new one, it’s important to performance test it with an unreasonable amount of sample data so any performance issues become evident immediately.
Or unexpected hard limits! Say your code depends on an array, and hasn’t taken into account array size limits. Or there’s an Integer variable somewhere that really should be a Long.
If you don’t learn about these problems before deployment, the original developers are long since scattered and forgot what they did on this project anyway, and meanwhile the app is unusable while you wait for a fix.
Lots and lots of names
The best test data looks like real data. It’s not just for the looks — having a limited number of different values in your fields can fail to exercise some aspects of the application — such as a keyword field with a calculated selection list reading from a categorized view column, which will start to fail when it reaches 65KB of return data.
So let’s say your test documents need to contain a person’s name, and you want to generate 50,000 documents each with a different name — or maybe nearly all different. How do you automate that?
The random number generator is your friend here. You can always just have a list of first names, a list of last names, and randomly combine them. The number of combinations is the product of the number of choices for each part, so 50 first and 50 last names would give you… 2,500 unique names. Hm. Not quite hitting the 50,000 name target. Adding a random middle initial gets you to 65,000 — barely enough.
Picking 50,000 random names from a pool of 65,000 is likely to result in a lot of duplicates. If the names need to be unique, you have to keep track of which ones you’ve used. A List variable can do this. Using the names as key values, it’s easy to tell whether a name is already in it and keep trying until you get one that isn’t.
And you might have other strings you need to randomly generate also. It would be nice to have a systematic way to do this and not have to code it every time.
Well, rejoice! Here’s a class that incorporates all that logic in a reusable form.
As a bonus, it contains functions to anonymize real data by replacing names with randomly generated other names in a consistent way (i.e. the first time a name is encountered it’s randomly replaced, then subsequent occurrences of the same name are replaced with the same string).
This class can also be used to generate random IDs of other sorts, with a different list of “parts” to combine together, but generally it’s fine to just use sequential numbers for those, which makes uniqueness easy.
If you find this website helpful and want to give back, may I suggest buying, reading, and reviewing one of my excellent books? More are coming soon!
If you want email whenever there’s a new post, you can subscribe to the email list, which is 100% private and used only to send you information about stuff on this site.
Specification
The RandomNameGenerator class has these members:
- Parts (write only): A string containing a newline-separated list of comma-separated lists of parts that will be used to construct names. For instance if Parts is set to “a,b<newline>1,2,3”, the class can generate up to six different strings by combining a or b with 1, 2, or 3. The default value can generate around 21 million names of people. Lines can contain a single value, e.g. “-“. “(space)” is interpreted as a single space if it’s on a line by itself. You may repeat values on the same line to adjust the likelihood of a particular value being selected. You can also mix in as many blank values (consecutive commas) as you like to control the likelihood of that part being excluded altogether.
- Spacer (write only): A string to be inserted between the parts, default empty string.
- Repeatable (boolean, read/write): applies to the MaskName function, which see.
- PossibleCombinations (double, read only): returns the number of unique names possible under the current configuration. This may be an overestimate if there’s some overlap in your Parts values such that different parts may combine to generate the same string. It does, however, take into account duplicate part values within the same line. It’s recommended that you provide enough parts to combine into at least 10 times the number of unique names you’re likely to want.
- Substitutions (list as string, read/write): If using MaskName, the list of substitutions that have been returned on previous calls. E.g. if Substitutions(“George”) = “Fred”, a previous call to MaskName(“George”) returned “Fred”.
- Function GetName As String: returns a random name generated from the Parts supplied. The name is not stored, so if you call again you might get a duplicate value.
- Function GetUnique(limit) As String: tries limit times to generate a random name that hasn’t already been used in this session. If that fails, returns “”, else the name. NOTE: a call to GetName doesn’t mark the name as used for this purpose.
- Function GetUniqueName As String: Like getUnique, but will never return “” — it’ll come up with something unique, but it might add a hexadecimal number to the end to make sure of it. If you’ve provided enough Parts to combine into ten times the number of unique values you’ll need, the odds of this being necessary are very low — on the order of a random meteorite obliterating your winning PowerBall ticket.
- Function MaskName(name) As String: Given a name, return a random made-up name to replace it with, for purposes of anonymizing data. If called again with the same input, it’ll return the same output (so records belonging to the same person can still be grouped even though you don’t know the person’s real name). If the Repeatable property is set True, MaskName will return the same name given the same input even between different sessions (the random name is generated using a hash of the input as the randomizer’s key value).
Note: If using MaskName with the Repeatable option, the name is less securely anonymized, since you can come back later and try a real name to see how it would appear in the output data. But someone who doesn’t have your Parts value, will not be able to reproduce this transformation.
Source code
%REM Library RandomNameGenerator Create random names and other strings for automating the creation of test data records. Also includes a function to transform names to random other names, for anonymizing data. © 2022 Andre Guirard Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with the License. You may obtain a copy of the License at http://www.apache.org/licenses/LICENSE-2.0 Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License. %END REM
Option Public Option Declare Public Const ERR_PACKED = 30440 Public Const MSG_PACKED = "Too difficult to find a unique name. Add more parts to allow for more combinations."'Begin DNT %REM Class RandomNameGenerator Description: Generate random multi-word or -syllable names from a set of values supplied by the caller, or from a default set. Constructor: New RandomNameGenerator %END REM
Class RandomNameGenerator z_nameparts() As Variant z_used List As Integer z_inited As Boolean z_spacer As String z_times As Integer z_count As Long z_combinations As Double' if True, use a technique that will "mask" a name the same way every time.
Public Repeatable As Boolean' for the MaskName function to remember the original names it masked.
Public Substitutions List As String%REM Property Set Spacer Description: Set what character is inserted between parts of the name (default, nothing). %END REM
Property Set Spacer As String z_spacer = spacer End Property%REM Property Count (read only) Description: How many unique names have been generated %END REM
Public Property Get Count As Long Count = z_count End Property%REM Property Set Parts Description: The Parts property is a string containing multiple lines of text, each line being a comma-delimited list of parts you want to paste together into a name. For instance if the value is "Anne,Ben<NL>Ertz,Took" then the possible names generated will be Anne Ertz, Ben Ertz, Anne Took, and Ben Took (assuming spacer is " "). %END REM
Public Property Set Parts As String Dim lines, i% lines = Split(parts, { }) ReDim z_nameparts(0 To UBound(lines)) For i = 0 To UBound(lines) If lines(i) = "(space)" Then lines(i) = " " z_nameparts(i) = Split(lines(i), ",") Next z_inited = True z_combinations = 0 End Property%REM Sub UseDefaults Description: The caller didn't supply any name parts, so use the default lists. %END REM
Private Sub UseDefaults Const DEFAULTPARTS = _ {Alexis,Andy,Anita,Arnold,August,Autumn,Bella,Ben,Bill,Carmen,Carol,Cheryl,Chloe,Chris,Dan,Dana,Dean,Delores,Denise,Dennis,Dexter,Elizabeth,Emile,Ethan,Evelyn,Ferris,Frank,Fred,Fritz,George,Greg,Gus,Hal,Hank,Helga,Hiro,Holly,Howard,Isaac,Ivana,James,Jennifer,Joan,John,Joseph,Juan,Judy,Julia,Justin,Karl,Keiko,Kelly,Kim,Kirk,Laura,Lex,Lily,Lisa,Lorraine,Manny,Maria,Mario,Mark,Martha,Mary,Michelle,Miriam,Mitch,Naomi,Ned,Nicole,Nita,Olga,Ozgur,Patti,Paul,Peter,Phil,Pippy,Rebecca,Rex,Richard,Samuel,Sanjay,Sarah,Sean,Sean,Sigmund,Sven,Tanita,Tate,Ted,Tip,Tony,Umberto,Vanessa,Vera,Vijay,Wei,Wendy,Xagra,Yasuko,Yentl,Yoshi,Zach,Zelda (space) Pre,Fro,Des,El,Re,Dwo,Min,Cis,Xan,Bub,Chu,Ki,Zen,Quet,Asa,Non,Bre,Nim,Fez,Zek,Lop,Ek,Op,Um boosi,a,ni,we,foo,jumi,resa,gero,ki,lu,tumi,jipy,free,nu,too,velu,pone,kro,hipi,re,fana ,ter,gen,berg,ly,man,son,chek,ski,pul,bur,vitch,zen,mar,tex,lit,kony,ther,plop,ver,ster ,,oni,gon,obu,etsi,akoi,ader,ettu,nivu,flar,jip,ikle,ings,oden,oopsi,ynds,akol,li,len} Parts = DEFAULTPARTS End Sub%REM Property PossibleCombinations (read only) Description: Figure out the number of possible unique names we can get from this system. %END REM
Public Property Get PossibleCombinations As Double If Not z_inited then UseDefaults If z_combinations = 0. Then z_Combinations = 1 ForAll thing In z_nameparts Dim tmp tmp = ArrayUnique(thing, 1) z_Combinations = z_Combinations * (UBound(tmp)+1) End ForAll End If PossibleCombinations = z_combinations End Property%REM Function randomelement Description: return a randomly selected element of a string array. %END REM
Private Function randomelement(x) As String If UBound(x) = 0 Then randomelement = x(0) Else randomelement = x(Fix(Rnd * (1+UBound(x)))) End Function%REM Function GetName Description: make up a name that's random but not necessarily unique. %END REM
Function GetName As String On Error GoTo oops If Not z_inited Then UseDefaults Dim i% GetName = randomelement(z_nameparts(0)) For i = 1 To UBound(z_nameparts) GetName = GetName & z_spacer & Randomelement(z_nameparts(i)) Next Exit Function oops: Error Err, Error & { //} & TypeName(Me) & {.} & GetThreadInfo(1) & {:} & Erl & (Erl-Getthreadinfo(0)) End Function%REM Function GetUnique Description: Retrieve a unique name Arguments: tries: the number of randomly generated names to try before giving up finding a unique one. Returns: the name, or "" if it wasn't possible to find one that's unique %END REM
Function GetUnique(ByVal tries As Integer) As String Dim tmp$ do tmp = GetName If Not IsElement(z_used(tmp)) Then GetUnique = tmp z_used(tmp) = 1 z_count = z_count + 1 End if tries = tries - 1 Loop While tries > 0 End Function Function GetUniqueName As String On Error GoTo oops Dim i%, k%, tmp$ GetUniqueName = GetUnique(20) If GetUniqueName <> "" Then Exit Function For i = 1 To 10000 tmp = GetName & Hex(i) ' add a number to make it more likely to be unique If Not IsElement(z_used(tmp)) Then z_used(tmp) = 1 z_count = z_count + 1 GetUniqueName = tmp Exit Function End If Next Error ERR_PACKED, MSG_PACKED Exit Function oops: Error Err, Error & { //} & TypeName(Me) & {.} & GetThreadInfo(1) & {:} & Erl & (Erl-Getthreadinfo(0)) End Function%REM Function MaskName Description: Make up a random name to replace a name we're given, and remember in case we're asked to mask the same name again. %END REM
Function MaskName(ByVal orig$) As String On Error GoTo oops If IsElement(Substitutions(orig)) Then MaskName = Substitutions(orig) Else If Repeatable Then Randomize fletcher32(orig) End If MaskName = getUniqueName Substitutions(orig) = MaskName End If Exit Function oops: Error Err, Error & { //} & TypeName(Me) & {.} & GetThreadInfo(1) & {:} & Erl & (Erl-Getthreadinfo(0)) End Function%REM Function Fletcher32 Description: Compute a position-dependent checksum or hash code of a string of unicode text. Fletcher is a common checksum algorithm, adapted here into LotusScript and treating each character as a word. %END REM
Private Function Fletcher32(ByVal strdat$) As Long Dim ind%, limit As Long, sum1 As Long, sum2 As Long, pos As Long, tlen% limit = Len(strdat) sum1 = &hffff sum2 = &hffff While limit > 0 If limit > 359 Then tlen = 359 Else tlen = limit limit = limit - tlen For ind = 1 To tlen sum1 = sum1 + Uni(Mid$(strdat, pos+ind, 1)) sum2 = sum2 + sum1 Next sum1 = (sum1 And &hffff&) + (sum1 \ &h10000) sum2 = (sum2 And &hffff&) + (sum2 \ &h10000) Wend Fletcher32 = CLng("&h" & Right(Hex(sum2), 4) & String(4, {0})) Or sum1 End Function End Class
Sample usage
Here’s the code to generate a list of twenty names such as in the above image:
Use "RandomNameGenerator" Const NEWLINE = { } Sub Initialize Dim ans$, rg As New RandomNameGenerator, i% For i = 1 To 20 ans = ans & NEWLINE & rg.getUniqueName() Next MsgBox Mid$(ans, 2), 0, "Possible: " & rg.PossibleCombinations End Sub
I created something similar like this. check it out –>
https://www.youtube.com/watch?v=ZMqHThHMWlI