Project Zebra: Every hundred years a bird comes and sharpens its beak ►
◄ My review: Samsung Galaxy Tab A8 (2019) SM-T290
I've been doing a lot of manual address parsing in Excel recently, which is a usage case for regex. Since Excel doesn't have a regular expression replace function this calls for a custom function, and it's preferable to store frequently used custom functions in a way that's available to all of your documents to avoid having to add them each time.
The answer is not to save a personal macro workbook, which is what Microsoft suggest but requires referencing in each new workbook to use your module functions,
Instead it's more practical to save a .xlam add-in workbook saved in the default location (typically C:\Users\your.name\AppData\Roaming\Microsoft\Addins) which makes the functions available to other open workbooks and allows Excel's function name auto-completion to work.
Just remember that either way the macros won't be available to recipients if you send them a document, so it's best for functions that are needed temporarily to parse data before being converted to values. My swiss army knife in a lot of situations is:
Function RegExpSub(text, find, replace)
Set RE = CreateObject("vbscript.regexp")
RE.pattern = find
RE.Global = True
RegExpSub = RE.replace(text, replace)
End Function
This is a very well-established way of getting regex support in Excel, which I picked up from tmetha.com (unsure if I'm misremembering the domain name or it's gone now) and wrote about a bit in https://virtualdebris.co.uk/blog/0358B3CC/fraud-and-audit-type-analysis-techniques-using-microsoft-excel but for a nicer article with lots of explanation and additional functions, https://www.myonlinetraininghub.com/regex-regular-expressions-excel looks decently written. If that disappears I'll re-post an adaptation here. Everyone should have regex available to them.
To give a practical example, the four regular expressions below will take a single string address in the formats;
Flat 1, 100, Name Road, Stourbridge, West Midlands, DY1 1HF
101, Name Road, Stourbridge, West Midlands, DY1 1HF
and split it down into Address, Town, County and Postcode fields. The reason for wanting/needing regex is because it's assumed that every address will have the town, county and postcode parts but the stem part of the address (everything before Stourbridge) can have a variable number of comma delimiters, which makes text-to-columns impractical.
=Substitute(RegExpSub(cell_reference,"(.*), (.*?), (.*), (.*)","$1"),", ",CHAR(10))
=RegExpSub(cell_reference,"(.*), (.*?), (.*), (.*)","$2")
=RegExpSub(cell_reference,"(.*), (.*?), (.*)","$2")
=RegExpSub(cell_reference,"(.*), (.*?)","$2")
The substitute bit turns commas in this part of the address into line feed characters for printing onto correspondence.
2021 edit: I'm just going to leave the information below here as well as it might be useful in future. The first uses the function above in Excel, the second similarly uses an equivalent custom function for SQL Server.
Both produce a matching token in the format FFF_SSS___NNN_POSTCODE.
=UPPER(LEFT(RegExpSub(B2,"(.*?) (.*?) (.*)","$2"),3)&"_"&LEFT(RegExpSub(B2,"(.*?) (.*) (.*?)","$3"),3))&"___"&IF(COUNT(FIND({0,1,2,3,4,5,6,7,8,9},E2))>0,RegExpSub(E2,"(.*?)(\d+\w*) (.*)","$2"),"")&"_"&UPPER(RegExpSub(H2,"\W",""))
SELECT DISTINCT UPPER(LEFT(ltrim(forenames),3)+'_'+LEFT(ltrim(surname),3)+'___'+dbo.RegExReplace(isnull(address_2,'')+' '+isnull(address_1,''),'(.*?)(\d+\w*) (.*)','$2')+'_'+dbo.RegExReplace(isnull(postcode,''),'\W',''))
💬 Comments are off, but you can use the mail form to contact or see the about page for social media links.