Using RegEx and Replace to Update Address Addresses Using USPS Abbreviations in MS-Access

I am trying to write a VBA function in Access that replaces the words in the address field with the standard US Postal Abbreviations . I understand that this will never be perfect, but I want to at least make simple reductions (without having to purchase address formatting software), for example

input output ------- ------------- North -> N Street -> ST Drive -> DR Lane -> LN 

I thought about using a simple table to store a row and a replacement row, and then scroll through that table / recordset to do a simple search and replace using the Replace() function, for example. using immediate window :

  ?Replace("123 North 3rd St", "North", "N", compare:=vbTextCompare) 123 N 3rd St 

However, this method can potentially cause errors, for example.

  ?Replace("123 Northampton St", "North", "N", compare:=vbTextCompare) 123 Nampton St 

My initial strategy was to create a replacement table with regex patterns and replacement strings, and then scroll through this table for a more precise search and replacement.

 pattern abbrev ------------------- ------------ {pattern for North} N {pattern for Street} ST 

I realized that RegEx could be crowded here, especially since I am going to cycle through address fields many times in the database, but could not think of an easier way to use the Replace() function ( Update : see the Answers from @ mwolfe02 and @ Cylian and hybrid solution).

In the above example, I want to search for the words "North" and "Street" if they either exist as a word in a line (thus separated by two spaces), or at the end of a line or the beginning of a line. This covers most situations that need to be shortened. eg

 address formatted ---------------------- -------------------------- 123 North 3rd St -> 123 N 3RD ST 123 ABC Street North -> 123 ABC ST N North 3rd Street -> N 3RD ST 123 North Northampton St -> 123 N NORTHAMPTON ST 

As in these examples, I want to replace all instances of the template in a string. I also convert everything to uppercase (I can use UCase() for the final result without any problems).

Does anyone know of an existing module that does such things? Can someone help with pattern matching like in the examples above? For additional credit, I am also interested in creating a rule in the table for formatting mailboxes, for example.

 address formatted ---------------------- -------------------------- PO Box 345 -> PO BOX 345 PO Box 345 -> PO BOX 345 Post Office Box 345 -> PO BOX 345 PO. Box 345 -> PO BOX 345 PO Box 345 -> PO BOX 345 

This post provides the following pattern for recognizing some PO boxes "^ \ s * P.?\s? O.?\sB [Oo] [Xx]". (though not the third example above). Again, I am not so comfortable with matching and replacement sets to figure out how to write this more accurate replacement function. Is there a RegEx / Access specialist that can help?

+6
source share
4 answers

I created a very simple ref_USPS_abbrev lookup table from the list of USPS abbreviations on the Internet. Here are the entries corresponding to the first given example:

 WORD ABBREV ------------ ------------- NORTH N STREET ST 

Then, by including the answers in my original post, I created two helper functions.

From @Cylian:

  ' ----------------------------------------------------------------------' ' Formats string containing PO Box to USPS Approved PO BOX format ' ' ----------------------------------------------------------------------' ' Requires Microsoft VBScript Regular Expressions 5.5 Public Function FormatPO(inputString As String) As String Static rePO As Object If rePO Is Nothing Then Set rePO = CreateObject("vbscript.regexp") With rePO .Pattern = "\bP(?:[. ]+|ost +)?O(?:ff\.?(?:ice))" & _ "?[. ]+B(?:ox|\.) +(\d+)\b" .Global = True .IgnoreCase = True End With End If With rePO If .Test(inputString) Then FormatPO = .Replace(inputString, "PO BOX $1") Else FormatPO = inputString End If End With End Function 

And using @ mwolfe02 is a great idea:

  ' ----------------------------------------------------------------------' ' Replaces whole word only with an abbreviation in address string ' ' ----------------------------------------------------------------------' Public Function AddressReplace(AddressLine As String, _ FullName As String, _ Abbrev As String) 'Enclose address line in an opening and closing space, so that you 'can require an opening and closing space on each word you are trying 'to replace. Finish up with a trim to get rid of those temporary spaces. AddressReplace = Trim(Replace(" " & AddressLine & " ", _ " " & FullName & " ", _ " " & Abbrev & " ")) End Function 

Then, by including these helper functions, I wrote this function:

 ' ----------------------------------------------------------------------' ' Format address using abbreviations stored in table ref_USPS_abbrev ' ' ----------------------------------------------------------------------' ' Requires Microsoft DAO 3.6 Object Library ' Table ref_USPS_abbrev has two fields: WORD (containing the word to match) ' and ABBREV containing the desired abbreviated substitution. ' United States Postal Services abbreviations are available at: ' https://www.usps.com/ship/official-abbreviations.htm Public Function SubstituteUSPS(address As String) As String Static dba As DAO.Database Static rst_abbrev As DAO.Recordset If IsNull(address) Then Exit Function 'Initialize the objects If dba Is Nothing Then Set dba = CurrentDb End If 'Create the rst_abbrev recordset once from ref_USPS_abbrev. If additional 'entries are added to the source ref_USPS_abbrev table after the recordset 'is created, since it is an dbOpenTable (by default), the recordset will 'be updated dynamically. If you use dbOpenSnapshot it will not update 'dynamically. If rst_abbrev Is Nothing Then Set rst_abbrev = dba.OpenRecordset("ref_USPS_abbrev", _ Type:=dbOpenTable) End If 'Since rst_abbrev is a static object, in the event the function is called 'in succession (eg while looping through a recordset to update values), 'move to the first entry in the recordset each time the function is 'called. rst_abbrev.MoveFirst 'Only call the FormatPO helper function if the address has the 'string "ox" in it. If InStr(address, "ox") > 0 Then address = FormatPO(address) End If 'Loop through the recordset containing the abbreviations 'and use the AddressReplace helper function to substitute 'abbreviations for whole words only. Do Until rst_abbrev.EOF address = AddressReplace(address, rst_abbrev![WORD], _ rst_abbrev![ABBREV]) rst_abbrev.MoveNext Loop 'Convert the address to upper case and trim white spaces and return result 'You can also add code here to trim out punctuation in the address, too. SubstituteUSPS = Trim(UCase(address)) End Function 

To create a table ref_USPS_abbrev for testing:

 Sub CreateUSPSTable() Dim dbs As Database Set dbs = CurrentDb With dbs .Execute "CREATE TABLE ref_USPS_abbrev " _ & "(WORD CHAR, ABBREV CHAR);" .Execute " INSERT INTO ref_USPS_abbrev " _ & "(WORD, ABBREV) VALUES " _ & "('NORTH', 'N');" .Execute " INSERT INTO ref_USPS_abbrev " _ & "(WORD, ABBREV) VALUES " _ & "('STREET', 'ST');" .Close End With End Sub 

Finally, testing this function from immediate window :

  CreateUSPSTable ?SubstituteUSPS("Post Office Box 345 123 North Northampton Street") PO BOX 345 123 N NORTHAMPTON ST 

I am not a professional programmer, so I would welcome suggestions for cleaning my code even further, but so far this works great. Thanks to all.

Stack Overflow Once Again FTW!

0
source

Try this feature

 Public Function FormatPO(inputString$) 'This example uses **Microsoft VBScript Regular Expressions 5.5** Dim re As New RegExp, result$ With re .Pattern = "\bP(?:[. ]+|ost +)?O(?:ff\.?(?:ice))?[. ]+B(?:ox|\.) +(\d+)\b" .Global = True .IgnoreCase = True If .test(inputString) Then FormatPO = .Replace(inputString, "PO BOX $1") Else MsgBox "Data doesn't matched!" End If End With 

and can be called as (from immediate window )

 ?FormatPO("PO Box 563") 

gives the result

 PO BOX 563 

An appropriate template for street names with addresses requires more time to create. But you can visit here and create your own regex online.

Hope this helps.

+5
source

@Cylian has a good answer for the second part of your question. I will try to turn to the first. If your only problem is that you are replacing whole words in an address, then the following function will do what you need:

 Function AddressReplace(AddressLine As String, _ FullName As String, _ Abbrev As String) AddressReplace = Trim(Replace(" " & AddressLine & " ", _ " " & FullName & " ", _ " " & Abbrev & " ")) End Function 

It wraps the address bar in an open and close space, so you may need an open and close space for each word you are trying to replace. It finishes to get rid of these temporary spaces.

The following procedure checks the code and produces the output you are looking for:

 Sub TestAddressReplace() Debug.Print AddressReplace("123 North 3rd St", "North", "N") Debug.Print AddressReplace("123 Northampton St", "North", "N") End Sub 
+2
source

USPS has a free search API to verify and standardize addresses. You will need to register for the service (quickly), and then use your id / password in the API to refuse your site. Everything works for you and has a sample code. The Canadian postal service has the same thing (not sure if it's free, though).

https://www.usps.com/business/web-tools-apis/welcome.htm

B. Sevier

+2
source

Source: https://habr.com/ru/post/916892/


All Articles