How to remove specific characters using excel VBA script

The following VBA script gets rid of unwanted characters, but unfortunately only NUMBERS.

could you help me? You must also discard the letters, as in the example table below (bold).

The range can be from 0 to 15000+ cells

.................................................. ...

a new a york a times a

b new b york b times b

c new c york c watertown c ny c

6 ave 6 new 6 york 6 city 6

>

.................................................. ....

VBA script:

Sub Remove() Application.ScreenUpdating = False Dim R As RegExp, C As Range For Each C In Range("A1:A" & Cells(Rows.Count, 1).End(xlUp).Row) If R Is Nothing Then Set R = New RegExp R.Global = True R.Pattern = "\D" C.Offset(0, 1) = R.Replace(C, "") R.Pattern = "\d" C = R.Replace(C, "") End If Set R = Nothing Next C Application.ScreenUpdating = True End Sub 

EDIT1

 Sub Remove() Call BackMeUp Dim cell As Range Dim RE As Object Dim Whitecell As Range Dim strFind As String, strReplace As String Dim lLoop As Long Dim Loop1 As Long Application.ScreenUpdating = False Application.Calculation = xlCalculationManual Application.EnableEvents = False Range("A3:L3").Select Selection.Delete Shift:=xlUp '--------------------------------------------------Remove JUNK Range("A3:A" & Cells(Rows.Count, 1).End(xlUp).Row).Select On Error Resume Next For lLoop = 1 To 100 strFind = Choose(lLoop, "~?»", "~®", "~.", "~!", "~ï", "~-", "~§", "~$", "~%", "~&", "~/", "~\", "~,", "~(", "~)", "~=", "~www", "~WWW", "~.com", "~.net", "~.org", "~{", "~}", "~[", "~]", "", "~¿", "", "~:", "~;", "~_", "", " ~@ ", "~#", "~'", "~|", "~€", "", "", "", "", "", "", "~+", "~<", "~>", "~nbsp", "", "", "", "", "~–", "", "~?") strReplace = Choose(lLoop, " ") Selection.Replace What:=strFind, Replacement:=strReplace, LookAt:=xlPart, _ SearchOrder:=xlByRows, MatchCase:=False, SearchFormat:=False, _ ReplaceFormat:=False Next lLoop '--------------------------------------------------Remove Numbers Range("A3:A" & Cells(Rows.Count, 1).End(xlUp).Row).Select On Error Resume Next For Loop1 = 1 To 40 strFind = Choose(lLoop, "~1", "~2", "~3", "~4", "~5", "~6", "~7", "~8", "~9", "~0") strReplace = Choose(Loop1, " ") Selection.Replace What:=strFind, Replacement:=strReplace, LookAt:=xlPart, _ SearchOrder:=xlByRows, MatchCase:=False, SearchFormat:=False, _ ReplaceFormat:=False Next Loop1 '--------------------------------------------------Remove Single Letters Set RE = CreateObject("vbscript.regexp") RE.Global = True RE.MultiLine = True RE.Pattern = "^[az]\b | \b[az]\b" For Each cell In Range("A3:A" & Cells(Rows.Count, 1).End(xlUp).Row) cell.Value = RE.Replace(cell.Value, "") Next '--------------------------------------------------Remove WHITE SPACES For Each Whitecell In Range("A3:A" & Cells(Rows.Count, 1).End(xlUp).Row) Whitecell = WorksheetFunction.Trim(Whitecell) Next Whitecell '--------------------------------------------------Remove DUPES ActiveSheet.Range("A3:A" & Cells(Rows.Count, 1).End(xlUp).Row).RemoveDuplicates Columns:=1, Header:=xlYes ActiveWorkbook.Worksheets(ActiveSheet.Name).Sort.SortFields.Clear '--------------------------------------------------Copy to B - REPLACE ALL WHITE IN B Range("A3:A" & Cells(Rows.Count, 1).End(xlUp).Row).Select Selection.Copy Range("B3:B" & Cells(Rows.Count, 1).End(xlUp).Row).Select ActiveSheet.Paste Selection.Replace What:=" ", Replacement:="", LookAt:=xlPart, _ SearchOrder:=xlByRows, MatchCase:=False, SearchFormat:=False, _ ReplaceFormat:=False Columns("A:L").EntireColumn.AutoFit '--------------------------------------------------END Application.ScreenUpdating = True Application.Calculation = xlCalculationAutomatic Application.EnableEvents = True Range("a1").Select End Sub 
+4
source share
3 answers

EDIT (deleted original answer as it is not applicable after receiving more information about what you wanted, but leaving advice)

  • You create / destroy a RE object every cell that is expensive / unnessessary
    • If other users will use this function, create an object inside the code instead of adding links
    • There is no need to set anything at the end for the regular expression object - variables are freed from memory at the end of the function automatically
    • Improving variable name names and using the right indentation can improve readability and simplify editing.
    • Add a multi-line parameter if your cells have line breaks inside them.
    • You might want to use a variant array if you work with a large number of cells.

UDPATE 2

Based on the comments below, here's how to get only the events of two or more lowercase characters and spaces between them. Instead of replacing what you DO NOT want , I personally think that a good way is to extract what you need DO . I shared this feature a bit on this site as it is really useful. Here is an example of how to call it in the contents of column A and put the results in column B.

 Sub test() ' Show how to run this on cells in A and transpose result in B Dim varray As Variant Dim i As Long Application.ScreenUpdating = False varray = Range("A1:A15000").Value For i = 1 To UBound(varray, 1) varray(i, 1) = RegexExtract(varray(i, 1), "([az]{2,})", " ") Next Range("B1").Resize(UBound(varray, 1)).Value = _ Application.WorksheetFunction.Transpose(varray) Application.ScreenUpdating = True End Sub 

And make sure this is in the module:

 Function RegexExtract(ByVal text As String, _ ByVal extract_what As String, _ Optional seperator As String = "") As String Dim i As Long Dim j As Long Dim result As String Dim allMatches As Object Dim RE As Object Set RE = CreateObject("vbscript.regexp") RE.Pattern = extract_what RE.Global = True Set allMatches = RE.Execute(text) For i = 0 To allMatches.Count - 1 For j = 0 To allMatches.Item(i).submatches.Count - 1 result = result & seperator & allMatches.Item(i).submatches.Item(j) Next Next If Len(result) <> 0 Then result = Right$(result, Len(result) - Len(seperator)) End If RegexExtract = result End Function 
+4
source

Your "R.Pattern =" \ d "is the only line you need to change." \ d "is a regular expression describing a" digit ".

I would suggest changing "\ d" to "^ [a-z0-9] | [a-z0-9] \ b" as a starting point.

+3
source

I rewrote your code below to

  • RegExp is created only once. Your current code creates a new object and then destroys it for each cell being tested, since it is inside your loop.
  • The code below uses a variant array to minimize process time when manipulating each cell value. The VbNullString slightly faster than "".
  • you use simpler \ w in regex to match any a-z0-9
  • late binding to a RegExp object avoids the need for a third-party to set the link, setting ignore case to true, makes your case-insensitive insenstive

      Sub Remove() Dim R As Object Dim C As Range Dim lngrow As Long Dim rng1 As Range Dim X Set R = CreateObject("vbscript.regexp") With R .Global = True .Pattern = "^\w\s|\b\w\b" .ignoreCase = True End With Application.ScreenUpdating = False Set rng1 = Range("A1:A" & Cells(Rows.Count, 1).End(xlUp).Row) X = rng1.Value2 For lngrow = 1 To UBound(X, 1) X(lngrow, 1) = R.Replace(X(lngrow, 1), vbNullString) Next lngrow rng1.Value2 = X Application.ScreenUpdating = True End Sub 
+3
source

Source: https://habr.com/ru/post/1382060/


All Articles