I have a file with bad data (a few random SUB control characters, which in themselves ... they are not part of the grapheme), and I tried to delete them using the regular expression search pattern:
Text to Find: \x1a Replace with:
This removes my SUB characters, but it also messes up my other accented characters (é and í).
Is there a regular expression that will remove the SUB control character (code point) if it is on its own? (e.g. not part of grapheme)
SAMPLES DATA (replace wherever you see “␚” with the SUB control character:
A,André,Fernandez A,Daniel,O␚Shea A,Ibhlín,Flanders A,Donny,O␚'Donnell A,Spencer,O'Maley
SAMPLE DATA Output if I use the current regular expression:
A,Andr ,Fernandez A,Daniel,OShea A,Ibhl n,Flanders A,Donny,O'Donnell A,Spencer,O'Maley
DESIRED DATA OUTPUT
A,André,Fernandez A,Daniel,OShea A,Ibhlín,Flanders A,Donny,O'Donnell A,Spencer,O'Maley