Divide the name into fields?

Is there a wonderful RegEx or method in C # that can do this for me?

Someone enters a string in the "Full Name" field, and I need to break it down into: Title Middle name Last name Suffix

But the user can type “John Smith,” so he needs to put John in the First Name and Smith in the Last Name. A person can dial Mr. John Smith (I have a list of known names and suffixes), so if the first line is a heading, it goes into the heading field.

A great example would be:

Mr. John Campbell Smith Jr.

But they could have:

Mr. and Mrs. John and Mary Smith

So, the title will be Mr. and Mrs., the first name will be John and Mary, and the last name will be Smith (they can use either "AND" or "&" as a joiner)

I find this to be too complicated for regular expression, but I was hoping someone could have an idea?

+4
source share
1 answer

Well, here is a program that I think will do the work for you. Of course, you can make some changes because I made some assumptions based on your question, but this, of course, should help you get started in the right direction.

Some of these assumptions are as follows:

  • There is no punctuation in the name provided to the function (for example, a minor with a period).
  • You must have a first and last name, but the names, middle name and suffix are optional.
  • Union operators only - this and & , as indicated in the question.
  • The name is in this format {titles} {first name} {middle name} {last name} {suffix} .

I threw a lot of different names on it, but, of course, there are more possibilities, I spent no more than 30 minutes on it, so it is not fully tested .

class Program { static List<string> _titles = new List<string> { "Mr", "Mrs", "Miss" }; static List<string> _suffixes = new List<string> { "Jr", "Sr" }; static void Main(string[] args) { var nameCombinations = new List<string> { "Mr and Mrs John and Mary Sue Smith Jr", "Mr and Mrs John and Mary Smith Jr", "Mr and Mrs John and Mary Sue Smith", "Mr and Mrs John and Mary Smith", "Mr and Mrs John Smith Jr", "Mr and Mrs John Smith", "John Smith", "John and Mary Smith", "John and Mary Smith Jr", "Mr John Campbell Smith Jr", "Mr John Smith", "Mr John Smith Jr", }; foreach (var name in nameCombinations) { Console.WriteLine(name); var breakdown = InterperetName(name); Console.WriteLine(" Title(s): {0}", string.Join(", ", breakdown.Item1)); Console.WriteLine(" First Name(s): {0}", string.Join(", ", breakdown.Item2)); Console.WriteLine(" Middle Name: {0}", breakdown.Item3); Console.WriteLine(" Last Name: {0}", breakdown.Item4); Console.WriteLine(" Suffix: {0}", breakdown.Item5); Console.WriteLine(); } Console.ReadKey(); } static Tuple<List<string>, List<string>, string, string, string> InterperetName(string name) { var segments = name.Split(" ".ToCharArray(), StringSplitOptions.RemoveEmptyEntries); List<string> titles = new List<string>(), firstNames = new List<string>(); string middleName = null, lastName = null, suffix = null; int segment = 0; for (int i = 0; i < segments.Length; i++) { var s = segments[i]; switch (segment) { case 0: if (_titles.Contains(s)) { titles.Add(s); if (segments[i + 1].IsJoiner()) { i++; continue; } segment++; } else { segment++; goto case 1; } break; case 1: firstNames.Add(s); if (segments[i + 1].IsJoiner()) { i++; continue; } segment++; break; case 2: if ((i + 1) == segments.Length) { segment++; goto case 3; } else if ((i + 2) == segments.Length && _suffixes.Contains(segments[i + 1])) { segment++; goto case 3; } middleName = s; segment++; break; case 3: lastName = s; segment++; break; case 4: if (_suffixes.Contains(s)) { suffix = s; } segment++; break; } } return new Tuple<List<string>, List<string>, string, string, string>(titles, firstNames, middleName, lastName, suffix); } } internal static class Extensions { internal static bool IsJoiner(this string s) { var val = s.ToLower().Trim(); return val == "and" || val == "&"; } } 
+5
source

Source: https://habr.com/ru/post/1440172/


All Articles