Regular expression: pars name / number

C # /. NET 2.0

I need to analyze a string containing the name of the street, and the house - not in two separate values.

in: "Streetname 1a"         out:  "streetname"  "1a"
    "Street name 1a"              "street name" "1a"
    "Street name 1 a"             "street name" "1 a"

My first choice was to break the line where I found the char, but this will not work for the second case.

result[0] = trimmedInput.Substring(0, splitPosition).Trim();
result[1] = trimmedInput.Substring(splitPosition + 1).Trim();

What is the best way to do this? Can regular expressions be used?

thank

+3
source share
5 answers

^(.+)\s(\S+)$ gotta do the trick

EDIT: this will work as the house number cannot contain spaces. Otherwise, this problem cannot be solved programmatically, since the program will never recognize the semantics of tokens.

. , , , .

^(.+)\s(\d+(\s*[^\d\s]+)*)$ , , - .

+7

Dyppl, . ( , / ), ( !) . SmartyStreets, . #, , API LiveAddress:

https://github.com/smartystreets/LiveAddressSamples/blob/master/c-sharp/street-address.cs

( , "" ):

[
    {
        "input_index": 0,
        "candidate_index": 0,
        "delivery_line_1": "3214 N University Ave",
        "last_line": "Provo UT 84604-4405",
        "delivery_point_barcode": "846044405140",
        "components": {
            "primary_number": "3214",
            "street_predirection": "N",
            "street_name": "University",
            "street_suffix": "Ave",
            "city_name": "Provo",
            "state_abbreviation": "UT",
            "zipcode": "84604",
            "plus4_code": "4405",
            "delivery_point": "14",
            "delivery_point_check_digit": "0"
        },
        "metadata": {
            "record_type": "S",
            "county_fips": "49049",
            "county_name": "Utah",
            "carrier_route": "C016",
            "congressional_district": "03",
            "latitude": 40.27586,
            "longitude": -111.6576,
            "precision": "Zip9"
        },
        "analysis": {
            "dpv_match_code": "Y",
            "dpv_footnotes": "AABBR1",
            "dpv_cmra": "Y",
            "dpv_vacant": "N",
            "ews_match": false
        }
    }
]

. , :

http://wiki.smartystreets.com/liveaddress_api_users_guide#json-responses

EDIT: / ( ).

+2

, , , . - , , :

  • .
  • .
  • , .
  • - , .

, .

, , , , - .

:

Regex reggie = new Regex(@"^(?<name>\w[\s\w]+?)\s*(?<num>\d+\s*[a-z]?)$", RegexOptions.IgnoreCase)
+1

, String.LastIndexOf() .

, - , splittedValue.Any(c => Char.IsDigit(c));. , - , , , , , , .

Update

If you really have such noisy data that needs to be normalized, I think you can do nothing better than @Dyppl said and using some kind of complex regular expression that should evolve from patterns that you get that aren't they will work.

0
source

This assumes that all of your “addresses” will be formatted in at least one of the ways mentioned above.

string address = "Streetname 1a"

string street = Regex.Replace(address, "^[^0-9]+", "");

string number = address.Replace(street, "");

Then trim both values.

0
source

Source: https://habr.com/ru/post/1792464/


All Articles