We don’t allow questions seeking recommendations for books, tools, software libraries, and more. You can edit the question so it can be answered with facts and citations.
Closed 4 years ago.
Improve this questionI have thousands of pieces of address data and I want to parse them so I can separate street from country from postal code and so on.
Is there any way to do that in Java ?
I know that google open sourced their international address and phone number parsing library. I'd suggest you check their presentation here and javadoc.
If you simply have addresses from all over the world in the form they are on the letters, and you later want to send letters there, you better leave them in this format (maybe after splitting of the country, which comes usually last).
The internal formats very differ between the individual countries (even if only comparing Germany, Great Britain, Russia), and having a database with the individual components afterwards requires individual (country-specific) logic to put them together again.
(I once had an application which took input of the individual fields and later created an address list from then (by the "german way to do this"), and always received complains from the British users that I formatted their addresses in wrong order. So in a later version I simply created a multi-line "address" input field, which I then outputted without any change.)
You could probably use regular expressions if you don't want to add 3rd party dependencies.
See: http://download.oracle.com/javase/6/docs/api/java/util/regex/Pattern.html and http://download.oracle.com/javase/6/docs/api/java/util/regex/Matcher.html
Usage is basically:
private static final Pattern PAT_NAME = Pattern.compile("my\\sregex");
...
Matcher matcher = PAT_NAME.matcher("my address");
There is an older library here: http://jgeocoder.sourceforge.net/parser.html, but it works for most cases. If you want to use an API, I've used SmartyStreets in the past and they work decently well (https://smartystreets.com/).
精彩评论