What is the proper way to get the domain from a URL without the subdomains?
In Java, from a string you can make a new URL(urlString) and call getHost() on the URL, but you have subdomains with it.
The problem is because there can be hosts like: subhost.example.com and subhost.example.co.uk
There are several other of these two part domains like co.uk (see the list on https://wiki.mozilla.org/TLD_List).
It seems to me the only correct way to get only the domain is to do a search through the TLD list, remove the TLD from the end of the host, and take away everything before the last period in the ho开发者_开发知识库st. Is there an existing method that does this? I didn't see one in java.net.URL, and I checked apache commons a bit but couldn't find one there.
I know this is a few years late but if anyone stumbles across this question try the following:
InternetDomainName.from("subhost.example.co.uk").topPrivateDomain().name
The above will return example.co.uk.
Not sure if the above answer is correct:
InternetDomainName.from("test.blogspot.com").topPrivateDomain() -> test.blogspot.com
This works better in my case:
InternetDomainName.from("test.blogspot.com").topDomainUnderRegistrySuffix() -> blogspot.com
Details: https://github.com/google/guava/wiki/InternetDomainNameExplained
The above solutions require you to add Guava. If you use OkHttp or Retrofit, you can also use
PublicSuffixDatabase.get().getEffectiveTldPlusOne("test.blogspot.com")
This gives you blogspot.com
精彩评论