开发者

Get domain without subdomain from a URL

开发者 https://www.devze.com 2023-01-06 14:11 出处:网络
What is the proper way to get the domain from a URL without the subdomains? In Java, from a string you can make a new URL(urlString) and call getHost() on the URL, but you have subdomains with it.

What is the proper way to get the domain from a URL without the subdomains?

In Java, from a string you can make a new URL(urlString) and call getHost() on the URL, but you have subdomains with it.

The problem is because there can be hosts like: subhost.example.com and subhost.example.co.uk

There are several other of these two part domains like co.uk (see the list on https://wiki.mozilla.org/TLD_List).

It seems to me the only correct way to get only the domain is to do a search through the TLD list, remove the TLD from the end of the host, and take away everything before the last period in the ho开发者_开发知识库st. Is there an existing method that does this? I didn't see one in java.net.URL, and I checked apache commons a bit but couldn't find one there.


I know this is a few years late but if anyone stumbles across this question try the following:

InternetDomainName.from("subhost.example.co.uk").topPrivateDomain().name

The above will return example.co.uk.


Not sure if the above answer is correct:

InternetDomainName.from("test.blogspot.com").topPrivateDomain() -> test.blogspot.com

This works better in my case:

InternetDomainName.from("test.blogspot.com").topDomainUnderRegistrySuffix() -> blogspot.com

Details: https://github.com/google/guava/wiki/InternetDomainNameExplained


The above solutions require you to add Guava. If you use OkHttp or Retrofit, you can also use

PublicSuffixDatabase.get().getEffectiveTldPlusOne("test.blogspot.com")

This gives you blogspot.com

0

精彩评论

暂无评论...
验证码 换一张
取 消