开发者

packaging large frequently changing data files with maven project

开发者 https://www.devze.com 2023-04-02 08:17 出处:网络
I am in the process of converting a legacy Ant project into a Maven project. Part of the project is a very large (~1.6GB) set of data files in a compressed binary format which are accessed in a random

I am in the process of converting a legacy Ant project into a Maven project. Part of the project is a very large (~1.6GB) set of data files in a compressed binary format which are accessed in a random-seek fashion via index tables. The data files are like logarithmic function tables, rainbow tables or similar data tables for massively abbreviating 开发者_运维知识库complex computations.

We publish new data tables on a weekly basis, and I want to be able to exploit Maven's dependency management system to help the developers get the latest tables.

The main problem I am having is that I cannot figure out how to bundle the tables up in a way that isn't just a JAR, ZIP or RAR of the whole set of them. Is there a way to write a pom that will result in a directory of data files? Or am I just thinking about the problem in a non-Maven way?

Thanks for any suggestions.


This depends on what the consumer can deal with. Maven dependencies don't deal with directories of files, so you'd need the whole artifact. You probably want to deal with ZIPs, as JAR has an overloaded meaning (put on the classpath) and other compression need custom plugins.

However, if you can break it up into long-lived and short-lived data you may get better behaviour (e.g. a quarterly full release, and a set of changes to apply to that that is re-released weekly). This depends whether the data can easily be split in this fashion, or overlaid, or patched in some way. This might be difficult in a compressed binary artifact.

The other alternative is to continuously build the large artifact, and discard old ones. This relies on good bandwidth between builds and repository, and enough disk to hold as many builds as you need (repository managers like Archiva can help purge old builds on a regular schedule if that's appropriate).

One final note - if you are dealing with ZIPs over 2G (which you are close to approaching), you will need to use a different ZIP such as the truezip-maven-plugin.

0

精彩评论

暂无评论...
验证码 换一张
取 消