开发者

Java jar files into a repository (CVS, SVN..)

开发者 https://www.devze.com 2023-02-04 12:19 出处:网络
Why it\'s a bad idea to commit Java jar file开发者_运维知识库s into a repository (CVS, SVN..)Because you can rebuild them from the source. On the hand if you are talking about third-party JAR files wh

Why it's a bad idea to commit Java jar file开发者_运维知识库s into a repository (CVS, SVN..)


Because you can rebuild them from the source. On the hand if you are talking about third-party JAR files which are required by your project then it is a good idea to commit them into the repository so that the project is self-contained.


So, you have a project that use some external dependencies. This dependencies are well known. They all have

  • A group (typically, the organization/forge creating them)
  • An identifier (their name)
  • A version

In maven terminology, these informations are called the artifact (your Jar) coordinates.

The dependencies I was talking about are either internal (for a web application, it can be your service/domain layer) or external (log4j, jdbc driver, Java EE framework, you name it, ...). All those dependencies (also called artifacts) are in fact, at their lowest level, binary files (JAR/WAR/EAR) that your CVS/SVN/GIT won't be able to store efficently. Indeed, SCM use the hypothesis that versionned content, the one for which diff operations are the most efficient) is text only. As a consequence, when binary data is stored, their is rarely storage optimization (contrary to text, where only versions differences are stored).

As a consequence, what I would tend to recommand you is to use a dependency management build system, like maven, Ivy, or Gradle. using such a tool, you will declare all your dependencies (in fact, in this file, you will declare your dependencies' artifacts coordinates) in a text (or maybe XML) file, which will be in your SCM. BUT your dependencies won't be in SCM. Rather, each developper will download them on its dev machine.

This transfers some network load from the SCM server to the internet (which bandwidth is often more limitated than internal enterpise network), and asks the question of long-term availability of artifacts. Both of these answers are solved (at least in amven work, but I believe both Ivy and gradle are able to connect to such tools - and it seems some questions are been asked on this very subject) using enterprises proxies, like Nexus, Artifactory and others.

The beauty of these tools is that they make available in internal network a view of all required artifacts, going as far as allowing you to deploy your own artifacts in these repositories, making sharing of your code both easy and independant from the source (which may be an advantage).

To sum up this long reply : use Ivy/Maven/Gradle instead of simple Ant build. These tools will allow you to define your dependencies, and do all the work of downloading these dependencies and ensuring you use the declared version.

On a personnal note, the day I discovered those tools, my vision of dependency handling in Java get from nightmare to heaven, as I now only have to say that I use this very version of this tool, and maven (in my case), do all the background job of downloading it and storing at the right location on my computer.


Source control systems are designed for holding the text source code. They can hold binary files, but that isn't really what they are designed for. In some cases it makes sense to put a binary file in source control, but java dependencies are generally better managed in a different way.

The ideal setup is one that lets you manage your dependencies outside of source control. You should be able to manage your dependencies outside of the source and simply "point" to the desired dependency from within the source. This has several advantages:

  • You can have a number of projects dependent on the same binaries without keeping a separate copy of each binary. It is common for a medium sized project to have hundreds of binaries it depends on. This can result in a great deal of duplication which wastes local and backup resources.
  • Versions of binaries can be managed centrally within your local environment or within the corporate entity.
  • In many situations, the source control server is not a local resource. Adding a bunch of binary files will slow things down because it increases the amount of data that needs to be sent across a slower connection.
  • If you are creating a war, there may be some jars you need for development, but not deployment and vice versa. A good dependency management tool lets you handle these types of issues easily and efficiently.
  • If you are depending on a binary file that comes from another one of your projects, it may change frequently. This means you could be constantly overwriting the binary with a new version. Since version control is going to keep every copy, it could quickly grow to an unmanageable size--particularly if you have any type of continuous integration or automated build scripts creating these binaries.
  • A dependency management system offers a certain level of flexibility in how you depend on binaries. For example, on your local machine, you may want to depend on the latest version of a dependency as it sits on your file system. However, when you deploy your application you want the dependency packaged as a jar and included in your file.

Maven's dependency management features solve these issues for you and can help you locate and retrieve binary dependencies as needed. Ivy is another tool that does this as well, but for Ant.


They are binary files:

  • It's better to reference the source, since that's what you're using source control for.
  • The system can't tell you which differences between the files
  • They become a source of merge-conflicts, in case they are compiled from the source in the same repository.
  • Some systems (e.g. SVN) don't deal quite well with large binary files.

In other words, better reference the source, and adjust your build scripts to make everything work.


The decision to commit jar files to SCM is usually influenced by the build tool being used. If using Maven in a conventional manner then you don't really have the choice. But if your build system allows you the choice, I think it is a good idea to commit your dependencies to SCM alongside the source code that depends on them.

This applies to third-party jars and in-house jars that are on a separate release cycle to your project. For example, if you have an in-house jar file containing common utility classes, I would commit that to SCM under each project that uses it.

If using CVS, be aware that it does not handle binary files efficiently. An SVN repository makes no distinction between binary and text files.

http://svnbook.red-bean.com/en/1.5/svn.forcvs.binary-and-trans.html

Update in response to the answer posted by Mark:

WRT bullet point 1: I would say it is not very common for even a large project to have hundreds of dependencies. In any case, disk usage (by keeping a separate copy of a dependency in each project that uses it) should not be your major concern. Disk space is cheap compared with the amount of time lost dealing with the complexities of a Maven repository. In any case, a local Maven repository will consume far more disk space than just the dependencies you actually use.

Bullet 3: Maven will not save you time waiting for network traffic. The opposite is true. With your dependencies in source control, you do a checkout, then you switch from one branch to another. You will very rarely need to checkout the same jars again. If you do, it will take only minutes. The main reason Maven is a slow build tool is all the network access it does even when there is no need.

Bullet Point 4: Your point here is not an argument against storing jars in SCM and Maven is only easy once you have learned it and it is only efficient up to the point when something goes wrong. Then it becomes difficult and your efficiency gains can disappear quickly. In terms of efficiency, Maven has a small upside when things work correctly and a big downside when they don't.

Bullet Point 5: Version control systems like SVN do not keep a separate copy of every version of every file. It stores them efficiently as deltas. It is very unlikely that your SVN repository will grow to an 'unmanageable' size.

Bullet Point 6: Your point here is not an argument against storing files is SCM. The use case you mention can be handled just as easily by a custom Ant build.

0

精彩评论

暂无评论...
验证码 换一张
取 消