I have an application with a few threads that manipulate data and save the output in different temporary files on a particular directory, in a Linux or a Windows machine. These files eventually need to be erased.
What I want to do is to be able to better separate the files, so I am thinking of doing this by Process ID and Thread ID. This will help the application save disk space because, upon termination of a threa开发者_开发技巧d, the whole directory with that thread's files can be erased and leave the rest of the application reuse the corresponding disk space.
Since the application runs on a single instance of the JVM, I assume it will have a single Process ID, which will be that of the JVM, right?
That being the case, the only way to discriminate among these files is to save them in a folder, the name of which will be related to the Thread ID.
Is this approach reasonable, or should I be doing something else?
java.io.File can create temporary files for you. As long as you keep a list of those files associated with each thread, you can delete them when the thread exits. You can also mark the files to delete on exit in case a thread does not complete.
It seems the simplest solution for this approach is really to extend Thread - never thought I'd see that day.
As P.T. already said Thread IDs are only unique as long as the thread is alive, they can and most certainly will be reused by the OS.
So instead of doing it this way, you use the Thread name that can be specified at construction and to make it simple, just write a small class:
public class MyThread extends Thread {
private static long ID = 0;
public MyThread(Runnable r) {
super(r, getNextName());
}
private static synchronized String getNextName() {
// We can get rid of synchronized with some AtomicLong and so on,
// doubt that's necessary though
return "MyThread " + ID++;
}
}
Then you can do something like this:
public static void main(String[] args) throws InterruptedException {
Thread t = new MyThread(new Runnable() {
@Override
public void run() {
System.out.println("Name: " + Thread.currentThread().getName());
}
});
t.start();
}
You have to overwrite all constructors you want to use and always use the MyThread
class, but this way you can guarantee a unique mapping - well at least 2^64-1 (negative values are fine too after all) which should be more than enough.
Though I still don't think that's the best approach, possibly better to create some "job" class that contains all necessary information and can clean up its files as soon as it's no longer needed - that way you also can easily use ThreadPools and co where one thread will do more than one job. At the moment you have business logic in a thread - that doesn't strike me as especially good design.
You're correct, the JVM has one process ID, and all threads in that JVM will share the process id. (It is possible for a JVM to use multiple processes, but AFAIK, no JVM does that.)
A JVM may very well re-use underlying OS threads for multiple Java threads, so there is no guaranteed correlation between a thread exiting in Java and anything similar happening at the OS level.
If you just need to cleanup stale files, sorting the files by their creation timestamp should do the job sufficiently? No need to encode anything special at all in the temporary file names.
Note that PIDs and TIDs are neither guaranteed to be increasing, no guaranteed to be unique across exits. The OS is free to recycle an ID. (In practice the IDs have to wrap around before re-use, but on some machines that can happen after only 32k or 64k processes have been created.
精彩评论