Whatever happened to symbolic links?

Posted on November 26, 2009 by Tommy McGuire
Labels: eclipse virgo, linux, osgi, java, programming language
To tell the truth, I am relatively new to Java programming. Prior to my current job, I had only briefly used Java in anger, while working at IBM, and in fact even there I had much more experience with JNI than with actual Java code.

One thing that always irritated me when I looked at the language was that significant chunks of functionality were missing from this supposed system programming language. For example, how do you get access to environment variables? Sure, there was System.getenv, but until relatively recently that was deprecated and threw an exception. Heck, Runtime.exec allows you to provide an environment to a sub-process. Even the hoary "system dependent" excuse seems unreasonable, since beaucoup other portable languages seemed to be capable of doing something useful.

Another example is symbolic links. (If you do not know what they are, you can find the definition elsewhere. I'm bitter.) Symbolic links are one of the most useful tools in the eternal war between hideous hacks and non-functionality. (Ok, sometimes they're not hacks. Did I mention I am bitter?) But Java (as of 6) does not recognize them, cannot create them, and in at least one case does the most massive wrong thing with them.

Suppose, for example, you have a directory that you wish to recursively delete. Suppose further that this directory contains a symbolic link to another directory, which you do not wish to delete. Now, most deletey things like rm handle this situation correctly: they delete the symbolic link itself, but leave the target directory untouched. Naive Java code, on the other hand, only recognizes the existence of files and directories and will cheerfully follow the link and blow away the contents of the target directory.

How do I know this? I have a web application that needs to publish content files from a stable directory outside of the exploded war file, and I am too lazy to modify the application's code to correctly look outside its own directory. This situation is the poster child of symlink uses. But what happens when you undeploy the web app? The exploded war file's directory is recursively deleted by naive Java code, at least in both Tomcat and the SpringSource dm Server.

The original, recursive delete code looks something like:

private static boolean doRecursiveDelete(File root) {
if (root.exists()) {
if (root.isDirectory()) {
File[] children = root.listFiles();
if (children != null) {
for (File file : children) {
doRecursiveDelete(file);
}
}
}
return root.delete();
}
return false;
}

That method cheerfully follows symbolic links, since isDirectory is true for a link to a directory.

Fortunately, I found a patch from July, 2008 by Michael Bailey that attempts to fix the problem for Tomcat, and that seems to have a positive review. (On the other hand, I found a similar Tomcat bug and patch from August, 2009, that seems to be labeled WONTFIX.)

I created a patch for the SpringSource dm Server that we are using and life seems to be better. I also reported a bug and included the patch.

These patches work...strangely. Java does not recognize anything but files and directories, but it does let you get the canonical path to a file system object. If the canonical path of an object differs from the canonical path of its parent directory plus the object's name, there might be a symlink involved:

String path1 = file.getAbsoluteFile().getParentFile().getCanonicalPath() + File.separatorChar + file.getName();
String path2 = file.getAbsoluteFile().getCanonicalPath();
return !(path1.equals(path2));


Furthermore, you may be wondering how to create a symlink in Java? One cow-orker (Hi, Del!) suggested exec'ing ln -s; I did not think of that since I like JNI too much:

JNIEXPORT void JNICALL
Java_util_Symlink_symlink(JNIEnv *env, jclass cls, jstring oldPath, jstring newPath)
{
const jbyte *old = (*env)->GetStringUTFChars(env, oldPath, NULL);
if (old == NULL) {
jclass exCls = (*env)->FindClass(env, "java/io/IOException");
if (exCls != NULL) {
(*env)->ThrowNew(env, exCls, "cannot access oldPath");
}
(*env)->DeleteLocalRef(env, exCls);
return;
}
const jbyte *news = (*env)->GetStringUTFChars(env, newPath, NULL);
if (news == NULL) {
(*env)->ReleaseStringUTFChars(env, oldPath, old);
jclass exCls = (*env)->FindClass(env, "java/io/IOException");
if (exCls != NULL) {
(*env)->ThrowNew(env, exCls, "cannot access newPath");
}
(*env)->DeleteLocalRef(env, exCls);
return;
}
int rc = symlink(old, news);
(*env)->ReleaseStringUTFChars(env, oldPath, old);
(*env)->ReleaseStringUTFChars(env, newPath, news);
if (rc) {
jclass exCls = (*env)->FindClass(env, "java/io/IOException");
if (exCls != NULL) {
(*env)->ThrowNew(env, exCls, strerror(errno));
}
(*env)->DeleteLocalRef(env, exCls);
return;
}
}


As for Java and I, well, I am learning to adapt. Java is not the worst programming language I have used. And if I am parsing the Google results correctly, symlinks will be handled in Java 7.

[Edit: Why do I keep wanting to spell "canonical" with three n's?]

Comments



"news"?

Tommy McGuire
2009-11-26

Note the real problem here is that developers (perhaps ignorantly) rely on getCanonicalPath() in the first place.

The javadoc for getCanonicalPath() explains that it will resolve symlinks-- if the developers of Tomcat (and other software) were paying attention, they'd realize that normalizing the absolute path is "safer" (in terms of respecting symlinks) than using the canonical path.

Erin
2009-11-30

Erin, I'm not sure what you mean. getCanonicalPath is a round-about way of determining whether the current root is a symbolic link or not. (I added the original recursive delete method to the post, for illustration.)

You are right that there is a real difference between getAbsolutePath and getCanonicalPath, and that the former is probably what is called for, most of the time.

Tommy McGuire
active directory applied formal logic ashurbanipal authentication books c c++ comics conference continuations coq data structure digital humanities Dijkstra eclipse virgo electronics emacs goodreads haskell http java job Knuth ldap link linux lisp math naming nimrod notation OpenAM osgi parsing pony programming language protocols python quote R random REST ruby rust SAML scala scheme shell software development system administration theory tip toy problems unix vmware yeti
Member of The Internet Defense League
Site proudly generated by Hakyll.