Sunday, August 22, 2010

Searching JAR Files with Groovy

My favorite use of Groovy continues to be for scripting in a Java development environment.  In this blog post, I demonstrate a simple Groovy script for searching JARs recursively under a provided directory for a file that includes the specified string.

Here is the Groovy script (findClassInJar.groovy):

findClassInJar.groovy
#!/usr/bin/env groovy

/**
 * findClassInJar.groovy
 *
 * findClassInJar <<root_directory>> <<string_to_search_for>>
 *
 * Script that looks for provided String in JAR files (assumed to have .jar
 * extensions) in the provided directory and all of its subdirectories.
 */

import java.util.zip.ZipFile
import java.util.zip.ZipException

rootDir = args ? args[0] : "."
fileToFind = args && args.length > 1 ? args[1] : "class"
numMatchingItems = 0
def dir = new File(rootDir)
dir.eachFileRecurse
{ file->
   if (file.isFile() && file.name.endsWith("jar"))
   {
      try
      {
         zip = new ZipFile(file)
         entries = zip.entries()
         entries.each
         { entry->
            if (entry.name.contains(fileToFind))
            {
               println file
               println "\t${entry.name}"
               numMatchingItems++
            }
         }
      }
      catch (ZipException zipEx)
      {
         println "Unable to open file ${file.name}"
      }
   }
}
println "${numMatchingItems} matches found!"

The script above is simple, but accomplishes the task at hand.  The next two screen snapshots show the start and finish of the script when run against a large set of JARs (Spring Framework 2.5.6 distribution) for a common name in that set ("Exporter").



There are a few interesting observations to be made about this Groovy script.  It uses Groovy's closures several times, especially using GDK functionality such as File.eachFileRecurse and Enumeration.each.  The script also demonstrates one of Groovy's greatest strengths: the ability to use Java APIs and libraries.  In this script, the java.util.zip.ZipFile and java.util.zip.ZipException are used to read the contents of each encountered JAR file.  Although Groovy does not require any exceptions (checked or unchecked) to be caught, I chose to have this script catch the ZipException and handle it by logging out the inability to open the candidate JAR file.  The advantage of explicitly catching it and "handling" it is that the encountered exception does not propagate up and stop the execution of the script.  A side benefit is a potentially more pleasing message than a partial stack trace.

There are numerous things that could be done to make this simplistic script easier and more powerful to use.  The built-in Groovy CliBuilder support could be used for more sophisticated command-line parameter handling and decent usage handling.  The script could also be enhanced to support case insensitivity or to support exact matches versus contains matches.  The script could also provide more metadata such as the number of JAR files in which matches were found or the numbers of different types of files found.  The script might be changed to search for files of ZIP formats other than JAR or for files with extensions other than .jar.  The good news is that even this simplistic script is highly useful for most situations.

There are other approaches that can be used to search JAR files with Groovy.  The blog post Groovy Script to find Java classes in JAR files lists an even shorter Groovy script that uses Groovy's regular expression support in conjunction with Groovy's String.execute() to execute the jar tvf command.  The Groovy Cookbook Examples includes a related example called Search One of More JAR Files for a Specified File or Directory.

No comments: