Developing a Reusable File: Cleanup Schedulable Job

In my last post, I delivered on a reusable, file archival schedulable job. I promised to next look at doing something similar for file cleanup.

File cleanup needs are driven by similar, if not identical, factors as those for file archival. Reiterating those, they are – limits of physical space, business or compliance requirements for retention and information availability for support teams. Again, it makes sense to allow for flexible configuration of our job for differences and even changes in those requirements so that we won’t have to go back to the drawing board each time.

Configuration Parameters

Some of those configuration parameters are:

  • Directory
  • Recursive Y/N?
  • Filename pattern
  • Minimum/Maximum Size
  • Minimum time since modification

Again, this will be an Obsidian schedulable job adaptable to any scheduling platform you desire. Here is the resulting Java source code available under the MIT open source licence for you do to as you wish. FileCleanupJob.java.

Job Code

The basic algorithm is as follows:

  • Iterate over files in the directory
  • Determine if the filename mask applies
  • Determine if the file matches any other criteria specified
  • Delete the file.

Here’s what our primary cleanup method ends up looking like:

protected void processDirectory(final Context context, String dir) throws Exception {
  boolean recursive = Boolean.TRUE.equals(context.getConfig().getBoolean(RECURSIVE));
  List files = new ArrayList();
  Directory d = new Directory(dir);
  List fileList = recursive ? d.listFilesRecursively() : d.listFiles();
  
  for (com.carfey.jdk.io.File file : fileList) {
    File ff = new File(file.getAbsolutePath());
    if (shouldDelete(context, ff)) {
      files.add(ff);
    }
  }
  for (File f : files) {
    deleteMatchingFile(context, f);
    checkInterrupted();
  }
}

You’ll notice the job even supports multiple directories, so you don’t have to configure this job multiple times for different directories if the other configuration criteria are all the same. This job is also designed with customization in mind. All the available features and its usage are detailed on our wiki. Try this job out in your own free instance of Obsidian Scheduler.