Developing a Reusable File: Archival Schedulable Job

DevOps and full-stack have been popular topics in our industry for a number of years now. Unfortunately, they don’t always mean the same thing to every organization or even individuals within an organization. Rather than debate what they are or what they should be, this post will work through a solution created by the development team to either facilitate a frequently executed task performed by the operations team or provide a solution worthy of use by developers functioning in a part-time operations role – file archival.

File archival needs are typically driven by a few factors – limits of physical space, business or compliance requirements for retention and information availability for support teams. Rather than develop a solution that is fixed to a given set of requirements, we want a reusable, flexible solution that can be used and adapted without a need for new development.

Configuration Parameters

This schedulable job will need to accept configuration parameters. They are:

  • Source Directory with optional filename pattern OR Source File(s)
  • Archival Directory
  • Rename pattern
  • Compress Y/N?
  • Delete original Y/N?

We will develop our job as an Obsidian schedulable job but you can adapt this easily to any scheduling platform you desire. In fact, here is the resulting Java source code available under the MIT open source licence for you do to as you wish. FileArchiveJob.java.

Job Code

The basic algorithm is as follows.

  • Iterate over files
  • Ensure archive directory exists
  • Archive file – applying compression if selected
  • Delete original if selected

Since our job is an Obsidian job, we use a source job’s results as the most flexible and powerful mechanism to get your archival list.

Here’s what our main archival method ends up looking like:

protected void processFile(Context context, File f) throws ParameterException, IOException, DBException {
  boolean gzip = Boolean.TRUE.equals(context.getConfig().getBoolean(GZIP));
  String newName = determineArchiveFilename(f);
  for (String dirPath : context.getConfig().getStringList(ARCHIVE_DIR)) {
    File dir = new File(dirPath);
    if (!dir.exists()) {
      dir.mkdirs(); 
    } 
    if (!dir.isDirectory()) {
      throw new RuntimeException("Archive directory does not exist and could not be created: " + dir.getAbsolutePath());
    }
    File archiveFile = new File(dir, newName);
    if (!Boolean.TRUE.equals(context.getConfig().getBoolean(OVERWRITE)) && archiveFile.exists()) {
      throw new RuntimeException("File already exists and overwrite is disabled: " + archiveFile.getAbsolutePath());
    }
    InputStream src = null;
    OutputStream dest = null;
    FileOutputStream fos = null;
    try {
      src = new BufferedInputStream(new FileInputStream(f));
      fos = new FileOutputStream(archiveFile);
      if (gzip) {
        dest = new BufferedOutputStream(new GZIPOutputStream(fos));
      } else {
        dest = new BufferedOutputStream(fos);
      }
      IOUtil.copyStream(src, dest, 4096, false);
      dest.flush();
      dest.close();
      context.saveJobResult("archiveFile", archiveFile.getAbsolutePath());
    } finally {
      IOUtil.closeQuietly(src);
      IOUtil.closeQuietly(dest);
      IOUtil.closeQuietly(fos);
    }
  }
}

And there’s a lot more this job offers. It is designed with customization in mind and has some additional optional features detailed on our wiki. Let us know what you think or take this job for a spin in your own free instance of Obsidian Scheduler.

Next up, we’ll take a look at a reusable file cleanup job.