GZIP is normally used to compress single files in GZIP format, if you want to compress multiple files using GZIP format in Java it is a two step process;
- first multiple files are archived into one with tar,
- then compressed with gzip to create a .tar.gz compressed archive.
In this post we'll see this whole process of compressing multiple files using gzip in Java by creating a tar file in Java and then gzip it thus creating a .tar.gz archive.
Gzip multiple files in Java
Java program given here to archive multiple files into tar and then compressing into GZIP uses Apache Commons Compress library which can be downloaded from this path- https://commons.apache.org/proper/commons-compress/download_compress.cgi
Version used here is commons-compress-1.18 so commons-compress-1.18.jar is added to the class path.
From Apache Commons Compress library following two files are used for creating tar archive.
- TarArchiveEntry- Represents an entry in a Tar archive. So all the directories and files which are compressed are added to tar archive using TarArchiveEntry.
- TarArchiveOutputStream- This class has methods to put archive entries, and then write content of the files by writing to this stream. TarArchiveOutputStream wraps GZIPOutputStream in the program.
Java program – Create tar archive and Gzip multiple files
Directory structure used in the Java program is as given below, there is a parent directory test having two sub-directories docs and prints and four files-
$ ls -R test test: aa.txt bb.txt docs prints test/docs: display.txt test/prints: output
In the program you need to traverse the directory structure to archive all files and directories. If it is a directory just archive that entry, in case of file apart from archiving that entry also write the content of the file to the stream.
import java.io.BufferedInputStream; import java.io.BufferedOutputStream; import java.io.File; import java.io.FileInputStream; import java.io.FileOutputStream; import java.io.IOException; import java.util.zip.GZIPOutputStream; import org.apache.commons.compress.archivers.tar.TarArchiveEntry; import org.apache.commons.compress.archivers.tar.TarArchiveOutputStream; import org.apache.commons.compress.utils.IOUtils; public class GZipMultipleFiles { public static void main(String[] args) { String PARENT_DIRECTORY = "/home/knpcode/Documents/test"; GZipMultipleFiles gzipMultipleFiles = new GZipMultipleFiles(); gzipMultipleFiles.createTarArchive(PARENT_DIRECTORY); } public void createTarArchive(String parentDir){ TarArchiveOutputStream tarArchive = null; try { File root = new File(parentDir); // create output name for tar archive FileOutputStream fos = new FileOutputStream(root.getAbsolutePath().concat(".tar.gz")); GZIPOutputStream gzipOS = new GZIPOutputStream(new BufferedOutputStream(fos)); tarArchive = new TarArchiveOutputStream(gzipOS); addToArchive(parentDir, "", tarArchive); } catch (IOException e) { // TODO Auto-generated catch block e.printStackTrace(); }finally{ try { tarArchive.close(); } catch (IOException e) { // TODO Auto-generated catch block e.printStackTrace(); } } } public void addToArchive(String filePath, String parent, TarArchiveOutputStream tarArchive) throws IOException { File file = new File(filePath); // Create entry name relative to parent file path //for the archived file String entryName = parent + file.getName(); System.out.println("entryName " + entryName); // add tar ArchiveEntry tarArchive.putArchiveEntry(new TarArchiveEntry(file, entryName)); if(file.isFile()){ FileInputStream fis = new FileInputStream(file); BufferedInputStream bis = new BufferedInputStream(fis); // Write file content to archive IOUtils.copy(bis, tarArchive); tarArchive.closeArchiveEntry(); bis.close(); }else if(file.isDirectory()){ // no content to copy so close archive entry tarArchive.closeArchiveEntry(); // if this directory contains more directories and files // traverse and archive them for(File f : file.listFiles()){ // recursive call addToArchive(f.getAbsolutePath(), entryName+File.separator, tarArchive); } } } }Output for the entries in the tar archives-
entryName test entryName test/docs entryName test/docs/display.txt entryName test/bb.txt entryName test/prints entryName test/prints/output entryName test/aa.txt
As shown in the Archive Manager.
That's all for the topic GZIP Multiple Files in Java Creating Tar Archive. If something is missing or you have something to share about the topic please write a comment.
You may also like
- Creating Password Protected Zip File in Java
- Producer-Consumer Problem Java Program
- Convert String to Date in Java
- Java HashMap computeIfAbsent() With Examples
- ArrayBlockingQueue in Java With Examples
- Supplier Functional Interface Java Examples
- Java Stream anyMatch() With Examples
- Predefined Mapper and Reducer Classes in Hadoop
No comments:
Post a Comment