In this article we’ll see how to read Excel file in Java using Apache POI library. If you have to write to an Excel file in Java you can check this post- Write to Excel File in Java Using Apache POI
Apache POI
Apache POI is a Java API for Microsoft documents. Using Apache POI you can read and write MS Excel files using Java. In addition, you can read and write MS Word and MS PowerPoint files using Java.
Apache POI provides support for reading both OLE2 files and Office Open XML standards (OOXML) files.
- OLE2 files include most Microsoft Office files such as XLS, DOC, and PPT as well as MFC serialization API based file formats.
- Office OpenXML Format is the new standards based XML file format found in Microsoft Office 2007 and 2008. This includes XLSX, DOCX and PPTX.
Which means Apache POI supports reading excel files having .xls format as well as excel files with .xlsx format.
Maven dependencies for Apache POI
To read excel files using Apache POI you need to add the following dependencies.
<dependency> <groupId>org.apache.poi</groupId> <artifactId>poi</artifactId> <version>5.2.5</version> </dependency> <dependency> <groupId>org.apache.poi</groupId> <artifactId>poi-ooxml</artifactId> <version>5.2.5</version> </dependency>
Apache POI classes for reading excel files
Here is a primer about the classes that are used for reading excel file in Java using Apache POI.
Apache POI library provides two implementations for two excel spread sheet formats-
- HSSF- It is the pure Java implementation for earlier excel format (.xls). Classes in this implementation will usually have HSSF prefix like HSSFWorkBook, HSSFSheet.
- XSSF- It is the pure Java implementation for xslx file format (OOXML). Classes in this implementation will usually have XSSF prefix like XSSFWorkBook, XSSFSheet.
SS- It is a package built on top of HSSF and XSSF that provides common support for both formats with a common API. You should try to use classes from this package for better compatibility.
While working with excel file the usual progression is-
There are interfaces corresponding to this progression-
- org.apache.poi.ss.usermodel.Workbook- High level representation of a Excel workbook. This is the first object most users will construct whether they are reading or writing a workbook. It is also the top level object for creating new sheets.
- org.apache.poi.ss.usermodel.Sheet- High level representation of a Excel worksheet. Sheets are the central structures within a workbook.
- org.apache.poi.ss.usermodel.Row- High level representation of a row of a spreadsheet.
- org.apache.poi.ss.usermodel.Cell- High level representation of a cell in a row of a spreadsheet. Cells can be numeric, formula-based or string-based (text).
For creating a Workbook, WorkbookFactory class is used.
- org.apache.poi.ss.usermodel.WorkbookFactory- Factory for creating the appropriate kind of Workbook (be it HSSFWorkbook or XSSFWorkbook), by auto-detecting from the supplied input.
Read excel file in Java using Apache POI example
In this Java example to read excel file we’ll read the following sheet (Employee.xslx) having employee information and create Employee objects using that data.
Last column "Current Date" has a formula "Today()" which displays current date. That column is there to show you how to read a formula in excel using Apache POI.
Employee classimport java.time.LocalDate; public class Employee { private String firstName; private String lastName; private String department; private LocalDate dob; public String getFirstName() { return firstName; } public void setFirstName(String firstName) { this.firstName = firstName; } public String getLastName() { return lastName; } public void setLastName(String lastName) { this.lastName = lastName; } public String getDepartment() { return department; } public void setDepartment(String department) { this.department = department; } public LocalDate getDob() { return dob; } public void setDob(LocalDate dob) { this.dob = dob; } @Override public String toString() { return "Employee [firstName=" + firstName + ", lastName=" + lastName + ", department=" + department + ", dob=" + dob + "]"; } }
Following class reads data from the excel file and create Employee class objects.
import java.io.FileInputStream; import java.io.IOException; import java.text.ParseException; import java.time.LocalDate; import java.util.ArrayList; import java.util.Iterator; import java.util.List; import org.apache.poi.ss.usermodel.Cell; import org.apache.poi.ss.usermodel.CellType; import org.apache.poi.ss.usermodel.DateUtil; import org.apache.poi.ss.usermodel.Row; import org.apache.poi.ss.usermodel.Sheet; import org.apache.poi.ss.usermodel.Workbook; import org.apache.poi.ss.usermodel.WorkbookFactory; public class ReadExcel { // Excel file path (change as per your system) private static final String EXCEL_FILE_PATH="D:\\knpcode\\Employee.xlsx"; public static void main(String[] args) { ReadExcel readExcel = new ReadExcel(); readExcel.readExcelFile(EXCEL_FILE_PATH); } private void readExcelFile(String excelFilePath) { try { // get excel workbook Workbook workbook = WorkbookFactory.create(new FileInputStream(excelFilePath)); // get excel sheet Sheet sheet = workbook.getSheetAt(0); //If you have more than one sheet in a workbook then you // can iterate through sheets /*Iterator<Sheet> sheetIterator = workbook.sheetIterator(); while(sheetIterator.hasNext()) { Sheet sheet = sheetIterator.next();*/ readSheet(sheet); } catch (IOException | ParseException e) { e.printStackTrace(); } } private void readSheet(Sheet sheet) throws ParseException{ Iterator<Row> rowItr = sheet.iterator(); List<Employee> empList = new ArrayList<>(); // Iterate through rows while(rowItr.hasNext()) { Employee emp = new Employee(); Row row = rowItr.next(); // skip header (First row) if(row.getRowNum() == 0) { continue; } Iterator<Cell> cellItr = row.cellIterator(); // Iterate each cell in a row while(cellItr.hasNext()) { Cell cell = cellItr.next(); int index = cell.getColumnIndex(); switch(index) { case 0: emp.setFirstName((String)getValueFromCell(cell)); break; case 1: emp.setLastName((String)getValueFromCell(cell)); break; case 2: emp.setDepartment((String)getValueFromCell(cell)); break; case 3: // cast from Object to LocalDate emp.setDob((LocalDate)getValueFromCell(cell)); break; case 4: // Formula value is just printed on the console System.out.println(getValueFromCell(cell)); break; } } empList.add(emp); } for(Employee emp : empList) { System.out.println("Employee information- " + emp.toString()); } } // Utility method to get cell value based on cell type private Object getValueFromCell(Cell cell) { CellType cellType = cell.getCellType(); if(cellType.equals(CellType.FORMULA)) { cellType = cell.getCachedFormulaResultType(); } switch(cellType) { case STRING: return cell.getStringCellValue(); case BOOLEAN: return cell.getBooleanCellValue(); case NUMERIC: // Check for date in numeric cell (as date is also considered numeric) if(DateUtil.isCellDateFormatted(cell)) { return cell.getLocalDateTimeCellValue().toLocalDate(); } return cell.getNumericCellValue(); case BLANK: return ""; default: return ""; } } }Output
2024-03-21 2024-03-21 2024-03-21 Employee information- Employee [firstName=John, lastName=Emerson, department=Technology, dob=1990-11-12] Employee information- Employee [firstName=Shelly, lastName=Mcarthy, department=HR, dob=1960-05-07] Employee information- Employee [firstName=Tom, lastName=Cevor, department=Finance, dob=1992-03-03]
Excel file used in the code has only a single sheet which can be retrieved using- Sheet sheet = workbook.getSheetAt(0);
If there are more than one sheet in the workbook then you can iterate through the sheets too-
Iterator<Sheet> sheetIterator = workbook.sheetIterator(); while(sheetIterator.hasNext()) { Sheet sheet = sheetIterator.next(); // get data from sheet }
In an excel sheet, cells can have values of different data types. Cell
class in Apache POI has different methods to get different types of cell values like getStringCellValue(), getBooleanCellValue() etc.
In order to use the correct method to extract value you need to know the data type of the cell value that's what is done in the getValueFromCell(Cell cell)
method in the example code.
Some of the data types are as following-
- Boolean
- Blank
- Formula
- Numeric
- String
There is actually no 'DATE' cell type in Excel. Date value is stored as Numeric type. That's why logic for date is in Numeric.
In case value of cell is of type 'Formula', use getCachedFormulaResultType()
method to get the type of the value returned by the formula. Then use that type to get the actual value using the appropriate get method.
That's all for the topic Read Excel File in Java Using Apache POI. If something is missing or you have something to share about the topic please write a comment.
You may also like
- How to Read Input From Console in Java
- Read File Asynchronously in Java
- Stable and Unstable Sorting Algorithms
- Java Date Difference Program
- ReentrantLock in Java With Examples
- JDBC Transaction Management and Savepoint Example
- Spring Boot and Dependency Injection
- Parquet File Format in Hadoop
- How to Copy File in Python
- Setting Error Page (404 Page) in React Router
No comments:
Post a Comment