In test automation, the test data plays an important role. The test automation framework developers maintain the test data in various formats and Microsoft Excel is one of the easiest and most used ways of storing and maintaining test data. Even the manual QAs keep their test data in Excel files. Now to access this test data in the automation frameworks, Java provides various libraries and Apache POI is one of the most used among them. Subsequently, in this article, we will kickstart our journey of using the Apache POI by understanding how to download apache poi and use the same in an automation framework, by covering the details under the following topics:
- What is Apache POI used for?
- How to download Apache POI?
- How to install POI libraries?
- And, how to configure POI libraries in Eclipse?
What is Apache POI used for?
Apache POI is an open-source library developed and distributed by Apache Foundation. Moreover, it is mainly used to create, read, and edit Microsoft Office files, majorly Excel files in Java programs. Moreover, it is distributed as a JAR, which provides various methods to manipulate Microsoft Excel files. The image below shows details of various formats and actions that Apache POI supports:
The older versions of Apache POI support binary file formats such as doc, xls, ppt etc whereas, from version 3.5 onwards, Apache POI supports OOXML file formats such as docx, xlsx, pptx etc. Additionally, the table below gives a brief summary of various components provided by Apache POI:
Component | Explanation | Details |
---|---|---|
POIFS | Poor Obfuscation Implementation File System. | This component provides the capability to read various files. |
HSSF | Horrible Spreadsheet Format | This component is used to read/write an older format of Excel(xls). |
XSSF | XML Spreadsheet Format | This component is used to read/write a new format of Excel(xlsx). |
HPSF | Horrible Property Set Format | This component is used to extract the “property sets” of various types of MS- Office files. |
HWPF | Horrible Word Processor Format | This component reads/writes an older format of Word(doc). |
XWPF | XML Word Processor Format | This component reads/writes a new format of Word(docx). |
HSLF | Horrible Slide Layout Format | This component reads/writes PowerPoint presentations. |
HDGF | Horrible DiaGram Format | This component reads/writes MS-Visio files. |
HPBF | Horrible PuBlisher Format | This component reads/writes MS-Publisher files. |
Let's now quickly see, how we can download *Apache POI *libraries:
How to download Apache POI?
The first step in the process of storing and accessing the test data in Excel files is to download the Apache POI library. Consequently, follow the steps as mentioned below to download the Apache POI library:
- First, navigate to the Apache POI webpage. After that, click on the Download link in the left menu. Moreover, it is as highlighted below:
- Secondly, clicking on the Download link will navigate to the page showing the latest release of Apache POI. Additionally, it is as highlighted below:
- Thirdly, you can either click on the "Latest Stable Release Link " (as shown by marker 1), which will scroll the page down to the binaries of Apache POI (as shown by marker 2 ), or can directly scroll down to the section of binaries shown by marker 2. Subsequently, after clicking on the "zip " file, it will navigate to the page showing various download links as shown below:
- After that, when you click on any of the highlighted links, it will download a zip. Additionally, you can save it in any folder of your choice as shown below:
- Fifthly, once you unzip the file, it will show the contents as below:
These are various JAR files that provide the classes and methods that we use for the manipulation of various MS-Office file types. Subsequently, let’s see how we can install these JARs in our projects and use them for the manipulation of various supported file types.
How to install POI libraries?
As we understood in the above sections that all the Apache POI libraries are available as JARs. Now to access the functionalities of POI, these JARs should be available in the build path of your application/framework. Additionally, we are majorly using Eclipse as IDE in our articles. Consequently, let's quickly see how we can install the Apache POI JARs in the build path of a project in Eclipse :
How to configure POI libraries in Eclipse?
Follow the steps as mentioned below to add the POI JARs in a project in Eclipse:
-
Firstly, suppose you have created a JAVA project in Eclipse, as per the steps mentioned in the article "Configure Selenium WebDriver with Eclipse".
-
After that, right -click on the project in Eclipse. Subsequently, select Build Path >> Configure Build Path as shown below:-
- Thirdly, it will open the "Properties " of the project. After that, select the Libraries tab. Finally, click on the Add External JARs as highlighted below.
Note: Classpath(as highlighted by marker 1) should be selected when adding the External JARs.
- Fourthly, select the JARs in the parent folder of the unzipped POI files. Subsequently, click on the Open button to include them in the Eclipse project:
- Next select the JARs under the ooxml-lib folder in the unzipped POI folder. Moreover, it is as highlighted below:
- Sixthly, select the JARs under the lib folder in the unzipped POI folder. Additionally, it is as highlighted below:
- After that, once all the POI JARs add, click on the Apply and Close button. Moreover, it is as highlighted below:
- Once all the POI libraries successfully install in the Eclipse project, they will reflect under the Referenced Libraries folder in the left pane of the Eclipse project structure, as shown below:
So, this completes the installation of Apache POI in an Eclipse project. Subsequently, we can now start using the capabilities of these libraries in our JAVA Project.
Key Takeaways
- Apache POI libraries provide the capabilities to handle various types of MS-Office files.
- Additionally, for an automation framework, keeping the test data in a file(eg, Excel file) is one of the common practices and Apache POI makes it very easy to read and write test data to an Excel file.
- Lastly, Apache POI libraries are available as a set of JAR files, which we can download and install in an Eclipse project by simply including the JAR files in the project.