PDF.js by Mozilla
JavaScript library for rendering PDF documents in the browser
- License:
- Apache License, Version 2.0
- Project:
- github.com/mozilla/pdf.js
FileConvertLab relies on several open-source libraries for document conversion, PDF manipulation, and optical character recognition. This page provides attributions for third-party components whose licenses require notice in documentation or accompanying materials. We are grateful to the developers and organizations maintaining these projects.
The following libraries power FileConvertLab's conversion capabilities. Each library is listed with its license type, project homepage, and a brief description of its role in our system.
JavaScript library for rendering PDF documents in the browser
Java library for creating and manipulating PDF documents
Java API for Microsoft Office documents (Word, Excel, PowerPoint)
Java library for document conversion using LibreOffice
Java library for reading and writing image formats
ImageIO plugins extending Java's image format support
Java JNA wrapper for Tesseract OCR engine
Open-source software enables innovation by allowing developers to build upon existing work. The libraries listed above are released under permissive licenses (Apache 2.0 and BSD 3-Clause) that allow commercial use with proper attribution. We comply with these license requirements by maintaining this attribution page and including license notices in our source code.
The Apache License 2.0 and BSD 3-Clause License are business-friendly open-source licenses that permit modification, distribution, and commercial use. They require preservation of copyright notices and, for Apache 2.0, a copy of the license. Neither license imposes copyleft requirements on derivative works.
Document Conversion: JODConverter interfaces with LibreOffice for converting between office formats (DOCX, ODT, RTF, TXT). Apache POI provides direct manipulation of Microsoft Office formats when LibreOffice processing is not required. These libraries handle the complex parsing and generation of office document formats.
PDF Processing: Apache PDFBox powers our PDF generation, manipulation, and image extraction features. PDF.js renders PDF documents in the browser for preview functionality. Together, they provide comprehensive PDF capabilities from creation to display.
OCR and Image Handling: Tess4J wraps the Tesseract OCR engine, enabling text recognition from scanned documents and images. Apache Commons Imaging and TwelveMonkeys extend Java's native image format support, ensuring we can process the wide variety of image formats users upload.
Beyond the libraries listed above, FileConvertLab uses LibreOffice as an external conversion engine via JODConverter. LibreOffice is distributed under the Mozilla Public License 2.0 and provides the core document rendering and format conversion capabilities that power our office document tools.
Tesseract OCR, developed by Google, provides the underlying text recognition engine wrapped by Tess4J. Tesseract is released under the Apache License 2.0 and represents decades of OCR research and development.
Our frontend is built with Next.js (MIT License), React (MIT License), and Tailwind CSS (MIT License). These frameworks and tools enable the responsive, modern web interface that users interact with when converting documents on FileConvertLab.