cattaDoc PHP requirements

cattaDoc is mainly written in PHP and uses a number of PHP add-ons: gettext, adodb, mcrypt, PEAR Mail_Mime and dompdf as well as plain text extraction utilities

PHP is the scripting language that cattaDoc is based on. From version 4.0, cattaDoc requires PHP 5.0 and upwards. cattaDoc 5.0 is tested with PHP 5.5. Consequently, cattaDoc cannot be run on PHP 3 and 4.

cattaDoc requires 2 specific add-ons to PHP:

Furthermore, 3 additional add-ons are optional:

In addition, a number of Plain text extraction utilities are used in cattaDoc, 3 of which are required and 1 is optional.

 

gettext - required

Gettext - part of the GNU project - is an open standard for supporting different languages in an application, e.g. an English, a Danish, a French, etc. user interface.

The gettext functions implement an NLS (Native Language Support) API which can be used to internationalise applications.

Note: gettext is operating system-specific and is therefore not included in the full cattaDoc download package.

 

ADOdb - required

ADOdb is a database abstraction layer for PHP, which hides the differences between the different database access functions in PHP. PHP database functions are not standardised, unfortunately. ADOdb makes it easier in the future to provide cattaDoc support for other databases.

Furthermore, ADOdb is fast and mature (devloped since 2000).

ADOdb ADOdb is an open source PHP class library and you can download it from adodb.sourceforge.net.

Installation of ADOdb is easy: Just unpack all the files into a directory accessible by your webserver.

ADOdb is included in the full cattaDoc download package, but not in the cattaDoc-only package.

 

mcrypt - optional

Mcrypt is an encryption package and is used for Permission control in cattaDoc.

If you only use Basic control in cattaDoc, mcrypt is not needed. It is only required for Enhanced control with access rights.

Note: mcrypt is part of PHP (operating system-specific) and is therefore not included in the full cattaDoc download package.

 

PEAR Mail_Mime - optional

PEAR Mail_Mime provides a set of classes to deal with the creation and manipulation of MIME messages. It allows people to create e-mail messages including text and HTML parts, attachments etc.

By adding PEAR Mail_Mime to cattaDoc, you can mail documents to recipients directly from the Send document event in cattaDoc.

Using PEAR Mail_Mime requires access to an SMTP mail server.

You can download PEAR Mail_Mime from pear.php.net.

You integrate PEAR Mail_Mime in cattaDoc through System administration --> Configuration / System constants.

PEAR Mail_Mime replaces HTML Mime Mail used in previous versions of cattaDoc (prior to version 5). HTML Mime Mail is no longer supported. In fact, it is replaced by PEAR Mail_Mime.

Note: PEAR Mail_Mime is part of PHP (operating system-specific) and is therefore not included in the full cattaDoc download package.

 

dompdf - optional

dompdf is a package which can convert HTML files to PDF files.

Sounds too good to be true - and often it is. dompdf works fine for simpler documents, like an invoice. But for larger documents, e.g. with tables spanning more than one page, it often fails.

If you know of a better package or way to automatically convert HTML files to PDF files, please leave a comment!

You can download dompdf from github.com/dompdf/dompdf/releases.

You integrate dompdf in cattaDoc through System administration --> Configuration / System constants.

 

Plain text extraction utilities

In order to enable full-text search, the plain text elements of documents are extracted and stored in the database. This requires the use of different utilities.

Encoding - required

The package Encoding - ref. github.com/neitanod/forceutf8 - is used to encode plain text from HTML files and ordinary text files into UTF-8 irrespective of the original encoding.

Encoding is required for full-text indexing of HTML and text files and is included in both download versions of cattaDoc.

 

Strip out (X)HTML tags and invisible content - required

The PHP package strip_html_tags - ref. nadeausoftware.com/articles/2007/09/php_tip_how_strip_html_tags_web_page - removes tags and invisible content from HTML files so that only the plain text elements are left behind.

strip_html_tags is required for full-text indexing of HTML files and is included in both download versions of cattaDoc.

 

Filetotext - required

The PHP class Filetotext - ref. www.phpclasses.org/ - is used to extract plain text from newer Microsoft Word documents (.docx).

Filetotext is required for full-text indexing of .docx documents and is included in both download versions of cattaDoc.

 

pdftotext - optional

pdftotext is an operating system component used by PHP in cattaDoc to extract plain text from PDF documents to enable full-text search.

pdftotext is part of the Xpdf software suite which is also ported to Windows. Poppler, which is derived from Xpdf, also includes an implementation of pdftotext. On most Linux distributions, pdftotext is included as part of the poppler-utils package, installed by default in many distributions.

www.foolabs.com/xpdf/home.html is the official home site for Xpdf from where the Windows version can be downloaded.

Note: pdftotext is operating system-specific and therefore not included in the full cattaDoc download package.

 

Download PHP

PHP PHP is open source and is included in most Linux distributions or can easily be installed using a Linux package manager, e.g. YAST in OpenSUSE or Synaptic Package Manager in Ubuntu.

Two Windows PHP packages are available in a number of different versions, including 32-bit and 64-bit.

In earlier version of PHP, gettext and mcrypt were not included in the installer package, but this seems to be history now.

 

Enable gettext and mcrypt

Make sure that both gettext and mcrypt is available and is enabled.

In Windows, both are separate DLL (dynamic link library) files:

  • php_gettext.dll
  • php_mcrypt.dll

Install and / or enable gettext and mcrypt in Windows:

  1. Extract the gettext as well as the mcrypt DLL files from the PHP zip package. It is in the ext sub-folder
  2. Copy php_gettext.dll and php_mcrypt.dll to the extensions subfolder for the PHP installation, e.g. C:\PHP\ext
  3. Edit the PHP configuration file, php.ini, e.g. C:\PHP\php.ini, to include gettext and mcrypt support by removing the semicolon in front of the line:
    extension=php_gettext.dll
    extension=php_mcrypt.dll
  4. Make sure that the extensions folder is correctly defined in php.ini, e.g.:
    extension_dir = C:\PHP\ext
  5. In php.ini, adjust the maximum file size for cattaDoc documents, e.g. 10 MB:
    upload_max_filesize = 10M
  6. Restart your web server

cattaDoc runs with the PHP.INI parameter register_globals = Off - but of course also with register_globals = On.

 



Leave a Comment

 
Revised: 2016-01-13