============================== How-To configure eZDFS cluster ============================== About ===== eZDFS stands for Distributed File System. This handler's principle can be summarized as follows: cluster files are mainly stored on NFS, while their metadata (size, mtime, expiry status) are maitained in a DB table similar to the one used by eZDB. NFS is used to read and write the reference copy of clusterized files. Cache files will be copied locally when used by a frontend. Images & binary files (when accessed directly via the browser) will be streamed directly from NFS. eZDFS: Global architecture configuration ======================================== The most important aspects of architecture are the NFS mount point and the cluster database. Each instance of eZ publish sharing the same relational database has to use the same cluster database, and each should have a local mount point to the same NFS export, configured in the exact same way. Example: NFS-server:/exports/myvardir eZpublish-server1:/var/www/var/nfsmount => mount of NFS-server:/exports/myvardir eZpublish-server2:/var/www/var/nfsmount => mount of NFS-server:/exports/myvardir eZpublish-server3:/var/www/var/nfsmount => mount of NFS-server:/exports/myvardir The var directories should in NO CASE be shared among instances, since they will automatically be synchronized. This is valid for both eZDB and eZDFS. The cluster handlers take care of synchronizing data from/to the centralized repository. Installing ========== Creating the database structure ------------------------------- The database structure required to hold clusterized files informations has to be created. This table can be created either on the same MySQL server as the one used for the relational database, or on a different one. For large scale websites, a dedicated MySQL server can really improve performances. The definition of this table can be found in the eZDFS MySQL driver class file (kernel/private/classes/clusterfilehandlers/dfsbackends/mysql.php). It is exactly the same as the one you can find below:: CREATE TABLE ezdfsfile ( `name` text NOT NULL, name_trunk text NOT NULL, name_hash varchar(34) NOT NULL DEFAULT '', datatype varchar(60) NOT NULL DEFAULT 'application/octet-stream', scope varchar(25) NOT NULL DEFAULT '', size bigint(20) unsigned NOT NULL DEFAULT '0', mtime int(11) NOT NULL DEFAULT '0', expired tinyint(1) NOT NULL DEFAULT '0', `status` tinyint(1) NOT NULL DEFAULT '0', PRIMARY KEY (name_hash), KEY ezdfsfile_name (`name`(250)), KEY ezdfsfile_name_trunk (name_trunk(250)), KEY ezdfsfile_mtime (mtime), KEY ezdfsfile_expired_name (expired,`name`(250)) ) ENGINE=InnoDB; NFS --- Since eZDFS is based on NFS, a local mount point to a NFS server has to be available & writeable by the webserver's user on each server that will use eZ Publish. How the mount points are configured depends on the software vendor of the NFS solution. Configuring the cluster handler ------------------------------- Each eZ Publish instance that share the same var directory has to be configured correctly in order to get the cluster handler working. Everything is configured in file.ini (you should of course create a global override for this file). Setting the cluster handler to eZDFS '''''''''''''''''''''''''''''''''''' This is done by setting in file.ini the ClusteringSettings.FileHandler directive to eZDFSFileHandler:: # in settings/override/file.ini.append.php [ClusteringSettings] FileHandler=eZDFSFileHandler Configuring the handler itself is done in the same file, in the INI block eZDFSClusteringSettings. Most settings are self explanatory, but more details are however available below:: # in settings/override/file.ini.append.php [eZDFSClusteringSettings] # Path to the NFS mount point # Can be relative to the eZ publish root, or absolute. # The example below will work for a mount of NFS to the subdirectory # var/nfsmount in the eZ Publish root. Any location is possible MountPointPath=var/nfsmount # Database backend (only one available at the moment) DBBackend=eZDFSFileHandlerMySQLBackend # Hostname of the MySQL server where the ezdfsfile table lies DBHost=localhost # MySQL port DBPort=3306 # Database name DBName=ezpublishcluster # Database user DBUser=ezpublish # Password for the user above DBPassword=ezpublish Configuring the index_* files for clustering -------------------------------------------- In order to make cluster files available via direct HTTP requests (images, binary files, etc), it is required to use a configuration script and rewrite rules. The configuration scripts are included in the release. The one used by apache to serve binary files request is index_cluster.php (included in the backport release), and it will be referenced in the rewrite configuration. This file contains configuration constants, and will include the file index_image.php (part of the eZ Publish distribution) when executed. This file will then include `index_image_dfscluster.php`, the final script that will read and stream the files to the browser. You will find below an example for index_cluster.php, matching the INI configuration above (settings will usually match between INI and this file):: Cluster rewrite rules --------------------- mod_rewrite is mandatory to get the above index_cluster.php script to deliver binary files & images. These rules are actually quite simple: they rewrite every request for a content image / binary file to index_cluster.php, which will then deliver the files directly through HTTP from the NFS server. These rules are the same than the ones used for eZDB ( http://ez.no/doc/ez_publish/technical_manual/4_0/features/clustering/setting_it_up ). These rules can however be found below:: RewriteEngine On Rewriterule ^/var/([^/]+/)?storage/images-versioned/.* /index_cluster.php [L] Rewriterule ^/var/([^/]+/)?storage/images/.* /index_cluster.php [L] # Other eZ Publish rewrite rules These rules *have* to be found *before* the standard eZ Publish rewrite rules. Note: As for the standard cluster, it is more than strongly recommended to use some sort of rewrite proxy to cache the binary files request, since these will directly query DB (eZDB) / NFS+DB (eZDFS) and may lead to performance issues. Clusterizing the files ---------------------- Once everything is correctly configured, binary data (image & files) have to be clusterized. This can be achieved using the clusterize.php script:: php bin/php/clusterize.php -v The script will push all the images found in ezimage and the binary files found in ezbinaryfile to the ezdfsfile table (metadata), and the file themselves to the configured NFS mount point. *Note:* this process can take a long time depending on the amount of files the eZ Publish instance contains. If the process is interrupted for any reason, it can be safely restarted. Clear the eZ Publish cache -------------------------- Use the command line ezcache.php script:: $ezpublishroot: php bin/php/ezcache.php --clear-all If you are configuring multiple servers, execute this command on each server. Troubleshooting =============== Once everything has been configured and binary content has been clusterized, testing can sometimes be complicated. First, override the debug.ini file (to settings/override/file.ini.append.php), and enable the kernel-clustering debug key:: [GeneralCondition] kernel-clustering=enabled It will present your with much more debug information regarding the cluster. Refresh any page of your website. The page itself should work correctly. If images are not displayed, it is most likely that something is wrong with either your rewrite rules, or your index_cluster.php file. The best way to locate the error is to load an image directly in the browser (usually using the contextual menu on an image, and choosing "Open image in a new tab" or similar). If a "Module not found" error is shown instead of the image, this means that your rewrite rule is not correctly configured. If a PHP error is shown, it means that your index_cluster.php configuration might be wrong.