************************************************************** ** README (Draft) for Parellel ARC on Struts and Lucene ** ** Dlib Research, Old Dominion University ** ** Last Updated: 04/02/2006 By: Yang Zhao ** *************************************************************** 1. Package Direcoty ===================== src/ Source JavaCode and Resource/Property Files java/ Java Source code of ARC web/ Jave Server Pages, HTML, Images and Style Sheets struts/ Struts Deployment Descriptors and Configs conf/ Application Configuration/Resource Files (*.properties, *.xml) oai/ Resource Files for MetaData Provider server/ Resource Files for Cluster Node Applications harvester/ Resource Files for MetaData Harvester struts/ Resource Files for Struts Web Appication cluster.xml Cluster Configuration (IP,Port, etc) dataProvider.xml Data-Provider List for Harvester searchFields.xml Field Names for Metadata Record to be queried (Don't Change) lib/ Java Libraries classpath/ Libraries needed in Compiling ext/ Libraries needed at Run-time(Like JDBC or Lucene) build/ Destination for Compiled Binaries and Built Packages class/ Java Classes and Resource Files tar/ Temp directory for making TAR oai_arc.war Web Application node.tar.gz Service Application on cluster nodes harvester.tar.gz Harvester Application ant.properties Properties for build.xml (for user to customize) build.xml Ant Build File READM.txt This File 2. Required Environment ======================== * J2SE 1.5.0 * Tomcat 5.0 * Struts 1.1 (already included under /lib) * Lucene 1.4.3 (already included under /lib) 3. Software Configuration =========================== (1) Web Application Component -------------------------------- 1. Under $Project_Home/conf, edit "cluster.xml" to set all server infomation in the Lucene cluster * IPADDR - IP address of RMI registry on a cluster node * PORT - Port Number of RMI registry * HOSTNAME - Hostname of a cluster node ( optional ) 2. Under $Project_Home/conf/oai, update the OAI's 'Identify' Information in "IdentifyBundle.properties" for Metadata Repository. 3. For Harvester Administration, set the deployment home directory of harvester in $Project_Home/conf/struts/ApplicationResouces.properties as following: Harvester.Home.Dir=E:/dlib/Merge_ARC/build/class (2) Harvester Application ------------------------------------- Go to $Project_Home/conf, 1. Set server information of the Lucene cluster in "cluster.xml" (Same as above (1) ) 2. Set data-provider list in "dataProvider.xml", if use web administration, do as above (3). 3. Set the scheduling properties in "ScheduleBundle.properties" * SLEEP_BETWEEN_HARVEST - Interval between 2 consecutive harvests (in seconds) * SLEEP_BETWEEN_REQUEST - Interval between 2 consecutive OAI requests (in milliseconds) * DP_LIST_SOURCE - xml only currently * DISTRIBUTED_LIST_SIZE - Number of records to distribute for one node at one time (3) Cluster-side Appplication Component ----------------------------------------------- Go to $Project_Home/conf/server, set the following properties in "LuceneNodeBundle.properties": * index.directory - The directory which stores the index of metadata * deleted.index.directory - The directory which contains the deleted records' index 4. Compile and Build ======================= (1) Edit the following properties in file ant.properties --------------------------------------------------------- 1. "dir.project" - Path to the project home, $PROJECT_HOME 2. "dir.appserver" - Path to Tomcat server home (if it is accessible through network hard drives), $TOMCAT_HOME 3. "dir.cluster" - Path to cluster-service application home (if it is accessible through network hard drives), $CLUSTER_HOME 4. "dir.harvester" - Path to harvester application home, $HARVESTER_HOME (2) Clean the old classes and packages --------------------------------------- % ant %Ant -bf build.xml clean (3) Build the war file for web application on Tomcat ---------------------------------------------------- % ant -bf build.xml build-web (4) Build the server-side application on Lucene cluster -------------------------------------------------------- % ant -bf build.xml build-server (5) Build havrester application (Unfinished yet) ---------------------------------------------- % ant -bf build.xml build-harvester 5. Deploy =============== (1) Web application -------------------- % ant -bf build.xml deploy-web OR copy the build/war/oai_arc.war to $TOMCAT_HOME/web_apps/ NOTE: To change username/password to access administration page, edit the file tomcat-users.xml under $PROJECT_HOME/src/struts, then copy it to $TOMCAT_HOME/conf/tomcat-users.xml. The default login is yang/yang or maly/maly. (2) Cluster-Service application --------------------------------- % ant -bf build.xml deploy-server Or manually copy 'build/node.tar.gz' to $CLUSTER_HOME, then "gunzip node.tar.gz" and "tar -xvf node.tar" to open the package. It comprise following contents: bin/ Shell Script (setting classpath, etc) conf/ LuceneNodeBundle.properties, server.policy, *.xml lib/ Third-party Libraries node.jar Application Jar data/ Store of Lucene Index index/ index deleted/ deleted index (3) Harvester application ------------------------------- % ant -bf build.xml deploy-harvester Or manually copy 'build/harvester.tar.gz' to $HARVESTER_HOME, then open it up with "gunzip" and "tar -xvf", it includes /edu and 'harvester.policy' It comprises following contents: conf/ harvester.policy, schedudlerBunder lib/ Third-party Libraries harvester.jar Application Jar 6. Start Service on Cluster Node ================================== 1. Check to ensure that the properties, "index.directory" and "deleted.index.directory", in 'conf/LuceneNodeBundle.properties' have been correctly set. In case of higher security requirement, users should update the permission settings under server.policy. It by default assigns full access previledges at this point of developement. 2. Start the service server using the command in the following format: $ java -Djava.security.policy=conf/server.policy -jar node.jar [IP:Port] [harvest|search|all] * harvest: Start Harvester Service only * search : Start Lucene Search Service only * all : Start both For Example( HOST=128.82.4.7, PORT=4455 ): -------------------------------------------- (1) Start Cluster-Service for Harvest Only $java -Djava.security.policy=conf/server.policy -jar node.jar 128.82.4.7:4455 harvest (2) Start Cluster-Service for Search Only $ java -Djava.security.policy=conf/server.policy -jar node.jar 128.82.4.7:4455 search (3) Start All Services (After some data has been loaded) $ java -Djava.security.policy=conf/server.policy -jar node.jar 128.82.4.7:4455 all 7. Start Web Application and Harvester Application =================================================== (1) Harvester -------------------------- 1. Under $HARVESTER_HOME/conf, make sure 'dataProvider.xml', 'cluster.xml', and 'ScheduleBundel.properties' have been correctly configured. 2. Start Harvester $ java -Djava.security.policy=conf/harvester.policy -jar harvester.jar [-s|-l] * "-s" run one time * "-l" to run scheduled task repeatedly (2) Web Application ------------------------ Start Tomcat server after the web component has been deployed correctly.