DAOS Estimator 1.5 – Performance Improvement and Best Practices
IBM has just released the DAOS Estimator version 1.5 for public download. I had been testing this new version internally during it’s “beta” and all I can say is that it improves VASTLY the performance. I had run this on a very large mail server, with around 3.500 mail files (~900GB of data).
As a recommendation for analyzing large mail servers, I would recommend the following:
- Make use of your cluster servers – running DAOS Estimator on your cluster servers will save your primary servers from a performance hit if you’re running on a “hot standby” mode.
- Run one DAOS Estimator task per mail directory – this allows processing to happen in more than one maildir at a time
- Run daosest -c to output attachment data to a .csv file to be analyzed later – this allows you to run daosest faster on your servers
- Upgrade to DAOS Estimator 1.5 – processing will improve VASTLY!
So I ran with the following settings:
On the server:
- Run one daoest -c task per mail directory. I ran 10 simultaneous daosest tasks for mail1-10, processing was roughly around 2 hours for all 3,500 mail files. This was done on the cluster side, with off-prime shift hours.
- Generate an .ind file containing all the DAOSEST*.CSV files generated from your daosest -c command on the server.
- To analyze the output from DAOSEST -C, run daosest -a INPUT_FILE.IND -o OUTPUT_LOG.txt, where the daosest anlyzes (-a) the “INPUT_FILE.IND” (which is the ind file you generated with the different .CSV files), and then outputs (-0) the analysis to a log file for you to review later.
On the initial run with version 1.4, this was taking in excess of 7 hours on some servers. After testing with version 1.5, the processing time reduced to 6 MINUTES!
Save yourself the hassle and upgrade to DAOS Estimator 1.5 immediately!
April 7th, 2010 at 2:31 am
Hello,
Is there a way to specify a path, rather than default binaries location, for storing the csv output ?
Thanks for your help