Tags:
create new tag
view all tags

Hello JunHua, Olivier,      I have been quite busy setting up things for the mass production of electric fields computations. The code seems to be stable now. At least it runs on both my local sl6 computers and at CC IN2P3. So I committed it to SVN. I'll try to make some summary of the changes here. You can have a look at the commit comments also, from http://svn.in2p3.fr/trend.      So the philosophy was to try keeping things simple by having all data stored locally on a PC used for the production, or on /sps/trend at CC IN2P3. Since the volume of data is not that big. Also, the good thing with using the grid is that you can control all things directly from your own PC. You don't need to connect to Lyon, which makes it easier from China.     The bookkeeping of the data and jobs is done with a local sqlite DB. So production should really be centralized where this DB is. Otherwise there will be conflicts. Nevertheless, the DB can be rebuilt at any time from the data set, in case it get corrupted, or lost or whatever. So the DB is there only as a tool to centralize info, but it is not critical. This is achieved by using a fixed format for the data storage, as following:      MainFolder/: showers/                             fields/                  *antennas*.txt        Where MainFolder identifies the production, for example `trend-50`. There is an example in _/sps/trend/trend-50/. The subfolders showers/_ and fields/ contain the grid job results, as it is the case in /sps/trend/simu/conex now. The file *antennas*.txt gives the antennas names and coordinates.     One thing that I changed in all the interface functions, in trasi.Interface, is that they can read directly from .tar.gz archives without untaring them. In order to guarantee the integrity of the data, that are not at all modified that way. I came across some troubles with files unzipped, but not reziped for example, maybe due to crashes?           Also, I provided a set of quick and easy functions for grid usage, similar to standard batch, the like: gstat, gsub, gkill and gget. The last one is for retrieving the output of grid job(s), which is not automatic as on a standard batch. In order to use them you will need to get the latest version of the soft from svn, and to have a working version of dirac installed first. Then from python/production you can run `make install` which will generate a setup.(c)sh script to source as initialization. BEWARE that installation must be done after having sourced dirac/bashrc or dirac/cshrc script and starting from a fresh sessions ideally. For further uses it should be enough to source only the setup.(c)sh script generated at install.      I also added an install procedure for the plot-* scripts in trasi. As previously, installation is done as 'make install' from trasi. Start from a fresh session again. The reason is that dirac uses it's owns python (2.6 flavour) and there can be conflicts between the standard python install and the dirac one. Some DIRAC modules do not run with the standard python2.6 and conversely, some modules can crash under dirac's python.      The soft is already installed at CCIN2P3, as well as a working version of DIRAC. If you are registered with the France-Asia VO you can test it from there. You'll need to copy the setup.(c)sh scripts from trasi/scripts and production/scripts to a work directory in your home and source them. They do not seem to conflict at CC  Then initialize your DIRAC proxy as dirac-proxy-init.py. Please, do not run the scripts from the $THRONG_DIR. It is a svn managed version of the soft meant for common use.     There are also 4 example scripts in production/scripts called *tr50* that show how to register new jobs in a local DB, for the trend-50 production, how to submit and how to manage the result. You first need to import the DB as import-db /sps/trend/trend-50, which will build trend-50.db. You can inspect it with dump-db trend-50.db, which will show its SQL like structure. Then you can register new showers or fields with `python register-tr50-*.py`, and submit them with `python submit-tr50-jobs.py`. To monitor or retrieve the result use `python get-tr50-jobs.py`. You can also list the jobs or get the data with gstat or gget. But when using the later the jobs result and the data will not be managed by the bookkeeping. Nevertheless, it does not harm to use these commands. For example, for debug purpose you can do `gget --debug -j JOBID`, which will collect all the info on the job, allowing to re-run it locally in order to check what was wrong.         Concerning the selection of valid antennas, shower core pairs, I organized the register-tr50-fields.py script such that it exposes the part of the algorithmic at the very beginning of the code, as two functions, called: generate_shower_core_position() and filter_event(). So you hopefully you won't need to dig in the code after which mostly deals with I/O's between the bookkeeping, DIRAC and the stdout.      About the installation of DIRAC. Erm, I had many troubles with the latest release  If you follow the instructions on the web you will get the HEAD version of DIRAC for gridfr which does not work actually for our France-Asia VO. So you need to revert to an older version. Dirac v6r4p4 is working fine. As example, I attached a bash script to source, which should automatise this step. You'll be asked for your GRID password at 2 cases though.     So that should be it concerning the code changes. There will probably be bugs left      Otherwise there is the question of integrating the refit in the code. It is not handled by the bookkeeping because this step is very fast and I wanted to keep things as simple as possible. I was thinking that we could do the following code modification, in order to handle it:      - Add a Refiter Class in trasi which allows both to refit a set of CONEX data and to undo the refit. For example the old data could be saved in ftn/old.tar.gz, and the new would overwrite the existing .ftn files.     - Add a refit flag to trasi.Interface.EvaJob which would trigger the refit at the end of each new conex simulation. This flag would be true by default.     - Add a refit flag to production.bookkeeping.Interface.ShowerInterface which would trigger the refit at each import, if no ftn/old.tar.gz file has been located. This flag would be True also by default.      With these modifications the refit would be transparent at usage. We will also need to check that the electric field computed for refited showers is close to the one for non-refited showers, provided the two fit agree. To validate the new *.ftn functions generated by the refit, ect ...     I could implement these changes if you provide the new fitter to me? But if you like to do it yourself, that's OK for me also      Cheers,     Valentin  P.S: to start with, I guess that the only selection we would like to do, on antenna-shower pairs is to make sure that the shower core is within some range of the shower. Later we could refine this taking the shower energy into account.  P.S.2: the simulated showers in /sps/trend/simu/conex are all at 5 10^17 eV (except 3 of them or so) and with a proton primary. In a more realistic simulation their energy would be distributed according to an 1/E^2.7 power law and the primary randomized also. But OK, let's start playing with the showers we already have sofar ... 

-- OlivierMartineauHuynh - 2013-12-17

Topic revision: r1 - 2013-12-16 - OlivierMartineauHuynh
 
This site is powered by the TWiki collaboration platform Powered by PerlCopyright © 2008-2019 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding TWiki? Send feedback