First of all, it has nothing to do with thermodynamics (that it, it is _not_ the array of Boltzmann weights of the states). Wstate is just the array of weights used for construction the density matrix for orbital optimization. If you are interested in more than one state, you should obtain the orbitals equally good (or equally bad) for all the states of interest. If you give preference to one of the states, your resulting orbitals will be biased and, therefore, sort of meaningless.
If you perform a state-averaged CASSCF for different geometries (say, along the reaction path), you'll see that the dominant configurations in your states of interest (say, ground and 1st excited state) change. These changes in the weights of individual configurations within the target states reproduce the changes in the corresponding wavefunctions along the reaction path. For example, the dominant configuration in the ground state was ...2200... (the digits are the occupation numbers) with 99% weight. As you move toward the transition state, the weight of this configuration will decrease, while the weight of ...2110... will increase. The state with dominant ...2200... configuration will become excited (therefore, state crossing should occur in some point). In the product state, you'll again have the ground state dominated by ...2200..., but the orbitals will be different, namely, product-like (rather than reagent-like). If you perform such a calculation, you'll see it clearly.
So, the amount of states to be averaged is governed by the amount of terms that cross along your reaction path. If you're lucky, you'll need only closed-shell singlet, 1st excited singlet, and a triplet. This is typical, for example, for rotation around conjugated bonds in linear conjugated systems. Since I know nothing about you system, I cannot say anything about wstate a priori.
Moreover, you may try state-specific CASSCF (that is, with default wstate(1)=1,-0). If you see that the dominant configuration of your ground state changes substantially along the reaction path, state-averaging is needed.
As for the active space size... it does not have to be _large_, but is _must_ be balanced. For example, if you see symmetry breaking in the electron density where the nuclear configuration is symmetric, something is definitely wrong. In this case, you should carefully examine and change you state-averaging scheme, then revise your active space. The natural orbitals with occupancies >1.99 and <0.01 can be considered as inactive for the given state-averaging scheme and, therefore, can be excluded from the active space or replaced by some others.
On Sun Sep 28 '14 1:25am, alex wrote
>Thanks for the comment. To be honest the voluntarism is what makes me really upset - if you have two different mcscfs converged, which one is (more) trustworthy and why? I can explain only orbital selection part of the the problem - the difference is due to limited active space and better understanding/bigger size of the active space would be more accurate. But what is about nstate? I have plenty of states very close at energy scale and I am not sure that in active site of my protein the amount of excited substrate is say 0.28 or 0.5 under thermal pressure of the surrounding protein body. It can be next to zero and just waiting for kinetic energy to be focused by protein body for example, why not? Intuitively I would join reaction path/coordinate to states weight selection because for reasonably selected active space only certain route of weights increasing for desired states is leading toward the reaction products. So can you please give some insights of why do you think that ignoring of the reaction coordinate and setting states on/off is the best strategy?
>Thanks for the suggestion - I'll check aldet vs guga performance.
>On Sat Sep 27 '14 8:35pm, sanya wrote
>>I agree with Ilya: the choice of active space and state-averaging scheme is mostly a matter of trial-and-error and experience. Using localized and properly reordered orbitals may help to reduce the active space.
>>By the way, I recommend using cistep=aldet option ($DET group is used in this case instead of $DRT), which is better parallelized. If you need to average the states of only one multiplicity, set pures=.t. In this case, you may set ispin to 0 or 1, depending on which multiplicity (odd or even) you need. Don't forget to adjust NSTGSS to 3*NSTATE. If different multiplicities should be averaged, set pures=.f.
>>Playing with fractional weights in wstate is not a good idea. Actually, fractional weights are used in very special cases. Just set 1 if you want to include the state in the averaging or 0 to exclude it.