Analytical Flory Random Coil
The Analytical Flory Random Coil (AFRC) is the central model of the package, exposed through
the AnalyticalFRC object. It reproduces the dimensions of a polypeptide
behaving as an ideal (Gaussian) chain - one in which the apparent Flory scaling exponent is
\(\nu = 0.5\), analogous to a chain in a \(\theta\)-solvent - and, unlike a finite
self-avoiding chain, it carries no finite-size (“dangling-end”) effects. It is a
parameter-free reference model: it is fully determined by the amino acid sequence and has
nothing to fit. The model and its parameterisation are described in Alston et al. (2023).
Origin: numerical Flory Random Coil simulations
The AFRC is a closed-form fit to numerical Flory Random Coil (FRC) ensembles. Those ensembles are generated with all-atom Monte Carlo using Flory’s rotational isomeric state (RIS) approximation: at each step a residue is chosen at random and its backbone dihedrals (\(\phi, \psi\)) are reassigned to one of a precomputed set of residue-specific allowed states, drawn from all-atom Ramachandran maps. The moves are rejection-free - only sterically allowed local dihedrals are proposed and the resulting global conformation is accepted unconditionally, with no through-space (chain-chain, chain-solvent, or chain-self) interactions of any kind.
Two consequences follow, and together they are what make an analytical description possible:
The chain is ideal. Because every monomer is “agnostic” to its surroundings, both global and internal dimensions scale with an apparent exponent of \(\nu = 0.5\), exactly as expected for a Gaussian chain in a \(\theta\)-solvent.
There are no end effects. Terminal residues sample the same conformational space as internal ones, so the “dangling-end” finite-size deviations seen in finite self-avoiding chains are absent. The internal scaling profiles for chains of every length superimpose, which means a single set of closed-form expressions can describe the chain over all length scales.
From simulations to a closed-form model
Residue-specific prefactors are fit once, against homopolymer FRC simulations, and then used analytically:
The root-mean-square inter-residue distance follows \(\sqrt{\langle r_{ij}^2 \rangle} = A_0\,|i-j|^{\nu}\); fitting the FRC internal scaling profiles yields a per-residue prefactor \(A_0\) (the \(R_0\) / \(R_0^{\mathrm{rms}}\) constants in this package).
The radius-of-gyration prefactor \(X_0\) is fit so that the analytical Lhuillier distribution matches the numerically generated \(P(R_g)\).
Because the RIS construction treats each residue independently, heteropolymers are handled by taking a composition-weighted average of these homopolymer prefactors. This generalisation was validated against FRC simulations of hundreds of heteropolymeric sequences (10-500 residues), reproducing both end-to-end and \(R_g\) distributions with sub-angstrom accuracy.
Mathematical formalism
End-to-end distance. The end-to-end distribution is the standard Gaussian chain result
where the root-mean-square size follows the ideal-chain scaling law
The prefactor \(R_0^{\mathrm{rms}}\) is the composition-weighted average of the per-residue \(A_0\) constants described above. An analogous prefactor \(R_0\) gives the mean end-to-end distance, \(\langle R_e \rangle = R_0\, N^{1/2}\).
Radius of gyration. The \(R_g\) distribution uses the analytical fractal-polymer form of Lhuillier (1988):
with \(\rho = X_0\, R_g\), dimensionality \(d = 3\), \(\nu = 1/2\), \(\alpha = 1/(\nu d - 1) = 2\), and \(\delta = 1/(1-\nu) = 2\). The composition-weighted prefactor \(X_0\) again comes from the calibrated per-residue table. The mean radius of gyration can be taken either as the expectation of this distribution or from the ideal-chain relation \(R_g = \langle R_e \rangle / \sqrt{6}\).
Hydrodynamic radius. \(R_h\) is available either from the Kirkwood-Riseman relation applied to the full inter-residue distance map, or from the empirical \(R_g \to R_h\) conversion of Nygaard et al. (2017).
Because the model also exposes every inter-residue distance, it additionally provides distance maps, contact-fraction maps, and per-residue PRE profiles for the same theta-state reference.
Behaviour and relationship to other models
By construction the AFRC behaves like a nu-dependent SAW evaluated at \(\nu = 0.5\); the two distributions sit essentially on top of one another. Relative to the other reference models in this package, the AFRC is slightly more expanded than the worm-like chain (at a persistence length of 3 Å) and substantially more compact than the good-solvent self-avoiding walk (\(\nu \approx 0.588\)). It therefore occupies the theta-point between the collapsed and fully solvated extremes.
Intended use
Note
The AFRC is a reference (null) model, not a predictor of unfolded-protein dimensions. Real dimensions depend on sequence-encoded chain-chain and chain-solvent interactions that the AFRC deliberately omits. Its value is as a fixed, sequence-matched touchstone: deviations of a simulation or experiment from the AFRC are a direct readout of sequence-specific intramolecular interactions, and normalising to the AFRC lets chains of different lengths and compositions be compared on a common footing.
Parameters
The AFRC is deliberately parameter-free: the per-residue calibration constants (\(R_0\), \(R_0^{\mathrm{rms}}\), \(X_0\)) are fixed and the only sequence input is composition and length. There is consequently nothing to tune.
Argument |
Default |
Meaning and typical values |
|---|---|---|
|
|
Numerical only. If |
What to expect for a protein. The apparent scaling exponent is \(\nu^{app} = 0.5\) by construction. With \(R_0 \approx 6\) Å, a disordered region of \(N\) residues has \(R_e \approx 6\sqrt{N}\) Å and \(R_g \approx R_e/\sqrt{6} \approx 2.5\sqrt{N}\) Å. Real intrinsically disordered regions scatter around these values: in the original study the ratio of simulated/measured to AFRC dimensions ranged from roughly 0.7 (more compact) to 1.4 (more expanded), so the AFRC is best read as a theta-point touchstone rather than a strict bound.
Citations
Alston, J. J., Ginell, G. M., Soranno, A., & Holehouse, A. S. (2023). The Analytical Flory Random Coil is a simple-to-use reference model for unfolded and disordered proteins. The Journal of Physical Chemistry B, 127(21), 4746-4760. https://doi.org/10.1021/acs.jpcb.3c01619
Flory, P. J. (1969). Statistical Mechanics of Chain Molecules. Wiley-Interscience.
Mao, A. H., Lyle, N., & Pappu, R. V. (2013). Describing sequence-ensemble relationships for intrinsically disordered proteins. Biochemical Journal, 449(2), 307-318.
Lhuillier, D. (1988). A simple model for polymeric fractals in a good solvent and an improved version of the Flory approximation. Journal de Physique, 49(5), 705-710.
Rubinstein, M., & Colby, R. H. (2003). Polymer Physics. Oxford University Press.
Nygaard, M., Kragelund, B. B., Papaleo, E., & Lindorff-Larsen, K. (2017). An efficient method for estimating the hydrodynamic radius of disordered protein conformations. Biophysical Journal, 113(3), 550-557.
Kirkwood, J. G., & Riseman, J. (1948). The intrinsic viscosities and diffusion constants of flexible macromolecules in solution. The Journal of Chemical Physics, 16(6), 565-573.