Help Topics


Tutorial LSER-Database

Database LSERD

This database contains experimentally determined Linear Solvation Energy Relationship (LSER) descriptors, E (excess molar refraction), S (polarizability/dipolarity), A (solute hydrogen bond acidity), B and/or B0 (solute hydrogen bond basicity), and L (logarithmic gas-hexadecane partition coefficient) reported in the literature. The database also stores the McGowan’s molar volume V calculated for each compound from its molecular structure. These solute descriptors can be used, in combination with LSER equations or other types of polyparameter linear free energy relationships (pp-LFERs), to calculate partition coefficients for various systems.

LSER descriptors stored in this database have either been previously published in peer reviewed scientific papers or were taken from the ABSOLV data base assembled by Michael Abraham. We neither take responsibility for accuracy of the predictions that are made using the LSER descriptors from this database, nor do we guarantee correctness of stored values in comparison to those in the original papers or in the ABSOLV database.

Within the database we also provide the option to predict LSER descriptors for all neutral chemicals that are identified by a SMILES string. These QSPR predictions were developed by Trevor Brown and are calibrated on the basis of the descriptors stored in our database. Naturally the accuracy of QSPR-predicted descriptors is inferior to those that were derived from experimental data. We also provide information whether the searched chemical is within the application domain of the QSPR. Obviously, predictions outside the application domain must not be regarded as reliable in any sense.

Searching using compound names, CAS numbers, and/or SMILES

Queries can be performed using the compound name, CAS number or SMILES. It is recommended to use CAS number or SMILES, because the database does not include all possible compound names for a given compound. CAS number should be entered in the form “123-45-6”.

A search can be performed for a single compound name, CAS number, or SMILES, or it is also possible to enter a list of compound names, CAS numbers, SMILES, or a mix of these.

By default, the search needs an “exact match” for any of the three keyword types. One can perform a partial search by inserting a “%” sign into the search keyword. For example, the name “hexane” gives only the data for “n-hexane”, but “hexan%” would give the values for hexane, hexanone, hexanol, etc. A “%” sign can be inserted anywhere in the keyword (e.g., “%anone”, “hexac%ene”, “he%me%e”, “%c1ccccc1”, “123-0%” are all allowed).

The search results are shown in a table form on the web-browser. The results can be also downloaded as an xls-file by clicking on a button below the result table.

It is possible to show only descriptor values from selected references. To do this, select a publication (or publications) in the scroll menu. If “preselected values” is selected, only one set of descriptors per compound will be shown in the table. This may be useful when the user wants to retrieve descriptors for a long list of compounds. However, we do not infer that preselected values are more accurate than others. Preselection was made only for convenience purposes.

Quantitative Structure Property Relationships for the calculation of descriptors

Quantitative Structure Property Relationships (QSPRs) have been developed for the LSER descriptors E, S, A, B and L. The QSPRs predict the LSER descriptors for chemicals based only on structures provided as SMILES strings. The development and testing of the QSPRs follow the OECD principles for model validation [1] and the methods have been previously described in detail [2], [3]

The QSPRs are based on functional group contributions. A set of fragments, or substructures, are searched for and counted in the structures provided and these counts are then multiplied by the model regression coefficients to yield a prediction. The substructures and coefficients are unique for each of the five developed QSPRs. SMILES strings and the substructures in the models capture only two-dimensional information about molecules, so the influence of three-dimensional shape, such as steric restriction of solvent access to hydrogen bond acceptors may not be captured in the QSPR.

A domain of model applicability has been defined and this information is provided with the predictions to help in interpreting their reliability. The higher the Chemical Similarity Score (CSS, varies between 0 and 1) the more similar a structure is to chemicals in the training dataset that are well-fitted by the model. A low similarity score indicates a structure very different from chemicals in the training dataset or a similarity to chemicals for which the model has difficulty predicting values. Leverages are also calculated, with a high leverage indicating that the model has been extrapolated beyond the model training dataset.

The predictions made by these QSPR models have inherent uncertainty. They are intended to aid in experimental design, to be used for preliminary screening of chemical properties, and to check the reasonableness of experimental data. Users should be cautious in their use and pay careful attention to the domain warnings included with the predictions. The authors assume no liability for predictions which turn out to be erroneous. New and improved versions of the QSPRs may be released in the future. In this case the old versions of the QSPRs will still be available, so that users can compare new and old predictions for themselves.

Calculation of partition coefficients

Search results of LSER solute descriptors can be used to calculate partition coefficients. When the results are displayed, click on one of the system categories listed in the “Search for partition systems” box. Select the desired partitioning system(s) by clicking the check box(es) and hit the “Calculate” button, which will generate an Excel file that contains logarithmic partition coefficients calculated using the selected solute descriptors and system parameters.

Non-linear sorption isotherms in soil organic carbon

Typically, sorption isotherms for soil organic carbon/water partitioning are non-linear and best described by a Freundlich isotherm of the form:

Csorbed = KF *(Cwater)n

A non-linear isotherm means that the partitioning is concentration dependent. Partition coefficients calculated from the provided ppLFERs do not reflect such a non-linear behavior. They are only valid for concentrations in the vicinity of those that were used for the calibration data. E.g. in the reference of Bronner et al. the equilibrium water concentration was in the order of 0.03 mmol/L (see ref. [4]). If the actual water concentration differs for more than 2 orders of magnitude from this reference value, then we recommend to use a Freundlich isotherm for describing the sorption equilibrium.

Typical Freundlich exponents, n, published in the literature for sorption to soil organic carbon are around 0.8 [5], [6]. If one accepts this value as generally valid then one can calculate a Freundlich sorption coefficient KF from the ppLFER derived Koc based on the work of Bronner et al. as follows:

KF = KppLFER 3 * 10-2
(3 * 10-2)0.8

With this KF and n=0.8 one can now construct the Freundlich isotherm. The Freundlich isotherm parameters are provided by the data base when the respective tool is used.

Extraction Tool

When extracting a target analyte from an aqueous sample, from a suspension or from organic material it is important to choose the type and volume of solvent such that it provides the desired extraction efficiency. This extraction efficiency can be calculated from known partition constants for the extraction system used if equilibrium partitioning is achieved. Here we provide a tool that calculates the expected extraction efficiency based on equilibrium partition constants from the data base. Besides the extraction from pure water, we offer extraction from aqueous suspensions containing dissolved organic matter, particulate organic matter, proteins, and lipids and the effect of adding NaCl (salting out). It is also possible to consider a double extraction.

In order to calculate the fraction of the analyte that is extracted one first has to search and select the analyte(s) in the data base. If no chemical of the query result is selected, all chemicals will be included in the calculations. Next one has to select all solvents of interest from the list shown in the window. Finally, one enters the volumes of the water sample to be extracted and the anticipated volume of the solvent phase(s). As an alternative the extraction efficiency for different solvent volumes can calculated or the respective solvent volume for a desired analyte fraction in the solvent.

Thermodesorption

In order to trap an analyte quantitatively from air to a sorbent the type and amount of sorbent that is used has to be adjusted to the sampling temperature and volume as well as the type of analyte in order to avoid breakthrough. In the subsequent desorption step, temperature and gas volume have to be adjusted to the type and amount of sorbent and the type of analyte. All this can either be done by “trial and error” (which is not at all efficient) or it can be planned ahead provided that the sorption constants of the analyte on potential sorbents are known as a function of temperature. In ref. [7] we present a systematic sorption study with many hundred experimental sorption values for various analytes, temperatures and sorbents (Tenax TA, Chromosorb 106, Porapak N and Carbon Black). Based on this data we derived a pp-LFER model to predict the specific 50 % breakthrough volume of any compound for different sorbents and temperatures. Here we present a tool that is able to predict the specific 50% breakthrough volume, the sorbent mass needed for a certain sample volume and the minimal desorption temperature. This information should help the user to facilitate experimental planning and optimize analytical methods. For the calculation of the mass of sorbent required we use two times the mass that would be needed to hold back 50% of the analyte: Mass sorbent = 2 * BTV50% * Sample Volume
For the calculation of the minimal desorption temperature we use the temperature that corresponds to a desorption volume 3 times higher than the one needed to desorb 50% of the analyte.

Blow Down

Loss of analyte is inevitable when preconcentrating an analyte in a solvent extract by reducing the solvent volume in a nitrogen gas stream. However, the loss may be kept very small if the optimal solvent is chosen and if the volume reduction is not too extreme. The tool offered here calculates the loss that is to be expected. This allows for an optimization of the conditions.
Approach: For the provided solvents one can calculate the volume of nitrogen that is needed for the specified volume reduction of the solvent from the ideal gas law and the saturated vapor pressure of the solvents. The loss of analyte into this nitrogen volume can then be estimated from the solvent/gas partition coefficient.

Permeability Tool

This tool calculates the passive apparent permeability through a monolayer of Caco-2 or MDCK cells grown on a permeable filter support. The calculation considers various transport resistances (unstirred water layers (UWL), cytosol, membranes, filter) as well as two parallel pathways: transcellular and paracellular. Details of the approach are described in ref. [8]. Deviating from this reference we have simplified the calculation of the paracellular transport by not distinguishing between neutral and ionic species for this pathway. This simplifies the required data input and has little effect on the overall performance of the tool, because setup specific differences in paracellular transport are expected to be much higher than speciation effects. Passive permeability through Caco-2/MDCK cell membranes is calculated from the solute’s hexadecane/water partition coefficient using the solubility-diffusion model and a correlation established between permeability in black lipid membranes and Caco-2/MDCK, as described in ref. [9].

When users provide the neutral fractions in aqueous layers for ionizable compounds in the exported Excel file, the tool considers pH-dependent concentration shift effects in the UWL, filter, and cytosol, as described in ref. [10]. If the iso-pH method (same pH at the apical and basolateral side) is used, the total UWL thickness is required. If the gradient-pH method (different pH at the apical and basolateral side) is used, the individual UWL thicknesses on both the apical and basolateral side are necessary for the calculation. If only total UWL is provided, a symmetrical distribution is assumed.

The calculated permeability refers to the total concentration of the chemical, assuming that the ionic species does participate in the permeation through the unstirred water layer and the paracellular pathway but not to the transport through the membrane itself. Note that the modeled membrane permeability is based on water and may not accurately represent scenarios where the adjacent phases are not pure water. Possible active transport or retention effects are not considered.

Cfree

For toxicity assays or the effect of active ingredients of pharmaceuticals it is essential to know the freely dissolved concentration or fraction of a solute, because it is this freely dissolved concentration that can directly and quantitatively be linked to an effect. However, often only the total/nominal concentration is known. A measurement of the freely dissolved concentration or fraction is often not possible or affordable. As an alternative one can calculate the freely dissolved concentration or fraction if one assumes that partition equilibrium is reached (usually this is the case) and if one knows the content of sorbing components (lipids and proteins) and the respective partition coefficients. Here we offer a calculation of Cfree in blood-plasma based on the assumption that plasma contains 7 vol% albumin and 1 vol% (phospho)lipids. Another option, the calculation of Cfree in various cellular assays is described in detail in ref. [11]. It is important to note that all results for Cfree only refer to neutral species. Cfree for ionic species or ionizable chemicals cannot be calculated with these tools.

Some details

McGowan’s molar volumes V shown in the result table are calculated using atom incremental values from Abraham and McGowan [12]. There are two exceptions: V values for PCBs from van Noort et al. [13] and for fluorinated chemicals from Goss et al. [14]. See the cited references for further information.

In a few cases, isomer-specific data are stored with isomer-nonspecific names, CAS numbers, and/or SMILES. For such chemicals, it is strongly recommended to consult the original data source and check which isomer the given data are for.

For roughly 1000 chemicals the stored descriptors are not complete. If you search for “Preselected values”, you will typically find the data source with the largest number of descriptors for that compound. Nevertheless, this source may still miss one or more descriptors. By selecting "all", you will find all available descriptors and may find the missing descriptor(s) in another source. Generally, however, accuracy of predictions using descriptors from multiple sources is expected to be lower, as these descriptors have not undergone consistent optimization.

LSERs and pp-LFERs

Since the 1980s it has been recognized that various physicochemical properties of organic chemicals can be described by multi-term linear equations that use molecular properties as descriptors. While various equations with combinations of different descriptors have been proposed, the most successful and widely used equations for describing biphasic partition coefficients are the Linear Solvation Energy Relationships (LSERs), established by Abraham and coworkers (see ref. [15] and references therein). The LSERs have two forms. For partitioning between a condensed phase and the gas phase:

Log K = c + eE + sS + aA + bB + lL

For partitioning between two condensed phases:

Log K = c + eE + sS + aA + bB + vV

K is the partition coefficient between the two phases of interest. The lowercase letters are fitting coefficients and referred to as system parameters.

More recently, it was shown that a single equation of the following form can describe both partition processes no matter whether they involve the gas phase or only condensed phases [16].

Log K = c + sS + aA + bB + vV + lL

This equation uses a slightly different parameter combination than the former two equations and typically gives the same statistics and slightly better predictions in particular cases. This equation type is identified as number 1 in all tables for the system equations.

For adsorption from air to surfaces, other descriptor combinations are used because the cavity formation is not relevant for this process (hence we can omit the v V term) and the S descriptor was empirically found not to be relevant (see ref. [17]). Also, please note that we have a general ppLFER equation for surfaces (identified as number 5 in the tables for the system equations) that works for all chemicals including polyfluorinated ones and organo-silicones and we have another set of equations for the same surfaces (identified as number 4 in the Tables for the system equations) that must not be used for polyfluorinated chemicals and organo-silicones but that gives more accurate results for all other chemicals.

We use the term polyparameter linear free energy relationships (pp-LFERs) to collectively refer to all different types of multi-term linear equations that describe the partition coefficients, including the traditional LSERs plus additional equations mentioned here or elsewhere.

Temperature Dependence

Information on the temperature dependence of equilibrium partitioning is given in various ways depending on the information that was available in the literature: a) pp-LFER system equations for different temperatures are presented, b) pp-LFER equations for calculating partition enthalpies are presented, c) for adsorption from air to mineral and water surfaces the enthalpy can be calculated by a simple empirical equation from the respective logarithmic adsorption coefficient at 15°C (ref. [18]):

ΔH (kJ/mol) = -10.2 (+/- 0.4) * log Ksurf/air (m3/m2, at 15°C) - 89.6 ( +/- 1.9)

Note, that equations for the prediction of the enthalpy of partitioning for various solvent-air systems were published (and are listed under the solvent-air partition systems). These can be used in a thermodynamic cycle to derive enthalpies for solvent-water partitioning.

References

  1. OECD, Guidance Document on the Validation of (Quantitative) Structure-Activity Relationship [(Q)SAR] Models. 2007.
  2. T. N. Brown, J. A. Arnot, and F. Wania, “Iterative fragment selection: A group contribution approach to predicting fish biotransformation half-lives,” Environ. Sci. Technol., vol. 46, no. 15, pp. 8253–8260, 2012, doi: 10.1021/es301182a.
  3. T. N. Brown, “Predicting hexadecane-air equilibrium partition coefficients (L) using a group contribution approach constructed from high quality data.,” SAR QSAR Environ. Res., vol. 25, no. 1, pp. 51–71, 2014, doi: 10.1080/1062936X.2013.841286.
  4. G. Bronner and K.-U. Goss, “Predicting sorption of pesticides and other multifunctional organic chemicals to soil organic carbon.,” Environ. Sci. Technol., vol. 45, no. 4, pp. 1313–1319, Feb. 2011, doi: 10.1021/es102553y.
  5. D. Zhu and J. J. Pignatello, “A concentration-dependent multi-term linear free energy relationship for sorption of organic compounds to soils based on the hexadecane dilute-solution reference state.,” Environ. Sci. Technol., vol. 39, no. 22, pp. 8817–8828, Nov. 2005, doi: 10.1021/es051079g.
  6. S. Endo, P. Grathwohl, S. B. Haderlein, and T. C. Schmidt, “Compound-specific factors influencing sorption nonlinearity in natural organic matter.,” Environ. Sci. Technol., vol. 42, no. 16, pp. 5897–5903, Aug. 2008, doi: 10.1021/es8001426.
  7. M. Schneider and K.-U. Goss, “Systematic Investigation of the Sorption Properties of Tenax TA, Chromosorb 106, Porapak N, and Carbopak F,” Anal. Chem., vol. 81, no. 8, pp. 3017–3021, Apr. 2009, doi: 10.1021/ac802686p.
  8. K. Bittermann and K. U. Goss, “Predicting apparent passive permeability of Caco-2 and MDCK cell-monolayers: A mechanistic model,” PLoS One, vol. 12, no. 12, pp. 1–20, 2017, doi: 10.1371/journal.pone.0190319.
  9. C. Dahley, T. Böckmann, A. Ebert, and K. U. Goss, “Predicting the intrinsic membrane permeability of Caco-2/MDCK cells by the solubility-diffusion model,” Eur. J. Pharm. Sci., vol. 195, no. January, 2024, doi: 10.1016/j.ejps.2024.106720.
  10. C. Dahley, K. U. Goss, and A. Ebert, “Revisiting the pKa-Flux method for determining intrinsic membrane permeability,” Eur. J. Pharm. Sci., vol. 191, no. July, p. 106592, 2023, doi: 10.1016/j.ejps.2023.106592.
  11. F. C. Fischer et al., “Modeling Exposure in the Tox21 in Vitro Bioassays,”Chem. Res. Toxicol., vol. 30, no. 5, pp. 1197–1208, May 2017, doi: 10.1021/acs.chemrestox.7b00023.
  12. M. H. Abraham and J. C. McGowan, “The use of characteristic volumes to measure cavity terms in reversed phase liquid chromatography,” Chromatographia, vol. 23, no. 4, pp. 243–246, 1987, doi: 10.1007/BF02311772.
  13. P. C. M. van Noort, J. J. H. Haftka, and J. R. Parsons, “Updated Abraham solvation parameters for polychlorinated biphenyls.,” Environ. Sci. Technol., vol. 44, no. 18, pp. 7037–7042, Sep. 2010, doi: 10.1021/es102210g.
  14. K. U. Goss, G. Bronner, T. Harner, M. Hertel, and T. C. Schmidt, “The partition behavior of fluorotelomer alcohols and olefins,” Environ. Sci. Technol., vol. 40, no. 11, pp. 3572–3577, 2006, doi: 10.1021/es060004p.
  15. M. H. Abraham, A. Ibrahim, and A. M. Zissimos, “Determination of sets of solute descriptors from chromatographic measurements,” J. Chromatogr. A, vol. 1037, no. 1, pp. 29–47, 2004, doi: https://doi.org/10.1016/j.chroma.2003.12.004.
  16. K.-U. Goss, “Predicting the equilibrium partitioning of organic compounds using just one linear solvation energy relationship (LSER),” Fluid Phase Equilib., vol. 233, no. 1, pp. 19–22, 2005, doi: https://doi.org/10.1016/j.fluid.2005.04.006.
  17. H. P. H. Arp, K.-U. Goss, and R. P. Schwarzenbach, “Evaluation of a predictive model for air/surface adsorption equilibrium constants and enthalpies.,” Environ. Toxicol. Chem., vol. 25, no. 1, pp. 45–51, Jan. 2006, doi: 10.1897/05-291r.1.
  18. K.-U. GOSS, “The Air/Surface Adsorption Equilibrium of Organic Compounds Under Ambient Conditions,” Crit. Rev. Environ. Sci. Technol., vol. 34, no. 4, pp. 339–389, Jul. 2004, doi: 10.1080/10643380490443263.

Contributors to UFZ-LSER database

Andrea Ebert (Department of Computational Biology and Chemistry)
Guido Bronner (Department of Analytical Environmental Chemistry)
Satoshi Endo (Department of Analytical Environmental Chemistry)
Kai-Uwe Goss (Department of Analytical Environmental Chemistry)
Nadin Ulrich (Department of Analytical Environmental Chemistry)
Norihiro Watanabe (Department of Environmental Informatics)
Tobias Lindner (Scientifical and Commercial Data Processing (WKDV))
Rolf Ziegler (Scientifical and Commercial Data Processing (WKDV))
Norman Walter (Scientifical and Commercial Data Processing (WKDV))
Michael Willig (Scientifical and Commercial Data Processing (WKDV))

Contact:

Kai-Uwe Goss - kai-uwe.goss@ufz.de
Satoshi Endo - endo.satoshi@nies.go.jp