|
1
|
- CODATA 18th International Conference: Frontiers of Scientific
- and Technical Data (29 September - 3 October, 2002)
- Prototype of TRC Integrated Information System for Physicochemical
Properties of Organic Compounds:
- Evaluated Data, Models and Knowledge
- Xinjian Yan,
- Qian Dong, Xiangrong Hong, Robert D. Chirico, Michael Frenkel
- Thermodynamics Research Center (TRC)
- National Institute of Standards and Technology
|
|
2
|
- Requirement: Industrial and scientific developments require high quality
data and models
- Key Point: High quality data system needs strong support from
comprehensive knowledge base
- Aim: Develop a system with high quality data and models fully supported
by domain knowledge
|
|
3
|
|
|
4
|
|
|
5
|
|
|
6
|
- Databases:
- Source Database, Table Database, Density Database, Vapor Pressure
Database, Ideal Gas Database, etc.
- A Comprehensive Physicochemical Data System:
- Source Database contains more than 100 physical and chemical
properties, over 2 million experimental records for 32,000 chemical
systems (pure compounds, mixtures, and reaction systems)
|
|
7
|
- Detailed information is crucial for a good understanding
- of data. The following information has been prepared for
- recommended data (also for experimental data).
- The uncertainty values of RD
- The number of data points used for obtaining RD
- The discreteness of the data used to process RD
- The description about the selection of RD
- The grade of RD
|
|
8
|
- For compounds having multiple values, a weighted average method is used
to obtain recommended data
- For compounds having only one or two values, the data are inspected by:
- A. Theories and
thermodynamics relationships
- B. Comparison with the values
from models
- C. Comparison with other well
characterized sources
- D. Similar compounds
- For doubtful data, original articles are reviewed
|
|
9
|
- Prediction ability
- Complexity of compounds used in developing and testing models
- Diversity of compounds used in developing and testing models
- Reliability of each parameter (how many and how well data were used in
obtaining each parameter)
- Similarity analysis
|
|
10
|
|
|
11
|
|
|
12
|
- Group/ complexity =1 >1
- CH 1 1 CH3-CH(CH3)-CH3 = 2
- C 2 2
- C=C (double bond) 2 2
- =C= 2 2
- C*C (triple bond) 2 2
- F, Cl, Br, I 3 5 2 (when groups >4)
- CN 3 4
- N 3 4
- NC 3 4
- S 3 4
- SH 3 4
- CHO 4 10
- CO 4 10
- COO 4 10
- COOH 4 10
- N= 4 10
- NH 4 10
- NH= 4 10
- NH2 4 10
- NO2 4 10
- O 4 10
- OH 4 10 OH-CH2-CH2-OH = 18
- SO 4 10
- SO2 4 10
- Ring / complexity 3 5 Including fused ring
- Terminals / complexity 6 (C=1 ) 3 (C=2) 1 (C=3)
- C atoms / complexity
- 1- 10 1 11- 20 2 21- 30 3
- 31- 40 4 41- 50 5 > 50 6
|
|
13
|
- CN AC
- Tc before 1996* 500 14
- Tc after 1995** 100 21
- CN - Compound Number; AC - Average Complexity
- * 500 compounds having critical temperature reported before 1996.
- ** 100 new compounds reported between 1996 and 2001.
|
|
14
|
|
|
15
|
- Scientific experiment is a complicated process
- Experimental data tend to have uncertainty or error
- Evaluation of scientific data is extremely difficult, no way to
guarantee their absolute correctness
- The true value of physicochemical property needs repeated experimental
examination
- The above problems are also true for models
|
|
16
|
- Thermophysics theory and concept
- Experimental and theoretical research methods
- Evaluation and comment on experimental data
- Compound physical and chemical characteristics
- Models (introduction, evaluation and comment)
- Molecular structure and interaction information
- Terminology
- Unit
- ……
|
|
17
|
|
|
18
|
|
|
19
|
|
|
20
|
- Uncertainty is everywhere
- Our knowledge on uncertainty is very limited
- Our awareness on uncertainty is low
- Knowledge is crucial to decrease the uncertainty
- For building a high quality information system, it is necessary to
develop a strong ability for analyzing the uncertainty of data, models
and text information
|