Web mining and Quantitative Epistemology
Target Documents :
- Chavalarias D. & Cointet J-P. Bottom-up scientific field detection for dynamical and hierarchical science mapping - methodology and case study Scientometrics Vol. 75 No. 1 , (DOI): 10.1007/s11192-007-1825-6.e-print
- Cointet J-P., Chavalarias D. (2008) Multi-level Science mapping with asymmetric co-occurrence analysis: Methodology and case study, Networks and Heterogeneous Media, Vol 3 Number 2, june 2008, p267-276
Slides :
Associated project :
MOMA (Module Mapping for Large Electronic Corpora and Social Media)
TINA (Tools for INteractive Assessement of Projects Portfolio and Visualization of Scientific Landscapes)
Objectives :
To develop new methods in order to detect paradigmatic fields thanks to simple statistics over a scientific content database. We have defined an asymmetric paradigmatic proximity between concepts which provide hierarchical structure and we have tested our methods on a case study with a database of 20 000 000M article. We also propose overlapping categorisation to describe paradigmatic fields as sets of concepts that may have several different usages. Concepts can also be dynamically clustered provinding a high-level description of the evolution of the paradigmatic fields.
Domains of application :
Identification of paradigmatic fields defined as ordered keywords clusters. !!Required: * Indexation of a database : occurences and co-occurences of words on several time periods (e.g. years). NO direct access to documents required.
Limits :
- Do not deals with authorship.
- Works better with very large corpuses.
Advantages :
- No semantic study needed
- No particular parsing needed
- Works on full text
- Works on large corpus (several million)
- Overlapping categorisation



