You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardexpand all lines: docs/index.rst
+10-15
Original file line number
Diff line number
Diff line change
@@ -1,20 +1,15 @@
1
-
.. InfoVar documentation master file, created by
2
-
sphinx-quickstart on Thu Sep 19 09:37:55 2024.
3
-
You can adapt this file completely to your liking, but it should at least
4
-
contain the root `toctree` directive.
5
-
6
1
Welcome to InfoVar's documentation
7
2
==================================
8
3
9
-
The `infovar` Python package provides tools to efficiently study the informativity of variables on data of interest.
4
+
The ``infovar`` Python package provides tools to efficiently study the informativity of variables on data of interest.
10
5
11
6
12
7
Context
13
8
=======
14
9
15
10
The informativity of a variable or set of variables is defined here as the ability of these variables, if known, to reduce the uncertainty we have about a quantity of interest. This uncertainty can be defined in several ways, for example in the sense of Shannon's information theory.
16
11
17
-
This is a ubiquitous problem in science in general, with very concrete applications in climatology, economics, psychology, sociology, and astrophysics, to name a few. Consequently, `InfoVar` has been designed to be very general.
12
+
This is a ubiquitous problem in science in general, with very concrete applications in climatology, economics, psychology, sociology, and astrophysics, to name a few. Consequently, *InfoVar* has been designed to be very general.
18
13
19
14
This package provides tools for quantifying the statistical dependence (e.g., mutual information, but other metrics are available) between continuous numerical data and estimating the associated error as well as the influence of the latter on the order of variables in terms of importance.
20
15
@@ -84,7 +79,7 @@ Statistics
84
79
85
80
In this project, we propose to measure the statistical dependence of variables based on the mutual information. Other metrics can also be used, such as the conditional differential entropy, which is closely related to mutual information, or canonical correlation coefficient.
86
81
87
-
Mutual information and conditional differential entropy are estimated nonparametrically using [Greg Ver Steeg's implementation](http://www.isi.edu/~gregv/npeet.html). More details are given in the `assessment` directory, which evaluates the properties of each available statistics and provides further mathematical context and references.
82
+
Mutual information and conditional differential entropy are estimated nonparametrically using [Greg Ver Steeg's implementation](http://www.isi.edu/~gregv/npeet.html). More details are given in the ``assessment`` directory, which evaluates the properties of each available statistics and provides further mathematical context and references.
88
83
89
84
If you're interested in other metrics, it's possible to add and use them.
90
85
@@ -100,9 +95,9 @@ To account for these uncertainties and to be able to compare different values pr
100
95
Estimation for different range of values
101
96
----------------------------------------
102
97
103
-
The heart of `InfoVar` lies in the fact that the informativity of a variable on a quantity of interest can vary according to the selected range of value of this quantity.
98
+
The heart of *InfoVar* lies in the fact that the informativity of a variable on a quantity of interest can vary according to the selected range of value of this quantity.
104
99
105
-
For example, if we're interested in house prices in California (see `examples/california-housing`), among a set of variables, geographical location (latitude, longitude) appears to be the most important pair of variables. However, if we restrict ourselves to the 10% most expensive homes, it appears that the number of rooms in the house becomes most useful. This type of observation is important, for example, from a data analysis point of view, but also in a variable selection context.
100
+
For example, if we're interested in house prices in California (see ``examples/california-housing``), among a set of variables, geographical location (latitude, longitude) appears to be the most important pair of variables. However, if we restrict ourselves to the 10% most expensive homes, it appears that the number of rooms in the house becomes most useful. This type of observation is important, for example, from a data analysis point of view, but also in a variable selection context.
106
101
107
102
More generally, taking into account these variations as a function of ranges of values of the variable of interest enables more refined analysis of phenomena. To help you understand, here are a few examples of possible applications.
108
103
@@ -130,20 +125,20 @@ It is also possible to perform the same analysis, but according to the value ran
130
125
- *Data of interest:* number of medals won by each country in each of the last 10 editions of the games.
131
126
- *Variables:* amount invested by the national Olympic committee, population, per capita income, unemployment rate.
132
127
133
-
The `InfoVar` allows you to perform sensitivity analysis in two ways:
128
+
*InfoVar* allows you to perform sensitivity analysis in two ways:
134
129
1. Define rigid intervals for the data that varies (example: houses priced below $150k, between $150 and $350k and above $350k).
135
130
2. Define a sliding window and calculate the evolution of the statistics almost continuously.
136
131
137
-
In case 1 (discrete case), the `DiscreteHandler` class provides all the important functions for calculating, storing and accessing results. In case 2 (continuous case), the `ContinuousHandler` class is used. The notebooks in `examples` give an example of the use of each of these two classes.
132
+
In case 1 (discrete case), the ``DiscreteHandler`` class provides all the important functions for calculating, storing and accessing results. In case 2 (continuous case), the ``ContinuousHandler`` class is used. The notebooks in ``examples`` give an example of the use of each of these two classes.
138
133
139
134
140
135
References
141
136
==========
142
137
143
138
[1] Einig, L & Palud, P. & Roueff, A. & Pety, J. & Bron, E. & Le Petit, F. & Gerin, M. & Chanussot, J. & Chainais, P. & Thouvenin, P.-A. & Languignon, D. & Bešlić, I. & Coudé, S. & Mazurek, H. & Orkisz, J. H. & G. Santa-Maria, M. & Ségal, L. & Zakardjian, A. & Bardeau, S. & Demyk, K. & de Souza Magalhẽs, V. & Javier R. Goicoechea & Gratier, P. & V. Guzmán, V. & Hughes, A. & Levrier, F. & Le Bourlot, J. & Darek C. Lis & Liszt, H. S. & Peretto, N. & Roueff, E & Sievers, A. (2024).
144
139
**Quantifying the informativity of emission lines to infer physical conditions in giant molecular clouds. I. Application to model predictions.** *Astronomy & Astrophysics.*
**Quantifying the informativity of emission lines to infer physical conditions in giant molecular clouds. II. Training robust models from selected observations.** *Astronomy & Astrophysics.*
0 commit comments