-
Notifications
You must be signed in to change notification settings - Fork 7
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Merge pull request #12 from yuriok/dev0.3
update ui texts, translations and tutorials
- Loading branch information
Showing
27 changed files
with
1,898 additions
and
210 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -108,4 +108,3 @@ venv.bak/ | |
|
||
# logs | ||
/logs | ||
/i18n |
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,74 @@ | ||
# The algorithms of QGrain | ||
|
||
## Background | ||
|
||
Grain size distribution (GSD) data have been widely used in Earth sciences, especially Quaternary Geology, due to its convenience and reliability. However, the usages of GSD are still oversimplified. The geological information contained in GSD is very abundant, but only some simplified proxies (e.g. mean grain size) are widely used. The most important reason is that GSD data are hard to interpret and visualize directly. | ||
|
||
To overcome this, some researchers have developed the methods to unmix the mixed multi-modal GSD to some components to make the interpretation and visualization easier. These methods can be divided into two routes. One is end-member analysis (EMA) (Weltje, 1997) which takes a batch of samples for the calculation of the end-members. Another is called single-specimen unmixing (SSU) (Sun et al., 2002) which treats each sample as an individual. | ||
|
||
The key difference between the two routes is that whether the end-members of a batch of samples are consistent. EMA believes that the end-members between different samples are consistent, the variations of GSD are only caused by the changing of fractions of the end-members. On the contrary, SSU has no assumption on the end-members, i.e. it admits that the end-members may vary between different samples. | ||
|
||
Some mature tools (Paterson and Heslop, 2015; Dietze and Dietze, 2019) taking the EMA route have appeared, but there is no available public and easy-to-use tool for SSU. That the reason of creating QGrain. | ||
|
||
## Fundamental | ||
|
||
The math principle of SSU has been described by Sun et al. (2002). | ||
|
||
In short, the distribution of a n-components mixed sample can be indicated as: | ||
|
||
y = f<sub>1</sub> * *d*<sub>1</sub>(x) + ... + f<sub>n</sub> * *d*<sub>n</sub>(x), | ||
|
||
where y is the mixed distribution, f<sub>i</sub> is the fraction of component i, *d*<sub>i</sub> is the base distribution function (e.g. Normal and Weibull) of component i, x is the classes of grain size. | ||
|
||
The question is to get the distribution paramters of *d*<sub>i</sub>. | ||
|
||
Therefore, the unmixing problem can be coverted to an optimization problem: | ||
|
||
minimize the error (e.g. sum of squared error) between y<sub>test</sub> and y<sub>guess</sub>. | ||
|
||
## Data preprocess | ||
|
||
In fact the input data of each sample are two array. One is the classes of grain size, another is the distribution. Usually, there are many 0 values in the head and tail of distribution array. These 0 values were caused by the limit of test precision. In fact, they should be close to 0 but not equal to 0. This difference will bring a constant error which is large enough to effect the fitting result. QGrain will exclude these 0 values to obtain better performance. | ||
|
||
## Local optimization | ||
|
||
Due to the complexity of base distribution function, the error function is non-convex. At present, there is no high-efficiency method to find the global minimum of a non-convex function. So, an alternative solution is local optimization. Local optimization can converge to a minimum rapidly, but without guarantee that the minimum is global. Optimization problem also is a core topic of machine learning. Therefore, there are many mature local optimization algorithms that meet our requirement. Here we use Sequential Least SQuares Programming (SLSQP) (Kraft, 1988) algorithm to perform local optimization. | ||
|
||
## Global optimization | ||
|
||
With the increase of component number, the error function will become much more complex. It's difficult to get a satisfactory result if only use local optimization. | ||
|
||
QGrain uses a global optimization algorithm called basinhopping (Wales & Doye, 1997) to improve the robustness. | ||
|
||
This global optimization algorithm will not search the whole space but will shift to another initial point to start a new local optimization process after one local optimization process finished. That makes it has ability to escape some loacl minimum and keep the efficiency meanwhile. | ||
|
||
## Base distribution function | ||
|
||
At present, QGrain supports the following distribution types: | ||
|
||
|Distribution Type|Parameter Number|Fitting Space|Skew| | ||
|:-:|:-:|:-:|:-:| | ||
|Normal<sup>1<sup>|2|Bin Numbers|No| | ||
|Weibull|2|Bin Numbers|Yes| | ||
|Gen. Weibull<sup>2</sup>|3|Bin Numbers|Yes| | ||
|
||
1. Normal distribution againsts bin numbers is equal to Lognormal distribution againsts grain size (μm). | ||
2. **Gen. Weibull** is General Weibull which has an additional location parameter. | ||
|
||
## Steps of fitting | ||
|
||
1. Data Loading | ||
2. Get information (e.g. distribution type and component number) | ||
3. Generate error function | ||
4. Data preprocess | ||
5. Global optimization (basinhopping) | ||
6. Final optimization (another local optimization, SLSQP) | ||
7. Generate fitting result by the parameters of error function | ||
|
||
## Referances | ||
|
||
* [Weltje, G.J. End-member modeling of compositional data: Numerical-statistical algorithms for solving the explicit mixing problem. Math Geol 29, 503–549 (1997) doi:10.1007/BF02775085](https://doi.org/10.1007/BF02775085) | ||
|
||
* Kraft, D. A software package for sequential quadratic programming. 1988. Tech. Rep. DFVLR-FB 88-28, DLR German Aerospace Center – Institute for Flight Mechanics, Koln, Germany. | ||
|
||
* Wales, D J, and Doye J P K, Global Optimization by Basin-Hopping and the Lowest Energy Structures of Lennard-Jones Clusters Containing up to 110 Atoms. Journal of Physical Chemistry A, 1997, 101, 5111. |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,54 @@ | ||
# How to fit the samples | ||
|
||
By this, please make sure you have loaded the grain size distribution (GSD) data correctly. | ||
|
||
If everything goes well, you will see the interface like this. By default, QGrain has fitted the first sample automatically. | ||
|
||
![App Appearance With Data Loaded](../figures/app_appearance_with_data_loaded.png) | ||
|
||
## The layout of app | ||
|
||
QGrain consists of some docks which are movable, scalable, floatable, and closable. You can adjust them as you please. If you want to display a dock that has been closed before, you can click the **Docks** menu to realize it. | ||
|
||
### Docks | ||
|
||
* Cavas: The dock to display the raw data and fitting result of the sample you selected. | ||
* Control Panel: The dock to control the fitting behaviours. | ||
* Raw Data Table: The dock to show the GSD data of samples. | ||
* Recorded Data Table: The dock to show the recorded fitting results. | ||
|
||
## Tips | ||
|
||
If you are confused to some widgets, you can hover on it to see the tips. | ||
|
||
* Click the raido buttons of **Distribution Type** to switch the distribution function. | ||
* Click the **+**/**-** button to add/reduce the component number you guess. | ||
* **Observe Iteration**: Whether to display the iteration procedure. | ||
* **Inherit Parameters**: Whether to inherit the parameters of last fitting. It will improve the accuracy and efficiency when the samples are continuous. | ||
* **Auto Fit**: Whether to automaticlly fit after the sample data changed. | ||
* **Auto Record**: Whether to automaticlly record the fitting result after fitting finished. | ||
* Click the **Previous** button to back to the previous sample. | ||
* Click the **Next** button to jump to the next sample. | ||
* Click the **Auto Run Orderly** button to run the program automatically. The samples from current to the end will be processed one by one. | ||
* Click the **Cancel** button to cancel the fitting progress. | ||
* Click the **Try Fit** button to fit the current sample. | ||
* Click the **Record** button to record the current fitting result.\nNote: It will record the LAST SUCCESS fitting result, NOT CURRENT SAMPLE. | ||
* Click the **Multi Cores Fitting** button to fit all samples. It will utilize all cores of cpu to accelerate calculation. | ||
* Move the lines in **Canvas** dock to set the expected mean values of each component, if it can not return a proper result and you make sure the component is correct. | ||
|
||
## Workflow | ||
|
||
The workflow of fitting samples is that: | ||
|
||
1. Try fit one typic sample untill you are satisfied. | ||
|
||
You can adjust the component number and watch the chart of fitting result to find a proper value. | ||
|
||
If it can not return a correct result, you can check the **Ovserve Iteration** option to find the reason. Also, you can move the lines to test whether the component number is proper. | ||
|
||
If it can return a proper result by giving the expected mean values, you can adjust the algorithm settings to refine the performance to let it can get the proper result automatically. | ||
|
||
2. Test other samples with the component number. | ||
3. If the component number are suitable for all samples, use auto fit to process them all. | ||
4. If some results are not correct, cancel the fitting and return the step 1. If the incorrect results are not too many, you can fit and record manually. | ||
5. Save the fitting results to file. |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.