Skip to content

❗ This is a read-only mirror of the CRAN R package repository. FisherEM — The FisherEM Algorithm to Simultaneously Cluster and Visualize High-Dimensional Data

Notifications You must be signed in to change notification settings

FloFloB/FisherEM-1

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

45 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Travis build status

New version of the FisherEM package with the Bayesian Fisher EM implemented.

Installation

R Package installation

CRAN dependencies

FisherEM needs the following CRAN R packages, so check that they are are installed on your computer.

required_CRAN <- c("MASS", "elasticnet", "parallel", "ggplot2")
not_installed_CRAN <- setdiff(required_CRAN, rownames(installed.packages()))
if (length(not_installed_CRAN) > 0) install.packages(not_installed_CRAN)

Installing FisherEM

  • A planned submission on CRAN in October
  • For the development version, use the github install
devtools::install_github("FloFloB/FisherEM-1")

New features

Simulation function

We added the script to simulate and reproduce of the BFEM chapter, 3 simulations are available.

# Chang 1983 setting
n = 300
simu = simu_bfem(n, which = "Chang1983")

# Section 4.2: 
p = 50
noise = 1
simu = simu_bfem(n = 900, which = "section4.2", p = p, noise = noise)

# Section 4.3: 
snr = 3
simu = simu_bfem(n=900, which = "section4.3", snr = snr)

The Bayesian Fisher-EM algorithm

The function structure, arguments and output are similar to fem()and sfem().

Y = iris[,-5]
cl_true = iris[,5]
res.bfem = bfem(Y, K = 3, model="DB", init = 'kmeans', method = 'gs', nstart = 10)

print(fem.ari(res.bfem, cl_true))
## [1] 0.9602777

Visualisation

ggbound = plot(res.bfem, type = 'elbo')
ggbound

ggspace = plot(res.bfem, type = 'subspace')
ggspace

High-dimensional example

Simulate from the frequentist DLM model with K=3 clusters. The latent space is of dimension d=2, and the other (p-d) dimensions are centered Gaussians with variance noise.

simu = simu_bfem(n = 900, which = "section4.2", p = 50, noise = 1)
Y = simu$Y
cl_true = simu$cl_true

# plot true subspace in 2-d
df.true = data.frame(simu$X, Cluster = factor(simu$cls))
ggtrue = ggplot(df.true, aes(x = X1, y=X2, col=Cluster, shape=Cluster)) +
  geom_point(size = 2) +
  scale_color_brewer(palette="Set2")  # color-blind friendly palette
print(ggtrue)

And then cluster the data with the BFEM algorithm.

res.bfem = bfem(simu$Y, K=3, model = 'DB', nstart = 10, method="gs")

cat('Init ARI : ', aricode::ARI(simu$cls, max.col(res.bfem$Tinit)))
## Init ARI :  0.4052959
cat('Final ARI : ', aricode::ARI(simu$cls, res.bfem$cls))
## Final ARI :  0.9709887

plot(res.bfem, type = "subspace")

References

About

❗ This is a read-only mirror of the CRAN R package repository. FisherEM — The FisherEM Algorithm to Simultaneously Cluster and Visualize High-Dimensional Data

Resources

Stars

Watchers

Forks

Packages

No packages published

Languages

  • R 100.0%