pneuvial / adjclust
Showing 3 of 35 files from the diff.
Newly tracked file
R/adjclust.R changed.
Newly tracked file
R/hicClust.R changed.
Newly tracked file
R/snpClust.R changed.

@@ -2,73 +2,79 @@
Loading
2 2
NULL
3 3
4 4
#' Adjacency-constrained Clustering
5 -
#' 
5 +
#'
6 6
#' Adjacency-constrained hierarchical agglomerative clustering
7 -
#' 
7 +
#'
8 8
#' Adjacency-constrained hierarchical agglomerative clustering (HAC) is HAC in
9 9
#' which each observation is associated to a position, and the clustering is
10 10
#' constrained so as only adjacent clusters are merged. These methods are useful
11 11
#' in various application fields, including ecology (Quaternary data) and
12 12
#' bioinformatics (e.g., in Genome-Wide Association Studies (GWAS)).
13 -
#' 
14 -
#' This function is a fast implementation of the method that takes advantage of 
15 -
#' sparse similarity matrices (i.e., that have 0 entries outside of a diagonal 
16 -
#' band of width \code{h}). The method is fully described in (Dehman, 2015) and 
17 -
#' based on a kernel version of the algorithm. The different options for the 
18 -
#' implementation are available in the package vignette entitled "Notes on CHAC 
13 +
#'
14 +
#' This function is a fast implementation of the method that takes advantage of
15 +
#' sparse similarity matrices (i.e., that have 0 entries outside of a diagonal
16 +
#' band of width \code{h}). The method is fully described in (Dehman, 2015) and
17 +
#' based on a kernel version of the algorithm. The different options for the
18 +
#' implementation are available in the package vignette entitled "Notes on CHAC
19 19
#' implementation in adjclust".
20 -
#' 
21 -
#' @param mat A similarity matrix or a dist object. Most sparse formats from 
22 -
#' \code{\link[Matrix]{sparseMatrix}} are allowed
23 -
#' @param type Type of matrix : similarity or dissimilarity. Defaults to 
20 +
#'
21 +
#' @param mat A similarity matrix or a dist object. Most sparse formats from
22 +
#'   \code{\link[Matrix]{sparseMatrix}} are allowed
23 +
#' @param type Type of matrix : similarity or dissimilarity. Defaults to
24 24
#'   \code{"similarity"}
25 25
#' @param h band width. It is assumed that the similarity between two items is 0
26 26
#'   when these items are at a distance of more than band width h. Default value
27 27
#'   is \code{ncol(mat)-1}
28 -
#'   
29 -
#' @return An object of class \code{\link{chac}} which describes the tree 
30 -
#' produced by the clustering process. The object a list with the same elements 
31 -
#' as an object of class \code{\link{chac}} (\code{merge}, \code{height}, 
32 -
#' \code{order}, \code{labels}, \code{call}, \code{method}, \code{dist.method}),
33 -
#' and an extra element \code{mat}: the data on which the clustering is 
34 -
#' performed, possibly after pre-transformations described in the vignette 
35 -
#' entitled "Notes on CHAC implementation in adjclust".
36 -
#'   
37 -
#' @seealso \code{\link{snpClust}} to cluster SNPs based on linkage disequilibrium
28 +
#'
29 +
#' @return An object of class \code{\link{chac}} which describes the tree
30 +
#'   produced by the clustering process. The object a list with the same
31 +
#'   elements as an object of class \code{\link{chac}} (\code{merge},
32 +
#'   \code{height}, \code{order}, \code{labels}, \code{call}, \code{method},
33 +
#'   \code{dist.method}), and an extra element \code{mat}: the data on which the
34 +
#'   clustering is performed, possibly after pre-transformations described in
35 +
#'   the vignette entitled "Notes on CHAC implementation in adjclust".
36 +
#'
37 +
#' @seealso \code{\link{snpClust}} to cluster SNPs based on linkage
38 +
#'   disequilibrium
38 39
#' @seealso \code{\link{hicClust}} to cluster Hi-C data
39 -
#'   
40 -
#' @references Dehman A. (2015) \emph{Spatial Clustering of Linkage 
41 -
#'   Disequilibrium Blocks for Genome-Wide Association Studies}, PhD thesis, 
40 +
#'
41 +
#' @references Dehman A. (2015) \emph{Spatial Clustering of Linkage
42 +
#'   Disequilibrium Blocks for Genome-Wide Association Studies}, PhD thesis,
42 43
#'   Universite Paris Saclay.
43 -
#'   
44 +
#'
45 +
#' @references Ambroise C., Dehman A., Neuvial P., Rigaill G., and Vialaneix N
46 +
#'   (2019). \emph{Adjacency-constrained hierarchical clustering of a band
47 +
#'   similarity matrix with application to genomics}, Algorithms for Molecular
48 +
#'   Biology 14(22)"
49 +
#'
44 50
#' @examples
45 51
#' sim <- matrix(
46 52
#' c(1.0, 0.1, 0.2, 0.3,
47 53
#'   0.1, 1.0 ,0.4 ,0.5,
48 -
#'   0.2, 0.4, 1.0, 0.6, 
54 +
#'   0.2, 0.4, 1.0, 0.6,
49 55
#'   0.3, 0.5, 0.6, 1.0), nrow = 4)
50 -
#' 
56 +
#'
51 57
#' ## similarity, full width
52 58
#' fit1 <- adjClust(sim, "similarity")
53 59
#' plot(fit1)
54 -
#' 
60 +
#'
55 61
#' ## similarity, h < p-1
56 62
#' fit2 <- adjClust(sim, "similarity", h = 2)
57 63
#' plot(fit2)
58 -
#' 
64 +
#'
59 65
#' ## dissimilarity
60 66
#' dist <- as.dist(sqrt(2-(2*sim)))
61 -
#' 
67 +
#'
62 68
#' ## dissimilarity, full width
63 69
#' fit3 <- adjClust(dist, "dissimilarity")
64 70
#' plot(fit3)
65 -
#' 
71 +
#'
66 72
#' ## dissimilarity, h < p-1
67 73
#' fit4 <- adjClust(dist, "dissimilarity", h = 2)
68 74
#' plot(fit4)
69 75
#'
70 76
#' @export
71 -
#' 
77 +
#'
72 78
#' @importFrom matrixStats rowCumsums
73 79
#' @importFrom matrixStats colCumsums
74 80
#' @importFrom Matrix diag

@@ -31,13 +31,15 @@
Loading
31 31
#'   
32 32
#' @seealso \code{\link{adjClust}} \code{\link[HiTC]{HTCexp}}
33 33
#'   
34 -
#' @references Dehman A. (2015) \emph{Spatial Clustering of Linkage 
35 -
#'   Disequilibrium Blocks for Genome-Wide Association Studies}, PhD thesis, 
36 -
#'   Universite Paris Saclay.
37 -
#'   
34 +
#' @references Ambroise C., Dehman A., Neuvial P., Rigaill G., and Vialaneix N
35 +
#'   (2019). \emph{Adjacency-constrained hierarchical clustering of a band
36 +
#'   similarity matrix with application to genomics}, Algorithms for Molecular
37 +
#'   Biology 14(22)"
38 +
#'
38 39
#' @references Servant N. \emph{et al} (2012). \emph{HiTC : Exploration of 
39 40
#'   High-Throughput 'C' experiments. Bioinformatics}.
40 41
#'   
42 +
#'   
41 43
#' @examples
42 44
#' # input as HiTC::HTCexp object
43 45
#' \dontrun{

@@ -39,6 +39,12 @@
Loading
39 39
#' @references Dehman, A. Ambroise, C. and Neuvial, P. (2015). Performance of a
40 40
#'   blockwise approach in variable selection using linkage disequilibrium
41 41
#'   information. *BMC Bioinformatics* 16:148.
42 +
#'   
43 +
#' @references Ambroise C., Dehman A., Neuvial P., Rigaill G., and Vialaneix N
44 +
#'   (2019). \emph{Adjacency-constrained hierarchical clustering of a band
45 +
#'   similarity matrix with application to genomics}, Algorithms for Molecular
46 +
#'   Biology 14(22)"
47 +
#'
42 48
#'
43 49
#' @details If \code{x} is of class
44 50
#'   \code{\link[snpStats:SnpMatrix-class]{SnpMatrix}} or \code{\link{matrix}},
Files Coverage
R 51.75%
src 100.00%
Project Totals (9 files) 64.12%
1
codecov:
2
  token: 1a548132-94fe-492a-8225-83905b5cd54e
Sunburst
The inner-most circle is the entire project, moving away from the center are folders then, finally, a single file. The size and color of each slice is representing the number of statements and the coverage, respectively.
Icicle
The top section represents the entire project. Proceeding with folders and finally individual files. The size and color of each slice is representing the number of statements and the coverage, respectively.
Grid
Each block represents a single file in the project. The size and color of each block is represented by the number of statements and the coverage, respectively.
Loading