Farah notation and related work
Compare changes
- Charly Lamothe authored
# Conflicts: # .gitignore
+ 78
− 26
@@ -7,7 +7,8 @@
@@ -20,6 +21,8 @@
@@ -28,58 +31,96 @@
Given a matrix $D = [d_1, \dots , d_l] \in \mathbb{R}^{n\times l}$ \todo{l undefined in text} (also called a dictionary) and a signal $\textbf{y}\in \mathbb{R}^n$, finding a $k$-sparse vector $\textbf{w} \in \mathbb{R}^l$ (i.e. $|| \textbf{w} ||_0 \leq k$) that minimize $|| X\textbf{w} - \textbf{y}||$ is an NP-hard problem \todo{(ref np-hardness)}.
\item \cite{Yang2012}: once the forest $(F = t_1, \dots, t_n)$ is built, he gives each tree a score (which measures the importance of the tree in the forest). The tree with the lowest score is removed from the forest. To eliminate the next tree, all the scores are recomputed, and the tree with the lowest score is removed...\\
\item \cite{Yang2012}: once the forest $(F = t_1, \dots, t_n)$\todo{inconsistent notations} is built, he \todo{who?} gives each tree a score (which measures the importance of the tree in the forest). The tree with the lowest score is removed from the forest. To eliminate the next tree, all the scores are recomputed, and the tree with the lowest score is removed... \todo{and so forth and so on}\\
@@ -106,7 +147,18 @@ For the experiments, they use breast cancer prognosis. They reduce the size of a
\item \cite{Fawagreh2015}: The goal is to get a much smaller forest while staying accurate and diverse. To do so, they used a clustering algorithm. Let $C(t_i, T) = \{c_{i1}, \dots, c_{im}\}$ denotes a vector of class labels obtained after having $t_i$ classify the training set $T$ of size $m$, with $t_i \in F$, $F$ the forest of size $n$. Let $\mathcal{C} = \bigcup^n_{i=1} C(t_i, T)$ be the super vector of all class vectors classified by each tree $t_i$. They then applied a clustering algorithm on $\mathcal{C}$ to find $k = \sqrt{\frac{n}{2}}$ clusters. Finally, the final forest $F'$ is composed on the union of each tree that is the most representative per cluster, for each cluster. So if you have 100 trees and 7 clusters, the final number of trees will be 7. They obtained at least similar performances as with regular RF algorithm.