Skip to content
Snippets Groups Projects

Farah notation and related work

Merged Charly Lamothe requested to merge farah_notation_and_related_work into master
1 file
+ 31
28
Compare changes
  • Side-by-side
  • Inline
+ 31
28
@@ -34,52 +34,55 @@
@@ -34,52 +34,55 @@
introduire le pb et les motivation ...
introduire le pb et les motivation ...
\subsection{Notation}
\subsection{Notation}
Let $ X \in \mathbb{R}^{n \times d}$ \todo{est-ce-que le non-gras majuscule est fréquent pour les matrices?} be the matrix data and $Y \in \mathbb{R}^{n}$ be the labels vector associated to the matrix $X$, where for each $i$, $\textbf{x}_i \in \mathcal{X} \subseteq \mathbb{R}^{d}$ and $y_i \in {\mathcal Y} \subseteq \mathbb{R}$. \\
Let $ \textbf{X} \in \mathbb{R}^{n \times d}$ \todo{est-ce-que le non-gras majuscule est fréquent pour les matrices?} be the matrix data and $\textbf{y} \in \mathbb{R}^{n}$ be the labels vector associated to the matrix $\textbf{X}$, where for each $i$, $y_i \in {\mathcal Y}$. \\
A random forest $F_{t_1, \dots, t_l}$ \todo[inline]{confusion possible notation: majuscule non gras: fonction ou matrice? les deux, à éclaircir} is a classifier made of a collection of $l$ trees ${t_1, \dots, t_l}$. This forest can be seen as a function, and noted as:
A random forest $F_{T_1, \dots, T_l}$ is a classifier made of a collection of $l$ trees ${T_1, \dots, T_l}$. A single tree and a forest can both be seen as functions. To define this tools, let us introduce ${\cal H}$ as the set of all possible trees:
 
\todo[inline]{confusion possible notation: majuscule non gras: fonction ou matrice? les deux, à éclaircir}
 
%
 
$$ {\cal H} = \{T \ | \ T :\ \mathbb{R}^{n \times d} \ \to \ \cal{Y}\}.$$
 
%
 
In this case, a forest can be noted as:
%
%
$$\begin{array}{ccccc}
$$\begin{array}{ccccc}
F_{t_1, \dots, t_l} & : & \cal{X} & \to & \cal{Y} \\
F_{T_1, \dots, T_l} & : & \cal{X} & \to & \cal{Y} \\
& & \textbf{x} & \mapsto & F_{t_1, \dots, t_l}(\textbf{x}) = f(\{t_1, \dots, t_l\} , \textbf{x}) \\
& & \textbf{x} & \mapsto & F_{T_1, \dots, T_l}(\textbf{x}) = H(\{T_1, \dots, T_l\} , \textbf{x}) \\
\end{array}$$
\end{array}$$
%
%
where $f$ is a function which depend on the task\todo{f unclear: why to introduce it?}. In a regression setup, where ${\cal Y} = \mathbb{R}$\todo{I don't think it is usefull}, this function can be defined as:
where $H$ is a function which depend on the task\todo{f unclear: why to introduce it?}. In a regression setup, where ${\cal Y} = \mathbb{R}$\todo{I don't think it is usefull}, this function can be defined as:
%
%
$$f(\{t_1, \dots, t_l \} , \textbf{x}) = \sum_{i = 1}^{l} \alpha_i t_i(x) \ \text{ where } \alpha_i \in \mathbb{R},$$
$$H(\{T_1, \dots, T_l \} , \textbf{x}) = \sum_{i = 1}^{l} \alpha_i T_i(x) \ \text{ where } \alpha_i \in \mathbb{R},$$
%
%
while in a classification setup, in which ${\cal Y} = \{ c_1, \dots, c_m \}$, $f$ will be a majority vote function:
while in a classification setup, in which ${\cal Y} = \{ c_1, \dots, c_m \}$, $H$ will be a majority vote function:
%
%
$$f(\{t_1, \dots, t_l \} , \textbf{x}) = \argmax_{c \in {\cal Y}} \sum_{i = 1}^{l} \mathds{1}(t_i(\textbf{x}) = c).$$
$$H(\{T_1, \dots, T_l \} , \textbf{x}) = \argmax_{c \in {\cal Y}} \sum_{i = 1}^{l} \mathds{1}(T_i(\textbf{x}) = c),$$
 
where $\mathds{1}$ is the indicator function which return $1$ if its argument is correct, and $0$ otherwise.
%
%
\todo{$\mathds{1}$ not defined}We \todo{no we}will need to define the vector prediction of a forest for all the data matrix: $F_{t_1, \dots, t_l}(X) = \begin{pmatrix}
\todo{$\mathds{1}$ not defined}We \todo{no we}will need to define the vector prediction of a forest for all the data matrix: $F_{t_1, \dots, t_l}(\textbf{X}) = \begin{pmatrix}
F_{t_1, \dots, t_l}(x_1) \\
F_{T_1, \dots, T_l}(\textbf{x}_1) \\
\dots \\
\dots \\
F_{t_1, \dots, t_l}(x_n)
F_{T_1, \dots, T_l}(\textbf{x}_n)
\end{pmatrix}.$\\
\end{pmatrix}.$\\
%
%
%
%
%
%
All these notations can be summarized in the following table:\\
All these notations can be summarized in Table \ref{table: notation}:\\
\begin{table}
\begin{table}
\begin{tabular}{l c}
\begin{tabular}{ l c }
lowercase & integer \\
%\hline
bold lowercase& vector \\
\textbf{x} & the vector x \\
bold capital & matrix \\
$k$ & the desired (pruned) forest size \\
calligraphic letters & vector space \\
$X$ & the matrix $X$ \\
$F_{T_1, \dots, T_l}$ & a forest of $l$ trees \\
${\cal X}$ & the data representation space \\
$F_{T_1, \dots, T_l}(\textbf{x}) \in {\cal Y}$ & the predicted label of \textbf{x} by the forest $F_{T_1, \dots, T_l}$ \\
${\cal Y}$ & the label representation space \\
$F_{T_1, \dots, T_l}(\textbf{X}) \in {\cal Y}^n$ & the predicted label of all the data of $\textbf{X}$ by the forest $F_{T_1, \dots, T_l}$\\
$n$ & the number of data\\
$n$ & the number of data \\
$d$ & the data dimension \\
$d$ & the data dimension \\
$l$ & the forest size \\
$l$ & the initial forest size \\
$F_{t_1, \dots, t_l}$ & a forest of $l$ trees \\
$k$ & the desired (pruned) forest size \\
$F_{t_1, \dots, t_l}(\textbf{x}) \in {\cal Y}$ & the predicted label of \textbf{x} by the forest $F_{t_1, \dots, t_l}$ \\
$F_{t_1, \dots, t_l}(X) \in {\cal Y}^n$ & the predicted label of all the data of $X$ by the forest $F_{t_1, \dots, t_l}$\\
%\hline
\end{tabular}
\end{tabular}
\caption{Notations}
\caption{Notations used in this paper}
 
\label{table: notation}
\end{table}\todo[inline]{ajouter les codifications des notations: bold minuscule: vecteur; non-bold majuscule: matrix, etc..}
\end{table}\todo[inline]{ajouter les codifications des notations: bold minuscule: vecteur; non-bold majuscule: matrix, etc..}
Loading