Non-negative Matrix Factorization (NMF) Flashcards

1
Q

Non-negative Matrix Factorization (NMF)

A

Non-negative Matrix Factorization (NMF) is a linear algebraic model, that factors high-dimensional vectors into a low-dimensionality representation. Similar to Principal component analysis (PCA), Non-negative matrix factorization is also a dimension-reduction technique. However, NMF has an advantage that it results in a parts-based, sparse non-subtractive, and more interpretable representation, as compared to PCA. In summary, Non-negative Matrix Factorization is a valuable tool for data analysis that can provide a lower-dimensional, parts-based representation of the data. Despite its increased computational requirements, NMF is a powerful tool for a range of applications where interpretability and the ability to capture local patterns in the data are important.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q
  1. Definition
A

Non-negative matrix factorization (NMF) is a group of algorithms in multivariate analysis where a matrix V is factorized into two matrices W and H, with the property that all three matrices have no negative elements. The goal is to find two non-negative matrices W and H, whose product approximates the non-negative matrix X.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q
  1. Procedure
A

NMF solves the following problem: Given an m × n matrix X with non-negative elements, find an m × r matrix W and an r × n matrix H such that the product WH approximates X. Here, r is less than m or n and the approximation is not exact because WH is a lower rank matrix than X.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q
  1. Applications
A

NMF is used in various applications such as text mining, image analysis, and bioinformatics. For example, in the case of image analysis, the parts-based representation property of NMF means that it can produce a more interpretable decomposition of the images.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q
  1. Advantages
A

NMF provides a reduction in dimensionality, decomposition into parts, and sparsity. These characteristics can be beneficial in interpretability of the results, noise reduction, and reducing the computational requirements of subsequent processing steps.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q
  1. Limitations
A

While NMF provides a parts-based lower dimensional representation, it does so at the cost of increased computational requirements. The NMF algorithms have a higher computational complexity than some other factorization algorithms, which can limit their applicability on large datasets.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q
  1. Algorithms
A

Several algorithms have been proposed to solve the NMF problem, including multiplicative update rules, alternating non-negativity constrained least squares, and hierarchical alternating least squares. Each has different trade-offs in terms of speed, scalability, and accuracy of the solution.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q
  1. Difference from PCA and SVD
A

Unlike PCA (Principal Component Analysis) and SVD (Singular Value Decomposition), NMF does not require the vectors to be orthogonal to each other. This allows NMF to capture the local patterns in the data more effectively.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q
  1. Non-Negativity Constraint
A

One of the key characteristics of NMF is the non-negativity constraint. This allows the parts-based representation and enhances interpretability because the factors can only add to, not subtract from, the representation.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q
  1. Factorization Rank
A

The rank of the factorization (i.e., the number of columns in W or rows in H) is a critical parameter in NMF. A smaller rank results in a higher level of data compression, but potentially at the loss of important information. In contrast, a larger rank can represent the data more accurately but may also capture noise and is more computationally intensive.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly