Skip to content
Snippets Groups Projects
Commit be512391 authored by Hachem Kadri's avatar Hachem Kadri
Browse files

Update README.md

parent b1cb28f5
No related branches found
No related tags found
No related merge requests found
# ML QUANTUM SEPARABILITY
The library is a toolbox written in python dedicated to the efficient generation of large labeled dataset for the quantum separability problem in high dimensional space.
The library is a toolbox written in Python dedicated to the efficient generation of large-scale labeled datasets for the quantum separability problem in high-dimensional settings.
The repo contains code & dataset accompaning the paper, Large-Scale Quantum Separability Through a Reproducible Machine Learning Lens.
## dependencies
- numpy (>= 1.23.5)
- scipy (>= 1.10.0)
## organisation
## Dependencies
- Python (>=3.6)
- NumPy (>= 1.23.5)
- SciPy (>= 1.10.0)
- src : contain the algorithms
- data : contain the datasets used in our experiments
- models : contain all the models trained during our experiments
## Organisation
- src: source base directory containing all the Python source code.
- data.zip: simulated labeled dataset with thousands of bipartite separable and entangled density matrices of sizes 9 × 9 and 49 × 49 (two-qudit mixed states with d=3 or d=7).
- main.py: code for learning from data the decision function between separable and entangled states (the classifier in this example is SVM from scikit-learn).
- models.zip: contains SVM models trained on the quantum-separability dataset.
## usage
## Usage
### Pipeline
the library is organised around the Pipeline class, which allow to define a sampling strategy as a serie of transformative steps applied to density matrices sampled from an initial probability distribution.
The library is organised around the "Pipeline" class. It is based on sampling density matrices and defining a number of transformations on them.
#### Example: PPT entagled density matrices
We give a typical use case in the following snipped of code :
We give a typical use case in the following code snippet. The goal here is to generate density matrices that will probably be PPT and entangled.
```python
from types import save_dmstack, load_dmstack
......@@ -33,28 +39,34 @@ states, infos = Pipeline([
('sample', RandomInduced(k_params=[25]).states), # induced measure of parameter 25
('ppt only', select(PPT.is_respected, True)), # respecting the PPT criterion
('fw', add(FrankWolfe(1000).approximation, key = 'approx'), # compute the sep approx.
('sel ent', select(DistToSep(0.01, sep_key = 'fw__approx').predict, Label.ENT))
('sel ent', select(DistToSep(0.01, sep_key = 'fw_approx').predict, Label.ENT))
]).sample(1000, [3,3])
save_dmstack('states_3x3', states, infos)
```
in this example, the following procedure is repeated until we obtain 1000 density matrices in dimensions [3,3] :
In this example, the following procedure is repeated until we obtain 1000 density matrices in dimensions [3,3]:
- We sample a density matrix. The random density matrices are generated uniformly with respect to some induced measure.
- We select only the density matrices satisfying the PPT criterion.
- We add the nearest separable approximation of each density matrix using the Frank-Wolfe (FW) algorithm.
- We only select the sampled density matrices at a distance from their nearest FW approximation greater than 0.01.
The pipeline class works with 3 types of functions: sample, transform and model. These are detailed below.
#### DMStack
A DMStack is a class that represents a stack of density matrices. It is a numpy.ndarray of shape (n_matrices, ...) and have n additional attribute dims which is a list indicating the dimensions of the quantum subsystems.
- we sample states from the induced distribution
- we select the sampled states respecting the PPT criterion
- we add the separable approximation of each state in the infos dictionnary at the key 'fw__approx'
- we only select the sampled states at a distance 0.01 or greater of their approximation by Frank Wolfe.
In the example above, the states and all the information are then saved in the file 'states_3x3' at the .mat format using the function 'save_dmstack'. They can be retrieved later via the function 'load_dmstack'.
the states and all the informations are then saved in the file 'states_3x3' at the .mat format and can be retrieved later via the function load_dmstack.
The pipeline function work with 3 types of functions :
### samplers
a sampler function produce a set of density matrices with relevant information and have the signature
#### Sample
Produce a set of density matrices.
```python
def sampler(n_states : int, dims : list[int]) -> DMStack, dict
def sample(n_states : int, dims : list[int]) -> DMStack, dict
```
the following samplers can be found in the library :
The following sample functions can be found in the library:
- samplers.utils.FromSet
- samplers.pure.RandomHaar
......@@ -63,27 +75,25 @@ the following samplers can be found in the library :
- samplers.separable.RandomSeparable
- sampler.entangled.AugmentedPPTEnt
### transformer
#### Transform
a transformer function associate, to each density matrix in a set, ...
a transformer function may use additional informations about the states and produce new informations.
Apply transformations to each density matrix.
```python
def transformer(states : DMStack, infos : dict) -> DMStack, dict
def transform(states : DMStack, infos : dict) -> DMStack, dict
```
the following transformers can be found in the library :
the following transform functions can be found in the library:
- transformers.sep_approximations.FrankWolfe
- transformers.representations.GellMann
- transformer.representations.Measures
### model
#### Model
a model function associate, to each density matrix in a set, a label.
a model function may use additional informations about the states and produce new informations.
Labeling each density matrix using a predefined model.
```python
def model(states : DMStack, infos : dict) -> list[int], dict
```
the following labelxx can be found in the library :
the following labeling models can be found in the library:
- models.criteria.PPT
- models.criteria.SepBall
......@@ -92,33 +102,32 @@ the following labelxx can be found in the library :
- models.approx_based.DistToSep
- models.approx_based.WitQuality
## data
## Data.zip
the datasets are grouped by :
The simulated quantum separability dataset. Data are grouped by:
- dimensions (3x3 or 7x7)
- usage (TRAIN or TEST)
- category (SEP, PPT, NPPT, FW)
- dimensions (3x3 or 7x7),
- usage (TRAIN or TEST),
- category (SEP, PPT, NPPT, FW).
The content of each files can be accessed by the function types.load_dmstack, which will return a DMStack containing all the state and a dictionnary containing informations about each states.
For states of the PPT category, the dictionary contain an approximation of the optimal witness in the 'fw__witness' key.
The content of each file can be accessed by the function types.load_dmstack, which will return a DMStack containing all the states and a dictionnary containing information about each states.
For states of the class PPT, the dictionary contain an approximation of the optimal witness in the 'fw_witness' key.
In all datasets, the states are in the form of complex density matrices.
Use transformations.GellMann or transformations.Measures to obtain a real-valued vector representation.
The states are in the form of complex density matrices.
Use "GellMann" or "Measures" transformations to obtain a real-valued vector representation.
## models
## Models.zip
the models are grouped by :
The SVM models trained on the quantum-separability dataset. they are grouped by:
- dimensions of the input (3x3 or 7x7)
- creation method for the PPT-ENT examples (AUG or NOAUG)
- dimensions (3x3 or 7x7),
- creation method for the PPT-ENT examples (AUG for data augmentation or NOAUG for without data augmentation).
the type of the model and the proportion of PPT-ENT states used during training is indicated in the file name :
for example the files
The type of the model and the proportion of PPT-ENT states used during training is indicated in the file name. For example the files
SVM_1000_[0.50]_(x)
SVM_1000_[0.50]_
(with x an index in [0,4]) contain a SVM trained using a dataset of 1000 examples per class where 50% of the entangled examples were PPT-ENT.
contain a SVM trained using a dataset of 1000 examples per class where 50% of the entangled examples are PPT-ENT.
All the models are accessible by the function joblib.load in the form of a GridSearchCV model (from sklearn).
All the models in the library use the Gell-Mann representation of states as input.
0% Loading or .
You are about to add 0 people to the discussion. Proceed with caution.
Please to comment