Technical University of Denmark
Browse
QM9_README (2.57 kB)
.GZ
QM9_matrix_dataset.tar.gz (81.33 GB)
1/0
2 files

QM9 data for graph2mat

dataset
posted on 2024-08-06, 08:12 authored by Arghya BhowmikArghya Bhowmik


Creators

------------

Pol Febrer (pol.febrer@icn2.cat, ORCID 0000-0003-0904-2234)

Peter Bjorn Jorgensen (peterbjorgensen@gmail.com, ORCID 0000-0003-4404-7276)

Arghya Bhowmik (arbh@dtu.dk, ORCID 0000-0003-3198-5116)


Related publication

-------------------

The dataset is published as part of the paper:

"GRAPH2MAT: UNIVERSAL GRAPH TO MATRIX CONVERSION FOR ELECTRON DENSITY PREDICTION"

(https://doi.org/10.26434/chemrxiv-2024-j4g21)


Short description

------------------

This dataset contains the Hamiltonian, Overlap, Density and Energy Density matrices

from SIESTA calculations of the QM9 dataset (https://doi.org/10.6084/m9.figshare.c.978904.v5)


SIESTA 5.0.0 was used to compute the dataset.


Contents

-----------------


The dataset has four directories:


- basis: Contains the files specifying the basis used for each atom.

- pseudos: Contains the pseudopotentials used for the calculation (obtained from

http://www.pseudo-dojo.org/, type NC SR (ONCVPSP v0.5), PBE, standard accuracy)

- runs: The results of running the SIESTA simulations. Contents are discussed next.

- splits: The data splits used in the published paper. Each file "splits_X.json"

contains the splits for training size X.


The "runs" directory contains one directory for each run, named with the index

of the run. Each directory contains:

- RUN.fdf, geom.fdf: The input files used for the SIESTA calculation.

- RUN.out: The log of the SIESTA run, which apar

- siesta.TSDE: Contains the Density and Energy Density matrices.

- siesta.TSHS: Contains the Hamiltonian and Overlap matrices.


Each matrix can be read using the sisl python package (https://github.com/zerothi/sisl)

like:


```python

import sisl


matrix = sisl.get_sile("RUN.fdf").read_X()

```


where X is hamiltonian, overlap, density_matrix or energy_density_matrix.


To reproduce the results presented in the paper, follow the documentation of the graph2mat

package (https://github.com/BIG-MAP/graph2mat).



Cite this data

------------------

https://doi.org/10.11583/DTU.c.7310005

© 2024 Technical University of Denmark



License

-----------------

This dataset is published under the CC BY 4.0 license.

This license allows reusers to distribute, remix, adapt, and build upon the material in any medium or format, so long as attribution is given to the creator.


Funding

Battery Interface Genome - Materials Acceleration Platform

European Commission

Find out more...

History

ORCID for corresponding depositor

Usage metrics

    DTU Energy

    Licence

    Exports

    RefWorks
    BibTeX
    Ref. manager
    Endnote
    DataCite
    NLM
    DC