Technical University of Denmark
Browse

Ethylene carbonate data for graph2mat

dataset
posted on 2024-08-06, 08:12 authored by Arghya BhowmikArghya Bhowmik
<p>Creators</p> <p>------------</p> <p>Pol Febrer (pol.febrer@icn2.cat, ORCID 0000-0003-0904-2234)</p> <p>Peter Bjorn Jorgensen (peterbjorgensen@gmail.com, ORCID 0000-0003-4404-7276)</p> <p>Arghya Bhowmik (arbh@dtu.dk, ORCID 0000-0003-3198-5116)</p> <p><br></p> <p>Related publication</p> <p>-------------------</p> <p>The dataset is published as part of the paper:</p> <p>"GRAPH2MAT: UNIVERSAL GRAPH TO MATRIX CONVERSION FOR ELECTRON DENSITY PREDICTION"</p> <p>(https://doi.org/10.26434/chemrxiv-2024-j4g21)</p> <p>https://github.com/BIG-MAP/graph2mat</p> <p><br></p> <p>Short description</p> <p>------------------</p> <p>This dataset contains the Hamiltonian, Overlap, Density and Energy Density matrices</p> <p>from SIESTA calculations of a subset of the MD17 aspirin dataset. The subset is taken</p> <p>from the third split in (https://doi.org/10.6084/m9.figshare.12672038.v3).</p> <p><br></p> <p>SIESTA 5.0.0 was used to compute the dataset.</p> <p><br></p> <p>Contents</p> <p>-----------------</p> <p><br></p> <p>The dataset has two directories:</p> <p><br></p> <p>- pseudos: Contains the pseudopotentials used for the calculation (obtained from</p> <p>http://www.pseudo-dojo.org/, type NC SR (ONCVPSP v0.5), PBE, standard accuracy)</p> <p>- splits: The data splits used in the published paper. Each file "splits_X.json"</p> <p>contains the splits for training size X.</p> <p><br></p> <p>And then, three directories containing the calculations with different basis sets:</p> <p>- matrix_dataset_defsplit: Uses the default split-valence DZP basis in SIESTA.</p> <p>- matrix_dataset_optimsplit: Uses a split-valence DZP basis optimized for aspirin.</p> <p>- matrix_dataset_defnodes: Uses the default nodes DZP basis in SIESTA.</p> <p><br></p> <p>Each of the basis directories has two subdirectories:</p> <p>- basis: Contains the files specifying the basis used for each atom.</p> <p>- runs: The results of running the SIESTA simulations. Contents are discussed next.</p> <p><br></p> <p>The "runs" directory contains one directory for each run, named with the index</p> <p>of the run. Each directory contains:</p> <p>- RUN.fdf, geom.fdf: The input files used for the SIESTA calculation.</p> <p>- RUN.out: The log of the SIESTA run, which apar</p> <p>- siesta.TSDE: Contains the Density and Energy Density matrices.</p> <p>- siesta.TSHS: Contains the Hamiltonian and Overlap matrices.</p> <p><br></p> <p>Each matrix can be read using the sisl python package (https://github.com/zerothi/sisl)</p> <p>like:</p> <p><br></p> <p>```python</p> <p>import sisl</p> <p><br></p> <p>matrix = sisl.get_sile("RUN.fdf").read_X()</p> <p>```</p> <p><br></p> <p>where X is hamiltonian, overlap, density_matrix or energy_density_matrix.</p> <p><br></p> <p>To reproduce the results presented in the paper, follow the documentation of the graph2mat</p> <p>package (https://github.com/BIG-MAP/graph2mat).</p> <p><br></p> <p><br></p> <p>Cite this data</p> <p>------------------</p> <p><br></p> <p>https://doi.org/10.11583/DTU.c.7310005</p> <p>© 2024 Technical University of Denmark</p> <p><br></p> <p><br></p> <p>License</p> <p>-----------------</p> <p>This dataset is published under the CC BY 4.0 license.</p> <p>This license allows reusers to distribute, remix, adapt, and build upon the material in any medium or format, so long as attribution is given to the creator.</p> <p><br></p>

Funding

Battery Interface Genome - Materials Acceleration Platform

European Commission

Find out more...

History

ORCID for corresponding depositor

Usage metrics

    DTU Energy

    Licence

    Exports

    RefWorks
    BibTeX
    Ref. manager
    Endnote
    DataCite
    NLM
    DC