Quick Start - From Python

Here we present a broad overview of using the PyBigDFT library to drive BigDFT calculations using Python. If you have installed from source, you should make sure you have setup the proper environment variables using the following command:

source install/bin/bigdftvars.sh

System Manipulation

Here we define a system which is compsed of two fragments: H2 and Helium.

[1]:

import warnings
warnings.filterwarnings("ignore")

[2]:

from BigDFT.Systems import System
from BigDFT.Fragments import Fragment
from BigDFT.Atoms import Atom
from BigDFT.Visualization import InlineVisualizer

[3]:

# Create Three Atoms
at1 = Atom({"r": [0, 0, 0], "sym": "H", "units": "bohr"})
at2 = Atom({"r": [0, 0, 1.4], "sym": "H", "units": "bohr"})
at3 = Atom({"r": [10, 0, 0], "sym": "He", "units": "bohr"})

# Construct a System from Two Fragments (H2, He)
sys = System()
sys["H2:1"] = Fragment([at1, at2])
sys["He:2"] = Fragment([at3])

# Iterate Over The System
for fragid, frag in sys.items():
    for at in frag:
        print(fragid, at.sym, at.get_position())

H2:1 H [0.0, 0.0, 0.0]
H2:1 H [0.0, 0.0, 1.4]
He:2 He [10.0, 0.0, 0.0]

[4]:

# NBVAL_IGNORE_OUTPUT
viz = InlineVisualizer(400, 300)
viz.display_system(sys)

You appear to be running in JupyterLab (or JavaScript failed to load for some other reason). You need to install the 3dmol extension:
jupyter labextension install jupyterlab_3dmol

Calculation

Calculate the created system using a grid spacing of \(0.4\) and the PBE functional. A logfile is generated from which we can access the computed properties. This logfile has built in properties and can be accessed like a dictionary.

[5]:

from BigDFT.Inputfiles import Inputfile
inp = Inputfile()
inp.set_hgrid(0.4)
inp.set_xc("PBE")
inp["perf"] = {"calculate_forces": False,
               "multipole_preserving": True}

[6]:

from BigDFT.Calculators import SystemCalculator
calc = SystemCalculator(skip=True, verbose=False)

[7]:

log = calc.run(sys=sys, input=inp, name="quick", run_dir="scratch")

[8]:

print(log.energy)
print(log.log["Memory Consumption Report"]
             ["Memory occupation"])

-4.053822554528733
{'Peak Value (MB)': 123.207, 'for the array': 'f_i', 'in the routine': 'vxcpostprocessing', 'Memory Peak of process': 'unknown'}

Periodic Systems

We setup a BCC unit cell of iron and perform the calculation using a 2x2x2 k-point grid with a Monkhorst-Pack grid.

[9]:

from BigDFT.UnitCells import UnitCell

[10]:

pat = Atom({"Fe": [0, 0, 0], "units": "angstroem"})
psys = System({"CEL:0": Fragment([pat])})
psys.cell = UnitCell([2.867, 2.867, 2.867], units="angstroem")

[11]:

# NBVAL_IGNORE_OUTPUT
viz = InlineVisualizer(400, 300)
viz.display_system(psys)

You appear to be running in JupyterLab (or JavaScript failed to load for some other reason). You need to install the 3dmol extension:
jupyter labextension install jupyterlab_3dmol

[12]:

inp = Inputfile()
inp.set_hgrid(0.3)
inp.set_xc("LDA")
inp["kpt"] = {"method": "mpgrid", "ngkpt": [2, 2, 2]}

[13]:

log = calc.run(sys=psys, input=inp, name="psys", run_dir="scratch")

[14]:

# NBVAL_IGNORE_OUTPUT
_ = log.get_dos().plot()

File I/O

Read and write a PDB file.

[15]:

from BigDFT.IO import read_pdb, write_pdb

[16]:

with open("scratch/temp.pdb", "w") as ofile:
    write_pdb(sys, ofile)
with open("scratch/temp.pdb", "r") as ifile:
    sys = read_pdb(ifile)

Linear Scaling

Activate the linear scaling mode of BigDFT and compute some per fragment quantities.

[17]:

from BigDFT.PostProcessing import BigDFTool
from BigDFT.IO import read_mol2
from io import StringIO
from matplotlib import pyplot as plt
from BigDFT.Systems import plot_fragment_information

[18]:

inp = Inputfile()
inp.set_hgrid(0.55)
inp.set_psp_nlcc()  # Soft pseudopotentials
inp.set_xc("PBE")
inp["import"] = "linear"

[19]:

istr = """@<TRIPOS>MOLECULE
N3
0 0 0 0
SMALL
GASTEIGER

@<TRIPOS>ATOM
N        -15.9520   11.4820   75.1020 N.4     1  ALA         0.3855
H        -16.8590   11.1760   75.4220 H       1  ALA        -0.0890
H2       -15.9190   12.4910   75.1350 H       1  ALA        -0.0890
H3       -15.3270   11.2070   75.8460 H       1  ALA        -0.0890
CA       -15.6750   10.9380   73.6710 C.3     1  ALA        -0.0080
HA       -15.5000    9.8640   73.7340 H       1  ALA         0.0175
CB       -16.9370   11.1850   72.9780 C.3     1  ALA        -0.0702
HB1      -16.8760   10.8330   71.9480 H       1  ALA         0.0218
HB2      -17.7950   10.7240   73.4660 H       1  ALA         0.0218
HB3      -17.0660   12.2620   72.8690 H       1  ALA         0.0218
C        -14.3150   11.5090   73.1160 C.2     1  ALA         0.1989
O        -13.5240   11.9470   73.9260 O.2     1  ALA        -0.2776
N        -14.0770   11.2840   71.7960 N.am    2  VAL        -0.3051
H        -14.7740   10.8610   71.1990 H       2  VAL         0.1494
CA       -12.9310   11.8890   71.0690 C.3     2  VAL         0.1016
HA       -12.1370   12.1190   71.7800 H       2  VAL         0.0597
CB       -12.2680   10.8300   70.1710 C.3     2  VAL        -0.0197
HB       -13.0570   10.2210   69.7300 H       2  VAL         0.0318
CG1      -11.2730   11.3540   69.0870 C.3     2  VAL        -0.0605
HG11      -10.7560   10.5590   68.5490 H       2  VAL         0.0233
HG12      -11.7530   11.9770   68.3320 H       2  VAL         0.0233
HG13      -10.5100   11.8940   69.6470 H       2  VAL         0.0233
CG2      -11.4480    9.8090   70.9840 C.3     2  VAL        -0.0605
HG21      -11.9590    9.4330   71.8700 H       2  VAL         0.0233
HG22      -11.3140    8.9080   70.3860 H       2  VAL         0.0233
HG23      -10.4970   10.2560   71.2750 H       2  VAL         0.0233
C        -13.2980   13.1490   70.4060 C.2     2  VAL         0.2342
O        -14.1940   13.1280   69.5820 O.2     2  VAL        -0.2738
N        -12.5920   14.2640   70.6930 N.am    3  LEU        -0.3007
H        -11.8060   14.2210   71.3260 H       3  LEU         0.1496
CA       -12.8570   15.7250   70.2250 C.3     3  LEU         0.1249
HA       -13.7770   15.8070   69.6470 H       3  LEU         0.0616
CB       -12.9890   16.4610   71.5050 C.3     3  LEU        -0.0216
HB2      -12.3120   16.1130   72.2850 H       3  LEU         0.0291
HB3      -12.7420   17.5000   71.2880 H       3  LEU         0.0291
CG       -14.4560   16.4840   72.0620 C.3     3  LEU        -0.0445
HG       -15.0910   16.5870   71.1830 H       3  LEU         0.0296
CD1      -14.9030   15.1850   72.7940 C.3     3  LEU        -0.0626
HD11      -15.9610   15.3610   72.9910 H       3  LEU         0.0232
HD12      -14.8720   14.2770   72.1920 H       3  LEU         0.0232
HD13      -14.2220   15.0500   73.6340 H       3  LEU         0.0232
CD2      -14.5970   17.6630   72.9710 C.3     3  LEU        -0.0626
HD21      -15.6450   17.5910   73.2620 H       3  LEU         0.0232
HD22      -13.9320   17.5610   73.8280 H       3  LEU         0.0232
HD23      -14.3580   18.5620   72.4040 H       3  LEU         0.0232
C        -11.7070   16.2810   69.3360 C.2     3  LEU         0.3758
O        -10.5210   15.8410   69.5920 O.co2   3  LEU        -0.2443
OXT      -11.9460   17.0960   68.3940 O.co2   3  LEU        -0.2443
"""
Lsys = read_mol2(StringIO(istr))

[20]:

log = calc.run(sys=Lsys, input=inp, name="Lsys", run_dir="scratch")

[21]:

# NBVAL_IGNORE_OUTPUT
fig, axs = plt.subplots(2, 1)
btool = BigDFTool()
plot_fragment_information(axs[0], btool.fragment_population(Lsys, log))
axs[0].set_ylabel("Charge", fontsize=12)
axs[0].set_xticks([])
axs[0].set_xlabel("")
plot_fragment_information(axs[1], btool.run_compute_purity(Lsys, log))
axs[1].set_ylabel("Purity", fontsize=12)
fig.tight_layout()

Remote Calculations

Calculations of large systems require the use of powerful supercomputers. One way to run on a supercomputer is to write a python script (or convert a Jupyter notebook), copy the script and related data to the machine, and run the calculation through the job system. As an alternative, we have developed the remotemanager framework for automatically launching your jobs. It works by wrapping arbitrary python functions that are then sent and run on the remote machine.

[22]:

def calculate(hgrid):
    from BigDFT.Calculators import SystemCalculator
    from BigDFT.Inputfiles import Inputfile
    from BigDFT.Database.Molecules import get_molecule

    sys = get_molecule("N2")
    inp = Inputfile()
    inp.set_hgrid(hgrid)
    calc = SystemCalculator()
    log = calc.run(sys=sys, input=inp, name=str(hgrid))

    return log.energy

[23]:

from remotemanager import URL  # Replace this with the computer of your choice
from remotemanager import Dataset  # Store a set of remote calculations
url = URL()
ds = Dataset(function=calculate, url=url)
for h in [0.3, 0.35, 0.4]:
    ds.append_run({"hgrid": h})
ds.run()

appended run runner-0
appended run runner-1
appended run runner-2
assessing run for runner dataset-54fed8e7-runner-0... checks passed, running
assessing run for runner dataset-54fed8e7-runner-1... checks passed, running
assessing run for runner dataset-54fed8e7-runner-2... checks passed, running

[24]:

from time import sleep
while not all(ds.is_finished):
    sleep(10)
ds.fetch_results()
print(ds.results)

checking remotely for finished runs
checking remotely for finished runs
[-19.888866344196096, -19.888459196244888, -19.88727486068828]

For simpler runs, you can use Jupyter magics for the same effect.

[25]:

%load_ext remotemanager

[26]:

%%sanzu url=url
%%sargs hgrid = 0.35
from BigDFT.Logfiles import Logfile
log = Logfile("log-" + str(hgrid) + ".yaml")
log.log["Memory Consumption Report"]["Memory occupation"]

runner runner-1 already exists
assessing run for runner dataset-469c350d-runner-0... skipping already completed run
checking remotely for finished runs

[27]:

print(magic_dataset.results)

[{'Peak Value (MB)': 101.461, 'for the array': 'zt', 'in the routine': 'G_PoissonSolver', 'Memory Peak of process': 'unknown'}]