Structure and dynamics of electron-phonon coupled systems using neural quantum states (2024)

Ankit MahajanDepartment of Chemistry, Columbia University, New York, NY 10027, United States  Paul J. RobinsonDepartment of Chemistry, Columbia University, New York, NY 10027, United StatesDepartment of Chemistry and Chemical Biology, Harvard University, Cambridge, Massachusetts 02138, United States  Joonho LeeDepartment of Chemistry, Columbia University, New York, NY 10027, United StatesDepartment of Chemistry and Chemical Biology, Harvard University, Cambridge, Massachusetts 02138, United States  David R. Reichmandrr2103@columbia.eduDepartment of Chemistry, Columbia University, New York, NY 10027, United States

Abstract

In this work, we use neural quantum states (NQS) to describe the high-dimensional wave functions of electron-phonon coupled systems. We demonstrate that NQS can accurately and systematically learn the underlying physics of such problems through a variational Monte Carlo optimization of the energy with minimal incorporation of physical information even in highly challenging cases. We assess the ability of our approach across various lattice model examples featuring different types of couplings. The flexibility of our NQS formulation is demonstrated via application to ab initio models parametrized by density functional perturbation theory consisting of electron or hole bands coupled linearly to dispersive phonons. We compute accurate real-frequency spectral properties of electron-phonon systems via a novel formalism based on NQS. Our work establishes a general framework for exploring diverse ground state and dynamical phenomena arising in electron-phonon systems, including the non-perturbative interplay of correlated electronic and electron-phonon effects in systems ranging from simple lattice models to realistic models of materials parametrized by ab initio calculations.

1 Introduction

The interaction between bosonic fields with charged and neutral carriers can lead to the formation of emergent quasiparticles with greatly altered properties. For example, transport measurements carried out on lightly doped or photoexcited carriers in inorganic and organic semiconductors have been linked to the formation of polarons,Coropceanu etal. (2007); Zhu and Podzorov (2015); Schilcher etal. (2021); Tulyagankhodjaev etal. (2023) quasiparticles composed of the carrier enveloped by a cloud of phonons. Features such as kinks and satellites in photoemission spectra at well-defined phonon frequencies offer additional direct evidence of polaronic physics.Li etal. (2020); Verdi etal. (2017); Moser etal. (2013); Kang etal. (2018); Setvin etal. (2014) Moreover, electron-phonon (eph) interactions can dramatically influence the interactions of electrons with other electrons. Perhaps the most important example of this is superconductivity in simple metals, embodied in the Bardeen-Cooper-Schrieffer (BCS) theory, where Cooper pairs formed by phonon-mediated attraction between electrons lead to an instability of the Fermi sea and the formation of an emergent superconducting state.Bardeen etal. (1957) The role of strong eph interactions in unconventional superconductivity remains a subject of debate.Lanzara etal. (2001); Alexandrov and Kornilovitch (2002); Giustino etal. (2008); Yin etal. (2013); Karakuzu etal. (2017); Costa etal. (2018); Luo etal. (2022); Ly etal. (2023); Zhang etal. (2023)

Since Landau proposed the concept of the self-trapping of an electron via lattice distortions, there has been extensive theoretical investigation into eph interactions.Alexandrov and Devreese (2010); Franchini etal. (2021); Giustino (2017) This exploration has entailed the development of simple lattice and continuum models, such as those pioneered by HolsteinHolstein (1959, 1959) and Fröhlich,Fröhlich (1954) alongside methods to solve the resulting eph problem. Various strategies for tackling these models include path integralFeynman (1955); Kornilovitch (1998); Beyl etal. (2018); Lee etal. (2021); Cohen-Stead etal. (2022) and diagrammatic methods,Prokof’ev and Svistunov (1998); Berciu (2006) variationalLee etal. (1953); Toyozawa (1961); Brown etal. (1997); Alder etal. (1997); Ohgoe and Imada (2014); Hohenadler etal. (2004); Karakuzu etal. (2017); Ferrari etal. (2020) and perturbative techniques,Marsiglio (1995); Hohenadler etal. (2003) density matrix renormalization group (DMRG) approachesJeckelmann and White (1998), as well as various flavors of exact diagonalizationBonča etal. (1999); Wang etal. (2020). Although some of these approaches are numerically exact in some domains, their applicability across different types of Hamiltonians and parameter ranges is constrained in practice due to potentially high computational costs. Recently, significant progress has been made in constructing and studying ab initio models based on density functional perturbation theory (DFPT).Baroni etal. (2001); Sjakste etal. (2015); Giustino (2017); Poncé etal. (2016); Zhou etal. (2021); Verdi and Giustino (2015) Methods initially developed for the investigation of lattice models are now being adapted to this ab initio setting.Lee etal. (2021); Lafuente-Bartolome etal. (2022) Systematically improvable calculations towards the exact limit have not yet been reported. In addition to allowing quantitative comparison with experiments and predictive power, such calculations also help assess the accuracy of approximate but more affordable computational methods.

Neural quantum states (NQS) present a promising avenue for addressing these challenges. Due to their universal approximation properties, neural networks (NN) have emerged as versatile wave function ansatzes for many-body systems.Carleo and Troyer (2017) Both empirical and theoreticalGao and Duan (2017); Sharir etal. (2020) evidence suggests that NN can efficiently represent physically relevant quantum states in non-trivial interacting systems. Leveraging technological developments in machine learning and via the use of Monte Carlo sampling, NQS can often be evaluated and optimized efficiently, allowing one to systematically converge answers close to the exact limit by increasing the number of parameters. Examples of recent successes of the NQS approach include the investigation of low-lying states and dynamics of spin and fermion models on both latticesCarleo and Troyer (2017); Nomura etal. (2017); Choo etal. (2018, 2019); Luo and Clark (2019); Vicentini etal. (2019); Szabó and Castelnovo (2020); RobledoMoreno etal. (2022) and in continuum settings.Hermann etal. (2020); Pfau etal. (2020) While a few studies have considered the use of NQS for eph coupled systems,Nomura (2020); Rzadkowski etal. (2022) their applicability and potential for detailed investigations of such systems remains largely unexplored. Extracting dynamical quantities from NQS is also an area of active research. The spectral properties of electronic and spin systems have been obtained using the correction vector method,Hendry and Feiguin (2019) Chebyshev expansion,Hendry etal. (2021), and real-time evolution.Mendes-Santos etal. (2023) The development of accurate real-frequency spectral properties is especially important in eph coupled systems, where sharp spectral features that are difficult to capture by analytic continuation at particular phonon frequencies often arise.

The flexibility and accuracy of NQS come at the expense of having to perform stochastic nonlinear optimization, presenting challenges more severe than those encountered in deterministic methods.Cai and Liu (2018); Westerhout etal. (2020); Szabó and Castelnovo (2020) Good results require careful development of heuristics for the choice of representations and learning strategies, which have to be tailored to the problem domain. In this work, we present our efforts to address these challenges for eph systems. We find that the NQS approaches we put forward here are competitive with the best exact approaches for standard polaron problems, while providing facile access to highly accurate spectral properties and the ability to flexibly treat complex problems such as those with correlated electrons and ab initio parametrized Hamiltonians.

In the following sections, we present the details of Hamiltonians, wave functions, and the methods used to calculate their properties (Sec. 2). We then present the results of our calculations on a variety of eph systems to demonstrate the utility of our approach (Sec. 3). Finally, we conclude with a discussion of our results and future directions (Sec. 4).

2 Theory

2.1 Hamiltonians

We consider the general linear coupling eph Hamiltonian which, when written in the momentum basis, is given by

H=n,kϵnkcn,kcn,k+ν,qωνqbν,qbν,q+k1,k2,k3ijmnVk1k2k3ijmnci,k1cj,k2cm,k3cn,k1+k2k3+mnνkqgmnνkqcn,k+qcn,k(bν,q+bν,q).𝐻subscript𝑛𝑘subscriptitalic-ϵ𝑛𝑘subscriptsuperscript𝑐𝑛𝑘subscript𝑐𝑛𝑘subscript𝜈𝑞subscript𝜔𝜈𝑞subscriptsuperscript𝑏𝜈𝑞subscript𝑏𝜈𝑞subscriptsubscript𝑘1subscript𝑘2subscript𝑘3𝑖𝑗𝑚𝑛superscriptsubscript𝑉subscript𝑘1subscript𝑘2subscript𝑘3𝑖𝑗𝑚𝑛superscriptsubscript𝑐𝑖subscript𝑘1superscriptsubscript𝑐𝑗subscript𝑘2subscript𝑐𝑚subscript𝑘3subscript𝑐𝑛subscript𝑘1subscript𝑘2subscript𝑘3subscript𝑚𝑛𝜈𝑘𝑞subscript𝑔𝑚𝑛𝜈𝑘𝑞subscriptsuperscript𝑐𝑛𝑘𝑞subscript𝑐𝑛𝑘subscriptsuperscript𝑏𝜈𝑞subscript𝑏𝜈𝑞\begin{split}H&=\sum_{n,k}\epsilon_{nk}c^{\dagger}_{n,k}c_{n,k}+\sum_{\nu,q}%\omega_{\nu q}b^{\dagger}_{\nu,q}b_{\nu,q}\\&+\sum_{\begin{subarray}{c}k_{1},k_{2},k_{3}\\ijmn\end{subarray}}V_{k_{1}k_{2}k_{3}}^{ijmn}c_{i,k_{1}}^{\dagger}c_{j,k_{2}}^%{\dagger}c_{m,k_{3}}c_{n,k_{1}+k_{2}-k_{3}}\\&+\sum_{mn\nu kq}g_{mn\nu kq}c^{\dagger}_{n,k+q}c_{n,k}(b^{\dagger}_{\nu,-q}+b%_{\nu,q}).\\\end{split}start_ROW start_CELL italic_H end_CELL start_CELL = ∑ start_POSTSUBSCRIPT italic_n , italic_k end_POSTSUBSCRIPT italic_ϵ start_POSTSUBSCRIPT italic_n italic_k end_POSTSUBSCRIPT italic_c start_POSTSUPERSCRIPT † end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_n , italic_k end_POSTSUBSCRIPT italic_c start_POSTSUBSCRIPT italic_n , italic_k end_POSTSUBSCRIPT + ∑ start_POSTSUBSCRIPT italic_ν , italic_q end_POSTSUBSCRIPT italic_ω start_POSTSUBSCRIPT italic_ν italic_q end_POSTSUBSCRIPT italic_b start_POSTSUPERSCRIPT † end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_ν , italic_q end_POSTSUBSCRIPT italic_b start_POSTSUBSCRIPT italic_ν , italic_q end_POSTSUBSCRIPT end_CELL end_ROW start_ROW start_CELL end_CELL start_CELL + ∑ start_POSTSUBSCRIPT start_ARG start_ROW start_CELL italic_k start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , italic_k start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT , italic_k start_POSTSUBSCRIPT 3 end_POSTSUBSCRIPT end_CELL end_ROW start_ROW start_CELL italic_i italic_j italic_m italic_n end_CELL end_ROW end_ARG end_POSTSUBSCRIPT italic_V start_POSTSUBSCRIPT italic_k start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT italic_k start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT italic_k start_POSTSUBSCRIPT 3 end_POSTSUBSCRIPT end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_i italic_j italic_m italic_n end_POSTSUPERSCRIPT italic_c start_POSTSUBSCRIPT italic_i , italic_k start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT end_POSTSUBSCRIPT start_POSTSUPERSCRIPT † end_POSTSUPERSCRIPT italic_c start_POSTSUBSCRIPT italic_j , italic_k start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT end_POSTSUBSCRIPT start_POSTSUPERSCRIPT † end_POSTSUPERSCRIPT italic_c start_POSTSUBSCRIPT italic_m , italic_k start_POSTSUBSCRIPT 3 end_POSTSUBSCRIPT end_POSTSUBSCRIPT italic_c start_POSTSUBSCRIPT italic_n , italic_k start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT + italic_k start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT - italic_k start_POSTSUBSCRIPT 3 end_POSTSUBSCRIPT end_POSTSUBSCRIPT end_CELL end_ROW start_ROW start_CELL end_CELL start_CELL + ∑ start_POSTSUBSCRIPT italic_m italic_n italic_ν italic_k italic_q end_POSTSUBSCRIPT italic_g start_POSTSUBSCRIPT italic_m italic_n italic_ν italic_k italic_q end_POSTSUBSCRIPT italic_c start_POSTSUPERSCRIPT † end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_n , italic_k + italic_q end_POSTSUBSCRIPT italic_c start_POSTSUBSCRIPT italic_n , italic_k end_POSTSUBSCRIPT ( italic_b start_POSTSUPERSCRIPT † end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_ν , - italic_q end_POSTSUBSCRIPT + italic_b start_POSTSUBSCRIPT italic_ν , italic_q end_POSTSUBSCRIPT ) . end_CELL end_ROW(1)

Here cn,ksubscript𝑐𝑛𝑘c_{n,k}italic_c start_POSTSUBSCRIPT italic_n , italic_k end_POSTSUBSCRIPT are electronic annihilation operators for an electron with crystal momentum k𝑘kitalic_k in band n𝑛nitalic_n, and bνqsubscript𝑏𝜈𝑞b_{\nu q}italic_b start_POSTSUBSCRIPT italic_ν italic_q end_POSTSUBSCRIPT are phonon annihilation operators with crystal momentum q𝑞qitalic_q in band ν𝜈\nuitalic_ν. ϵksubscriptitalic-ϵ𝑘\epsilon_{k}italic_ϵ start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT are electronic energies, Vk1k2k3ijmnsuperscriptsubscript𝑉subscript𝑘1subscript𝑘2subscript𝑘3𝑖𝑗𝑚𝑛V_{k_{1}k_{2}k_{3}}^{ijmn}italic_V start_POSTSUBSCRIPT italic_k start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT italic_k start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT italic_k start_POSTSUBSCRIPT 3 end_POSTSUBSCRIPT end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_i italic_j italic_m italic_n end_POSTSUPERSCRIPT are electron-electron interactions, ων,qsubscript𝜔𝜈𝑞\omega_{\nu,q}italic_ω start_POSTSUBSCRIPT italic_ν , italic_q end_POSTSUBSCRIPT are phonon energies, and gmnνkqsubscript𝑔𝑚𝑛𝜈𝑘𝑞g_{mn\nu kq}italic_g start_POSTSUBSCRIPT italic_m italic_n italic_ν italic_k italic_q end_POSTSUBSCRIPT are eph couplings. We have omitted spin indices for brevity. In an ab initio setting, the band energies and interaction terms can be obtained from DFPT calculations.

ModelPhononElectron phonon coupling term Hephsubscript𝐻ephH_{\text{eph}}italic_H start_POSTSUBSCRIPT eph end_POSTSUBSCRIPTDimensionless
degrees of freedomSite basisMomentum basiscoupling constant
Holstein1-d phonons on sitesgcici(bi+bi)𝑔subscriptsuperscript𝑐𝑖subscript𝑐𝑖subscriptsuperscript𝑏𝑖subscript𝑏𝑖gc^{\dagger}_{i}c_{i}(b^{\dagger}_{i}+b_{i})italic_g italic_c start_POSTSUPERSCRIPT † end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT italic_c start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ( italic_b start_POSTSUPERSCRIPT † end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT + italic_b start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT )gNck+qck(bq+bq)𝑔𝑁subscriptsuperscript𝑐𝑘𝑞subscript𝑐𝑘subscriptsuperscript𝑏𝑞subscript𝑏𝑞\dfrac{g}{\sqrt{N}}c^{\dagger}_{k+q}c_{k}(b^{\dagger}_{-q}+b_{q})divide start_ARG italic_g end_ARG start_ARG square-root start_ARG italic_N end_ARG end_ARG italic_c start_POSTSUPERSCRIPT † end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_k + italic_q end_POSTSUBSCRIPT italic_c start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT ( italic_b start_POSTSUPERSCRIPT † end_POSTSUPERSCRIPT start_POSTSUBSCRIPT - italic_q end_POSTSUBSCRIPT + italic_b start_POSTSUBSCRIPT italic_q end_POSTSUBSCRIPT )λ=g22ωt𝜆superscript𝑔22𝜔𝑡\lambda=\dfrac{g^{2}}{2\omega t}italic_λ = divide start_ARG italic_g start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT end_ARG start_ARG 2 italic_ω italic_t end_ARG
SSHn𝑛nitalic_n-d phonons on sitesg(cicj+cjci)(bi,{ij}+bi,{ij}bj,{ij}bj,{ij})𝑔subscriptsuperscript𝑐𝑖subscript𝑐𝑗subscriptsuperscript𝑐𝑗subscript𝑐𝑖subscriptsuperscript𝑏𝑖𝑖𝑗subscript𝑏𝑖𝑖𝑗subscriptsuperscript𝑏𝑗𝑖𝑗subscript𝑏𝑗𝑖𝑗{\begin{split}&g\left(c^{\dagger}_{i}c_{j}+c^{\dagger}_{j}c_{i}\right)\\&\qquad(b^{\dagger}_{i,\left\{ij\right\}}+b_{i,\left\{ij\right\}}-b^{\dagger}_%{j,\left\{ij\right\}}-b_{j,\left\{ij\right\}})\end{split}}start_ROW start_CELL end_CELL start_CELL italic_g ( italic_c start_POSTSUPERSCRIPT † end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT italic_c start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT + italic_c start_POSTSUPERSCRIPT † end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT italic_c start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ) end_CELL end_ROW start_ROW start_CELL end_CELL start_CELL ( italic_b start_POSTSUPERSCRIPT † end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_i , { italic_i italic_j } end_POSTSUBSCRIPT + italic_b start_POSTSUBSCRIPT italic_i , { italic_i italic_j } end_POSTSUBSCRIPT - italic_b start_POSTSUPERSCRIPT † end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_j , { italic_i italic_j } end_POSTSUBSCRIPT - italic_b start_POSTSUBSCRIPT italic_j , { italic_i italic_j } end_POSTSUBSCRIPT ) end_CELL end_ROW2igN(sin(kμ+qμ)sin(kμ))ck+qck(bq,μ+bq,μ)2𝑖𝑔𝑁subscript𝑘𝜇subscript𝑞𝜇subscript𝑘𝜇subscriptsuperscript𝑐𝑘𝑞subscript𝑐𝑘subscriptsuperscript𝑏𝑞𝜇subscript𝑏𝑞𝜇{\begin{split}\dfrac{2ig}{\sqrt{N}}&\left(\sin(k_{\mu}+q_{\mu})-\sin(k_{\mu})%\right)\\&\qquad\qquad c^{\dagger}_{k+q}c_{k}(b^{\dagger}_{-q,\mu}+b_{q,\mu})\end{split}}start_ROW start_CELL divide start_ARG 2 italic_i italic_g end_ARG start_ARG square-root start_ARG italic_N end_ARG end_ARG end_CELL start_CELL ( roman_sin ( italic_k start_POSTSUBSCRIPT italic_μ end_POSTSUBSCRIPT + italic_q start_POSTSUBSCRIPT italic_μ end_POSTSUBSCRIPT ) - roman_sin ( italic_k start_POSTSUBSCRIPT italic_μ end_POSTSUBSCRIPT ) ) end_CELL end_ROW start_ROW start_CELL end_CELL start_CELL italic_c start_POSTSUPERSCRIPT † end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_k + italic_q end_POSTSUBSCRIPT italic_c start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT ( italic_b start_POSTSUPERSCRIPT † end_POSTSUPERSCRIPT start_POSTSUBSCRIPT - italic_q , italic_μ end_POSTSUBSCRIPT + italic_b start_POSTSUBSCRIPT italic_q , italic_μ end_POSTSUBSCRIPT ) end_CELL end_ROWλ=g2ωt𝜆superscript𝑔2𝜔𝑡\lambda=\dfrac{g^{2}}{\omega t}italic_λ = divide start_ARG italic_g start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT end_ARG start_ARG italic_ω italic_t end_ARG
Bond1-d phonons on bondsg(cicj+cjci)(bij+bij)𝑔subscriptsuperscript𝑐𝑖subscript𝑐𝑗subscriptsuperscript𝑐𝑗subscript𝑐𝑖subscriptsuperscript𝑏delimited-⟨⟩𝑖𝑗subscript𝑏delimited-⟨⟩𝑖𝑗g\left(c^{\dagger}_{i}c_{j}+c^{\dagger}_{j}c_{i}\right)(b^{\dagger}_{\langle ij%\rangle}+b_{\langle ij\rangle})italic_g ( italic_c start_POSTSUPERSCRIPT † end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT italic_c start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT + italic_c start_POSTSUPERSCRIPT † end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT italic_c start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ) ( italic_b start_POSTSUPERSCRIPT † end_POSTSUPERSCRIPT start_POSTSUBSCRIPT ⟨ italic_i italic_j ⟩ end_POSTSUBSCRIPT + italic_b start_POSTSUBSCRIPT ⟨ italic_i italic_j ⟩ end_POSTSUBSCRIPT )gN(ei(kμ+qμ)+eikμ)ck+qck(bq,μ+bq,μ)𝑔𝑁superscript𝑒𝑖subscript𝑘𝜇subscript𝑞𝜇superscript𝑒𝑖subscript𝑘𝜇subscriptsuperscript𝑐𝑘𝑞subscript𝑐𝑘subscriptsuperscript𝑏𝑞𝜇subscript𝑏𝑞𝜇{\begin{split}\dfrac{g}{\sqrt{N}}&\left(e^{i(k_{\mu}+q_{\mu})}+e^{-ik_{\mu}}%\right)\\&\qquad\qquad c^{\dagger}_{k+q}c_{k}(b^{\dagger}_{-q,\mu}+b_{q,\mu})\end{split}}start_ROW start_CELL divide start_ARG italic_g end_ARG start_ARG square-root start_ARG italic_N end_ARG end_ARG end_CELL start_CELL ( italic_e start_POSTSUPERSCRIPT italic_i ( italic_k start_POSTSUBSCRIPT italic_μ end_POSTSUBSCRIPT + italic_q start_POSTSUBSCRIPT italic_μ end_POSTSUBSCRIPT ) end_POSTSUPERSCRIPT + italic_e start_POSTSUPERSCRIPT - italic_i italic_k start_POSTSUBSCRIPT italic_μ end_POSTSUBSCRIPT end_POSTSUPERSCRIPT ) end_CELL end_ROW start_ROW start_CELL end_CELL start_CELL italic_c start_POSTSUPERSCRIPT † end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_k + italic_q end_POSTSUBSCRIPT italic_c start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT ( italic_b start_POSTSUPERSCRIPT † end_POSTSUPERSCRIPT start_POSTSUBSCRIPT - italic_q , italic_μ end_POSTSUBSCRIPT + italic_b start_POSTSUBSCRIPT italic_q , italic_μ end_POSTSUBSCRIPT ) end_CELL end_ROWλ=g22ωt𝜆superscript𝑔22𝜔𝑡\lambda=\dfrac{g^{2}}{2\omega t}italic_λ = divide start_ARG italic_g start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT end_ARG start_ARG 2 italic_ω italic_t end_ARG

Phenomenologically, various special cases of this Hamiltonian have been studied in the literature. For polarons, models like the HolsteinHolstein (1959, 1959) and eph coupled SSHSu etal. (1979) models include only local interactions often with dispersionless phonons. They have the following general form in the site and momentum bases,

H=He+Hν+Heph,He=ijt(cicj+h.c.),=k2t(μcos(kμ))ckck,Hν=ω0ν,ibν,ibν,i=ω0ν,qbν,qbν,q,\begin{split}H&=H_{e}+H_{\nu}+H_{\text{eph}},\\H_{e}&=-\sum_{\langle ij\rangle}t(c^{\dagger}_{i}c_{j}+\text{h.c.}),\\&=\sum_{k}-2t\left(\sum_{\mu}\cos(k_{\mu})\right)c^{\dagger}_{k}c_{k},\\H_{\nu}&=\omega_{0}\sum_{\nu,i}b^{\dagger}_{\nu,i}b_{\nu,i}=\omega_{0}\sum_{%\nu,q}b^{\dagger}_{\nu,q}b_{\nu,q},\\\end{split}start_ROW start_CELL italic_H end_CELL start_CELL = italic_H start_POSTSUBSCRIPT italic_e end_POSTSUBSCRIPT + italic_H start_POSTSUBSCRIPT italic_ν end_POSTSUBSCRIPT + italic_H start_POSTSUBSCRIPT eph end_POSTSUBSCRIPT , end_CELL end_ROW start_ROW start_CELL italic_H start_POSTSUBSCRIPT italic_e end_POSTSUBSCRIPT end_CELL start_CELL = - ∑ start_POSTSUBSCRIPT ⟨ italic_i italic_j ⟩ end_POSTSUBSCRIPT italic_t ( italic_c start_POSTSUPERSCRIPT † end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT italic_c start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT + h.c. ) , end_CELL end_ROW start_ROW start_CELL end_CELL start_CELL = ∑ start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT - 2 italic_t ( ∑ start_POSTSUBSCRIPT italic_μ end_POSTSUBSCRIPT roman_cos ( italic_k start_POSTSUBSCRIPT italic_μ end_POSTSUBSCRIPT ) ) italic_c start_POSTSUPERSCRIPT † end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT italic_c start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT , end_CELL end_ROW start_ROW start_CELL italic_H start_POSTSUBSCRIPT italic_ν end_POSTSUBSCRIPT end_CELL start_CELL = italic_ω start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ∑ start_POSTSUBSCRIPT italic_ν , italic_i end_POSTSUBSCRIPT italic_b start_POSTSUPERSCRIPT † end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_ν , italic_i end_POSTSUBSCRIPT italic_b start_POSTSUBSCRIPT italic_ν , italic_i end_POSTSUBSCRIPT = italic_ω start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ∑ start_POSTSUBSCRIPT italic_ν , italic_q end_POSTSUBSCRIPT italic_b start_POSTSUPERSCRIPT † end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_ν , italic_q end_POSTSUBSCRIPT italic_b start_POSTSUBSCRIPT italic_ν , italic_q end_POSTSUBSCRIPT , end_CELL end_ROW(2)

where t𝑡titalic_t is the electronic hopping amplitude between neighboring sites i𝑖iitalic_i and j𝑗jitalic_j, and ω0subscript𝜔0\omega_{0}italic_ω start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT is a fixed frequency for all phonon branches. We set t=1𝑡1t=1italic_t = 1, so that all energies are measured in units of t𝑡titalic_t. The use of periodic boundary conditions allows the Hamiltonian to be expressed in a simple form in the momentum basis. In this work, we consider integer lattices viz. one-dimensional chain, two-dimensional square lattice, and three-dimensional cubic lattice.

Electron-phonon interactions are given in Table 1 for the Holstein, Bond, and SSH models. The Holstein Hamiltonian couples phonons on lattice sites to the local density of electrons, referred to as diagonal coupling. The Holstein model is appropriate for describing the coupling of doped or excited charge carriers to high-frequency optical phonons. The SSH model, also known as the Peierls model, on the other hand, couples carrier hopping to vibrations through an off-diagonal coupling. It results from the modulation of hopping parameters by changes in nuclear positions. A variation of this model, termed the Bond model,Sengupta etal. (2003) consists of phonon modes situated on bonds rather than lattice sites, which couple directly to the carrier hopping. These polaron models can be extended to study systems with multiple interacting electrons. On-site Hubbard interactions are commonly used for this purpose, where the interaction is given by

HHubbard=Uinini.subscript𝐻Hubbard𝑈subscript𝑖subscript𝑛𝑖absentsubscript𝑛𝑖absentH_{\text{Hubbard}}=U\sum_{i}n_{i\uparrow}n_{i\downarrow}.italic_H start_POSTSUBSCRIPT Hubbard end_POSTSUBSCRIPT = italic_U ∑ start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT italic_n start_POSTSUBSCRIPT italic_i ↑ end_POSTSUBSCRIPT italic_n start_POSTSUBSCRIPT italic_i ↓ end_POSTSUBSCRIPT .(3)

While VMC does not have a sign-problem per se, states with non-trivial amplitude sign or phase structures can be harder to optimize.Cai and Liu (2018); Westerhout etal. (2020); Szabó and Castelnovo (2020) The Holstein and Bond models are stoquasticBravyi etal. (2006) in the site basis, whereas the SSH model is not. To see this note that in a basis specified by the position of the electron xesubscript𝑥𝑒x_{e}italic_x start_POSTSUBSCRIPT italic_e end_POSTSUBSCRIPT and number of phonons at each lattice site {νi}subscript𝜈𝑖\left\{\nu_{i}\right\}{ italic_ν start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT }, these Hamiltonians have negative off-diagonal elements when t𝑡titalic_t and g𝑔gitalic_g are positive. Thus, according to the Perron-Frobenius theorem, these Hamiltonians have a unique ground state with positive amplitudes in this basis. The SSH Hamiltonian does not have this property and, therefore, does not have a sign-definite ground state in the same basis. Furthermore, while the Holstein model maintains stoquasticity in the momentum basis, the Bond and SSH models do not share this attribute. We note that similar observations have been noted in studies using diagrammatic quantum Monte Carlo (DQMC).Prokof’ev and Svistunov (2022)

2.2 Wave functions

We will use the notation n=({ei},{νi})nsubscript𝑒𝑖subscript𝜈𝑖\textbf{n}=(\left\{e_{i}\right\},\left\{\nu_{i}\right\})n = ( { italic_e start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT } , { italic_ν start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT } ), where {ei}subscript𝑒𝑖\left\{e_{i}\right\}{ italic_e start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT } and {νi}subscript𝜈𝑖\left\{\nu_{i}\right\}{ italic_ν start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT } are electron and phonon occupation numbers, respectively, to denote the basis vectors. Our NN wave function ansatz is given by

|ψ=nexp[f(n)]iνi!|𝐧,ket𝜓subscriptn𝑓nsubscriptproduct𝑖subscript𝜈𝑖ket𝐧|\psi\rangle=\sum_{\textbf{n}}\frac{\exp\left[f(\textbf{n})\right]}{\sqrt{%\prod_{i}\nu_{i}!}}|\mathbf{n}\rangle,| italic_ψ ⟩ = ∑ start_POSTSUBSCRIPT n end_POSTSUBSCRIPT divide start_ARG roman_exp [ italic_f ( n ) ] end_ARG start_ARG square-root start_ARG ∏ start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT italic_ν start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ! end_ARG end_ARG | bold_n ⟩ ,(4)

where the function f𝑓fitalic_f is a sum of two neural nets:

f(n)=r(n)+iϕ(n).𝑓n𝑟n𝑖italic-ϕnf(\textbf{n})=r(\textbf{n})+i\phi(\textbf{n}).italic_f ( n ) = italic_r ( n ) + italic_i italic_ϕ ( n ) .(5)

These NNs operate with real parameters and generate real outputs. The function ϕitalic-ϕ\phiitalic_ϕ serves to impart a phase to the wave function. The use of an exponential form, as identified in prior research, aids in the effective representation of wave function amplitudes that may vary significantly across many orders of magnitude. The factor in the denominator involving phonon numbers is included to improve the stability of wave function optimization. It echoes the functional form of coherent states of the harmonic oscillator. In the case of a single site coupled to a phonon mode, the neural net simply has to represent a linear function of the phonon number. Other ways of encoding phonon degrees of freedom in wave functions include continuum representationsOhgoe and Imada (2014) and binary string representations of phonon occupation numbers.Nomura (2020); Jeckelmann and White (1998) The latter has the advantage that all inputs are binary but a restriction is placed on the total number of phonons on a site.

In this work, we employ multilayer perceptrons (MLPs), which are fully connected feedforward networks.Cybenko (1989); Goodfellow etal. (2016) An MLP with n𝑛nitalic_n layers is defined recursively as

𝐱i+1=σ(𝐖i.𝐱i+𝐛i),\mathbf{x}_{i+1}=\sigma(\mathbf{W}_{i}.\mathbf{x}_{i}+\mathbf{b}_{i}),bold_x start_POSTSUBSCRIPT italic_i + 1 end_POSTSUBSCRIPT = italic_σ ( bold_W start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT . bold_x start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT + bold_b start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ) ,(6)

where 𝐱isubscript𝐱𝑖\mathbf{x}_{i}bold_x start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT are outputs of hidden neuron layers with 𝐱0subscript𝐱0\mathbf{x}_{0}bold_x start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT the input, 𝐖isubscript𝐖𝑖\mathbf{W}_{i}bold_W start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT a weight matrix, 𝐛isubscript𝐛𝑖\mathbf{b}_{i}bold_b start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT a bias, and σ𝜎\sigmaitalic_σ is the activation function which acts element-wise on the vector input. In this work, we use rectifier (ReLU) activation functions.Goodfellow etal. (2016) Other network architectures, like convolutional neural networks (CNN),Choo etal. (2018); Roth etal. (2023) restricted Boltzmann machines,Carleo and Troyer (2017); Nomura (2020); Glasser etal. (2018)and autoregressive neural netsHumeniuk etal. (2023) have been used in several studies. CNNs, in particular, have the advantage of being inherently translationally invariant. Translational invariance can be imposed on MLP states too, as we discuss next.

The use of symmetry is essential for increasing the efficiency of the representation. Encoding symmetries biases the NN in a way that obviates the work required to learn them from scratch, allowing one to achieve similar accuracy with a smaller NN with fewer parameters. It also enables the targeting of excited states belonging to different irreducible representations (irrep) of the symmetry group. One way to impose symmetries is to generate images of the input under the action of all elements of the symmetry group and average the NN outputs over them

ψ(n)=1Ngcgψ(gn),𝜓n1𝑁subscript𝑔subscript𝑐𝑔𝜓𝑔n\psi(\textbf{n})=\frac{1}{N}\sum_{g}c_{g}\psi(g\textbf{n}),italic_ψ ( n ) = divide start_ARG 1 end_ARG start_ARG italic_N end_ARG ∑ start_POSTSUBSCRIPT italic_g end_POSTSUBSCRIPT italic_c start_POSTSUBSCRIPT italic_g end_POSTSUBSCRIPT italic_ψ ( italic_g n ) ,(7)

where g𝑔gitalic_g denotes a group operation and cgsubscript𝑐𝑔c_{g}italic_c start_POSTSUBSCRIPT italic_g end_POSTSUBSCRIPT is its character in the irrep being targeted. This approach has been used in VMC with traditional wave functionsTahara and Imada (2008); Becca and Sorella (2017) as well as in studies of NQS states.Roth etal. (2023); Reh etal. (2023) Another possibility is to arbitrarily choose one of the equivalent sets of permutations, e.g. the lexicographically smallest one. For polarons, a natural choice is shifting phonon occupations along with the electron in real space.

ψ(n)=ψ(T0n),𝜓n𝜓subscript𝑇0n\psi(\textbf{n})=\psi(T_{0}\textbf{n}),italic_ψ ( n ) = italic_ψ ( italic_T start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT n ) ,(8)

where T0subscript𝑇0T_{0}italic_T start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT denotes the translation operator that shifts the electron to an arbitrarily chosen origin of the lattice. Although this approach is computationally cheaper, we find it is susceptible to converging to local minima during optimization in our numerical experiments. Therefore we use the averaging method in this work. We note that when working in the momentum basis, imposing translational symmetry becomes trivial, because one can simply restrict the Monte Carlo random walk to configurations with a fixed momentum. For polaron and bipolaron problems, the cost of evaluating the dense Hamiltonian in momentum space is comparable to the cost of translational symmetry projection in the site basis. This is not the case for the treatment of many-electron systems, where the inter-electronic interaction incurs a steeper cost in momentum space. Other Abelian symmetries like some point groups (not used here) can be similarly restored. In bipolaron problems, we make use of spin symmetry in addition to translational symmetry to target singlet states.

In addition to NN states, we also consider physically motivated approximate polaron wave functions for comparison. The ansatz due to Davydov is given by

|ψDavydov=(iϕici)exp(νξνbνξνbν)|0,ketsubscript𝜓Davydovsubscript𝑖subscriptitalic-ϕ𝑖superscriptsubscript𝑐𝑖subscript𝜈subscript𝜉𝜈superscriptsubscript𝑏𝜈superscriptsubscript𝜉𝜈subscript𝑏𝜈ket0|\psi_{\text{Davydov}}\rangle=\left(\sum_{i}\phi_{i}c_{i}^{\dagger}\right)\exp%\left(-\sum_{\nu}\xi_{\nu}b_{\nu}^{\dagger}-\xi_{\nu}^{*}b_{\nu}\right)|0\rangle,| italic_ψ start_POSTSUBSCRIPT Davydov end_POSTSUBSCRIPT ⟩ = ( ∑ start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT italic_ϕ start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT italic_c start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT start_POSTSUPERSCRIPT † end_POSTSUPERSCRIPT ) roman_exp ( - ∑ start_POSTSUBSCRIPT italic_ν end_POSTSUBSCRIPT italic_ξ start_POSTSUBSCRIPT italic_ν end_POSTSUBSCRIPT italic_b start_POSTSUBSCRIPT italic_ν end_POSTSUBSCRIPT start_POSTSUPERSCRIPT † end_POSTSUPERSCRIPT - italic_ξ start_POSTSUBSCRIPT italic_ν end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT italic_b start_POSTSUBSCRIPT italic_ν end_POSTSUBSCRIPT ) | 0 ⟩ ,(9)

where i𝑖iitalic_i and ν𝜈\nuitalic_ν denote electron and phonon indices, respectively, and ϕisubscriptitalic-ϕ𝑖\phi_{i}italic_ϕ start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT and ξνsubscript𝜉𝜈\xi_{\nu}italic_ξ start_POSTSUBSCRIPT italic_ν end_POSTSUBSCRIPT are complex variational parameters. This wave function is a product of a coherent state for the phonons and a linear combination of electron creation operators, designed to represent a small polaron localized in real space. We note that an MLP wave function can mimic this form efficiently with a single hidden neuron. This is accomplished by setting the weights connecting the electronic occupations to the hidden neuron to lnϕisubscriptitalic-ϕ𝑖\ln\phi_{i}roman_ln italic_ϕ start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT and those for the phonon occupations to ξνsubscript𝜉𝜈\xi_{\nu}italic_ξ start_POSTSUBSCRIPT italic_ν end_POSTSUBSCRIPT, and using a linear activation function. Imposing translational symmetry on the Davydov wave function entangles the electron with the phonons giving rise to the Toyozawa ansatz,

|ψToyozawak=Pk|ψDavydov,ketsuperscriptsubscript𝜓Toyozawa𝑘subscript𝑃𝑘ketsubscript𝜓Davydov|\psi_{\text{Toyozawa}}^{k}\rangle=P_{k}|\psi_{\text{Davydov}}\rangle,| italic_ψ start_POSTSUBSCRIPT Toyozawa end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_k end_POSTSUPERSCRIPT ⟩ = italic_P start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT | italic_ψ start_POSTSUBSCRIPT Davydov end_POSTSUBSCRIPT ⟩ ,(10)

where Pksubscript𝑃𝑘P_{k}italic_P start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT denotes a projector onto the subspace of fixed momentum k𝑘kitalic_k, which acts as in Eq. 7. Translationally symmetrized NN states can thus be thought of as a generalization of the Toyozawa ansatz. Note that this wave function can be evaluated and optimized deterministically at a polynomial cost, but we use it within VMC here. The low computational cost of its evaluation allows us to assess finite-size effects in cases where calculations with NN states are expensive on large systems.

The ability of our ansatz to capture electron correlation can be demonstrated by constructing Jastrow states as MLPs. Jastrow factors are known to describe Hubbard-like physics very accuratelyTahara and Imada (2008); Becca and Sorella (2017) and are given by

J(𝐧)=exp(ijvijninj),𝐽𝐧subscript𝑖𝑗subscript𝑣𝑖𝑗subscript𝑛𝑖subscript𝑛𝑗J(\mathbf{n})=\exp\left(\sum_{ij}v_{ij}n_{i}n_{j}\right),italic_J ( bold_n ) = roman_exp ( ∑ start_POSTSUBSCRIPT italic_i italic_j end_POSTSUBSCRIPT italic_v start_POSTSUBSCRIPT italic_i italic_j end_POSTSUBSCRIPT italic_n start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT italic_n start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT ) ,(11)

where nisubscript𝑛𝑖n_{i}italic_n start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT is the occupation number of electrons at site i𝑖iitalic_i, and vijsubscript𝑣𝑖𝑗v_{ij}italic_v start_POSTSUBSCRIPT italic_i italic_j end_POSTSUBSCRIPT are variational parameters. One can obtain the product of a pair of occupation numbers using unit weights and a bias of -1 with a ReLU activation function,

n1n2=ReLU(n1+n21).subscript𝑛1subscript𝑛2ReLUsubscript𝑛1subscript𝑛21n_{1}n_{2}=\text{ReLU}(n_{1}+n_{2}-1).italic_n start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT italic_n start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT = ReLU ( italic_n start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT + italic_n start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT - 1 ) .(12)

It might appear that to obtain all pairwise products in the Jastrow factors requires O(N2)𝑂superscript𝑁2O(N^{2})italic_O ( italic_N start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT ) hidden neurons with O(N3)𝑂superscript𝑁3O(N^{3})italic_O ( italic_N start_POSTSUPERSCRIPT 3 end_POSTSUPERSCRIPT ) parameters, but it is possible to construct an MLP with O(N2)𝑂superscript𝑁2O(N^{2})italic_O ( italic_N start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT ) hidden neurons as one would expect given the quadratic scaling number of parameters in vijsubscript𝑣𝑖𝑗v_{ij}italic_v start_POSTSUBSCRIPT italic_i italic_j end_POSTSUBSCRIPT. This is easiest to see with a two-hidden layer network. One half of the the first hidden layer consists of N𝑁Nitalic_N neurons, one for each input. For a given input site one constructs the following quantity as the output of the corresponding hidden neuron:

θi=jvijnj.subscript𝜃𝑖subscript𝑗subscript𝑣𝑖𝑗subscript𝑛𝑗\theta_{i}=\sum_{j}v_{ij}n_{j}.italic_θ start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT = ∑ start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT italic_v start_POSTSUBSCRIPT italic_i italic_j end_POSTSUBSCRIPT italic_n start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT .(13)

For convenience, we also copy over the inputs to another set of N𝑁Nitalic_N hidden neurons in the first hidden layer. In the second layer, we construct N𝑁Nitalic_N products of the form niθisubscript𝑛𝑖subscript𝜃𝑖n_{i}\theta_{i}italic_n start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT italic_θ start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT using the construction above. Finally, these outputs are summed and exponentiated to obtain the Jastrow factor in the output neuron. Note that one can encode electron-phonon Jastrow factorsKarakuzu etal. (2017); Ohgoe and Imada (2014) in a similar manner. An efficient encoding of Jastrow factors as Restricted Boltzmann Machines has been reported previously.Clark (2018)

2.3 Variational Monte Carlo (VMC)

We optimize and calculate the properties of the above wave functions using Monte Carlo sampling. Although VMC has not been used extensively for calculations of eph systems, some studies have employed this techniqueKarakuzu etal. (2017); Ohgoe and Imada (2014); Ferrari etal. (2020). VMC is suitable for computing properties of NN wave functions since it only requires the overlap of the state with a walker configuration. Here, we perform random walks in the space of electron and phonon number configurations. An observable O𝑂Oitalic_O can be sampled using

ψ|O|ψψ|ψ=w|ψ|w|2ψ|ψw|O|ψw|ψ,quantum-operator-product𝜓𝑂𝜓inner-product𝜓𝜓subscript𝑤superscriptinner-product𝜓𝑤2inner-product𝜓𝜓quantum-operator-product𝑤𝑂𝜓inner-product𝑤𝜓\frac{\langle\psi|O|\psi\rangle}{\langle\psi|\psi\rangle}=\sum_{w}\frac{|%\langle\psi|w\rangle|^{2}}{\langle\psi|\psi\rangle}\frac{\langle w|O|\psi%\rangle}{\langle w|\psi\rangle},divide start_ARG ⟨ italic_ψ | italic_O | italic_ψ ⟩ end_ARG start_ARG ⟨ italic_ψ | italic_ψ ⟩ end_ARG = ∑ start_POSTSUBSCRIPT italic_w end_POSTSUBSCRIPT divide start_ARG | ⟨ italic_ψ | italic_w ⟩ | start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT end_ARG start_ARG ⟨ italic_ψ | italic_ψ ⟩ end_ARG divide start_ARG ⟨ italic_w | italic_O | italic_ψ ⟩ end_ARG start_ARG ⟨ italic_w | italic_ψ ⟩ end_ARG ,(14)

where we have combined carrier and phonon coordinates into w𝑤witalic_w. This requires the evaluation of the following local quantity for each walker configuration.

OL(w)=w|O|ψw|ψ=ww|O|ww|ψw|ψ,subscript𝑂𝐿𝑤quantum-operator-product𝑤𝑂𝜓inner-product𝑤𝜓subscriptsuperscript𝑤quantum-operator-product𝑤𝑂superscript𝑤inner-productsuperscript𝑤𝜓inner-product𝑤𝜓O_{L}(w)=\frac{\langle w|O|\psi\rangle}{\langle w|\psi\rangle}=\sum_{w^{\prime%}}\langle w|O|w^{\prime}\rangle\frac{\langle w^{\prime}|\psi\rangle}{\langle w%|\psi\rangle},italic_O start_POSTSUBSCRIPT italic_L end_POSTSUBSCRIPT ( italic_w ) = divide start_ARG ⟨ italic_w | italic_O | italic_ψ ⟩ end_ARG start_ARG ⟨ italic_w | italic_ψ ⟩ end_ARG = ∑ start_POSTSUBSCRIPT italic_w start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT end_POSTSUBSCRIPT ⟨ italic_w | italic_O | italic_w start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ⟩ divide start_ARG ⟨ italic_w start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT | italic_ψ ⟩ end_ARG start_ARG ⟨ italic_w | italic_ψ ⟩ end_ARG ,(15)

where wsuperscript𝑤w^{\prime}italic_w start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT are configurations generated from w𝑤witalic_w by application of the observable operator. The cost of local energy evaluation in the site basis for a space-local Hamiltonian, like the Holstein model, is the same as the cost of overlap calculation since the number of excitations generated is a small system-size independent constant. For general long-range Hamiltonians, this is no longer the case and local energy evaluation becomes the bottleneck of the VMC calculation.

We sample walkers w𝑤witalic_w from the distribution |w|ψ|2ψ|ψsuperscriptinner-product𝑤𝜓2inner-product𝜓𝜓\frac{|\langle w|\psi\rangle|^{2}}{\langle\psi|\psi\rangle}divide start_ARG | ⟨ italic_w | italic_ψ ⟩ | start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT end_ARG start_ARG ⟨ italic_ψ | italic_ψ ⟩ end_ARG using continuous time sampling, which is a rejection-free sampling techniqueSabzevari and Sharma (2018). Starting from the current walker w𝑤witalic_w, we choose the next walker configuration out of the wsuperscript𝑤w^{\prime}italic_w start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT configurations generated during the local energy evaluation with probability proportional to |w|ψw|ψ|inner-productsuperscript𝑤𝜓inner-product𝑤𝜓|\frac{\langle w^{\prime}|\psi\rangle}{\langle w|\psi\rangle}|| divide start_ARG ⟨ italic_w start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT | italic_ψ ⟩ end_ARG start_ARG ⟨ italic_w | italic_ψ ⟩ end_ARG |. One of the advantages of the VMC approach is that one is not required to truncate the phonon space, as there are no restrictions on the number of phonons sampled. This is in contrast to deterministic methods like DMRGJeckelmann and White (1998). The gradient of the energy with respect to the variational parameters is similarly sampled. The use of backpropagation allows efficient calculation of energy gradients at the same cost as energy.Goodfellow etal. (2016) Having sampled energies and gradients, the variational parameters can be optimized using gradient-based optimization methods. We use the AMSGrad method, which is a variation of gradient descent with momentum.Reddi etal. (2019)

2.4 Real-frequency Green’s functions and excited states

To calculate the dynamical properties of the system, we work in the tangent space spanned by the derivatives of the wave function with respect to the variational parameters.McWeeny (1989) The state corresponding to the μ𝜇\muitalic_μth parameter is given by

|ψμ=|ψ0pν,ketsubscript𝜓𝜇ketsubscript𝜓0subscript𝑝𝜈|\psi_{\mu}\rangle=\frac{\partial|\psi_{0}\rangle}{\partial p_{\nu}},| italic_ψ start_POSTSUBSCRIPT italic_μ end_POSTSUBSCRIPT ⟩ = divide start_ARG ∂ | italic_ψ start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ⟩ end_ARG start_ARG ∂ italic_p start_POSTSUBSCRIPT italic_ν end_POSTSUBSCRIPT end_ARG ,(16)

with |ψ0ketsubscript𝜓0|\psi_{0}\rangle| italic_ψ start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ⟩ being the optimized ground state. The resulting basis is nonorthogonal and has linear dependencies. Thus to obtain the excited states and spectral functions, we solve the generalized eigenvalue problem

𝐇𝐂=E𝐒𝐂,𝐇𝐂𝐸𝐒𝐂\mathbf{H}\mathbf{C}=E\mathbf{S}\mathbf{C},bold_HC = italic_E bold_SC ,(17)

where the Hamiltonian and overlap matrices are given by

𝐇μν=ψμ|H|ψν,𝐒μν=ψμ|ψν.formulae-sequencesubscript𝐇𝜇𝜈quantum-operator-productsubscript𝜓𝜇𝐻subscript𝜓𝜈subscript𝐒𝜇𝜈inner-productsubscript𝜓𝜇subscript𝜓𝜈\begin{split}\mathbf{H}_{\mu\nu}&=\langle\psi_{\mu}|H|\psi_{\nu}\rangle,\\\mathbf{S}_{\mu\nu}&=\langle\psi_{\mu}|\psi_{\nu}\rangle.\end{split}start_ROW start_CELL bold_H start_POSTSUBSCRIPT italic_μ italic_ν end_POSTSUBSCRIPT end_CELL start_CELL = ⟨ italic_ψ start_POSTSUBSCRIPT italic_μ end_POSTSUBSCRIPT | italic_H | italic_ψ start_POSTSUBSCRIPT italic_ν end_POSTSUBSCRIPT ⟩ , end_CELL end_ROW start_ROW start_CELL bold_S start_POSTSUBSCRIPT italic_μ italic_ν end_POSTSUBSCRIPT end_CELL start_CELL = ⟨ italic_ψ start_POSTSUBSCRIPT italic_μ end_POSTSUBSCRIPT | italic_ψ start_POSTSUBSCRIPT italic_ν end_POSTSUBSCRIPT ⟩ . end_CELL end_ROW(18)

The eigenspectrum of this effective Hamiltonian {Ei,|i}subscript𝐸𝑖ket𝑖\left\{E_{i},|i\rangle\right\}{ italic_E start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT , | italic_i ⟩ } can be used to calculate spectral functions as

A(ω)=1πImψ|1ωH+iη|ψ=1πi|ψ|i|2η(ωEi)2+η2,𝐴𝜔1𝜋Imquantum-operator-product𝜓1𝜔𝐻𝑖𝜂𝜓1𝜋subscript𝑖superscriptinner-product𝜓𝑖2𝜂superscript𝜔subscript𝐸𝑖2superscript𝜂2\begin{split}A(\omega)&=-\frac{1}{\pi}\text{Im}\langle\psi|\frac{1}{\omega-H+i%\eta}|\psi\rangle\\&=-\frac{1}{\pi}\sum_{i}|\langle\psi|i\rangle|^{2}\frac{\eta}{(\omega-E_{i})^{%2}+\eta^{2}},\\\end{split}start_ROW start_CELL italic_A ( italic_ω ) end_CELL start_CELL = - divide start_ARG 1 end_ARG start_ARG italic_π end_ARG Im ⟨ italic_ψ | divide start_ARG 1 end_ARG start_ARG italic_ω - italic_H + italic_i italic_η end_ARG | italic_ψ ⟩ end_CELL end_ROW start_ROW start_CELL end_CELL start_CELL = - divide start_ARG 1 end_ARG start_ARG italic_π end_ARG ∑ start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT | ⟨ italic_ψ | italic_i ⟩ | start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT divide start_ARG italic_η end_ARG start_ARG ( italic_ω - italic_E start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ) start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT + italic_η start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT end_ARG , end_CELL end_ROW(19)

where η𝜂\etaitalic_η is a Lorentzian broadening parameter. The state |ψket𝜓|\psi\rangle| italic_ψ ⟩ is usually a physically intuitive excitation on top of the ground state and depends on the particular spectral function of interest. We refer to this method as linear response VMC (LR-VMC). Despite being couched in a linear response framework, we note that LR-VMC can be formally converged to the exact spectral function by increasing the number of parameters in the ground state wave function, thus expanding the tangent space systematically. We show numerical examples of this convergence in Sec. 3.2. The rate of this convergence is usually faster for lower-lying states compared to higher energy excitations. A faster convergence could be achieved by using higher derivative states instead of more parameters, but we do not pursue this possibility here.

While it is possible to sample the amplitude square of the ground state to obtain 𝐇𝐇\mathbf{H}bold_H and 𝐒𝐒\mathbf{S}bold_S, this can lead to noisy estimates, as |μket𝜇|\mu\rangle| italic_μ ⟩ states may have support on configurations with vanishing contributions to |ψ0ketsubscript𝜓0|\psi_{0}\rangle| italic_ψ start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ⟩. Moreover, noise in the unbiased estimates of these matrix elements leads to a bias in the estimates of the energy spectrum. We instead employ an alternative approach, termed reweighting in Ref. 87, which uses ν|n|ψμ|2subscript𝜈superscriptinner-product𝑛subscript𝜓𝜇2\sum_{\nu}|\langle n|\psi_{\mu}\rangle|^{2}∑ start_POSTSUBSCRIPT italic_ν end_POSTSUBSCRIPT | ⟨ italic_n | italic_ψ start_POSTSUBSCRIPT italic_μ end_POSTSUBSCRIPT ⟩ | start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT as the sampling function instead. The two approaches lead to identical results in the limit of infinite sampling, but we found the reweighting method to perform significantly better for a fixed sampling effort. Details and comparisons of these sampling approaches are provided in Appendix B. Due to stochastic sampling, the estimated metric S𝑆Sitalic_S is not necessarily positive definite and needs to be regularized. We diagonalize the sampled metric and throw away states with eigenvalues below a small threshold.

Symmetry is again imposed with the use of appropriate walkers used in the VMC sampling. The basis states can therefore be thought of as P^|ψμ^𝑃ketsubscript𝜓𝜇\hat{P}|\psi_{\mu}\rangleover^ start_ARG italic_P end_ARG | italic_ψ start_POSTSUBSCRIPT italic_μ end_POSTSUBSCRIPT ⟩, where P^^𝑃\hat{P}over^ start_ARG italic_P end_ARG is a projector of the symmetry imposed during the sampling. Excited state spectra of different symmetries can be calculated from the optimized ground states in the corresponding sectors.

Structure and dynamics of electron-phonon coupled systems using neural quantum states (1)

This approach of using derivative states has been used with many classic wave function theories, often termed as linear response or equation of motion (EOM) methods.McWeeny (1989) The Tamm-Dancoff approximation, which is formulated in the tangent space of the Hartree-Fock state, is a well-known example. A linear response DMRG theory has been developed in close analogy to Hartree-Fock.Dorando etal. (2009); Wouters etal. (2013); Haegeman etal. (2013) Similarly to NQS, linear response DMRG can also be systematically improved by increasing the bond dimension of the matrix product state. In VMC, an EOM theory based on geminal wave functions was presented in Ref. 91. A few studies have obtained dynamical information through the use of a basis constructed from physically relevant excitations on top of the variational ground state.Ido etal. (2020); Ferrari etal. (2020) Derivative basis states have the advantage of being computationally cheaper for the calculation of the required matrix elements. For example, consider the following element used in the sampling of the Hamiltonian matrix

n|H|ψμ=μn|H|ψ0.quantum-operator-product𝑛𝐻subscript𝜓𝜇subscript𝜇quantum-operator-product𝑛𝐻subscript𝜓0\langle n|H|\psi_{\mu}\rangle=\partial_{\mu}\langle n|H|\psi_{0}\rangle.⟨ italic_n | italic_H | italic_ψ start_POSTSUBSCRIPT italic_μ end_POSTSUBSCRIPT ⟩ = ∂ start_POSTSUBSCRIPT italic_μ end_POSTSUBSCRIPT ⟨ italic_n | italic_H | italic_ψ start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ⟩ .(20)

One can obtain this matrix element for all μ𝜇\muitalic_μ at the same cost scaling as the local energy calculation through reverse mode automatic differentiation. This is in contrast to the bases consisting of excitations on top of the ground state, where in general the cost of computing matrix elements for all excitations scales linearly with the number of excitations. A computational bottleneck of this approach lies in the explicit construction of the H𝐻Hitalic_H and S𝑆Sitalic_S matrices, which becomes infeasible for a large number of parameters, restricting us to states with fewer than roughly 104superscript10410^{4}10 start_POSTSUPERSCRIPT 4 end_POSTSUPERSCRIPT parameters. This cost can be avoided by using a direct method (not used in this work) outlined in Appendix B, which only samples the action of these matrices onto vectors.

3 Results

We present a numerical analysis of the performance of NQS in polaron, bipolaron, and many-electron systems. First, we consider the ability of these states to represent ground state structure for different kinds of eph coupling. We also calculate the binding energy of the hole polaron in lithium fluoride (LiF) from first principles. In the second part, we assess the accuracy of our LR-VMC approach based on NQS to capture spectral properties of eph systems. We use three-layer MLPs (input, hidden, output) for the radial and phase part of the NQS in all cases, unless stated otherwise. The code used to perform the VMC calculations is available in a public repository.vmc

3.1 Ground state properties

We start with the question of how many hidden neurons are required to accurately represent the ground state of Holstein and Bond model polarons at different coupling strengths. For the Holstein model, the phase function was set to one, whereas for the Bond model, we use MLPs with identical structures for both the radial and phase parts of the wave function in momentum space. Fig. 1 shows the errors in the ground state energies for a 30 site chain with ω=0.5𝜔0.5\omega=0.5italic_ω = 0.5. We use energies converged with respect to the number of hidden neurons as a reference. We have verified that these values agree very well with DMRG in the case of the Holstein model and numerically exact energies reported in literatureCarbone etal. (2021) for the Bond model wherever available. We ensured that the ground state energy in all cases is converged with respect to the number of lattice sites. For the Holstein model, the error is largest for a given number of neurons at intermediate coupling around the self-trapping crossover. This is intuitively sensible since simple weak and strong coupling ansatzes describe the regions away from the crossover point very well. The ansatz with a single hidden neuron, which is equivalent to the Toyozawa wave function, performs very well for the Holstein model. Convergence with respect to the number of hidden neurons is achieved very quickly, with the energies close to exact with just 10 hidden neurons. For the Bond model, on the other hand, the error increases with coupling for a fixed number of hidden neurons. The errors for the same number of hidden neurons are much larger compared to the Holstein model. This model does not exhibit a self-trapping crossover, and the number of hidden neurons required to obtain the same error in ground state energy increases with the size of the coupling. This also coincides with there not being a simple strong coupling ansatz that describes the strong coupling limit of the Bond model. We note that these observations also hold for calculations in the site basis and in higher dimensions. The number of hidden neurons for a given energy accuracy does not scale with the lattice size.

Structure and dynamics of electron-phonon coupled systems using neural quantum states (2)
Structure and dynamics of electron-phonon coupled systems using neural quantum states (3)
Structure and dynamics of electron-phonon coupled systems using neural quantum states (4)
Structure and dynamics of electron-phonon coupled systems using neural quantum states (5)

To demonstrate the robustness of our optimization protocol, we present a calculation on the ground state band of a modified Holstein model polaron with dispersive phonons. This system was studied using variational exact diagonalization (VED) in Ref. 96. We use the phonon dispersion given by

ω(q)=ω0+2tphcos(q),𝜔𝑞subscript𝜔02subscript𝑡ph𝑞\omega(q)=\omega_{0}+2t_{\text{ph}}\cos(q),italic_ω ( italic_q ) = italic_ω start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT + 2 italic_t start_POSTSUBSCRIPT ph end_POSTSUBSCRIPT roman_cos ( italic_q ) ,(21)

where tphsubscript𝑡pht_{\text{ph}}italic_t start_POSTSUBSCRIPT ph end_POSTSUBSCRIPT is the phonon hopping amplitude. The remaining parts of the Hamiltonian are identical to the usual Holstein model. We consider the parameters tph=0.4subscript𝑡ph0.4t_{\text{ph}}=0.4italic_t start_POSTSUBSCRIPT ph end_POSTSUBSCRIPT = 0.4 and λ=g22telω024tph2=0.5𝜆superscript𝑔22subscript𝑡elsuperscriptsubscript𝜔024superscriptsubscript𝑡ph20.5\lambda=\frac{g^{2}}{2t_{\text{el}}\sqrt{\omega_{0}^{2}-4t_{\text{ph}}^{2}}}=0.5italic_λ = divide start_ARG italic_g start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT end_ARG start_ARG 2 italic_t start_POSTSUBSCRIPT el end_POSTSUBSCRIPT square-root start_ARG italic_ω start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT - 4 italic_t start_POSTSUBSCRIPT ph end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT end_ARG end_ARG = 0.5. The phonon dispersion in this case bends downward due to the negative hopping amplitude, and results in a peculiar polaron band structure due to multiple avoided crossings with multiple phonon excitations. Fig. 2 shows the ground and first excited state bands obtained using a momentum space NQS on a 42 site chain. We find the ground state NQS energy to be in excellent agreement with VED results. The VMC optimization remarkably found the correct ground state at all k𝑘kitalic_k points starting with a completely random initial guess and did not encounter problems with trapping in local minima. The Toyozawa wave function is very accurate close to k=0𝑘0k=0italic_k = 0, but its error is substantial at larger k𝑘kitalic_k values. The NQS first excited state energy, obtained using LR-VMC (discussed in detail in Sec. 3.2), is also in good agreement with VED results. We found some discrepancies at intermediate k𝑘kitalic_k values between the first two avoided crossings, which we attribute to finite-size effects. The inset shows the convergence of the first excited state energy at k=π3𝑘𝜋3k=\frac{\pi}{3}italic_k = divide start_ARG italic_π end_ARG start_ARG 3 end_ARG with respect to the size of the lattice, showing a slow convergence with the lattice size.

Structure and dynamics of electron-phonon coupled systems using neural quantum states (6)
Structure and dynamics of electron-phonon coupled systems using neural quantum states (7)

Fig. 3 illustrates the behavior of the polaron ground state energy on two-dimensional lattices. We used a 10×10101010\times 1010 × 10 lattice for all calculations, which was confirmed to be sufficiently large to reach convergence within stochastic error bars for all cases presented here. For the Holstein model, similar to the one-dimensional chain considered previously, we find that the ground state energy is readily converged with a small number of hidden neurons. In fact, the Toyozawa wave function (equivalent to one hidden neuron) is very accurate in this case even on the two-dimensional lattice. We see the self-trapping crossover in the Holstein model clearly in this plot. The Bond and SSH models are considerably more difficult to solve for the NQS approach indicating the complexity of the off-diagonal coupling. We compare our energies to DQMC results reported in Ref. 95. For the Bond model, we are able to converge NQS energies to the DQMC results with a larger number of hidden neurons, around 100 for the largest couplings. The Toyozawa ansatz exhibits substantial errors that increase with the size of the coupling. The SSH model at large couplings has its ground state at nonzero k𝑘kitalic_k due to the negative next-nearest neighbor hopping amplitude induced by eph coupling. This behavior has been attributed to the unphysical linear nature of the interaction at strong couplings and was shown to disappear with more realistic nonlinear couplings.Prokof’ev and Svistunov (2022) Nevertheless, we find that the NQS is able to capture this shift of the ground state off the band center accurately. In this regime of stronger coupling, optimization becomes very slow and the ansatz requires a large number of hidden neurons to reach convergence. In these cases, we extrapolate our energies with respect to the number of hidden neurons (see Appendix A). We do not show the Toyozawa ansatz energies in this case because we were unable to converge them reliably.

We are also able to calculate ground state properties other than energy using NQS. Fig. 4 shows the electron-phonon correlation functions for the two-dimensional Holstein and Bond models. The correlation function is defined as

ξ(|ij|)=nixjni,𝜉𝑖𝑗delimited-⟨⟩subscript𝑛𝑖subscript𝑥𝑗delimited-⟨⟩subscript𝑛𝑖\xi(|i-j|)=\frac{\langle n_{i}x_{j}\rangle}{\langle n_{i}\rangle},italic_ξ ( | italic_i - italic_j | ) = divide start_ARG ⟨ italic_n start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT italic_x start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT ⟩ end_ARG start_ARG ⟨ italic_n start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ⟩ end_ARG ,(22)

where nisubscript𝑛𝑖n_{i}italic_n start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT and xjsubscript𝑥𝑗x_{j}italic_x start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT denote electron occupation and phonon displacement, respectively. The correlation function is a measure of the spatial extent of the polaron. The difference between the diagonal and off-diagonal couplings is immediately apparent from the correlation functions. In the Holstein model, the phonon cloud builds around the electron at a site, whereas in the Bond model, the electron hops between two sites exciting the phonons in surrounding bonds. We note that only the correlation function is localized in polaron models, but the exact ground state remains delocalized reflecting the lack of a true self-trapping transition in these models.

Structure and dynamics of electron-phonon coupled systems using neural quantum states (8)
Structure and dynamics of electron-phonon coupled systems using neural quantum states (9)
Structure and dynamics of electron-phonon coupled systems using neural quantum states (10)

For the final polaron example, we calculate the hole polaron binding energy in lithium fluoride (LiF) from first principles.Sio etal. (2019); Lee etal. (2021) This is a polar ionic crystal with a large band gap (14.2 eV experimental optical gap, 8.9 eV calculated PBE KS gap).Sio etal. (2019) We obtain the Hamiltonian parameters from DFT calculations using Quantum EspressoGiannozzi etal. (2017) with the EPW package.Poncé etal. (2016) We include three valence bands arising mainly from the p𝑝pitalic_p orbitals of fluorine in our calculation. These are coupled to three optical and three acoustic mode vibrations. Due to the polar nature of this system, the hole coupling to the longitudinal optical (LO) mode is the strongest and largely of the Fröhlich type. Due to the very strong coupling of the hole with phonons, we expect the ground state to be readily describable with a strong coupling ansatz. This is borne out in our calculations shown in Fig. 5. We calculate ground state energies on progressively larger k𝑘kitalic_k point grids ranging from 3×3×33333\times 3\times 33 × 3 × 3 to 13×13×1313131313\times 13\times 1313 × 13 × 13. Due to the large number of input sites in the NQS state, we are restricted in the number of hidden neurons we can employ in this calculation. Therefore as a validation of our ansatz, we also calculate energies on a smaller model including just the LO phonon mode in the Hamiltonian, which contributes the bulk of the binding energy. We find that increasing the number of hidden neurons provides modest improvements (of the order of only 10 meV) over the Toyozawa ansatz as expected in a strong density coupling case. Using a crude two-point extrapolation we obtain a binding energy of 1.97 eV in the thermodynamic limit. The largest uncertainty in this number, which is not reported, arises simply due to the crude nature of the extrapolation from small grid sizes. A future study will be devoted to a more careful extrapolation of this value. This binding energy is in good agreement with strong coupling perturbation theory calculations and results obtained from a novel all-coupling wave function ansatz,Robinson etal. as well as values reported in the literature using Landau-Pekar theory.Sio etal. (2019) With improvements in our numerical methodology, including the use of locality of interactions and low-rank compression,Luo etal. (2024) we expect to be able to perform calculations on even larger grids in the future.

We now turn to the calculation of two interacting electrons coupled to phonons. In cases of strong eph interactions, one can obtain a bound state of electrons, termed a bipolaron, due to the phonon-mediated attraction overcoming the Coulomb repulsion. Recent work on bipolarons in the Bond and SSH models shows that light yet mobile bipolarons exist in these models even with strong Coulomb interactions. In Fig. 6, we compare ground state energies obtained using NQS states with DQMC values reported in Ref. 97 for the Bond model. These calculations were performed on a 12×12121212\times 1212 × 12 lattice. Note that there is some cancellation of finite size errors due to attractive eph and repulsive e-e interactions. The largest NQS state here used 150 hidden neurons. For the SSH model, we performed extrapolations with respect to the number of hidden neurons to obtain the reported energies. No calculations of the binding energy have been reported for this model in the literature. However, they display a trend similar to the Bond model with the bipolarons staying bound even at large Coulomb interactions.

Structure and dynamics of electron-phonon coupled systems using neural quantum states (11)

Finally, we present results for the Hubbard-Holstein model on a one-dimensional chain of 20 sites at half-filling with U=4𝑈4U=4italic_U = 4. These calculations were performed in the site basis. We used a generalized Hartree Fock (GHF) state as the reference antisymmetric electronic state multiplied with an NQS eph Jastrow factor,

𝐧|ψ=𝐧|ψNQS𝐧e|ψGHF,inner-product𝐧𝜓inner-product𝐧subscript𝜓NQSinner-productsubscript𝐧𝑒subscript𝜓GHF\langle\mathbf{n}|\psi\rangle=\langle\mathbf{n}|\psi_{\text{NQS}}\rangle%\langle\mathbf{n}_{e}|\psi_{\text{GHF}}\rangle,⟨ bold_n | italic_ψ ⟩ = ⟨ bold_n | italic_ψ start_POSTSUBSCRIPT NQS end_POSTSUBSCRIPT ⟩ ⟨ bold_n start_POSTSUBSCRIPT italic_e end_POSTSUBSCRIPT | italic_ψ start_POSTSUBSCRIPT GHF end_POSTSUBSCRIPT ⟩ ,(23)

where 𝐧esubscript𝐧𝑒\mathbf{n}_{e}bold_n start_POSTSUBSCRIPT italic_e end_POSTSUBSCRIPT denotes just the electronic part of the input configuration. Note that the GHF state breaks the spin projection Szsubscript𝑆𝑧S_{z}italic_S start_POSTSUBSCRIPT italic_z end_POSTSUBSCRIPT symmetry, which is restored by the VMC sampling procedure. We use a real MLP with 40 hidden neurons as the Jastrow factor. The sign structure is therefore inherited from the reference GHF state, which is a bias in this calculation. Despite these considerations, calculations on the half-filled pure Hubbard model indicate that this state is an excellent approximation for describing electronic correlation. In this model, as the eph coupling is increased, the system undergoes a transition from a quasiordered Mott insulator phase to a charge density wave (CDW) phase. This is reflected in the energies shown in Fig. 7. Agreement with AFQMC results, obtained using the constrained path approximation,Lee etal. (2021) is seen to be very good. We note that to converge the wave function to the correct CDW minimum at larger couplings, we perform the initial GHF calculation with an effective attractive coupling given by Ueff=U4λsubscript𝑈eff𝑈4𝜆U_{\text{eff}}=U-4\lambdaitalic_U start_POSTSUBSCRIPT eff end_POSTSUBSCRIPT = italic_U - 4 italic_λ. We see that the electronic double occupancy of the lattice sites given by

d=2Ninini,𝑑2𝑁subscript𝑖delimited-⟨⟩subscript𝑛𝑖absentsubscript𝑛𝑖absentd=\frac{2}{N}\sum_{i}\langle n_{i\uparrow}n_{i\downarrow}\rangle,italic_d = divide start_ARG 2 end_ARG start_ARG italic_N end_ARG ∑ start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ⟨ italic_n start_POSTSUBSCRIPT italic_i ↑ end_POSTSUBSCRIPT italic_n start_POSTSUBSCRIPT italic_i ↓ end_POSTSUBSCRIPT ⟩ ,(24)

changes rapidly near the transition point. Double occupancy close to one is an indicator of electron pairing in the CDW phase.

3.2 Dynamical properties

One particle spectral functions of polaron models have been extensively studied in the past and serve as a good benchmark for the current method. Here we will work in the momentum space basis to calculate the one-particle spectral function at zero temperature given by

A(k,ω)=1πIm0|ck1ωH+iηck|0,𝐴𝑘𝜔1𝜋Imquantum-operator-product0subscript𝑐𝑘1𝜔𝐻𝑖𝜂subscriptsuperscript𝑐𝑘0A(k,\omega)=-\frac{1}{\pi}\text{Im}\langle 0|c_{k}\frac{1}{\omega-H+i\eta}c^{%\dagger}_{k}|0\rangle,italic_A ( italic_k , italic_ω ) = - divide start_ARG 1 end_ARG start_ARG italic_π end_ARG Im ⟨ 0 | italic_c start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT divide start_ARG 1 end_ARG start_ARG italic_ω - italic_H + italic_i italic_η end_ARG italic_c start_POSTSUPERSCRIPT † end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT | 0 ⟩ ,(25)

where |0ket0|0\rangle| 0 ⟩ denotes the vacuum state. This quantity is directly related to angle-resolved photoemission spectroscopy (ARPES) measurements.Sobota etal. (2021) Accurate numerical calculations have been performed using variations of exact diagonalization methods,Marsiglio (1993); Zhang etal. (1999); Weiße etal. (2006) DMRG,Jansen etal. (2020) hierarchical equation of motion,Mitrić etal. (2022) and generalized Green’s function cluster expansionCarbone etal. (2021) methods on modestly sized systems and mostly for the Holstein model. While it may be difficult or impractical to do so in certain Hamiltonian parameter regimes, these methods nonetheless have the virtue of allowing the user to assess the convergence of the results obtained to the exact limit.

Structure and dynamics of electron-phonon coupled systems using neural quantum states (12)

As an illustrative example, we compare our results with exact diagonalization of the Bond model (ω0=1subscript𝜔01\omega_{0}=1italic_ω start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT = 1, λ=1𝜆1\lambda=1italic_λ = 1) polaron in an 8 site chain in a truncated space with a maximum of 5 phonons. We restricted the number of phonons in VMC sampling to the same number for consistency. To focus solely on the quality of the LR approximation and its convergence with the quality of the ansatz, we obtained the LR-VMC results deterministically in this small example by simply summing over all the eph configurations instead of VMC sampling. Discussion of the biases due to sampling can be found in Appendix B. Fig. 8 shows the comparison of the exact spectral function with an LR-VMC calculation for different numbers of hidden neurons in a single hidden layer MLP state. nHsubscript𝑛𝐻n_{H}italic_n start_POSTSUBSCRIPT italic_H end_POSTSUBSCRIPT in the figure denotes the number of hidden neurons in the magnitude and phase NNs each.

Structure and dynamics of electron-phonon coupled systems using neural quantum states (13)

At k=0𝑘0k=0italic_k = 0, the dominant quasiparticle peak corresponds to the polaron ground state. Subsequent peaks result from the addition of phonons to the polaron, the first one being at roughly ω0subscript𝜔0\omega_{0}italic_ω start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT energy above the ground state, marking the onset of the phonon excitation continuum corresponding to states of the polaron with additional phonon excitations. LR-VMC on top of the nH=20subscript𝑛𝐻20n_{H}=20italic_n start_POSTSUBSCRIPT italic_H end_POSTSUBSCRIPT = 20 state captures the first 4 peaks very well but starts to deviate from the reference spectral function at higher energies. Increasing the number of hidden neurons in the ansatz to 40, we see that the agreement with the reference improves at higher energies. The norm of the projection of exact energy eigenstates on the LR-VMC tangent space, PLR|Einormsubscript𝑃LRketsubscript𝐸𝑖\|P_{\text{LR}}|E_{i}\rangle\|∥ italic_P start_POSTSUBSCRIPT LR end_POSTSUBSCRIPT | italic_E start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ⟩ ∥, shown in the inset, is a measure of the quality of the LR approximation. We see that the overlap decays for higher energy states, but lower energy states are represented remarkably well even with nH=20subscript𝑛𝐻20n_{H}=20italic_n start_POSTSUBSCRIPT italic_H end_POSTSUBSCRIPT = 20. This evidence suggests that the tangent space states likely represent simple excitations on top of the ground state, like those present in the low-lying eigenstates.

At the band edge, k=π𝑘𝜋k=\piitalic_k = italic_π, the spectrum is concentrated in the higher energy regions corresponding to the incoherent phonon continuum. The low energy states in this case have the electron close to the band minimum with high momentum phonons leading to a low spectral weight. The nH=40subscript𝑛𝐻40n_{H}=40italic_n start_POSTSUBSCRIPT italic_H end_POSTSUBSCRIPT = 40 wave function captures this low part of the spectrum well as seen from the projection norm shown in the inset. Because of the nature of the LR ansatz, it takes many more hidden neurons, nH=80subscript𝑛𝐻80n_{H}=80italic_n start_POSTSUBSCRIPT italic_H end_POSTSUBSCRIPT = 80, to nearly converge to the exact spectral function at higher energies. The tangent space for this state essentially represents the whole truncated space as evidenced by the norms of the energy eigenstate projections. We note that the more compact wave function still produces a qualitatively correct structure in the incoherent region.

In Fig. 9, we show the spectral functions of the Holstein, Bond, and SSH model polaron on a 30 site chain. The spectral functions were converged with respect to the number of hidden neurons in the ansatz up to the stochastic error. For ω0=1subscript𝜔01\omega_{0}=1italic_ω start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT = 1 and λ=2𝜆2\lambda=2italic_λ = 2 used here, we verified that finite size effects are negligible. In the Holstein model spectral function, the first excited state at k=0𝑘0k=0italic_k = 0 is exactly ω0subscript𝜔0\omega_{0}italic_ω start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT in energy above the ground state, indicating an unbound state of a phonon well separated from the ground state polaron. We also note the appearance of a discrete state immediately above the one phonon continuum band more prominent near the band edge. In Ref. 109, this was characterized as an antibound state between the polaron and a phonon which has vanishing spectral weight at k=0𝑘0k=0italic_k = 0. We do not see the nondispersive repulsive state seen in that work, which they attributed to the finite size of the lattice used in that work.

The Bond model has a higher binding energy compared to the Holstein model while having a lighter polaron mass at the same time. This behavior which can be attributed to the coupling of phonons to carrier hopping terms has been noted in previous work.Carbone etal. (2021) We also note the presence of a bound first excited state just below the one phonon continuum band. While it is almost nondispersive at this coupling, it has a concave dispersion at stronger couplings (not shown here). This is in contrast to the Holstein model, where the first excited bound state at intermediate couplings has the same dispersion shape as the ground state. The SSH model has a very different spectrum compared to the other two models with the ground state at a non-zero momentum. We again see the appearance of a bound excited state below the first phonon continuum carrying a large spectral weight.

Structure and dynamics of electron-phonon coupled systems using neural quantum states (14)

We also calculated the ground and first excited states of the two-dimensional Holstein bipolaron as a function of the electronic interaction U𝑈Uitalic_U. Results on a 10×10101010\times 1010 × 10 lattice are shown in Fig. 10. For moderate eph coupling, the ground state in this model evolves from a strongly bound S0subscript𝑆0S_{0}italic_S start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT bipolaron, with both electrons mostly on the same site, to a weakly bound S1subscript𝑆1S_{1}italic_S start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT bipolaron with the two electrons on neighboring sites. Our ground state energies are in good agreement with those reported in Ref. 110. In the on-site regime, the energy increases nearly linearly with U𝑈Uitalic_U. Strong coupling argumentsMacridin etal. (2004) suggest the presence of two singlet excited S1subscript𝑆1S_{1}italic_S start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT states below the phonon and free-polaron continuua for weak to intermediate U𝑈Uitalic_U. These states are bound due to the interplay of effective kinetic exchange interactions with eph coupling. One of these states has a d𝑑ditalic_d-wave symmetry, while the other has s𝑠sitalic_s-wave symmetry. Since spatial symmetry is projected in our calculations, LR-VMC on top of the s𝑠sitalic_s-wave symmetric ground state only captures excited states in this symmetry sector. We find this excited state just below the phonon continuum for small U𝑈Uitalic_U. With increasing U𝑈Uitalic_U it starts mixing with the S0subscript𝑆0S_{0}italic_S start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ground state and after the crossover, it becomes the ground state.

Structure and dynamics of electron-phonon coupled systems using neural quantum states (15)
Structure and dynamics of electron-phonon coupled systems using neural quantum states (16)

In Fig. 11 we present dynamical spin and charge structure factors for the half-filled Hubbard-Holstein model on a 30 site chain. The dynamical structure factors are defined as

Sσ(k,ω)=1πImψ0|Skz1ωH+E0+iηSkz|ψ0Sρ(k,ω)=1πImψ0|Nk1ωH+E0+iηNk|ψ0,subscript𝑆𝜎𝑘𝜔1𝜋Imquantum-operator-productsubscript𝜓0subscriptsuperscript𝑆𝑧𝑘1𝜔𝐻subscript𝐸0𝑖𝜂subscriptsuperscript𝑆𝑧𝑘subscript𝜓0subscript𝑆𝜌𝑘𝜔1𝜋Imquantum-operator-productsubscript𝜓0subscript𝑁𝑘1𝜔𝐻subscript𝐸0𝑖𝜂subscript𝑁𝑘subscript𝜓0\begin{split}S_{\sigma}(k,\omega)&=-\frac{1}{\pi}\text{Im}\langle\psi_{0}|S^{z%}_{-k}\frac{1}{\omega-H+E_{0}+i\eta}S^{z}_{k}|\psi_{0}\rangle\\S_{\rho}(k,\omega)&=-\frac{1}{\pi}\text{Im}\langle\psi_{0}|N_{-k}\frac{1}{%\omega-H+E_{0}+i\eta}N_{k}|\psi_{0}\rangle,\end{split}start_ROW start_CELL italic_S start_POSTSUBSCRIPT italic_σ end_POSTSUBSCRIPT ( italic_k , italic_ω ) end_CELL start_CELL = - divide start_ARG 1 end_ARG start_ARG italic_π end_ARG Im ⟨ italic_ψ start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT | italic_S start_POSTSUPERSCRIPT italic_z end_POSTSUPERSCRIPT start_POSTSUBSCRIPT - italic_k end_POSTSUBSCRIPT divide start_ARG 1 end_ARG start_ARG italic_ω - italic_H + italic_E start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT + italic_i italic_η end_ARG italic_S start_POSTSUPERSCRIPT italic_z end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT | italic_ψ start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ⟩ end_CELL end_ROW start_ROW start_CELL italic_S start_POSTSUBSCRIPT italic_ρ end_POSTSUBSCRIPT ( italic_k , italic_ω ) end_CELL start_CELL = - divide start_ARG 1 end_ARG start_ARG italic_π end_ARG Im ⟨ italic_ψ start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT | italic_N start_POSTSUBSCRIPT - italic_k end_POSTSUBSCRIPT divide start_ARG 1 end_ARG start_ARG italic_ω - italic_H + italic_E start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT + italic_i italic_η end_ARG italic_N start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT | italic_ψ start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ⟩ , end_CELL end_ROW(26)

where |ψ0ketsubscript𝜓0|\psi_{0}\rangle| italic_ψ start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ⟩ and E0subscript𝐸0E_{0}italic_E start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT are the ground state wave function and energy, respectively. Unlike the polaron spectral function calculations, here, we perform a single VMC optimization for the ground state at k=0𝑘0k=0italic_k = 0. The LR basis functions at all k𝑘kitalic_k points are obtained by momentum projection of the k=0𝑘0k=0italic_k = 0 LR space. For the parameters used here, the system is in the Mott insulating phase with quasi-antiferromagnetic ordering. We used a Szsubscript𝑆𝑧S_{z}italic_S start_POSTSUBSCRIPT italic_z end_POSTSUBSCRIPT and momentum projected NQS-Jastrow GHF wave function. In this phase, the spin spectrum is gapless and can be described by a two-spinon continuum, with a bulk of the spectral weight around k=π𝑘𝜋k=\piitalic_k = italic_π. The charge spectrum, on the other hand, is gapped and consists of two-doublon excitations. There is also a weak contribution to the structure factor due to two-doublon two-spinon excitations.Essler etal. (2005) The spectra for the Hubbard-Holstein model shown in the figure follow this expected behavior and are in good agreement with those reported in Ref. 112 obtained using the continuous time Monte Carlo method at a small but finite temperature. In the charge spectrum, we also see a feature at higher energies due to phonon excitations.

4 Conclusion

In this paper, we have developed and examined the performance of neural quantum states for describing the effects of eph coupling in a wide class of models. Within these models, we considered different types of lattice models including diagonal and off-diagonal couplings. We considered different types of eph couplings, dimensionality, and the interplay of electron-electron interactions with eph coupling. In nearly all cases we found NQS to be able to describe ground state correlations accurately and efficiently. In extreme cases like the strong coupling regime of the SSH model, the NQS approach has some difficulty in describing the ground state correlations, but for polaron problems, it is possible to systematically obtain more accurate answers at the expense of a larger computational cost. We have also applied our methodology to calculate the hole polaron binding energy in LiF, demonstrating the possibility of using NQS to perform ab initio calculations. Lastly, we studied a linear response strategy to calculate spectral properties based on NQS. This approach is attractive since it only requires a nonlinear stochastic optimization of the ground state, with the tangent space of the parameter manifold naturally serving as the response space. We showed that low-lying excitations can be well described in this framework without the need for manually constructing excited states. The ability to describe spectral properties accurately offers a sizeable advantage over imaginary time approaches which require analytic continuation for this task.

Our work opens up many avenues of future research. Applications to more complex ab initio systems can be enabled by exploiting the locality of interactions and low-rank properties in the Hamiltonian.Luo etal. (2024) Integrating semiclassical methods to account for acoustic phonons would allow the incorporation of these slow degrees of freedom more efficiently. Employing more sophisticated neural network architectures should facilitate the use of more efficient representations of the eph correlations. Enhancements in the implementation of dynamical calculations will enable the study of finite temperature transport and spectral properties in ab initio systems. Lastly, leveraging strategies developed for describing electronic correlation in NQS, we also anticipate exploring the interplay between eph and electronic correlations in more realistic and complex models of strongly correlated electronic systems. Some of these directions will be explored in the immediate future.

Acknowledgments

We thank Marco Bernardi and Yao Luo for their helpful assistance in parametrizing the LiF ab initio model. A.M. thanks Arkajit Mandal and Zhihao Cui for the useful discussions. A.M. and D.R.R. were partially supported by NSF CHE-2245592. P.J.R. acknowledges support from the National ScienceFoundation Graduate Research Fellowship under GrantNo. DGE-2036197. This work used the Delta system at the National Center for Supercomputing Applications through allocation CHE230028 from the Advanced Cyberinfrastructure Coordination Ecosystem: Services and Support (ACCESS) program, which is supported by National Science Foundation grants #2138259, #2138286, #2138307, #2137603, and #2138296.

Appendix A Variance extrapolation of NQS energies

In some cases, achieving convergence of energies with respect to the number of parameters in the NQS is challenging due to difficulties in the optimization of states with a large number of parameters. The most challenging cases of this class are models where the eph coupling strongly modulates the electron hopping. In these cases, we employ the technique of variance extrapolation to estimate the exact energy using a series of approximate calculations. This extrapolation is based on the rationale that since the exact ground state has zero energy variance (H2H2delimited-⟨⟩superscript𝐻2superscriptdelimited-⟨⟩𝐻2\langle H^{2}\rangle-\langle H\rangle^{2}⟨ italic_H start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT ⟩ - ⟨ italic_H ⟩ start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT), more accurate wave functions usually have lower energy variance in addition to lower variational energies. We use a linear fit to energies against variance to estimate the zero variance value.

Structure and dynamics of electron-phonon coupled systems using neural quantum states (17)

Fig. 12 shows the variance extrapolation of two-dimensional Bond and SSH model polaron ground state energies. DQMC energies taken from the work of Zhang etal. (2021) are shown for comparison. We chose values of coupling leading to similar energies in the two models to highlight the differences in convergence with respect to the number of hidden neurons. We see a slower convergence for the SSH model in the intermediate and strong coupling regime compared to the Bond model. The variance extrapolated NQS energy is within 1%percent11\%1 % of the binding energy obtained by DQMC. We note that DQMC has a sign problem for the SSH model that also renders convergence challenging for larger couplings.

Appendix B Sampling the Hamiltonian and overlap matrices

Here we provide a more detailed description of the sampling of Hamiltonian and overlap matrices used in LR-VMC (see Eq. 18). One way to sample these matrix elements is by sampling the amplitude square of the ground state as

𝐇μν=ψμ|H|ψνψ0|ψ0=w|ψ0|w|2ψ0|ψ0ψμ|wψ0|ww|H|ψνw|ψ0,𝐒μν=ψμ|ψνψ0|ψ0=w|ψ0|w|2ψ0|ψ0ψμ|wψ0|ww|ψνw|ψ0.formulae-sequencesubscript𝐇𝜇𝜈quantum-operator-productsubscript𝜓𝜇𝐻subscript𝜓𝜈inner-productsubscript𝜓0subscript𝜓0subscript𝑤superscriptinner-productsubscript𝜓0𝑤2inner-productsubscript𝜓0subscript𝜓0inner-productsubscript𝜓𝜇𝑤inner-productsubscript𝜓0𝑤quantum-operator-product𝑤𝐻subscript𝜓𝜈inner-product𝑤subscript𝜓0subscript𝐒𝜇𝜈inner-productsubscript𝜓𝜇subscript𝜓𝜈inner-productsubscript𝜓0subscript𝜓0subscript𝑤superscriptinner-productsubscript𝜓0𝑤2inner-productsubscript𝜓0subscript𝜓0inner-productsubscript𝜓𝜇𝑤inner-productsubscript𝜓0𝑤inner-product𝑤subscript𝜓𝜈inner-product𝑤subscript𝜓0\begin{split}\mathbf{H}_{\mu\nu}&=\frac{\langle\psi_{\mu}|H|\psi_{\nu}\rangle}%{\langle\psi_{0}|\psi_{0}\rangle}=\sum_{w}\frac{|\langle\psi_{0}|w\rangle|^{2}%}{\langle\psi_{0}|\psi_{0}\rangle}\frac{\langle\psi_{\mu}|w\rangle}{\langle%\psi_{0}|w\rangle}\frac{\langle w|H|\psi_{\nu}\rangle}{\langle w|\psi_{0}%\rangle},\\\mathbf{S}_{\mu\nu}&=\frac{\langle\psi_{\mu}|\psi_{\nu}\rangle}{\langle\psi_{0%}|\psi_{0}\rangle}=\sum_{w}\frac{|\langle\psi_{0}|w\rangle|^{2}}{\langle\psi_{%0}|\psi_{0}\rangle}\frac{\langle\psi_{\mu}|w\rangle}{\langle\psi_{0}|w\rangle}%\frac{\langle w|\psi_{\nu}\rangle}{\langle w|\psi_{0}\rangle}.\end{split}start_ROW start_CELL bold_H start_POSTSUBSCRIPT italic_μ italic_ν end_POSTSUBSCRIPT end_CELL start_CELL = divide start_ARG ⟨ italic_ψ start_POSTSUBSCRIPT italic_μ end_POSTSUBSCRIPT | italic_H | italic_ψ start_POSTSUBSCRIPT italic_ν end_POSTSUBSCRIPT ⟩ end_ARG start_ARG ⟨ italic_ψ start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT | italic_ψ start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ⟩ end_ARG = ∑ start_POSTSUBSCRIPT italic_w end_POSTSUBSCRIPT divide start_ARG | ⟨ italic_ψ start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT | italic_w ⟩ | start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT end_ARG start_ARG ⟨ italic_ψ start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT | italic_ψ start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ⟩ end_ARG divide start_ARG ⟨ italic_ψ start_POSTSUBSCRIPT italic_μ end_POSTSUBSCRIPT | italic_w ⟩ end_ARG start_ARG ⟨ italic_ψ start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT | italic_w ⟩ end_ARG divide start_ARG ⟨ italic_w | italic_H | italic_ψ start_POSTSUBSCRIPT italic_ν end_POSTSUBSCRIPT ⟩ end_ARG start_ARG ⟨ italic_w | italic_ψ start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ⟩ end_ARG , end_CELL end_ROW start_ROW start_CELL bold_S start_POSTSUBSCRIPT italic_μ italic_ν end_POSTSUBSCRIPT end_CELL start_CELL = divide start_ARG ⟨ italic_ψ start_POSTSUBSCRIPT italic_μ end_POSTSUBSCRIPT | italic_ψ start_POSTSUBSCRIPT italic_ν end_POSTSUBSCRIPT ⟩ end_ARG start_ARG ⟨ italic_ψ start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT | italic_ψ start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ⟩ end_ARG = ∑ start_POSTSUBSCRIPT italic_w end_POSTSUBSCRIPT divide start_ARG | ⟨ italic_ψ start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT | italic_w ⟩ | start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT end_ARG start_ARG ⟨ italic_ψ start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT | italic_ψ start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ⟩ end_ARG divide start_ARG ⟨ italic_ψ start_POSTSUBSCRIPT italic_μ end_POSTSUBSCRIPT | italic_w ⟩ end_ARG start_ARG ⟨ italic_ψ start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT | italic_w ⟩ end_ARG divide start_ARG ⟨ italic_w | italic_ψ start_POSTSUBSCRIPT italic_ν end_POSTSUBSCRIPT ⟩ end_ARG start_ARG ⟨ italic_w | italic_ψ start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ⟩ end_ARG . end_CELL end_ROW(27)

This sampling method naturally follows from the ground state energy sampling approach, but it has the following issue. Because the configurations |wket𝑤|w\rangle| italic_w ⟩ are drawn from the ground state distribution, they do not necessarily have substantial support on the tangent space states |ψμketsubscript𝜓𝜇|\psi_{\mu}\rangle| italic_ψ start_POSTSUBSCRIPT italic_μ end_POSTSUBSCRIPT ⟩. While this does not lead to the infinite variance problem seen in continuum simulations, for the discrete case one obtains high-variance estimates due to the ratios ψμ|wψ0|winner-productsubscript𝜓𝜇𝑤inner-productsubscript𝜓0𝑤\frac{\langle\psi_{\mu}|w\rangle}{\langle\psi_{0}|w\rangle}divide start_ARG ⟨ italic_ψ start_POSTSUBSCRIPT italic_μ end_POSTSUBSCRIPT | italic_w ⟩ end_ARG start_ARG ⟨ italic_ψ start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT | italic_w ⟩ end_ARG becoming large for certain configurations, making the method statistically inefficient.

One way to mitigate this problem is to use a different sampling function. This has been recognized in various QMC excited state studies Ceperley and Bernu (1988); Filippi etal. (2009); Li and Yang (2010). Termed reweighting in Ref. 87, this method uses the following sampling approach:

𝐇μν=ψμ|H|ψνZ=wZwZ|ψ0|w|2Zwψμ|wψ0|ww|H|ψνw|ψ0,𝐒μν=ψμ|ψνZ=wZwZ|ψ0|w|2Zwψμ|wψ0|ww|ψνw|ψ0,Z=wZw=wμ|ψμ|w|2.formulae-sequencesubscript𝐇𝜇𝜈quantum-operator-productsubscript𝜓𝜇𝐻subscript𝜓𝜈𝑍subscript𝑤subscript𝑍𝑤𝑍superscriptinner-productsubscript𝜓0𝑤2subscript𝑍𝑤inner-productsubscript𝜓𝜇𝑤inner-productsubscript𝜓0𝑤quantum-operator-product𝑤𝐻subscript𝜓𝜈inner-product𝑤subscript𝜓0subscript𝐒𝜇𝜈inner-productsubscript𝜓𝜇subscript𝜓𝜈𝑍subscript𝑤subscript𝑍𝑤𝑍superscriptinner-productsubscript𝜓0𝑤2subscript𝑍𝑤inner-productsubscript𝜓𝜇𝑤inner-productsubscript𝜓0𝑤inner-product𝑤subscript𝜓𝜈inner-product𝑤subscript𝜓0𝑍subscript𝑤subscript𝑍𝑤subscript𝑤subscript𝜇superscriptinner-productsubscript𝜓𝜇𝑤2\begin{split}\mathbf{H}_{\mu\nu}&=\frac{\langle\psi_{\mu}|H|\psi_{\nu}\rangle}%{Z}=\sum_{w}\frac{Z_{w}}{Z}\frac{|\langle\psi_{0}|w\rangle|^{2}}{Z_{w}}\frac{%\langle\psi_{\mu}|w\rangle}{\langle\psi_{0}|w\rangle}\frac{\langle w|H|\psi_{%\nu}\rangle}{\langle w|\psi_{0}\rangle},\\\mathbf{S}_{\mu\nu}&=\frac{\langle\psi_{\mu}|\psi_{\nu}\rangle}{Z}=\sum_{w}%\frac{Z_{w}}{Z}\frac{|\langle\psi_{0}|w\rangle|^{2}}{Z_{w}}\frac{\langle\psi_{%\mu}|w\rangle}{\langle\psi_{0}|w\rangle}\frac{\langle w|\psi_{\nu}\rangle}{%\langle w|\psi_{0}\rangle},\\Z&=\sum_{w}Z_{w}=\sum_{w}\sum_{\mu}|\langle\psi_{\mu}|w\rangle|^{2}.\\\end{split}start_ROW start_CELL bold_H start_POSTSUBSCRIPT italic_μ italic_ν end_POSTSUBSCRIPT end_CELL start_CELL = divide start_ARG ⟨ italic_ψ start_POSTSUBSCRIPT italic_μ end_POSTSUBSCRIPT | italic_H | italic_ψ start_POSTSUBSCRIPT italic_ν end_POSTSUBSCRIPT ⟩ end_ARG start_ARG italic_Z end_ARG = ∑ start_POSTSUBSCRIPT italic_w end_POSTSUBSCRIPT divide start_ARG italic_Z start_POSTSUBSCRIPT italic_w end_POSTSUBSCRIPT end_ARG start_ARG italic_Z end_ARG divide start_ARG | ⟨ italic_ψ start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT | italic_w ⟩ | start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT end_ARG start_ARG italic_Z start_POSTSUBSCRIPT italic_w end_POSTSUBSCRIPT end_ARG divide start_ARG ⟨ italic_ψ start_POSTSUBSCRIPT italic_μ end_POSTSUBSCRIPT | italic_w ⟩ end_ARG start_ARG ⟨ italic_ψ start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT | italic_w ⟩ end_ARG divide start_ARG ⟨ italic_w | italic_H | italic_ψ start_POSTSUBSCRIPT italic_ν end_POSTSUBSCRIPT ⟩ end_ARG start_ARG ⟨ italic_w | italic_ψ start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ⟩ end_ARG , end_CELL end_ROW start_ROW start_CELL bold_S start_POSTSUBSCRIPT italic_μ italic_ν end_POSTSUBSCRIPT end_CELL start_CELL = divide start_ARG ⟨ italic_ψ start_POSTSUBSCRIPT italic_μ end_POSTSUBSCRIPT | italic_ψ start_POSTSUBSCRIPT italic_ν end_POSTSUBSCRIPT ⟩ end_ARG start_ARG italic_Z end_ARG = ∑ start_POSTSUBSCRIPT italic_w end_POSTSUBSCRIPT divide start_ARG italic_Z start_POSTSUBSCRIPT italic_w end_POSTSUBSCRIPT end_ARG start_ARG italic_Z end_ARG divide start_ARG | ⟨ italic_ψ start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT | italic_w ⟩ | start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT end_ARG start_ARG italic_Z start_POSTSUBSCRIPT italic_w end_POSTSUBSCRIPT end_ARG divide start_ARG ⟨ italic_ψ start_POSTSUBSCRIPT italic_μ end_POSTSUBSCRIPT | italic_w ⟩ end_ARG start_ARG ⟨ italic_ψ start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT | italic_w ⟩ end_ARG divide start_ARG ⟨ italic_w | italic_ψ start_POSTSUBSCRIPT italic_ν end_POSTSUBSCRIPT ⟩ end_ARG start_ARG ⟨ italic_w | italic_ψ start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ⟩ end_ARG , end_CELL end_ROW start_ROW start_CELL italic_Z end_CELL start_CELL = ∑ start_POSTSUBSCRIPT italic_w end_POSTSUBSCRIPT italic_Z start_POSTSUBSCRIPT italic_w end_POSTSUBSCRIPT = ∑ start_POSTSUBSCRIPT italic_w end_POSTSUBSCRIPT ∑ start_POSTSUBSCRIPT italic_μ end_POSTSUBSCRIPT | ⟨ italic_ψ start_POSTSUBSCRIPT italic_μ end_POSTSUBSCRIPT | italic_w ⟩ | start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT . end_CELL end_ROW(28)

Thus the configurations |wket𝑤|w\rangle| italic_w ⟩ are sampled according to the distribution p(w)μ|ψμ|w|2proportional-to𝑝𝑤subscript𝜇superscriptinner-productsubscript𝜓𝜇𝑤2p(w)\propto\sum_{\mu}|\langle\psi_{\mu}|w\rangle|^{2}italic_p ( italic_w ) ∝ ∑ start_POSTSUBSCRIPT italic_μ end_POSTSUBSCRIPT | ⟨ italic_ψ start_POSTSUBSCRIPT italic_μ end_POSTSUBSCRIPT | italic_w ⟩ | start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT, which ensures sampling of configurations important for describing the excited states. The cost scaling of reweighted sampling is the same as the ground state sampling.

Structure and dynamics of electron-phonon coupled systems using neural quantum states (18)

The cost of constructing the 𝐇𝐇\mathbf{H}bold_H and 𝐒𝐒\mathbf{S}bold_S matrices scales as O(Np2Ns)𝑂superscriptsubscript𝑁𝑝2subscript𝑁𝑠O(N_{p}^{2}N_{s})italic_O ( italic_N start_POSTSUBSCRIPT italic_p end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT italic_N start_POSTSUBSCRIPT italic_s end_POSTSUBSCRIPT ), where Npsubscript𝑁𝑝N_{p}italic_N start_POSTSUBSCRIPT italic_p end_POSTSUBSCRIPT is the number of parameters and Nssubscript𝑁𝑠N_{s}italic_N start_POSTSUBSCRIPT italic_s end_POSTSUBSCRIPT is the number of samples. This cost can be reduced by using a direct method, which only samples the action of these matrices onto vectors. This has been noted in several previous works mainly in the context of optimization methodsNeuscamman etal. (2012); Sabzevari etal. (2020). Consider the following expression for the action of the overlap matrix onto a vector 𝐱𝐱\mathbf{x}bold_x:

ν𝐒μνxν=wZwZ|ψ0|w|2Zwψμ|wψ0|w(νw|ψνw|ψ0xν).subscript𝜈subscript𝐒𝜇𝜈subscript𝑥𝜈subscript𝑤subscript𝑍𝑤𝑍superscriptinner-productsubscript𝜓0𝑤2subscript𝑍𝑤inner-productsubscript𝜓𝜇𝑤inner-productsubscript𝜓0𝑤subscript𝜈inner-product𝑤subscript𝜓𝜈inner-product𝑤subscript𝜓0subscript𝑥𝜈\sum_{\nu}\mathbf{S}_{\mu\nu}x_{\nu}=\sum_{w}\frac{Z_{w}}{Z}\frac{|\langle\psi%_{0}|w\rangle|^{2}}{Z_{w}}\frac{\langle\psi_{\mu}|w\rangle}{\langle\psi_{0}|w%\rangle}\left(\sum_{\nu}\frac{\langle w|\psi_{\nu}\rangle}{\langle w|\psi_{0}%\rangle}x_{\nu}\right).∑ start_POSTSUBSCRIPT italic_ν end_POSTSUBSCRIPT bold_S start_POSTSUBSCRIPT italic_μ italic_ν end_POSTSUBSCRIPT italic_x start_POSTSUBSCRIPT italic_ν end_POSTSUBSCRIPT = ∑ start_POSTSUBSCRIPT italic_w end_POSTSUBSCRIPT divide start_ARG italic_Z start_POSTSUBSCRIPT italic_w end_POSTSUBSCRIPT end_ARG start_ARG italic_Z end_ARG divide start_ARG | ⟨ italic_ψ start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT | italic_w ⟩ | start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT end_ARG start_ARG italic_Z start_POSTSUBSCRIPT italic_w end_POSTSUBSCRIPT end_ARG divide start_ARG ⟨ italic_ψ start_POSTSUBSCRIPT italic_μ end_POSTSUBSCRIPT | italic_w ⟩ end_ARG start_ARG ⟨ italic_ψ start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT | italic_w ⟩ end_ARG ( ∑ start_POSTSUBSCRIPT italic_ν end_POSTSUBSCRIPT divide start_ARG ⟨ italic_w | italic_ψ start_POSTSUBSCRIPT italic_ν end_POSTSUBSCRIPT ⟩ end_ARG start_ARG ⟨ italic_w | italic_ψ start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ⟩ end_ARG italic_x start_POSTSUBSCRIPT italic_ν end_POSTSUBSCRIPT ) .(29)

The cost of this calculation scales as O(NpNs)𝑂subscript𝑁𝑝subscript𝑁𝑠O(N_{p}N_{s})italic_O ( italic_N start_POSTSUBSCRIPT italic_p end_POSTSUBSCRIPT italic_N start_POSTSUBSCRIPT italic_s end_POSTSUBSCRIPT ). The action of the Hamiltonian matrix can be similarly sampled. Iterative solvers can then be used to obtain spectral information of the system using only matrix vector products. In particular, the Chebyshev expansion-based kernel polynomial methodWeiße etal. (2006) allows the calculation of various dynamical correlation functions including at finite temperatures.

The statistical performance of the two approaches is shown in Fig. 13. We performed these calculations on a small Holstein chain with a truncated phonon space to allow deterministic evaluation of the spectrum in LR-VMC, which serve as reference values. We restricted the LR-VMC sampling to the same truncated Hilbert space for the sake of this comparison. While Eqs. 27 and 28 yield unbiased estimates of the 𝐇𝐇\mathbf{H}bold_H and 𝐒𝐒\mathbf{S}bold_S matrix elements, their eigenvalues are biased, as they are nonlinear functions of the matrices.Blunt etal. (2018) Using each sampling approach, we calculated averages of 100 independent calculations using various numbers of samples to compute energies of the low-lying states. As expected, there is a bias in the obtained energies for a small number of samples, which decreases systematically as we increase the number of samples in the case of reweighted sampling. On the other hand, for ground state sampling we see a persistent large bias that is nearly unchanged from around 2000 to 6000 samples and suddenly decreases for 8000 samples. This is indicative of ergodicity issues in the ground state sampling approach to LR-VMC. Here, configurations with large contributions to the excited states do not get sampled often enough if the number of samples is small, leading to a large bias, especially in the excited state energies. We also find that numerical instabilities due to linear dependencies in the basis set Blunt etal. (2018); Lee etal. (2021) are greatly reduced due to reweighted sampling.

References

  • Coropceanu etal. (2007)Coropceanu,V.; Cornil,J.; daSilvaFilho,D.A.; Olivier,Y.; Silbey,R.;Brédas,J.-L. Chemical reviews 2007, 107,926–952.
  • Zhu and Podzorov (2015)Zhu,X.-Y.; Podzorov,V. Charge carriers in hybrid organic–inorganic leadhalide perovskites might be protected as large polarons. 2015.
  • Schilcher etal. (2021)Schilcher,M.J.; Robinson,P.J.; Abramovitch,D.J.; Tan,L.Z.;Rappe,A.M.; Reichman,D.R.; Egger,D.A. ACS Energy Letters2021, 6, 2162–2173.
  • Tulyagankhodjaev etal. (2023)Tulyagankhodjaev,J.A.; Shih,P.; Yu,J.; Russell,J.C.; Chica,D.G.;Reynoso,M.E.; Su,H.; Stenor,A.C.; Roy,X.; Berkelbach,T.C.; Delor,M.Science 2023, 382, 438–442.
  • Li etal. (2020)Li,Q.; Liu,F.; Russell,J.C.; Roy,X.; Zhu,X. The Journal of ChemicalPhysics 2020, 152.
  • Verdi etal. (2017)Verdi,C.; Caruso,F.; Giustino,F. Nature Communications 2017,8, 15769.
  • Moser etal. (2013)Moser,S.; Moreschini,L.; Jaćimović,J.; Barišić,O.;Berger,H.; Magrez,A.; Chang,Y.; Kim,K.; Bostwick,A.; Rotenberg,E.;Forró,L.; Grioni,M. Physical review letters 2013,110, 196403.
  • Kang etal. (2018)Kang,M.; Jung,S.W.; Shin,W.J.; Sohn,Y.; Ryu,S.H.; Kim,T.K.;Hoesch,M.; Kim,K.S. Nature materials 2018, 17,676–680.
  • Setvin etal. (2014)Setvin,M.; Franchini,C.; Hao,X.; Schmid,M.; Janotti,A.; Kaltak,M.; VandeWalle,C.G.; Kresse,G.; Diebold,U. Physical review letters2014, 113, 086402.
  • Bardeen etal. (1957)Bardeen,J.; Cooper,L.N.; Schrieffer,J.R. Physical review1957, 108, 1175.
  • Lanzara etal. (2001)Lanzara,A.; Bogdanov,P.; Zhou,X.; Kellar,S.; Feng,D.; Lu,E.; Yoshida,T.;Eisaki,H.; Fujimori,A.; Kishio,K.; Shimoyama,J.-i.; Noda,T.;Uchida,S.-i.; Hussain,Z.; Shen,Z.-X. Nature 2001,412, 510–514.
  • Alexandrov and Kornilovitch (2002)Alexandrov,A.; Kornilovitch,P. Journal of Physics: Condensed Matter2002, 14, 5337.
  • Giustino etal. (2008)Giustino,F.; Cohen,M.L.; Louie,S.G. Nature 2008,452, 975–978.
  • Yin etal. (2013)Yin,Z.; Kutepov,A.; Kotliar,G. Physical Review X 2013,3, 021011.
  • Karakuzu etal. (2017)Karakuzu,S.; Tocchio,L.F.; Sorella,S.; Becca,F. Physical Review B2017, 96, 205145.
  • Costa etal. (2018)Costa,N.C.; Blommel,T.; Chiu,W.-T.; Batrouni,G.; Scalettar,R.Physical Review Letters 2018, 120, 187003.
  • Luo etal. (2022)Luo,H. etal. Nature communications 2022, 13,273.
  • Ly etal. (2023)Ly,A.T.; Cohen-Stead,B.; Costa,S.M.; Johnston,S. Physical Review B2023, 108, 184501.
  • Zhang etal. (2023)Zhang,C.; Sous,J.; Reichman,D.; Berciu,M.; Millis,A.; Prokof’ev,N.;Svistunov,B. Physical Review X 2023, 13, 011010.
  • Alexandrov and Devreese (2010)Alexandrov,A.S.; Devreese,J.T. Advances in polaron physics;Springer, 2010; Vol. 159.
  • Franchini etal. (2021)Franchini,C.; Reticcioli,M.; Setvin,M.; Diebold,U. Nature ReviewsMaterials 2021, 6, 560–586.
  • Giustino (2017)Giustino,F. Reviews of Modern Physics 2017, 89,015003.
  • Holstein (1959)Holstein,T. Annals of physics 1959, 8, 325–342.
  • Holstein (1959)Holstein,T. Annals of physics 1959, 8, 343–389.
  • Fröhlich (1954)Fröhlich,H. Advances in Physics 1954, 3,325–361.
  • Feynman (1955)Feynman,R.P. Physical Review 1955, 97, 660.
  • Kornilovitch (1998)Kornilovitch,P. Physical review letters 1998, 81,5382.
  • Beyl etal. (2018)Beyl,S.; Goth,F.; Assaad,F.F. Physical Review B 2018,97, 085144.
  • Lee etal. (2021)Lee,J.; Zhang,S.; Reichman,D.R. Physical Review B 2021,103, 115123.
  • Cohen-Stead etal. (2022)Cohen-Stead,B.; Bradley,O.; Miles,C.; Batrouni,G.; Scalettar,R.;Barros,K. Physical Review E 2022, 105, 065302.
  • Prokof’ev and Svistunov (1998)Prokof’ev,N.V.; Svistunov,B.V. Physical review letters1998, 81, 2514.
  • Berciu (2006)Berciu,M. Physical review letters 2006, 97,036402.
  • Lee etal. (1953)Lee,T.; Low,F.; Pines,D. Physical Review 1953, 90,297.
  • Toyozawa (1961)Toyozawa,Y. Progress of Theoretical Physics 1961, 26,29–44.
  • Brown etal. (1997)Brown,D.W.; Lindenberg,K.; Zhao,Y. The Journal of chemical physics1997, 107, 3179–3195.
  • Alder etal. (1997)Alder,B.; Runge,K.; Scalettar,R. Physical review letters1997, 79, 3022.
  • Ohgoe and Imada (2014)Ohgoe,T.; Imada,M. Physical Review B 2014, 89,195139.
  • Hohenadler etal. (2004)Hohenadler,M.; Evertz,H.G.; Vonder Linden,W. Physical Review B2004, 69, 024301.
  • Ferrari etal. (2020)Ferrari,F.; Valenti,R.; Becca,F. Physical Review B 2020,102, 125149.
  • Marsiglio (1995)Marsiglio,F. Physica C: Superconductivity 1995, 244,21–34.
  • Hohenadler etal. (2003)Hohenadler,M.; Aichhorn,M.; vonder Linden,W. Physical Review B2003, 68, 184304.
  • Jeckelmann and White (1998)Jeckelmann,E.; White,S.R. Physical Review B 1998, 57,6376.
  • Bonča etal. (1999)Bonča,J.; Trugman,S.; Batistić,I. Physical Review B1999, 60, 1633.
  • Wang etal. (2020)Wang,Y.; Esterlis,I.; Shi,T.; Cirac,J.I.; Demler,E. Physical ReviewResearch 2020, 2, 043258.
  • Baroni etal. (2001)Baroni,S.; DeGironcoli,S.; DalCorso,A.; Giannozzi,P. Reviews ofmodern Physics 2001, 73, 515.
  • Sjakste etal. (2015)Sjakste,J.; Vast,N.; Calandra,M.; Mauri,F. Physical Review B2015, 92, 054307.
  • Poncé etal. (2016)Poncé,S.; Margine,E.R.; Verdi,C.; Giustino,F. Computer PhysicsCommunications 2016, 209, 116–133.
  • Zhou etal. (2021)Zhou,J.-J.; Park,J.; Lu,I.-T.; Maliyov,I.; Tong,X.; Bernardi,M.Computer Physics Communications 2021, 264,107970.
  • Verdi and Giustino (2015)Verdi,C.; Giustino,F. Physical review letters 2015,115, 176401.
  • Lee etal. (2021)Lee,N.-E.; Chen,H.-Y.; Zhou,J.-J.; Bernardi,M. Physical ReviewMaterials 2021, 5, 063805.
  • Lafuente-Bartolome etal. (2022)Lafuente-Bartolome,J.; Lian,C.; Sio,W.H.; Gurtubay,I.G.; Eiguren,A.;Giustino,F. Physical Review Letters 2022, 129,076402.
  • Carleo and Troyer (2017)Carleo,G.; Troyer,M. Science 2017, 355, 602–606.
  • Gao and Duan (2017)Gao,X.; Duan,L.-M. Nature communications 2017, 8,662.
  • Sharir etal. (2020)Sharir,O.; Levine,Y.; Wies,N.; Carleo,G.; Shashua,A. Physical reviewletters 2020, 124, 020503.
  • Nomura etal. (2017)Nomura,Y.; Darmawan,A.S.; Yamaji,Y.; Imada,M. Physical Review B2017, 96, 205152.
  • Choo etal. (2018)Choo,K.; Carleo,G.; Regnault,N.; Neupert,T. Physical review letters2018, 121, 167204.
  • Choo etal. (2019)Choo,K.; Neupert,T.; Carleo,G. Physical Review B 2019,100, 125124.
  • Luo and Clark (2019)Luo,D.; Clark,B.K. Physical review letters 2019, 122,226401.
  • Vicentini etal. (2019)Vicentini,F.; Biella,A.; Regnault,N.; Ciuti,C. Physical reviewletters 2019, 122, 250503.
  • Szabó and Castelnovo (2020)Szabó,A.; Castelnovo,C. Physical Review Research 2020,2, 033075.
  • RobledoMoreno etal. (2022)RobledoMoreno,J.; Carleo,G.; Georges,A.; Stokes,J. Proceedings ofthe National Academy of Sciences 2022, 119,e2122059119.
  • Hermann etal. (2020)Hermann,J.; Schätzle,Z.; Noé,F. Nature Chemistry2020, 12, 891–897.
  • Pfau etal. (2020)Pfau,D.; Spencer,J.S.; Matthews,A.G.; Foulkes,W. M.C. PhysicalReview Research 2020, 2, 033429.
  • Nomura (2020)Nomura,Y. Journal of the Physical Society of Japan 2020,89, 054706.
  • Rzadkowski etal. (2022)Rzadkowski,W.; Lemeshko,M.; Mentink,J.H. Physical Review B2022, 106, 155127.
  • Hendry and Feiguin (2019)Hendry,D.; Feiguin,A.E. Physical Review B 2019, 100,245123.
  • Hendry etal. (2021)Hendry,D.; Chen,H.; Weinberg,P.; Feiguin,A.E. Physical Review B2021, 104, 205130.
  • Mendes-Santos etal. (2023)Mendes-Santos,T.; Schmitt,M.; Heyl,M. Physical Review Letters2023, 131, 046501.
  • Cai and Liu (2018)Cai,Z.; Liu,J. Physical Review B 2018, 97,035116.
  • Westerhout etal. (2020)Westerhout,T.; Astrakhantsev,N.; Tikhonov,K.S.; Katsnelson,M.I.;Bagrov,A.A. Nature communications 2020, 11,1593.
  • Su etal. (1979)Su,W.-P.; Schrieffer,J.R.; Heeger,A.J. Physical review letters1979, 42, 1698.
  • Sengupta etal. (2003)Sengupta,P.; Sandvik,A.W.; Campbell,D.K. Physical Review B2003, 67, 245103.
  • Bravyi etal. (2006)Bravyi,S.; Divincenzo,D.P.; Oliveira,R.I.; Terhal,B.M. arXivpreprint quant-ph/0606140 2006,
  • Prokof’ev and Svistunov (2022)Prokof’ev,N.V.; Svistunov,B.V. Physical Review B 2022,106, L041117.
  • Cybenko (1989)Cybenko,G. Mathematics of control, signals and systems 1989,2, 303–314.
  • Goodfellow etal. (2016)Goodfellow,I.; Bengio,Y.; Courville,A. Deep learning; MIT press,2016.
  • Roth etal. (2023)Roth,C.; Szabó,A.; MacDonald,A.H. Physical Review B2023, 108, 054410.
  • Glasser etal. (2018)Glasser,I.; Pancotti,N.; August,M.; Rodriguez,I.D.; Cirac,J.I.Physical Review X 2018, 8, 011006.
  • Humeniuk etal. (2023)Humeniuk,S.; Wan,Y.; Wang,L. SciPost Physics 2023,14, 171.
  • Tahara and Imada (2008)Tahara,D.; Imada,M. Journal of the Physical Society of Japan2008, 77, 114701.
  • Becca and Sorella (2017)Becca,F.; Sorella,S. Quantum Monte Carlo approaches for correlatedsystems; Cambridge University Press, 2017.
  • Reh etal. (2023)Reh,M.; Schmitt,M.; Gärttner,M. Physical Review B 2023,107, 195115.
  • Clark (2018)Clark,S.R. Journal of Physics A: Mathematical and Theoretical2018, 51, 135301.
  • Sabzevari and Sharma (2018)Sabzevari,I.; Sharma,S. Journal of chemical theory and computation2018, 14, 6276–6286.
  • Reddi etal. (2019)Reddi,S.J.; Kale,S.; Kumar,S. arXiv preprint arXiv:1904.092372019,
  • McWeeny (1989)McWeeny,R. 2nd edition 1989,
  • Li and Yang (2010)Li,T.; Yang,F. Physical Review B 2010, 81,214509.
  • Dorando etal. (2009)Dorando,J.J.; Hachmann,J.; Chan,G.K. The Journal of chemicalphysics 2009, 130.
  • Wouters etal. (2013)Wouters,S.; Nakatani,N.; VanNeck,D.; Chan,G. K.-L. Physical ReviewB 2013, 88, 075122.
  • Haegeman etal. (2013)Haegeman,J.; Osborne,T.J.; Verstraete,F. Physical Review B2013, 88, 075133.
  • Zhao and Neuscamman (2016)Zhao,L.; Neuscamman,E. Journal of Chemical Theory and Computation2016, 12, 3719–3726.
  • Ido etal. (2020)Ido,K.; Imada,M.; Misawa,T. Physical Review B 2020,101, 075124.
  • (93)https://github.com/ankit76/nn_eph/.
  • Carbone etal. (2021)Carbone,M.R.; Millis,A.J.; Reichman,D.R.; Sous,J. Physical ReviewB 2021, 104, L140307.
  • Zhang etal. (2021)Zhang,C.; Prokof’ev,N.V.; Svistunov,B.V. Physical Review B2021, 104, 035143.
  • Bonča and Trugman (2021)Bonča,J.; Trugman,S. Physical Review B 2021,103, 054304.
  • Zhang etal. (2022)Zhang,C.; Prokof’ev,N.V.; Svistunov,B.V. Physical Review B2022, 105, L020501.
  • Sio etal. (2019)Sio,W.H.; Verdi,C.; Poncé,S.; Giustino,F. Physical Review B2019, 99, 235139.
  • Giannozzi etal. (2017)Giannozzi,P. etal. Journal of Physics: Condensed Matter2017, 29, 465901.
  • (100)Robinson,P.J.; Lee,J.; Mahajan,A.; Reichman,D.R. (to be submitted;see parallel arXiv submission)
  • Luo etal. (2024)Luo,Y.; Desai,D.; Park,J.; Bernardi,M. arXiv preprintarXiv:2401.11393 2024,
  • Sobota etal. (2021)Sobota,J.A.; He,Y.; Shen,Z.-X. Reviews of Modern Physics2021, 93, 025006.
  • Marsiglio (1993)Marsiglio,F. Physics Letters A 1993, 180,280–284.
  • Zhang etal. (1999)Zhang,C.; Jeckelmann,E.; White,S.R. Physical Review B 1999,60, 14092.
  • Weiße etal. (2006)Weiße,A.; Wellein,G.; Alvermann,A.; Fehske,H. Reviews of modernphysics 2006, 78, 275.
  • Jansen etal. (2020)Jansen,D.; Bonča,J.; Heidrich-Meisner,F. Physical Review B2020, 102, 165155.
  • Mitrić etal. (2022)Mitrić,P.; Janković,V.; Vukmirović,N.; Tanasković,D.Physical Review Letters 2022, 129, 096401.
  • Carbone etal. (2021)Carbone,M.R.; Reichman,D.R.; Sous,J. Physical Review B2021, 104, 035106.
  • Vidmar etal. (2010)Vidmar,L.; Bonča,J.; Trugman,S.A. Physical Review B2010, 82, 104304.
  • Macridin etal. (2004)Macridin,A.; Sawatzky,G.; Jarrell,M. Physical Review B 2004,69, 245111.
  • Essler etal. (2005)Essler,F.H.; Frahm,H.; Göhmann,F.; Klümper,A.; Korepin,V.E.The one-dimensional Hubbard model; Cambridge University Press,2005.
  • Hohenadler and Assaad (2013)Hohenadler,M.; Assaad,F.F. Physical Review B 2013,87, 075149.
  • Ceperley and Bernu (1988)Ceperley,D.; Bernu,B. The Journal of chemical physics 1988,89, 6316–6328.
  • Filippi etal. (2009)Filippi,C.; Zaccheddu,M.; Buda,F. Journal of Chemical Theory andComputation 2009, 5, 2074–2087.
  • Neuscamman etal. (2012)Neuscamman,E.; Umrigar,C.; Chan,G. K.-L. Physical Review B2012, 85, 045103.
  • Sabzevari etal. (2020)Sabzevari,I.; Mahajan,A.; Sharma,S. The Journal of chemical physics2020, 152.
  • Blunt etal. (2018)Blunt,N.S.; Alavi,A.; Booth,G.H. Physical Review B 2018,98, 085118.
  • Lee etal. (2021)Lee,J.; Malone,F.D.; Morales,M.A.; Reichman,D.R. Journal ofChemical Theory and Computation 2021, 17, 3372–3387.
Structure and dynamics of electron-phonon coupled systems using neural quantum states (2024)
Top Articles
Latest Posts
Article information

Author: Nicola Considine CPA

Last Updated:

Views: 6360

Rating: 4.9 / 5 (69 voted)

Reviews: 92% of readers found this page helpful

Author information

Name: Nicola Considine CPA

Birthday: 1993-02-26

Address: 3809 Clinton Inlet, East Aleisha, UT 46318-2392

Phone: +2681424145499

Job: Government Technician

Hobby: Calligraphy, Lego building, Worldbuilding, Shooting, Bird watching, Shopping, Cooking

Introduction: My name is Nicola Considine CPA, I am a determined, witty, powerful, brainy, open, smiling, proud person who loves writing and wants to share my knowledge and understanding with you.