<TeXmacs|1.99.7>

<style|<tuple|aps|std-latex>>

<\body>
  <doc-data|<doc-title|First-principles GW calculations for DNA and RNA
  nucleobases>|<doc-author|<author-data|<author-name|Carina
  Faber<rsup|<math|1,2>>, Claudio Attaccalite<rsup|<math|1>>, Valerio
  Olevano<rsup|<math|1>>, Erich Runge<rsup|<math|2>>, Xavier
  Blase<rsup|<math|1>>>>>|<doc-date|<date|>>>

  <abstract-data|<\abstract>
    On the basis of first-principles GW calculations, we study the
    quasiparticle properties of the guanine, adenine, cytosine, thymine, and
    uracil DNA and RNA nucleobases. Beyond standard
    G<rsub|<math|0>>W<rsub|<math|0>> calculations, starting from Kohn-Sham
    eigenstates obtained with (semi)local functionals, a simple
    self-consistency on the eigenvalues allows to obtain vertical ionization
    energies and electron affinities within an average 0.11 eV and 0.18 eV
    error respectively as compared to state-of-the-art coupled-cluster and
    multi-configurational perturbative quantum chemistry approaches. Further,
    GW calculations predict the correct <math|\<pi\>>-character of the
    highest occupied state, thanks to several level crossings between density
    functional and GW calculations. Our study is based on a recent
    gaussian-basis implementation of GW with explicit treatment of dynamical
    screening through contour deformation techniques.
  </abstract>>

  <\with|par-columns|1>
    <\big-figure>
      <image|figure1|1par|||><label|structures>
    </big-figure|(Color online) Schematic representation of the molecular
    structure of (a) guanine (G9K), (b) adenine, (c) cytosine (C1), (d)
    thymine, and (e) uracil. Black, brown, red, white atoms are carbon,
    nitrogen, oxygen, and hydrogen, respectively. The G9K and C1 notations
    for the guanine and cytosine tautomers are consistent with
    Ref.<nbsp><cite-arg|Bravaya10>. >
  </with>

  The determination of the ionization energies (IE), electronic affinities
  (EA) and character of the frontier orbitals of DNA and RNA nucleobases is
  an important step towards a better understanding of the electronic
  properties and reactivity of nucleotides and nucleosides along the DNA/RNA
  chains. Important phenomena such as nucleobases/protein interactions,
  defining the DNA functions <cite|polymerase>, or damages of the genetic
  material through oxidation or ionizing radiations <cite|Lodish>, are
  strongly related to these fundamental spectroscopic quantities. Even though
  nucleobases in DNA/RNA strands are connected within the nucleotides to
  phosphate groups through a five-carbon sugar, several studies show that the
  highest-occupied orbital (the HOMO level) in nucleotides, which is
  responsible e.g. for the sensitivity of the molecule to oxidation
  processes, remains localized on the nucleobases <cite|Close08>. Figure
  <nbsp><reference|structures> shows the structures of the DNA and RNA
  nucleobases, i.e. the purines - adenine (A) and guanine (G), and the
  pyrimidines - cytosine (C) as well as thymine (T) in DNA and uracil (U) in
  RNA.

  Besides the overarching fundamental interest in understanding complex
  biological processes at the microscopic level, <with|font-shape|italic|ab
  initio> calculations of isolated nucleobases are interesting since recent
  high-level quantum chemistry calculations
  <cite|Sanjuan06|Sanjuan08|Bravaya10> allow to rationalize the rather large
  spread of experimental results concerning the electronic properties of the
  nucleobases in the gas phase <cite|Hush75|Dougherty78|Choi05|Trofimov06|Schwell08|Zaytseva09|Kostko10>,
  in particular as due to the existence of several isomers for guanine and
  cytosine <cite|Bravaya10>. Thus, these molecules offer a valuable mean to
  explore the merits of the so-called GW formalism
  <cite|Hedin65|Strinati80|Hybertsen86|Godby88|Onida02> for isolated organic
  molecules, along the line of recent systematic studies of small molecules
  <cite|Rostgaard10> or molecules such as fullerenes or porphyrins of
  interest for electronic or photovoltaic applications
  <cite|Dori06|Tiago08|Palumno09|Umari09|Stenuit10|Blase10>.

  In the present work, we study by means of first-principles GW calculations
  the quasiparticle properties of the DNA and RNA nucleobases, namely
  guanine, adenine, cytosine, thymine and uracil. We show in particular that
  the GW correction to the Kohn-Sham eigenvalues brings the ionization
  energies in much better agreement with experiment and high-level quantum
  chemistry calculations. These results demonstrate the importance of
  self-consistency on the eigenvalues when performing GW calculations in
  molecular systems starting from (semi)local DFT functionals, and the merits
  of a simple scheme based on a G<rsub|<math|0>>W<rsub|<math|0>> calculation
  starting from Hartree-Fock like eigenvalues.

  The GW approach is a Green's function formalism usually derived within a
  functional derivative treatment <cite|Hedin65|Schwinger59> allowing to
  prove that the two-body Green's function (<math|G<rsub|2>>), involved in
  the equation of motion of the one-body time-ordered Green's function
  <math|G>, can be recast into a non-local and energy-dependent self-energy
  operator <math|\<Sigma\><around|(|<math-bf|r>,<with|math-font-family|bf|r<rprime|'>>\|E|)>>.
  This self-energy <math|\<Sigma\>> accounts for exchange and correlation in
  the present formalism. Since it is energy-dependent, it must be evaluated
  at the <math|E=\<varepsilon\><rsup|Q*P><rsub|i>> quasiparticle energies,
  where (i) indexes the molecular energy levels. This self-energy involves
  <math|G<around|(|<math-bf|r>,<with|math-font-family|bf|r<rprime|'>>\|\<omega\>|)>>,
  the dynamically-screened Coulomb potential
  <math|W<around|(|<math-bf|r>,<with|math-font-family|bf|r<rprime|'>>\|\<omega\>|)>>,
  and the so-called vertex correction <math|\<Gamma\>>. A set of exact
  self-consistent (closed) equations connects <math|G>, <math|W>,
  <math|\<Gamma\>>, and the independent-electron/full polarisabilities
  <math|\<chi\><rsub|0><around|(|<math-bf|r>,<with|math-font-family|bf|r<rprime|'>>\|\<omega\>|)>>
  and <math|\<chi\><around|(|<math-bf|r>,<with|math-font-family|bf|r<rprime|'>>\|\<omega\>|)>>,
  respectively. In the GW approximation (GWA), the three-body vertex operator
  <math|\<Gamma\>> is set to unity, yielding the following expression for the
  self-energy:

  <eqnarray*|<tformat|<table|<row|<cell|\<Sigma\><around|(|<math-bf|r>,<with|math-font-family|bf|r<rprime|'>>\|E|)>>|<cell|=>|<cell|<frac|i|2*\<pi\>>*<big|int>d*\<omega\>*<space|0.17em>e<rsup|i*\<omega\>*0<rsup|+>>*G*<around|(|<math-bf|r>,<with|math-font-family|bf|r<rprime|'>>\|E+\<omega\>|)>*W<around|(|<math-bf|r>,<with|math-font-family|bf|r<rprime|'>>\|\<omega\>|)>>>|<row|<cell|<wide|W|~><around|(|<math-bf|r>,<with|math-font-family|bf|r<rprime|'>>\|\<omega\>|)>>|<cell|=>|<cell|<big|int>d<with|math-font-family|bf|r<rsub|1>>d<with|math-font-family|bf|r<rsub|2>><space|0.17em>v<around|(|<math-bf|r>,<with|math-font-family|bf|r<rsub|1>>|)>*\<chi\><rsub|0><around|(|<with|math-font-family|bf|r<rsub|1>>,<with|math-font-family|bf|r<rsub|2>>\|\<omega\>|)>*W<around|(|<with|math-font-family|bf|r<rsub|2>>,<with|math-font-family|bf|r<rprime|'>>\|\<omega\>|)>,>>|<row|<cell|\<chi\><rsub|0><around|(|<math-bf|r>,<with|math-font-family|bf|r<rprime|'>>\|\<omega\>|)>>|<cell|=>|<cell|<big|sum><rsub|i,j><around|(|f<rsub|i>-f<rsub|j>|)>*<frac|\<phi\><rsub|i><rsup|\<ast\>><around|(|<math-bf|r>|)>*\<phi\><rsub|j><around|(|<math-bf|r>|)>*\<phi\><rsub|j><rsup|\<ast\>><around|(|<with|math-font-family|bf|r<rprime|'>>|)>*\<phi\><rsub|i><around|(|<with|math-font-family|bf|r<rprime|'>>|)>|\<varepsilon\><rsub|i>-\<varepsilon\><rsub|j>+\<omega\>\<pm\>i*\<delta\>>>>>>>

  where <math|v<around|(|<math-bf|r>,<with|math-font-family|bf|r<rprime|'>>|)>>
  is the bare (unscreened) Coulomb potential and <math|<wide|W|~>=W-v>. The
  <math|<around|(|\<varepsilon\><rsub|i>,\<phi\><rsub|i>|)>> are
  \Pzeroth-order" one-body eigenstates. Following the large bulk of work
  <cite|Onida02> devoted to GW calculations in solids, surfaces, graphene,
  nanotubes, or nanowires, we use here Kohn-Sham DFT-LDA eigenstates. It is
  shown below, and in Refs.<nbsp><cite-arg|Rostgaard10,Kaasbjerg10,Blase10,Hahn05>,
  that Hartree-Fock (or hybrid) solutions may constitute better starting
  points for molecular systems. <math|<around|(|f<rsub|i>,f<rsub|j>|)>> are
  Fermi-Dirac occupation numbers, and <math|\<delta\>> an infinitesimal such
  that the poles of <math|W> fall in the second and fourth quadrants of the
  complex plane. In the GW approximation, the self-energy operator can be
  loosely interpreted as a generalization of the Hartree-Fock method by
  replacing the bare Coulomb potential with a dynamically screened Coulomb
  interaction accounting both for exchange and (dynamical) correlations. An
  important feature of the GW approach is that not only ionization energies
  and electronic affinities can be calculated, but also the full
  quasiparticle spectrum. Further, both localized and infinite systems can be
  treated on the same footing with long and short range screening
  automatically accounted for in the construction of the screened Coulomb
  potential <math|W>. More details about the present implementation can be
  found in Ref.<nbsp><cite-arg|Blase10>.

  Our calculations are based on a recently developed implementation of the GW
  formalism (the <with|font-shape|small-caps|Fiesta> code) using a gaussian
  auxiliary basis to expand the two-point operators such as the Coulomb
  potential, the susceptibilities or the self-energy <cite|Blase10>.
  Dynamical correlations are included explicitly through contour deformation
  techniques. We start with a ground-state DFT calculation using the
  <with|font-shape|small-caps|Siesta> package <cite|siesta> and a large
  triple-zeta with double polarization (TZDP) basis <cite|KSbasis>. We fit
  the radial part of the numerical basis generated by the
  <with|font-shape|small-caps|Siesta> code by up to five contracted gaussians
  in order to facilitate the calculation of the Coulomb matrix elements and
  of the matrix elements <math|<around|\<langle\>|\<phi\><rsub|i><around|\||\<beta\>|\|>*\<phi\><rsub|j>|\<rangle\>>>
  of the auxiliary basis (<math|\<beta\>>) between Kohn-Sham states. Such a
  scheme allows to exploit the analytic relations for the products of
  gaussian orbitals centered on different atoms or for their Fourier
  transform <cite|Blase10>. Our auxiliary basis for first row elements is the
  tempered basis <cite|Cherkes09> developed by Kaczmarski and coworkers
  <cite|Kaczmarski10>. Such a basis was tested recently in a systematic study
  of several molecules of interest for photovoltaic applications
  <cite|Blase10>. Four gaussians for each l-channel with localization
  coefficients <math|\<alpha\>>=(0.2,0.5,1.25,3.2) a.u. are used for the
  (<with|font-shape|italic|s,p,d>) channels of C, O, and N atoms, while three
  gaussians with <math|\<alpha\>>=(0.1,0.4,1.5) a.u. describe hydrogen
  <cite|largerbasis>.

  <with|font-series|bold|Ionization energies.> We now comment on the values
  of the calculated first ionization energy (IE) as compiled in the Table and
  Fig.<nbsp><reference|iefig>. The comparison to the experimental data is
  complicated by the 0.2-0.3 eV range spanned by the various experimental
  reports (vertical arrows Fig.<nbsp><reference|iefig>). An additional
  complication in the case of cytosine and guanine, beyond the intrinsic
  difficulties in accurately measuring ionization energies in the gas phase,
  is that several gas phase tautomers exist <cite|Bravaya10> which differ
  from the so-called C1-cytosine and G9K-guanine isomers commonly found in
  DNA (see Fig.<nbsp><reference|structures>). State-of-the-art
  <with|font-shape|italic|ab initio> quantum chemistry calculations, namely
  coupled-cluster CCSD(T) and multiconfigurational perturbation (CASPT2)
  methods <cite|Sanjuan06|Sanjuan08>, studied the nucleobase tautomers that
  can be found along the DNA/RNA strands. More recently, equation of motion
  coupled-cluster techniques (EOM-IP-CCSD) were performed on several isomers
  <cite|Bravaya10>. All methods agree to within 0.04 eV for the average IE of
  the A, G, C, T tautomers we consider here, with a maximum discrepancy of
  0.09 eV in the case of thymine. The CASPT2 and CCSD(T) calculations agree
  to within 0.03 eV for all molecules. These theoretical IE are commonly
  considered as the most reliable references and land within the experimental
  error bars, except for the cytosine (C1) case where the calculated IEs are
  slightly smaller than the experimental lower bound <cite|IPisomers> (see
  Table and Fig.<nbsp><reference|iefig>).

  Clearly, the ionization energy within DFT-LDA, as given by the negative
  HOMO Kohn-Sham level energy, significantly underestimates the IE by an
  average of <math|\<sim\>>2.5 eV (29<math|%>) <cite|PBE>. The self-energy
  correction at the G<rsub|<math|0>>W<rsub|<math|0>>(LDA) level improves very
  significantly the situation by bringing the error to an average 0.5 eV
  (5.7<math|%>) as compared to state-of-the-art quantum chemistry results.
  However, as emphasized in recent papers
  <cite|Rostgaard10|Kaasbjerg10|Blase10|Hahn05>, the overscreening induced by
  starting with LDA eigenvalues, which dramatically underestimate the band
  gap, tends to produce too small ionization energies. This problem can be
  solved at least partly by performing a simple self-consistency on the
  eigenvalues. We shall refer to this approach as GW henceforth. Such a
  self-consistency on the eigenvalues leads to a much reduced average error
  of 0.11 eV (<math|\<sim\>>1.3<math|%>) as compared to the quantum chemistry
  reference. This good agreement certainly indicates the reliability of the
  present GW scheme for such systems. As shown in
  Fig.<nbsp><reference|iefig>, the largest discrepancies are observed for
  guanine and adenine (the purines), while the agreement is excellent for the
  three remaining bases.

  <\big-figure>
    <image|figure2|0.45tex-text-width|||><label|iefig>
  </big-figure|(Color online) Ionization energies in eV. The vertical
  (maroon) error bars indicate the experimental range. Triangles up (light
  blue): LDA values; (green) squares: G<rsub|<math|0>>W<rsub|<math|0>>(LDA)
  values; full black diamond: GW values; (red) empty circles (QuantChem
  abbreviation): quantum chemistry, namely CCSD(T), CASPT2 and EOM-IP-CCSD,
  values (see text). >

  In recent work, it was shown that for small molecules a non-self-consistent
  G<rsub|<math|0>>W<rsub|<math|0>> calculation starting from Hartree-Fock
  eigenstates leads for the ionization energy to better results than a full
  self-consistent GW calculation where the wavefunctions are updated as well
  <cite|Rostgaard10|Kaasbjerg10>. Consistent with this observation, a simple
  scheme relying on an Hartree-Fock-like approach was successfully tested on
  silane, disilane, and water <cite|Hahn05>, and larger molecules such as
  fullerenes or porphyrins <cite|Blase10>. In this
  \PG<rsub|<math|0>>W<rsub|<math|0>> on Hartree-Fock (HF)"
  <with|font-shape|italic|ansatz>, the input eigenvalues
  (<math|<wide|\<epsilon\>|~><rsub|n>>) are computed within a diagonal
  first-order perturbation theory where the DFT exchange-correlation
  contribution to the eigenvalues is replaced by the Fock exchange integral,
  namely:

  <\equation*>
    <wide|\<epsilon\>|~><rsub|n>=\<epsilon\><rsup|<mathrm>>*L*D*A<rsub|n>+\<less\>\<psi\><rsup|<mathrm>>*L*D*A<rsub|n>*<around|\||\<Sigma\><rsub|x>-V<rsup|<mathrm>>*L*D*A<rsub|x*c>|\|>*\<psi\><rsup|<mathrm>>*L*D*A<rsub|n>\<gtr\>.
  </equation*>

  where <math|\<Sigma\><rsub|x>> is the Fock operator. This approach, labeled
  G<rsub|<math|0>>W<rsub|<math|0>>(HF<math|<rsub|<mathrm>>d*i*a*g>) in the
  Table, produces an average error of 0.22 eV (<math|\<sim\>>2.6<math|%>).
  This good agreement with both the GW and quantum chemistry calculations
  clearly speaks in favor of this simple scheme for molecular systems, or the
  full G<rsub|<math|0>>W<rsub|<math|0>>(HF) calculations tested in
  Ref.<nbsp><cite-arg|Rostgaard10>, which also avoids seeking
  self-consistency. A difficult issue lying ahead concerns e.g. hybrid
  systems, such as semiconducting surfaces grafted by organic molecules, for
  which it is not quite clear what should be the best starting point.

  <\big-figure>
    <image|figure3|0.45tex-text-width|||><label|waves>
  </big-figure|(Color online) Isodensity surface plot of the HOMO
  (<math|\<sigma\><rsub|O>>), HOMO-1 (<math|\<pi\>>), HOMO-2
  (<math|\<sigma\>>), and HOMO-3 (<math|\<pi\><rprime|'>>) LDA Kohn-Sham
  eigenstates of cytosine. Within GW, the ordering of states becomes
  <math|\<pi\>>,<math|\<pi\><rprime|'>>,<math|\<sigma\><rsub|O>>,
  <math|\<sigma\>> for HOMO to HOMO-3 (see text). >

  <with|par-columns|1|<big-table|<tabular*|<tformat|<cwith|1|-1|1|1|cell-lborder|2ln>|<cwith|1|-1|1|1|cell-hyphen|t>|<cwith|1|-1|1|1|cell-hmode|exact>|<cwith|1|-1|1|1|cell-width|0.11tex-text-width>|<cwith|1|-1|2|2|cell-hyphen|t>|<cwith|1|-1|2|2|cell-hmode|exact>|<cwith|1|-1|2|2|cell-width|0.10tex-text-width>|<cwith|1|-1|3|3|cell-hyphen|t>|<cwith|1|-1|3|3|cell-hmode|exact>|<cwith|1|-1|3|3|cell-width|0.10tex-text-width>|<cwith|1|-1|4|4|cell-hyphen|t>|<cwith|1|-1|4|4|cell-hmode|exact>|<cwith|1|-1|4|4|cell-width|0.09tex-text-width>|<cwith|1|-1|5|5|cell-hyphen|t>|<cwith|1|-1|5|5|cell-hmode|exact>|<cwith|1|-1|5|5|cell-width|0.12tex-text-width>|<cwith|1|-1|6|6|cell-hyphen|t>|<cwith|1|-1|6|6|cell-hmode|exact>|<cwith|1|-1|6|6|cell-width|0.15tex-text-width>|<cwith|1|-1|7|7|cell-hyphen|t>|<cwith|1|-1|7|7|cell-hmode|exact>|<cwith|1|-1|7|7|cell-width|0.09tex-text-width>|<cwith|1|-1|8|8|cell-hyphen|t>|<cwith|1|-1|8|8|cell-hmode|exact>|<cwith|1|-1|8|8|cell-width|0.16tex-text-width>|<cwith|1|-1|8|8|cell-rborder|1ln>|<cwith|1|-1|1|-1|cell-valign|c>|<cwith|1|1|1|1|cell-col-span|8>|<cwith|1|1|1|1|cell-lborder|2ln>|<cwith|1|1|1|1|cell-halign|c>|<cwith|1|1|1|1|cell-rborder|2ln>|<cwith|5|5|1|-1|cell-bborder|1ln>|<cwith|8|8|1|-1|cell-bborder|1ln>|<cwith|13|13|1|-1|cell-bborder|1ln>|<cwith|16|16|1|-1|cell-bborder|1ln>|<table|<row|<cell|<hhline|\|t:========:t\|>
  <with|font-series|bold|Vertical ionization energies and vertical electronic
  affinities >>|<cell|>|<cell|>|<cell|>|<cell|>|<cell|>|<cell|>|<cell|>>|<row|<cell|<hhline|\|\|<emdash><emdash>--\|\|>>|<cell|LDA-KS>|<cell|G<rsub|<math|0>>W<rsub|<math|0>>(LDA)>|<cell|GW>|<cell|G<rsub|<math|0>>W<rsub|<math|0>>(HF<math|<rsub|<mathrm>>d*i*a*g>)>|<cell|CAS<rsup|<math|a,b>>/CC<rsup|<math|a,b>>>|<cell|EOM<rsup|<math|c>>>|<cell|Experiment<rsup|<math|d,e,f,g>>>>|<row|<cell|<hhline|\|\|<emdash><emdash>--\|\|>
  G-LUMO>|<cell|1.80>|<cell|-1.04>|<cell|-1.58>|<cell|-1.77>|<cell|-1.14<rsup|<math|a>>/>|<cell|>|<cell|>>|<row|<cell|G-HOMO>|<cell|5.69>|<cell|7.49>|<cell|7.81>|<cell|7.76>|<cell|8.09<rsup|<math|b>>/8.09<rsup|<math|b>>>|<cell|8.15>|<cell|8.0-8.3<rsup|<math|d>>>>|<row|<cell|G-HOMO-1>|<cell|6.34>|<cell|8.78>|<cell|9.82>|<cell|9.78>|<cell|9.56<rsup|<math|b>>/>|<cell|9.86>|<cell|9.90<rsup|<math|g>>>>|<row|<cell|A-LUMO>|<cell|2.22>|<cell|-0.64>|<cell|-1.14>|<cell|-1.30>|<cell|-0.91<rsup|<math|a>>/>|<cell|>|<cell|-0.56
  to -0.45<rsup|<math|e>>>>|<row|<cell|A-HOMO>|<cell|6.02>|<cell|7.90>|<cell|8.22>|<cell|8.23>|<cell|8.37<rsup|<math|b>>/8.40<rsup|<math|b>>>|<cell|8.37>|<cell|8.3-8.5<rsup|<math|d>>,
  8.47<rsup|<math|f>>>>|<row|<cell|A-HOMO-1>|<cell|6.28>|<cell|8.75>|<cell|9.47>|<cell|9.51>|<cell|9.05<rsup|<math|b>>/>|<cell|9.37>|<cell|9.45<rsup|<math|f>>>>|<row|<cell|C-LUMO>|<cell|2.57>|<cell|-0.45>|<cell|-0.91>|<cell|-1.05>|<cell|-0.69<rsup|<math|a>>/-0.79<rsup|<math|a>>>|<cell|>|<cell|-0.55
  to -0.32<rsup|<math|e>>>>|<row|<cell|C-HOMO>|<cell|6.167
  (<math|\<sigma\><rsub|O>>)>|<cell|8.21 (<math|\<pi\>>)>|<cell|8.73
  (<math|\<pi\>>)>|<cell|9.05 (<math|\<pi\>>)>|<cell|8.73<rsup|<math|b>>
  (<math|\<pi\>>)/8.76<rsup|<math|b>>>|<cell|8.78
  (<math|\<pi\>>)>|<cell|8.8-9.0<rsup|<math|d>>,
  8.89<rsup|<math|f>>>>|<row|<cell|C-HOMO-1>|<cell|6.172
  (<math|\<pi\>>)>|<cell|8.80 (<math|\<sigma\><rsub|O>>)>|<cell|9.52
  (<math|\<pi\>>')>|<cell|9.87 (<math|\<pi\>>')>|<cell|9.42<rsup|<math|b>>
  (<math|\<sigma\><rsub|O>>)/>|<cell|9.54
  (<math|\<pi\>>')>|<cell|9.45<rsup|<math|g>>,
  9.55<rsup|<math|f>>>>|<row|<cell|C-HOMO-2>|<cell|6.806
  (<math|\<sigma\>>)>|<cell|8.92 (<math|\<pi\>>')>|<cell|9.89
  (<math|\<sigma\><rsub|O>>)>|<cell|10.36
  (<math|\<sigma\><rsub|O>>)>|<cell|9.49<rsup|<math|b>>
  (<math|\<pi\>>')/>|<cell|9.65 (<math|\<sigma\><rsub|O>>)>|<cell|9.89<rsup|<math|f>>>>|<row|<cell|C-HOMO-3>|<cell|6.809
  (<math|\<pi\><rprime|'>>)>|<cell|9.38 (<math|\<sigma\>>)>|<cell|10.22
  (<math|\<sigma\>>)>|<cell|10.64 (<math|\<sigma\>>)>|<cell|9.88<rsup|<math|b>>(<math|\<sigma\>>)/>|<cell|10.06
  (<math|\<sigma\>>)>|<cell|11.20<rsup|<math|f>>>>|<row|<cell|T-LUMO>|<cell|2.83>|<cell|-0.14>|<cell|-0.67>|<cell|-0.77>|<cell|-0.60<rsup|<math|a>>/-0.65<rsup|<math|a>>>|<cell|>|<cell|-0.53
  to -0.29<rsup|<math|e>>>>|<row|<cell|T-HOMO>|<cell|6.54>|<cell|8.64>|<cell|9.05>|<cell|9.05>|<cell|9.07<rsup|<math|b>>/9.04<rsup|<math|b>>>|<cell|9.13>|<cell|9.0-9.2<rsup|<math|d>>,
  9.19<rsup|<math|f>>>>|<row|<cell|T-HOMO-1>|<cell|6.68>|<cell|9.34>|<cell|10.41>|<cell|10.40>|<cell|9.81<rsup|<math|b>>/>|<cell|10.13>|<cell|9.95-10.05<rsup|<math|d>>,10.14<rsup|<math|f>>>>|<row|<cell|U-LUMO>|<cell|3.01>|<cell|-0.11>|<cell|-0.64>|<cell|-0.71>|<cell|-0.61<rsup|<math|a>>/-0.64<rsup|<math|a>>>|<cell|>|<cell|-0.30
  to -0.22<rsup|<math|e>>>>|<row|<cell|U-HOMO>|<cell|6.72
  (<math|\<sigma\><rsub|O>>)>|<cell|9.03 (<math|\<pi\>>)>|<cell|9.47
  (<math|\<pi\>>)>|<cell|9.73 (<math|\<pi\>>)>|<cell|9.42<rsup|<math|b>>
  (<math|\<pi\>>)/9.43<rsup|<math|b>>>|<cell|>|<cell|9.4-9.6<rsup|<math|d>>>>|<row|<cell|U-HOMO-1>|<cell|6.88
  (<math|\<pi\>>)>|<cell|9.45 (<math|\<sigma\><rsub|O>>)>|<cell|10.54
  (<math|\<sigma\><rsub|O>>)>|<cell|10.96
  (<math|\<sigma\><rsub|O>>)>|<cell|9.83<rsup|<math|b>>
  (<math|\<sigma\><rsub|O>>)/>|<cell|>|<cell|10.02-10.13<rsup|<math|d>>>>|<row|<cell|U-HOMO-2>|<cell|7.55
  (<math|\<sigma\>>)>|<cell|9.88 (<math|\<pi\>>')>|<cell|10.66
  (<math|\<pi\>>')>|<cell|11.06 (<math|\<pi\>>')>|<cell|10.41<rsup|<math|b>>
  (<math|\<pi\>>')/>|<cell|>|<cell|10.51-10.56<rsup|<math|d>>>>|<row|<cell|U-HOMO-3>|<cell|7.66
  (<math|\<pi\>>')>|<cell|10.33 (<math|\<sigma\>>)>|<cell|11.48
  (<math|\<sigma\>>)>|<cell|11.90 (<math|\<sigma\>>)>|<cell|10.86<rsup|<math|b>>
  (<math|\<sigma\>>)/>|<cell|>|<cell|10.90-11.16<rsup|<math|d>>>>|<row|<cell|<hhline|\|b:========:b\|>
  MAE LUMO>|<cell|3.29>|<cell|0.33>|<cell|0.18>|<cell|0.31>|<cell|>|<cell|>|<cell|>>|<row|<cell|MAE
  HOMO>|<cell|2.5>|<cell|0.5>|<cell|0.11>|<cell|0.22>|<cell|>|<cell|>|<cell|>>|<row|<cell|<hhline|\|b:========:b\|>>|<cell|>|<cell|>|<cell|>|<cell|>|<cell|>|<cell|>|<cell|>>>>><label|tableIE>|Vertical
  ionization energies and electronic affinities in eV as obtained from the
  negative Kohn-Sham eigenvalues (LDA-KS), from non-self-consistent
  G<rsub|<math|0>>W<rsub|<math|0>>(LDA) calculations, from a GW calculation
  with self-consistency on the eigenvalues (GW), and from a
  non-self-consistent G<rsub|<math|0>>W<rsub|<math|0>>(HF<math|<rsub|<mathrm>>d*i*a*g>)
  calculation starting from Hartree-Fock-like eigenvalues. The
  <math|\<sigma\>> or <math|\<pi\>> character of the wavefunctions is
  indicated when the GW correction changes the level ordering as compared to
  DFT-LDA (see text). The acronyms CAS, CC and EOM stand for CASPT2, CCSD(T)
  and equation of motion coupled-cluster high-level many body quantum
  chemistry calculations, respectively. Theoretical values are reported for
  the C1-cytosine and G9K-guanine, while the experimental values average over
  several tautomers. The MAE is the mean absolute error in eV as compared to
  the quantum chemistry reference calculations in columns 6 and 7.
  <rsup|<math|a>>Ref.<nbsp><cite-arg|Sanjuan08>.
  <rsup|<math|b>>Ref.<nbsp><cite-arg|Sanjuan06>.
  <rsup|<math|c>>Ref.<nbsp><cite-arg|Bravaya10>. <rsup|<math|d>>Compiled in
  Ref.<nbsp><cite-arg|Sanjuan06>. <rsup|<math|e>>Compiled in
  Ref.<nbsp><cite-arg|Sanjuan08>. <rsup|<math|f>>Ref.<nbsp><cite-arg|Trofimov06>.
  <rsup|<math|g>>Ref.<nbsp><cite-arg|Dougherty78>. >>

  Next, we address the character of the HOMO level of cytosine and uracil. It
  changes from DFT-LDA to GW calculations. We plot in
  Fig.<nbsp><reference|waves>(a-d) the C1-cytosine DFT-LDA Kohn-Sham HOMO to
  (HOMO-3) eigenstates. The LDA HOMO level is an in plane <math|\<sigma\>>
  state with a strong component on the (<with|font-shape|italic|p><rsub|<math|x>>,<with|font-shape|italic|p><rsub|<math|y>>)
  oxygen orbitals. Such a state is labeled <math|\<sigma\><rsub|O>> in the
  Table and in the following. The (HOMO-1) level is a more standard
  <math|\<pi\>>-state with weight on the oxygen
  (<with|font-shape|italic|p><rsub|<math|z>>) orbital and a delocalized
  benzene ring <math|\<pi\>> molecular orbital. Within the
  G<rsub|<math|0>>W<rsub|<math|0>>(LDA), GW and
  G<rsub|<math|0>>W<rsub|<math|0>>(HF<math|<rsub|<mathrm>>d*i*a*g>)
  approaches, the LDA HOMO <math|\<sigma\><rsub|O>> state is pushed to a
  significantly lower energy and the <math|\<pi\>> state becomes the HOMO
  level. This level crossing brings the GW calculations in agreement with
  many-body quantum chemistry calculations, which all predict the
  <math|\<pi\>> state to be the HOMO level. The same level crossing is
  observed in the case of uracil with the LDA HOMO and (HOMO-1) levels being
  <math|\<sigma\><rsub|O>> and <math|\<pi\>>-states respectively, while all
  GW results and quantum chemistry calculations predict a reverse ordering.
  Our interpretation is that the very localized <math|\<sigma\><rsub|O>>
  state suffers much more from the spurious LDA self interaction than the
  rather delocalized <math|\<pi\>> state. Even though it would be wrong to
  reduce the dynamical GW self-energy operator to a self-interaction free
  functional, the GW correction certainly cures in part this well-known
  problem. The other bases, namely guanine, adenine, and thymine, all show
  the correct <math|\<pi\>>-character for the HOMO level.

  The HOMO to (HOMO-1) energy difference averages to 0.80 eV and 1.12 eV
  within CASPT2 and EOM-IP-CCSD, respectively. Clearly, the average LDA
  energy spacing of 0.22 eV is significantly too small. We find that the 0.77
  eV G<rsub|<math|0>>W<rsub|<math|0>>(LDA) average value is close to the
  CASPT2 results, while the larger 1.29 eV GW result falls closer to the
  EOM-IP-CCSD energy difference. Averaging over all isomers, the experimental
  HOMO to (HOMO-1) energy spacing comes to 0.97 eV, in between the
  G<rsub|<math|0>>W<rsub|<math|0>>(LDA) or CASPT2 results and the GW or
  EOM-IP-CCSD values. Even though it is too early for final conclusions about
  the merits of the various approaches, it seems fair to state that the LDA
  value is significantly too small, and that the situation is improved
  significantly by the GW correction.

  <with|font-series|bold|Electronic affinities.> We conclude this study by
  exploring the electronic affinity (EA) of the nucleobases. They are
  provided in the Table as the negative sign of the LUMO Kohn-Sham energies.
  Experimental data for guanine are missing. Further, the CASPT2 and CCSD(T)
  results <cite|Sanjuan08> are clearly larger (in absolute value) than the
  highest experimental estimates. While again part of the discrepancy may
  come from the presence of several tautomers in the gas phase, it certainly
  results as well from the fact that the electronic affinity is negative. A
  detailed discussion on the experimental difficulties in probing unbound
  states is presented in Ref.<nbsp><cite-arg|Bravaya10>. Taking again the
  CCSD(T) and CASPT2 calculations <cite|Sanjuan08> as a reference, the GW
  electronic affinities are quite satisfying, with a MAE of 0.18 eV. Such an
  agreement is rather impressive since the LDA electronic affinities show the
  wrong sign, with a discrepancy as compared to CASPT2 ranging from 2.9 eV to
  3.6 eV. We observe that while the G<rsub|<math|0>>W<rsub|<math|0>> EAs are
  smaller (in absolute value) than the quantum chemistry ones, the GW EAs are
  larger. This contrasts with the IE case where both
  G<rsub|<math|0>>W<rsub|<math|0>> and GW values were smaller (see Fig. 2).
  Similar to the quantum chemistry case, the GW values are found to
  systematically overestimate the experimental results. Further study is
  needed to understand such a discrepancy between theoretical and available
  experimental results.

  In conclusion, we have studied on the basis of <with|font-shape|italic|ab
  initio> GW calculations the ionization energies and electronic affinities
  of the DNA and RNA nucleobases, guanine, adenine, cytosine, thymine and
  uracil. While a standard G<rsub|<math|0>>W<rsub|<math|0>>(LDA) calculation
  yields ionization energies that are 0.5 eV away from CCSD(T)/CASPT2
  reference quantum chemistry calculations, self-consistency on the
  eigenvalues brings the agreement to an excellent 0.11 eV average absolute
  error. A simple G<rsub|<math|0>>W<rsub|<math|0>> calculation starting from
  Hartree-Fock-like eigenvalues, avoiding the need for self-consistency,
  shifts the agreement to 0.22 eV. The possibility of bringing the calculated
  values to within 0.1-0.2 eV from state-of-the-art reference calculations
  with a scheme, the GW formalism, which allows to treat both finite size and
  extended systems with a N<rsup|<math|4>> scaling, and permits to obtain the
  full quasiparticle spectrum, paves the way to further studies of larger DNA
  strands and biological systems in general.

  <with|font-series|bold|Acknowledgements.> C.F. is indebted to the European
  Union Erasmus program for funding. Calculations have been performed on the
  CIMENT platform in Grenoble thanks to the Nanostar RTRA project.

  <\thebibliography|150bliography>
    <bibitem|polymerase>A well-known example is that of the \Preplication"
    proteins that copy the nucleobases in DNA transcription and replication.
    See e.g. A. Travers, in <with|font-shape|italic|DNA-Protein
    Interactions>. Springer (1993); P.B. Dervan,
    <with|font-shape|italic|Science> <with|font-series|bold|232>. 464 (1983).

    <bibitem|Lodish>H. Lodish, A. Berk, P. Matsudaira, C. A. Kaiser
    <with|font-shape|italic|et al.>, in <with|font-shape|italic|Molecular
    Biology of the Cell>, WH Freeman: New York, NY. 5th ed. (2004).

    <bibitem|Close08>D. M. Close and K. T. hman, J. Phys. Chem. A
    <with|font-series|bold|112>, 11207 (2008); and references therein.

    <bibitem|Sanjuan06>D. Roca-Sanjuan, M. Rubio, M. Merchan
    <with|font-shape|italic|et al.>, J. Chem. Phys.
    <with|font-series|bold|125>, 084302 (2006).

    <bibitem|Sanjuan08>D. Roca-Sanjuan, M. Merchan, L. Serrano-Andres, M.
    Rubio, J. Chem. Phys. <with|font-series|bold|129>, 095104 (2008).

    <bibitem|Bravaya10>K. B. Bravaya, O. Kostko, S. Dolgikh, A. Landau
    <with|font-shape|italic|et al.>, J. Phys. Chem. A
    <with|font-series|bold|114>, 12305-12317 (2010); and references therein.

    <bibitem|Hush75>N. S. Hush, A. S. Cheung, Chem. Phys. Lett.
    <with|font-series|bold|34>, 11 (1975).

    <bibitem|Dougherty78>D. Dougherty, E. S. Younathan, R. Voll
    <with|font-shape|italic|et al.>, J. Elec. Spectro. Relat. Phenomena
    <with|font-series|bold|13>, 379 (1978).

    <bibitem|Choi05>K. W. Choi, J. H. Lee, S. K. Kim, J. Am. Chem. Soc.
    <with|font-series|bold|127>, 15674 (2005).

    <bibitem|Trofimov06>A. B. Trofimov, J. Schirmer, V. B. Kobychev, A. W.
    Potts <with|font-shape|italic|et al.>, J. Phys. B: At. Mol. Opt. Phys.
    <with|font-series|bold|39>, 305 (2006).

    <bibitem|Schwell08>M. Schwell, H. W. Jochims, H. Baumgartel
    <with|font-shape|italic|et al.>, Chem. Phys. <with|font-series|bold|353>,
    145 (2008).

    <bibitem|Zaytseva09>I. L. Zaytseva, A. B. Trofimov, J. Schirmer
    <with|font-shape|italic|et al.>, J. Phys. Chem. A
    <with|font-series|bold|113>, 15142 (2009).

    <bibitem|Kostko10>O. Kostko, K. Bravaya, A. Krylov,
    <with|font-shape|italic|et al.>, Phys. Chem. Chem. Phys.
    <with|font-series|bold|12>, 2860 (2010); and references therein.

    <bibitem|Hedin65>L. Hedin, Phys. Rev. <with|font-series|bold|139>, A796
    (1965).

    <bibitem|Strinati80>G. Strinati, H. J. Mattausch, W. Hanke, Phys. Rev.
    Lett. <with|font-series|bold|45>, 290 (1980);
    <with|font-shape|italic|idem>, Phys. Rev. B <with|font-series|bold|25>,
    2867 (1982).

    <bibitem|Hybertsen86>M. S. Hybertsen and S. G. Louie, Phys. Rev. B
    <with|font-series|bold|34>, 5390 (1986).

    <bibitem|Godby88>R. W. Godby, M. Schlter, and L. J. Sham, Phys. Rev. B
    <with|font-series|bold|37>, 10159 (1988).

    <bibitem|Onida02>G. Onida, L. Reining, A. Rubio, Rev. Mod. Phys.
    <with|font-series|bold|74>, 601 (2002).

    <bibitem|Rostgaard10>C. Rostgaard, K. W. Jacobsen, K. S. Thygesen, Phys.
    Rev. B <with|font-series|bold|81>, 085103 (2010).

    <bibitem|Dori06>N. Dori <with|font-shape|italic|et al.>, Phys. Rev. B
    <with|font-series|bold|73>, 195208 (2006).

    <bibitem|Tiago08>M. L. Tiago, P. R. C. Kent, R. Q. Hood, F. A. Reboredo,
    J. Chem. Phys. <with|font-series|bold|129>, 084311 (2008).

    <bibitem|Umari09>P. Umari, G. Stenuit, S. Baroni, Phys. Rev. B
    <with|font-series|bold|79>, 201104 (R) (2009).

    <bibitem|Palumno09>M. Palummo <with|font-shape|italic|et al.>, J. Chem.
    Phys. <with|font-series|bold|131>, 084102 (2009).

    <bibitem|Stenuit10>G. Stenuit, C. Castellarin-Cudia, O. Plekan, V. Feyer
    <with|font-shape|italic|et al.>, Phys. Chem. Chem. Phys.
    <with|font-series|bold|12>, 10817 (2010).

    <bibitem|Blase10>X. Blase, C. Attaccalite and V. Olevano,
    arXiv:1011.3933.

    <bibitem|Schwinger59>P. C. Martin and J. Schwinger, Phys. Rev.
    <with|font-series|bold|115>, 1342 (1959).

    <bibitem|Kaasbjerg10>K. Kaasbjerg, K. S. Thygesen, Phys. Rev. B
    <with|font-series|bold|81>, 085102 (2010).

    <bibitem|Hahn05>P. H. Hahn, W. G. Schmidt and F. Bechstedt, Phys. Rev. B
    <with|font-series|bold|72>, 24545 (2005).

    <bibitem|siesta>J. M. Soler <with|font-shape|italic|et al.>, J. Phys.:
    Condens. Mater <with|font-series|bold|14>, 2745-2779 (2002).

    <bibitem|KSbasis>It was shown in Refs.<nbsp><cite-arg|Rostgaard10,Blase10>
    for a large set of molecules that a standard double-zeta plus
    polarization (DZP) basis to expand the Kohn-Sham eigenstates yields
    already IE within <math|\<sim\>> 0.1 eV as compared to the much larger
    TZDP basis. This certainly indicates that the present TZDP basis is an
    extremely good basis for such calculations.

    <bibitem|Cherkes09>For a recent analysis, see: I. Cherkes, S. Klaiman, N.
    Miseyev, Int. J. Quant. Chem. <with|font-series|bold|109>, 2996 (2009);
    and references therein.

    <bibitem|Kaczmarski10>M. S. Kaczmarski, Y. C. Ma and M. Rohlfing, Phys.
    Rev. B <with|font-series|bold|81>, 115433 (2010).

    <bibitem|largerbasis>We have tested the use of a larger and more diffuse
    even-tempered basis with decay coefficients <math|\<alpha\>>=(0.15, 0.32,
    0.69, 1.48, 3.2) a.u. for C, O, and N atoms. We find that the GW
    ionization energies and electronic affinities change by 0.08 eV and 0.015
    eV (MAE) respectively with no systematic trend.

    <bibitem|IPisomers>The EOM-IP-CCSD calculations predict that the IP of
    the most stable guanine (G7K) and cytosine (C2b) isomers (the most
    abundant in the gas phase) are 0.14 eV and 0.08 eV larger than the IP of
    the G9K and C1 isomers we study. Our calculations yields differences of
    0.08 eV and 0.04 eV respectively (GW value). For cytosine, our GW
    calculations predict that the largest IE is that of the (C3a) tautomer
    which lands 0.19 eV above that of (C1). This is consistent with the 0.2
    eV experimental range, even though larger than the 0.12 eV difference
    reported in Ref.<nbsp><cite-arg|Bravaya10> (EOM-IP-CCSD calculations).
    Clearly, averaging over all isomers would bring our results in better
    agreement with experiment. We note however that such an average would
    require to know the abundance of each tautomer at the experimental
    temperature. This stands beyond the purpose of the present paper.

    <bibitem|PBE>Very similar results are obtained with the PBE functional.
    J. P. Perdew, K. Burke, M. Ernzerhof, Phys. Rev. Lett.
    <with|font-series|bold|77>, 3865 (1996).
  </thebibliography>
</body>