aced-review/main.tex

\RequirePackage{silence}
\WarningFilter{revtex4-2}{Repair the float}

\documentclass[aps,pra,nofootinbib,twocolumn]{revtex4-2}
\usepackage[english]{babel}
\usepackage[utf8]{inputenc}
\usepackage{amsmath}
\usepackage{amsfonts}
\usepackage{amssymb}
\usepackage{amsthm}
\usepackage{mathtools}
\usepackage{hyperref}
\usepackage{cleveref}
\usepackage{calc}
\usepackage{graphicx}
\usepackage{physics}
\usepackage{float}
\usepackage{qcircuit}

\newtheorem{theorem}{Theorem}
\newtheorem{lemma}{Lemma}
\newtheorem{definition}{Definition}

\ExplSyntaxOn
\clist_new:N \l_my_footlist
\cs_new:Nn \my_footnotelater:n {
    \clist_gput_right:Nn \l_my_footlist {{#1}}
}
\cs_new:Nn \my_footnotenow: {
    \clist_gpop:NN \l_my_footlist \l_tmpa_tl
    \exp_args:Nv\footnote{l_tmpa_tl}
}
\NewDocumentCommand{\FootnoteLater}{m}{\my_footnotelater:n{#1}}
\NewDocumentCommand{\FootnoteNow}{}{\my_footnotenow:}

\cs_new:Nn \postulate:nn {
    \emph{#2}
    %\hyperlink{#1}{#2}
}
\NewDocumentCommand \Postulate { m }
{
  \str_case_e:nnF { \str_foldcase:n { #1 } }
    {
        { state\space space }{\postulate:nn{postulate:state-space}{#1}}
        { evolution }{\postulate:nn{postulate:evolution}{#1}}
        { measurement }{\postulate:nn{postulate:measurement}{#1}}
        { composite\space systems }{\postulate:nn{postulate:composite-systems}{#1}}
    }{
        \textbf{!!}#1\textbf{!!}
    }
}

\cs_new:Nn \my_simulator_slug:Nn {
    \tl_set:Nn #1 {#2}
    \tl_set:Nx #1 {\text_lowercase:n {#2}}
    \tl_remove_all:Nn #1 { () }
    \tl_replace_all:Nnn #1 { ~_/ } { - }
    \tl_put_left:Nn #1 {sec:simulator-}
}
\NewDocumentCommand{\Simulator}{m}{
    \my_simulator_slug:Nn \l_tmpa_tl {#1}
    \subsubsection{#1}
    \exp_args:No \label \l_tmpa_tl
}
\ExplSyntaxOff

\def\Cpp{{C\nolinebreak[4]\hspace{-.05em}\raisebox{.4ex}{\tiny\bf ++}}}
\def\viz.{\textit{viz}.}
\NewDocumentCommand{\Naturals}{}{\mathbb{N}}
\let\NN\Naturals
\NewDocumentCommand{\Integers}{}{\mathbb{Z}}
\let\ZZ\Integers
\NewDocumentCommand{\Reals}{}{\mathbb{R}}
\let\RR\Reals
\NewDocumentCommand{\Complex}{}{\mathbb{C}}
\let\CC\Complex
\NewDocumentCommand{\Hilbert}{}{\ensuremath{\mathbb{H}}}
\NewDocumentCommand{\Controlled}{m}{\ensuremath{{{}_C#1}}}
\NewDocumentCommand{\Clifford}{}{\text{Clifford}}
\NewDocumentCommand{\Schrodinger}{}{{Schr\"{o}dinger}}
\NewDocumentCommand{\BigO}{}{\mathcal{O}}

\begin{document}

\title{Survey on quantum circuit simulators}
\author{Miguel Mur\c{c}a}
\affiliation{Instituto Superior T\'{e}cnico, University of Lisbon, Portugal}
\date{\today}

\begin{abstract}
    In this survey, we review some of the main techniques for simulating a
    quantum computation on a classical computer, as well as some of the
    publicly available simulation software.
\end{abstract}

\maketitle

\section{Introduction}

\section{Quantum computation fundamentals}
\label{sec:quantum-computation-fundamentals}

\subsection{State vector}
\label{sec:state-vector}

Quantum computation is the field of study concerned with computation performed
within the postulates imposed by the theory of quantum mechanics. The notion of
``computation'', as here used, is the same as in the field of (classical)
computability and complexity theory, and so we refer to, for example,
Ref.~\cite{Homer2001} for a formal overview of the topic. However, for
simplicity, computation can be taken to mean the evaluation of some function
over some discrete input range. In any case, this function (or, more generally,
the computational task at hand) will be made concrete in context.

The postulates of quantum mechanics, on the other hand, are well defined and
may succinctly be stated as follows \cite{Nielsen2012}:

\FootnoteLater{The dagger symbol (${}^\dagger$) is used to denote conjugate
    transposition.}

\makeatother
\begin{description}
    \makeatletter
    \let\old@item\item
          \def\my@item#1#2{[\label{#2}{#1}]}
          \def\item[#1][#2]{\expandafter\old@item\my@item{#1}{#2}}
          \makeatother

          \item[State space][postulate:state-space]
          An isolated quantum system is represented by a ray of unit two-norm
          vectors in a complex vector space equipped with an inner product (i.e.,
          a Hilbert space). This space is referred to as the \emph{state space}.

          \item[Evolution][postulate:evolution]
          The evolution of an isolated quantum system, as represented by its
          quantum state vector $\vec \psi$, is given by a unitary operator acting
          on that quantum state vector. I.e., let \Hilbert\ be the Hilbert space
          such that the quantum state vector is, at some moment, $\vec\psi \in
              \Hilbert$. Then, there exists an operator $U \in
              \Hilbert\times\Hilbert$, satisfying $UU^\dagger = U^\dagger U = I$
          \FootnoteNow{} such that, after some time, the state of the system is
          given by the quantum state vector $U \vec\psi$.

          \item[Measurement][postulate:measurement] \hskip 0pt plus 1em
          A set of operators $\{M_m\}_m$, acting on $\Hilbert$ and satisfying
          $\sum_m M_m^\dagger M_m = I$, define a ``measurement''. Each operator
          $M_m$ has an associated outcome $m$, such that the measurement operation
          yields outcome $m$ with probability $\lVert{ M_m
              {\vec\psi}}\rVert_2^2$. After the measurement, the quantum system is
          described by the quantum state vector $M_m {\vec\psi} / \norm*{M_m
                  \vec\psi}_2$.

          \item[Composite Systems][postulate:composite-systems]
          If a quantum system is composed of multiple quantum subsystems, then
          the corresponding state vector space is given by the tensor product of
          the quantum state vector spaces of the subsystems.
\end{description}

A quantum computation is, thus, a computation carried out within these
postulates. Taking the computational task to be a decision problem (i.e.,
a question to which the computer should output a ``yes'' or ``no'' answer), we
may, without loss of generality, consider that the final answer is produced by
a final measurement (as defined in the measurement postulate), with outcomes
``yes'' ($m=1$) or ``no'' ($m=0$).

In the classical case, it is well known that a binary alphabet -- the ``bit''
-- is sufficient to perform computation (in the sense that it requires only a
logarithmic overhead in comparison to a larger alphabet; see \cite[Claim
    1.5]{arora2009computational}). To perform quantum computation, we will likewise
work with a quantum analogue of the bit, the ``qubit''. In particular, a qubit
is a quantum state of a Hilbert space of dimension $2$, $\Hilbert_2$. For
concreteness, we choose two quantum state vectors of $\Hilbert_2$ that are
orthogonal,
%
\begin{equation}
    \ket{0}, \ket{1}
\end{equation}
%
and that thus form a basis of $\Hilbert_2$, the ``computational basis''. We've
also here introduced the so-called ``Dirac notation'', or ``bra-ket notation'',
common in quantum mechanics, and by extension in quantum computing works. We
present a summary of Dirac notation in Table \ref{table:dirac-notation}, and we
will henceforth use this notation.

\begin{table}
    \newcommand{\rowskip}{8pt}
    \renewcommand{\arraystretch}{0}
    \begin{tabular}{r @{\hskip 1em}|@{\hskip 1em} l} \hline\hline\vrule height 14pt width 0pt
        $\ket{\psi}$               & Vector $\psi$, or ``ket'' $\psi$. Corresponds to $\vec\psi$.                          \\[\rowskip]
        $\bra{\psi}$               & Dual of $\ket{\psi}$, or ``bra'' $\psi$. Corresponds to ${\vec\psi}^\dagger$.         \\[\rowskip]
        $\braket{a}{b}$            & Inner product between vectors $\ket{a}$ and $\ket{b}$.                                \\[\rowskip]
        $\ket{a} \otimes \ket{b}$  & Tensor product between vectors $\ket{a}$ and $\ket{b}$.                               \\[0pt]
                                   & If both $\ket{a},\ket{b} \in \Hilbert_n$, $\ket{a}\otimes\ket{b} \in \Hilbert_{n^2}$. \\[\rowskip]
        $\ket{a}\mskip-5mu\ket{b}$ & Shortened notation for $\ket{a} \otimes \ket{b}$.                                     \\[\rowskip]
        $\bra{a} A \ket{b}$        & Inner product between $\ket{a}$ and $A\ket{b}$, or,                                   \\[0pt]
                                   & equivalently, $A^\dagger \ket{a}$ and $\ket{b}$.                                      \\[0pt]
                                   & If $\ket{a}=\ket{b}$, may be referred to as the                                       \\[0pt]
                                   & \emph{expectation value} of $A$ under $\ket{a}$.                                      \\[\rowskip]
        $\dyad{a}$                 & Projector onto the span of $\ket{a}$.
        \\[14pt]
        \hline\hline
    \end{tabular}

    \caption{Summary of ``Dirac notation'', or ``bra-ket notation''.}
    \label{table:dirac-notation}
\end{table}

In analogy to how, in the classical case, a binary alphabet can represent larger
alphabets, multiple qubits can be used to span a larger Hilbert space, by the
\Postulate{composite systems} postulate. Indeed, $n$ qubits span $2^n$
orthogonal states in their collective state space, which we may label by the
binary string given by each of the qubits, or the corresponding number:
%
{
\makeatletter
\def\my@ksp{\mskip-5mu}
\def\my@neg@ksp{\mskip5mu}
\def\my@ssp{\mskip-3mu}
\let\old@ket\ket
\let\old@ldots\ldots
\let\old@equiv\equiv
\def\ket#1{\old@ket{#1}\my@ksp}
\def\ldots{\my@neg@ksp\my@ssp\old@ldots\my@ssp}
\def\equiv{\my@neg@ksp\old@equiv}
\makeatother
%
\begin{equation}
    \begin{array}{l}
        \hskip-2em  \ket{0}\ket{0}\ldots\ket{0}\ket{0} \equiv \ket{0_b}, \\
        \hskip-1em  \ket{0}\ket{0}\ldots\ket{0}\ket{1} \equiv \ket{1_b}, \\
        \ket{0}\ket{0}\ldots\ket{1}\ket{0} \equiv \ket{2_b},             \\
        \hskip1em   \ldots                                               \\
        \hskip1.5em   \ket{1}\ket{1}\ldots\ket{1}\ket{1} \equiv \ket{2^n-1_b}.
    \end{array}
\end{equation}
}

Where it is clear in context (for example, if a ket is labeled with a value
other than $0$ or $1$), we may drop the subscript $b$.

Per the \Postulate{state space} postulate, a state of $n$ qubits may be given by
a linear combination of these basis states:
%
\begin{equation} \label{eq:superposition}
    \begin{array}{c}
        \displaystyle \ket{\psi} = \sum_{j=0}^{2^n-1} \alpha_j \ket{j_b} \\[2.5em]
        \alpha_j \in \Complex, \quad \sum_{j=0}^{2^n-1} \abs{\alpha_j}^2 = 1
    \end{array}
\end{equation}

If more than one $\alpha_j$ is non-zero, we say the state is in
``superposition''. Measuring a state in superposition produces different
outcomes, depending on the state's overlap with the outcome state. To see this,
consider the measurement
%
\begin{equation} \label{eq:computational-basis-measurement}
    \{ M_m = \dyad{m_b} \qquad m=0,\ldots,2^n-1 \}.
\end{equation}

Since every $\ket{j_b}$ has unit norm, and $\{\ket{j_b}\}_{j=0,\ldots,2^n-1}$
span the Hilbert space being considered, this set satisfies the conditions
outlined in the \Postulate{measurement} postulate. We conclude also from the
postulate that the probability of observing outcome $j$ is given by
%
\begin{equation}
    \Probability_{\ket{\psi}}[j] = \abs{\alpha_j}^2.
\end{equation}

\subsection{Density matrix}
\label{sec:density-matrix}

Consider the measurement \eqref{eq:computational-basis-measurement} on state
\eqref{eq:superposition}. After performing such a measurement, one holds state
$\ket{j_b}$ with probability $\Probability_{\ket{\psi}}[j]$. This is not
correctly described by a superposition. To see this, suppose a single qubit,
and a Hadamard gate:
%
\begin{equation}
    \label{eq:hadamard}
    \def\arraycolsep{8pt}
    \begin{array}{cc}
        \def\arraycolsep{4pt}
        H = \frac{1}{\sqrt{2}}\begin{pmatrix} 1 & 1 \\ 1 & -1 \end{pmatrix} & %
        \begin{array}{c} H\ket{0} = \frac{1}{\sqrt{2}}(\ket{0} + \ket{1}) \\
            H\ket{1} = \frac{1}{\sqrt{2}} (\ket{0}-\ket{1})\end{array}
    \end{array}
\end{equation}

From this definition and the previous discussion,
%
\begin{equation} \label{eq:hadamard-action}
    \def\strut{\vrule height 0pt depth 1em width 0pt}
    \everymath={\displaystyle}
    \begin{array}{c}
        \Pr_{H\ket{0}}[0] = \Pr_{H\ket{1}}[0] = 1/2 \\[1.2em]
        \strut \Pr_{H\ket{0}}[1] = \Pr_{H\ket{1}}[1] = 1/2
    \end{array}
\end{equation}
%
and we note that this is also true of the states
%
\begin{equation} \label{eq:plus-minus-states}
    \everymath={\displaystyle}
    \def\arraycolsep{.5cm}
    \begin{array}{cc}
        \ket{+} = \frac{\ket{0} + \ket{1}}{\sqrt{2}} & %
        \ket{-} = \frac{\ket{0} - \ket{1}}{\sqrt{2}}.
    \end{array}
\end{equation}

We conclude from Eq.~\eqref{eq:hadamard-action} that by taking a $\ket{0}$
state, applying a Hadamard gate, measuring, and again applying a Hadamard gate
and measuring, we should observe $0$ with probability $1/2$, and likewise $1$
with probability $1/2$. But, following the postulates and attempting to
describe the intermediate situation by either $\ket{+}$ or $\ket{-}$, we find
that
%
\begin{equation}
    \everymath={\displaystyle}
    \begin{array}{c}
        H \ket{+} = \ket{0} \\
        H \ket{-} = \ket{1}
    \end{array}
\end{equation}
%
which would indicate either $\Pr[0] = 1$ or $\Pr[1] = 1$.

\FootnoteLater{Alternatively, the density matrix formalism allows one to
    express both ``quantum randomness'', as resulting from the measurement of
    a state in superposition, and ``classical randomness'', in the sense of
    operations conditioned on a random/unknown bit string.}

Indeed, the system may be described by one of several states not in
superposition, while we are uncertain about which which state describes it. The
density matrix formalism (or \emph{density operator} formalism) gives a formal
tool for describing this situation \FootnoteNow. If a quantum state is in state
$\ket{\psi_j}$ with probability $p_j$, the associated density operator is
%
\begin{equation}
    \rho = \sum_j p_j \dyad{\psi_j}.
\end{equation}

In general, to be a valid density operator, $\rho$ must have unit trace and be
a positive operator. In particular, the density operator associated to a state
vector $\ket{\psi}$ is
%
\begin{equation}
    \rho = \dyad{\psi},
\end{equation}
%
and a statistical mixture between multiple $\rho_j$ with corresponding
probability $p_j$ is given by the density operator
%
\begin{equation}
    \rho = \sum_j p_j \rho_j.
\end{equation}

Density matrices are a ``complete'' formalism, in the sense that we may rephrase
the postulates of quantum mechanics only in terms of the density matrix
operator; in fact one may check that the two sets of postulates are equivalent.

\begin{description}
    \makeatletter
    \let\old@item\item
          \let\old@makelabel\makelabel
          \renewcommand{\item}[1][0]{%
              \old@item[#1]%
              \hfill\break}
          \def\ResetLabel{%
              \let\item\old@item%
              \let\makelabel\old@makelabel}
          \makeatother
    \item[State space {\it(Density matrix)}]
        An isolated physical system is completely described by a \emph{density
            operator}, which is a positive operator of unit trace in a Hilbert
        space. If a system is in state $\rho_j$ with probability $p_j$, its
        density operator is $\rho = \sum_j p_j \rho_j$.
    \item[Evolution {\it(Density matrix)}]
        Let a quantum system be described, at some moment, by the density
        operator $\rho$. Then, there exists a unitary operator $U$ such that,
        after some time, the quantum system is now described by the density
        operator $\rho' = U\rho U^\dagger$.
    \item[Measurement {\it(Density matrix)}]
        A set of operators $\{M_m\}_m$, acting on the space of $\rho$, and
        satisfying $\sum_m M_m^\dagger M_m$, define a ``measurement''. Each
        operator $M_m$ has an outcome $m$ associated to it. If a quantum state
        is described by a density operator $\rho$, the outcome of a measurement
        is $m$ with probability $\Tr(M_m^\dagger M_m \rho)$, and, after the
        measurement, the system is now described by the density matrix $\rho' =
            M_m \rho M_m^\dagger / \Tr(M_m^\dagger M_m \rho)$.
    \item[Composite Systems {\it(Density matrix)}]
        The density operator describing a quantum system composed of multiple
        quantum subsystems is given by the tensor product of the density
        operators of each of the subsystems.
\end{description}

A quantum state with density matrix $\rho$ for which there exists
%
\begin{equation}
    \everymath={\displaystyle}
    \begin{array}{r@{\quad}c@{\quad}l}
        \ket{\psi} & \text{such that} & \rho = \dyad{\psi}
    \end{array}
\end{equation}
%
is said to be a \emph{pure} state, while a state that cannot satisfy this is
said to be a \emph{mixed} state.

A density operator may be used to describe a quantum subsystem: if a system is
composed of subsystems $A$ and $B$, jointly described by the density operator
$\rho$, then subsystem $A$ is described by density operator
%
\begin{equation}
    \rho_A = \Tr_B(\rho) = \sum_j (I_A \otimes \bra{j}_B) \rho (I_A \otimes \ket{j}_B)
\end{equation}
%
where $\Tr_B$ is the newly defined \emph{partial trace} operation, $I_A$ is the
identity in the state space of $A$, and we take $\{\ket{j}\}_j$ to be a basis
over the state space of $B$.

\subsection{Quantum circuits}

To conclude this section, we introduce a common notation to denote unitary
transformations, and by extension quantum algorithms: quantum circuits.

\FootnoteLater{Rigorously, this depends on how the unitary is ``given''. Here,
    consider that a unitary is given by specifying its action over each element
    $\ket{j}$ of a set spanning the state space. By linearity (\viz.\ the
    \Postulate{evolution} postulate), this determines the action of the unitary
    over any vector in the state space.}

Recall that a quantum computation may be described by a sequence of unitary
evolutions and measurements, and a final measurement. While the measurements
correspond (by the \Postulate{measurement} postulate) to a physical procedure,
it is not necessarily clear how to implement a given unitary
transformation\FootnoteNow. Instead, we assume the ability to physically realize
a number of elementary operations, and compose these operations to build more
sophisticated unitary evolutions. Table \ref{table:basic-operations} lists some
common basic operations. Critically, a limited set of these elementary
operations can be sufficient to express any unitary evolution. I.e., the notion
of a \emph{universal quantum gate set} is well defined. The set of gates
specified in Table \ref{table:basic-operations} form a universal quantum gate
set. One could even reduce the size of this set while maintaining universality;
for example, the set of $\{R_X, R_Y, R_Z, \Controlled{X}\}$ gates is also
universal \cite{Nielsen2012}. A common choice of universal gate set is the
``Clifford+$T$'' set, where the Clifford gate set, $\{S, H, \Controlled{X}\}$,
is augmented with the $T$ gate. Note that the Clifford gate set generates all
the Pauli gates. The term ``Clifford group'' is used to denote the set of all
operations generated by the Clifford gate set.

\begin{table}
    \def\Action#1{$\begin{array}{l} #1 \end{array}$}
    \def\TabularSpacing{\hskip 0em plus 1em minus 1em}
    \def\CX{\Controlled{X}}

    \begin{tabular}{c @{\TabularSpacing}c@{\TabularSpacing} c} \hline\hline\vrule height 14pt width 0pt
        Symbol        & Definition                                     & Description         \\
        \hline\vrule height 18pt width 0pt
        $X$           & \Action{X\ket{0} = \ket{1}                                           \\ X\ket{1} = \ket{0}} & $X$-Pauli or ``not'' gate \\
        $Y$           & \Action{Y\ket{0} = -i\ket{1}                                         \\ Y\ket{1} = i\ket{0}} & $Y$-Pauli gate \\
        $Z$           & \Action{Z\ket{0} = \ket{0}                                           \\ Z\ket{1} = -\ket{1}} & $Z$-Pauli gate \\
        $H$           & \Action{H\ket{0} = \sqrt{1/2}(\ket{0}+\ket{1})                       \\ H\ket{1} = \sqrt{1/2}(\ket{0}-\ket{1})} & Hadamard gate \\
        $S$           & \Action{S\ket{0} = \ket{0}                                           \\ S\ket{1} = i\ket{1}} & Phase gate \\
        $T$           & \Action{T\ket{0} = \ket{0}                                           \\ T\ket{1} = e^{i\pi/4}\ket{1}} & $\pi/8$ gate \\[9pt]
        $R_X(\theta)$ & $\exp{-i\theta X/2}$                           & $X$-rotation gate   \\[9pt]
        $R_Y(\theta)$ & $\exp{-i\theta Y/2}$                           & $Y$-rotation gate   \\[9pt]
        $R_Z(\theta)$ & $\exp{-i\theta Z/2}$                           & $Z$-rotation gate   \\[6pt]
        $\CX$         & $\dyad{0}\otimes I + \dyad{1}\otimes X$        & Controlled-not gate \\[6pt]
        \hline\hline
    \end{tabular}
    \caption{Common ``native'' operations, often assumed to be physically
        realizable, to be composed into other operations.}
    \label{table:basic-operations}
\end{table}

Quantum circuits provide a visual notation to denote composition of elementary
gates into larger unitary evolutions and measurements. The main elements of a
quantum circuit are given in Table \ref{table:quantum-circuit}. Assuming every
operation in the elementary gate set can be performed in a time step, it follows
that the number of vertical slices in a quantum circuit correspond to the
running time of the circuit. This is referred to as the circuit's \emph{depth}.
The circuit's \emph{width} is the number of qubits acted upon non-trivially by
the circuit, and relates to the space requirements of the circuit.

Because a quantum algorithm corresponds to known unitary evolutions and
measurements, it is expressible in quantum circuit form. We say, then, that a
quantum algorithm is efficient if the corresponding quantum circuit's width and
depth scale at most polynomially with input size.

\begin{table*}
    \def\TabularSpacing{\hskip 2em plus 1em minus 1em}
    \def\Description#1{\begin{minipage}[c]{.6\linewidth}{#1}\end{minipage}}
    \def\GateStrut#1{\vrule height 9pt depth 3pt width 0pt\hskip .25em{#1}\hskip.25em}
    \begin{tabular}{c @{\TabularSpacing}c@{\TabularSpacing} c} \hline\hline\vrule height 14pt width 0pt
        Symbol                                                                                                                                       & Definition           & Description                                                                                                                                     \\
        \hline\vrule height 1cm width 0pt
        $\Qcircuit {                                                                                                                                 & \qw }$               & Wire                     & \Description{Reperesents the state space
            associated to a qubit, i.e., $\Hilbert_2$. Multiple wires,
            vertically aligned, represent the collective Hilbert space, given
            by the tensor product of the individual $\Hilbert_2$ spaces
            (\viz.\ the \Postulate{composite systems} postulate).}
        \\[.8cm]
        $\Qcircuit @C=1em { \lstick{{}_A}                                                                                                            & \qw{/}               & \qw }$                   & Register                                 &
        \Description{Represents multiple wires, i.e., a subspace of dimension
            $2^n$, where $n$ is the number of wires in the register. As shown, a
        register may be labelled.}                                                                                                                                                                                                                                                                                            \\[.7cm]
        $\Qcircuit @C=1em{                                                                                                                           & \gate{U}             & \qw }$                   & Gate                                     & \Description{Unitary $U$ acting
            on the state space associated to the incoming wire (left-hand side).
            The outgoing wire (right-hand side) corresponds to the state space
        after the action of the gate.}                                                                                                                                                                                                                                                                                        \\[.7cm]
        $\Qcircuit @C=1em {                                                                                                                          & \meter               & \cw }$                   & Measurement                              & \Description{
            Measurement of the state subspace associated to the incoming wire
            (left-hand side), according to the measurement $\{(\dyad{0}, 0),
                \allowbreak (\dyad{1}, 1)\}$ (``computational basis measurement'').
            The double wire represents the resulting classical bit (i.e., a
        $\ZZ_2$ space).}                                                                                                                                                                                                                                                                                                      \\[.8cm]
        \begin{minipage}[c]{1.4cm}
            $\Qcircuit @C=1em @R=1em {\lstick{{}_A}& \ctrl{1} & \qw \\ \lstick{{}_B}{\kern6pt/}& \gate{U} & \qw}$
        \end{minipage} & Controlled operation & \Description{Denotes the
            operation $\dyad{0}_A\otimes I_B + \dyad{1}_A\otimes U_B$ acting on the
        collective state space of $A$ and $B$.}                                                                                                                                                                                                                                                                               \\[.6cm]
        $\Qcircuit @C=1em { \lstick{\ket{\psi}}                                                                                                      & \gate{U}             & \qw}$                    & Application                              & \Description{
            The quantum state resulting from application of
            the unitary represented by the quantum circuit on the quantum state
            specified on the left-hand side. The resulting state may be denoted
        on the right-hand side.                                                                                                                                                                                                                                                                                               \\ Here, represents the state
        $U\ket{\psi}$.}                                                                                                                                                                                                                                                                                                       \\[1cm]
        \hskip-1.2em $\Qcircuit @C=.5em {                                                                                                            & {/}\qw               & \gate{\GateStrut{V}}     & \gate{\GateStrut{W}}                     & \qw \\}$  & Composition & \Description{Circuits are
            ``read'' left-to-right. Thus, the concatenation of two circuits
            denoting the actions of operators, respectively, $V$ and $W$, results
        in a circuit denoting the operator $WV$.}                                                                                                                                                                                                                                                                             \\[.6cm]
        \hline\hline
    \end{tabular}
    \caption{Quantum circuit notation.}
    \label{table:quantum-circuit}
\end{table*}

\section{Simulation techniques}

The quantum computational model, as introduced in the previous section, is
believed to be more powerful than the classical computing model
\cite{Bernstein1993}. One canonical example of evidence for this separation is
the existence of Shor's Algorithm for efficient factoring
\cite{Shor1994,Kitaev1995}. Thus, the task of classically simulating a quantum
computation becomes doubly significant: on the one hand, an efficient classical
algorithm to simulate an arbitrary quantum computation would have a fundamental
impact in the current understanding of computational complexity (but, for this
reason, is not expected to exist). On the other hand, it is necessary to ensure
that a particular instance of a quantum circuit cannot be classically simulated
in practice in order to claim that a quantum computation has been carried out
``with advantage'' (i.e., beyond a classical regime, or what is often referred
to as ``quantum supremacy'') \cite{Arute2019,Terhal2018,Haner2017,Markov2018}.
Finally, even in the quantum advantage regime, small-scale classical
simulations remain relevant as a source of reference data for validation
\cite{Bravyi2016}.

As discussed in section \ref{sec:state-vector}, a quantum state of $n$ qubits is
determined by a vector of $2^n$ complex values, up to a global phase factor, and
subject to a normalization constraint. Therefore, on first approach, one may
believe that a quantum circuit of more than $40$--$45$ qubits cannot be
classically simulated, simply due to the memory requirements of maintaining a
quantum state vector. However, this consideration ignores the details of any
particular problem instance, such as the existence of structure or constraints
in the quantum circuit to be ran, or specifications on the desired output. In
this section, we review some key theoretical results regarding classical
simulation of quantum computations, as well as some general techniques.

\subsection{Strong simulation \texorpdfstring{\textit{vs.}}{vs.}~weak simulation}

As defined in section \ref{sec:quantum-computation-fundamentals}, the final
output of a quantum computation results from a measurement. Due to the nature of
measurements, the outcome is a random variable, and its distribution depends on
the underlying state vector before the measurement. Thus, should a classical
simulation of a quantum algorithm:
%
{
\renewcommand{\labelenumi}{\textit{\roman{enumi}.}}
\begin{enumerate}
    \item generate a random outcome observing the same output distribution as
          the quantum counterpart, or;
    \item explicitly specify the distribution of the generated output?
\end{enumerate}
}

These two problem specifications correspond, respectively, to the notions of
\emph{weak simulation} and \emph{strong simulation}. Requiring either strong or
weak simulation may significantly affect the computational hardness of the task;
indeed there exist circuits for which classical weak simulation is easy, but
classical strong simulation is hard \cite{VandenNest2010}. Stating these two
notions more formally:

\begin{definition}[Strong simulation \cite{VandenNest2010}]
    \label{def:strong-simulation}
    Given a description of a quantum circuit of $n$ qubits, which corresponds
    to unitary operator $U$, terminating with a measurement of the first qubit
    in the computational basis, output $\Tr(\dyad{0} \mkern5mu U \mkern-3mu
        \dyad{0_b} U^\dagger)$.
\end{definition}

\begin{definition}[Weak simulation \cite{VandenNest2010}]
    \label{def:weak-simulation}
    Given a description of a quantum circuit of $n$ qubits, which corresponds
    to unitary operator $U$, terminating with a measurement of the first qubit
    in the comptuational basis, output $0$ with probability $\Tr(\dyad{0}
        \mkern5mu U \mkern-3mu \dyad{0_b} U^\dagger)$, or $1$ otherwise.
\end{definition}

\FootnoteLater{By the \Postulate{state space} postulate, a global phase factor
    is physically irrelevant. However, relative phase differences should not be
    disregarded. As a simple example, consider the action of the Hadamard gate
    (eqs.~\eqref{eq:hadamard-action},\eqref{eq:plus-minus-states}): the only
    difference between the $\ket{+}$ and $\ket{-}$ states is the relative phase
    difference between the $\ket{0}$ and $\ket{1}$ components -- however, the
    result from acting with the Hadamard gate is completely different.}

In practice, one may wish to simulate the circuit only up to some point, or to
inspect the intermediate state of a small-scale computation
\cite{Bravyi2016,Haner2016}. This motivates a different definition of strong
simulation by some authors. Namely, recalling Eq.~\eqref{eq:superposition},
note that knowledge of $\alpha_j$ is enough to determine the probability of
observing any measurement in the computational basis. However, the converse is
not true: because $\Pr[j] = \abs{\alpha_j}^2$, knowledge of $\Pr[j]$ fails to
inform about the complex phase of $\alpha_j$\FootnoteNow. So, strong simulation
may be taken to mean:

\begin{definition}[Strong simulation, wave-function version \cite{Chen2017}]
    \label{def:strong-simulation-wavefn}
    Given a quantum circuit of $n$ qubits, corresponding to a unitary operator
    $U$, and an $n$-bit string $j$, output $\bra{j_b} U \ket{0_b}$.
\end{definition}

Note that, in all of the definitions above, we took the quantum circuits to be
described by a unitary operator, which may not be trivially true if measurements
are performed half-way in the circuit (\viz.\ section~\ref{sec:density-matrix}),
or if classical post-processing is employed. However, a well-known result,
which we review in appendix~\ref{appendix:delayed-measurement}, allows us to
defer all measurements to the end of the circuit, such that the whole of the
computation is carried out unitarily.

\subsection{Clifford circuits and the Gottesman-Knill theorem}

Recall that Clifford gates are gates in the Clifford set, i.e., any quantum
circuit that can be written in terms of phase, Hadamard, and Controlled-Not
gates (\viz.\ Table~\ref{table:basic-operations}). Then, the Gottesman-Knill
theorem states the following:

\begin{theorem}[Gottesman-Knill \cite{Gottesman1999}]
    \label{theorem:gottesman-knill}
    \def\ketneg{\mkern-5mu}
    Every (uniform family of) Clifford circuit(s), when applied to the input
    state $\ket{0_b} \equiv \ket{0}\ketneg\ket{0}\ldots\ket{0}$, and when
    followed by a computational basis measurement of the first qubit, can be
    efficiently simulated classically in the strong sense.
\end{theorem}

The theorem is constructive, and in Ref.~\cite{Gottesman2004} Gottesman and
Aaronson provide a high-performance (weak) simulator of Clifford circuits that
can scale up to tens of thousands of qubits. Van den Nest \cite{VandenNest2010}
gives an alternative derivation of the theorem that allows for direct strong
simulation as well (both the regular and wave-function version).

Despite this result, Clifford circuits very easily extend to the universality
regime: not only by augmentation of the gate set -- the Clifford+$T$ gate set
is already universal -- but also by choice of the input state. Indeed, there
exist ``magic states'', such that a supply of these (pre-prepared) quantum
states and Clifford operations are enough to perform universal quantum
computation \cite{Bravyi2005}. Nonetheless, if Clifford gates dominate a
non-Clifford circuit, this structure may be exploited to speed-up computation,
by treating the simulation as a tensor network where the calculation of certain
tensors can be sped-up \cite{Bravyi2016} (\viz.\ section
\ref{sec:tensor-network-simulation}).

This move from Clifford-based computation to quantum universality entails a
``jump'', since Clifford circuits are not as powerful as classical circuits.
In Ref.~\cite{VandenNest2010}, this computational gap is discussed and
eliminated, by giving a superclass of Clifford circuits, ``$HT$ circuits'', that
is equivalent to classical computation and can be weakly simulated.

Finally, note also that certain classes of efficiently simulatable circuits not
discussed here, such as \emph{matchgate circuits}, may relate non-trivially
with Clifford circuits \cite{Valiant2002,Jozsa2008,Brod2016}.

\subsection{Schr\"{o}dinger simulation}
\label{sec:schrodinger-simulation}

\Schrodinger\ simulation refers to the straightforward approach of maintaining
the global state-vector, updating it as new unitary operations are encountered
\cite{DeRaedt2019}. As such, it is inherently a form of the wave-function
version of strong simulation (definition \ref{def:strong-simulation-wavefn}).
This method also requires, by definition, that $2^n$ complex values are
maintained for a state of $n$ qubits, such that it cannot physically scale
beyond a certain number of qubits (about $45$-$50$ qubits, corresponding to a
petabyte or more of memory, if each amplitude is represented within $8$ bytes;
barring adaptative models admitting error, such as in Ref.~\cite{DeRaedt2019}).

Despite this constraint, a significant amount of research has been devoted to
\Schrodinger\ simulation in the memory-tractable regime ($n \lesssim 50$),
specifically in pushing this limit and speeding up the running time
\cite{Smelyanskiy2016,Fatima2021,Markov2018,DeRaedt2019,Jones2019,Haner2017,Haner2016,Niwa2002,DeRaedt2006}.

A key observation is that, if each gate is considered at a time, the
matrix-vector products being calculated are of very sparse and strongly
structured matrices. Namely, the gates are \emph{local}, in the sense that they
involve non-trivially at most a small number $k$ of qubits. For example, in
Ref.~\cite{Haner2017}, this is used to ensure that compute resources are
maximally utilized via parallelization. Following a different strategy, in
Ref.~\cite{Haner2016}, advance knowledge of the action of common blocks of
operations in quantum computing is used to speed up over the simulation of each
gate individually. Gate fusion, different encoding techniques and cache-related
considerations, as well as employment of compiler intrinsics, also allow for
speed improvements
\cite{Fatima2021,Smelyanskiy2016,Haner2016,Haner2017,Markov2018}.

Otherwise, the problem may be regarded as a classical
large-sparse-matrix and vector product, a well-researched problem (see, e.g.,
Ref.~\cite{Xiao2023}).

\subsection{Feynman simulation}
\label{sec:feynman-simulation}

The Feynman simulation method
\cite{Boixo2017,Chen2017,Bernstein1997} trades the $2^n$ memory requirement by
an exponential time computation, but in linear space. Being also a form of the
wave-function version of strong simulation (definition
\ref{def:strong-simulation-wavefn}), the Feynman simulation method follows from
noticing the following:
%
\begin{equation}
    \def\ProdStrut{\rule[-0.3\baselineskip]{0pt}{7pt}}
    \everymath={\displaystyle}
    \begin{array}{l}
        \bra{x} U_L U_{L-1} U_{L-2} \cdots U_2 U_1 \ket{0_b} = {}                                      \\[1em]
        {} = \bra{x} U_L (\Sigma_{j=0}^{2^n-1} \dyad{j}) U_{L-1} (\Sigma_{j'=0}^{2^n-1} \dyad{j'}) \\[.5em]
        \hfill U_{L-2} \cdots U_2 (\Sigma_{j''=0}^{2^n-1} \dyad{j''}) U_1 \ket{0_b} = {}               \\[1em]
        \hfill {} = \sum_{\mathclap{\{y_{(t)}\} \in \{0,1,\ldots,2^n-1\}^{L-1}}} \mkern 65mu \prod_{\ProdStrut t=0}^{\ProdStrut L-1} \bra{y_{(t+1)}} U_{t} \ket{y_{(t)}}
    \end{array}
\end{equation}
%
since $\{\ket{j_b}\}_{j=0,\ldots,2^n-1}$ form an orthonormal basis of the state
vector space, and letting $\ket*{y_{(L)}} \equiv \ket{x}$. Now, take each of
the $U_t$ to be the (local, sparse) unitary corresponding to a quantum gate in
a quantum circuit, such that $L$ is the depth of the quantum circuit. One may
conclude this scheme requires $\BigO(n\cdot(2d)^{n+1})$ time and $\BigO(n \log
    d)$ space \cite{Chen2017}.

This approach may be interpreted as a discrete version of the Feynman path
integral formulation, where, simply put, every possible ``computation path''
(corresponding to a choice of $\{y_{(t)}\}$) is considered separately, in order
to determine the resulting constructive or destructive interference between the
paths; each path requires a linear amount of memory to compute, but there are
exponentially many paths to consider, which interfere among themselves. This is
illustrated in figure~\ref{fig:interference}.

A further advantage of this method is that some of the computational paths may
be simply ignored, at the expense of simulation fidelity. The computational
advantage of not having to compute these paths is sufficiently expressive to
allow for the simulation of real-world implementations thought to be impossible
to simulate (matching their fidelity) \cite{Markov2018}. These considerations
also apply to the \Schrodinger-Feynman method (section
\ref{sec:schrodinger-feynman-simulation}).

\begin{figure}
    \hskip-.5cm\includegraphics{assets/interference.pdf}
    \caption{The action of two Hadamard gates (eq.~\ref{eq:hadamard}) acting on
        a single qubit initialized to $\ket{0}$, as a trivial example of
        interference. Each node's tone reflects the absolute value of the
        associated amplitude: darker corresponds to greater amplitude. The node
        with a negative amplitude is marked with a $(-)$. In a Feynman path
        integral interpretation (see section \ref{sec:feynman-simulation}),
        each of the drawn paths is considered separately, and then summed,
        resulting in the destructive interference of the $\ket{1}$ state.}
    \label{fig:interference}
\end{figure}

\subsection{Schr\"{o}dinger-Feynman simulation}
\label{sec:schrodinger-feynman-simulation}

It is possible to establish an intermediate scheme between \Schrodinger\
simulation (section \ref{sec:schrodinger-simulation}) and Feynman simulation
(section \ref{sec:feynman-simulation}) \cite{Chen2017,Boixo2017,Chen2018}.
Thus, this scheme, which allows for a controllable trade-off between space and
time complexity, is referred to by some authors as \Schrodinger-Feynman
simulation \cite{Boixo2017,Burgholzer2021,Fatima2021}.

The main idea of the technique is to divide the quantum circuit into disjoint
registers, performing \Schrodinger\ simulation for operations that involve only
qubits in the same register, but summing over ``paths'' resulting from
operations across registers. This allows the computation to be distributed
(across different choices of computation paths for cross-register operations),
while maximally exploiting the memory available to each process.

The choice of (maximum) size of the registers determines the memory consumption,
with bigger sized registers requiring more memory but less computational paths
to consider. Thus, for at-most $k$-qubit sized registers in an $n$-qubit
circuit of depth $d$, one requires $\BigO(n 2^{n-k} \cdot (2d)^{k+1})$ time, and
$\BigO(2^{n-k} \log d)$ space \cite{Chen2017}.

This method is well suited for simulating circuits with a grid-like
connectivity graph between qubits, as is common in practical implementations
\cite{Arute2019,Kim2023,Saffman2010}, and so is used in multiple works pushing
the boundary of quantum advantage, allowing for simulation of circuits with
significantly more than $50$ qubits \cite{Chen2018,Li2018,Haner2017}.

\subsection{Tensor network simulation}
\label{sec:tensor-network-simulation}

Closely related to the Feynman simulation technique, but generalizing the idea,
tensor-network based simulation regards quantum circuit simulation as a form of
tensor contraction \cite{Vidal2003,Markov2008,McCaskey2018,Pednault2020,Villalonga2019,Guo2019}.

\FootnoteLater{Terminology regarding tensors is not always consistent across
    works. For example, the terms \emph{way}, \emph{order}, \emph{degree}, or
    \emph{dimension} may be used instead of \emph{rank}
    \cite{Joshi1995,DeLathauwer2000,Vasilescu2002,Kolda2009}.}

Recall that a tensor is an object generalizing matrices: a tensor has upper and
lower (contra- and co-variant) indices; its number of indices determines its
\emph{rank}, while the values that each index may take determine the
\emph{dimension} of the index\FootnoteNow. A given choice of indices yields a
\emph{component} of the tensor, e.g., $T^{ijk}_{lm}$ denotes the $(i,j,k;l,m)$th
component of the tensor $T$. A tensor $T$ is of type $(a,b)$ if it has $a$
contravariant indices and $b$ covariant indices. Two tensors may be
\emph{contracted} by summing over, respectively, a contravariant and
contravariant index of the same dimension over the product of the two tensors.
I.e., let $M$ and $N$ be two tensors of types $(a,b)$ and $(c,d)$, with
dimension $d$ on each index. The tensor whose components are given by
%
$$
    \sum_{i_k=1}^d M^{i_1 \ldots i_{k-1} i_k i_{k+1} \ldots i_a}_{j_1 \ldots j_b}
                   N^{k_1 \ldots k_c}_{l_1 \ldots l_{k-1} i_k l_{k+1} \cdots l_d}
$$
%
is an $(a+c-1,b+d-1)$-type tensor.

Now, it is possible to represent a quantum state of $n$ qubits by an $n$-rank
tensor, where each index has dimension $2$:
%
\begin{multline}
        \psi^{i_1 i_2 \ldots i_n} \quad \leftrightarrow \quad \\
         \ket{\psi} = \sum_{i_1, \ldots, i_n = 0,1} \psi^{i_1 \ldots i_n} \ket{i_1}\cdots\ket{i_n}
\end{multline}

It follows that likewise any quantum operation involving $k$ qubits is given by
a $(k,k)$-type tensor, and the quantum state vector after the operation is given
by a tensor contraction. E.g.,
%
\begin{multline}
    \ket{\phi} = G\ket{\psi} \quad\leftrightarrow\quad \\
    \phi^{i'_a \ldots i'_k i_{k+1} \ldots i_n} = \sum_{\mathclap{i_a \ldots i_k = 0,1}} G^{i'_a \ldots i'_k}_{i_a \ldots i_k} \psi^{i_a \ldots i_k i_{k+1} \ldots i_n}
\end{multline}
%
where we have, for simplicity of notation, taken the action of the gate $G$ to
be on the first $k$ qubits.

Therefore, it is possible to represent a quantum circuit acting on an input
state as a sequence of tensor products and contractions. The network of
contractions of the multiple tensors is designated by \emph{tensor network}.
The result of the contraction will be a vector (rank-$1$ tensor), the
components of which are the entries in the state vector resulting from the
action of the quantum circuit in the input state. Therefore, contracting the
tensor network in this manner yields a wave-function strong simulation method
(definition \ref{def:strong-simulation-wavefn}). However, it is also possible
to fix a final projector, i.e., calculate the contraction respecting to the
tensor
%
\begin{equation}
    \bra{x} U \ket{0_b} = \sum_{\mathclap{i_1, \ldots, i_n = 0,1}} x_{i_1 \ldots i_n} \; U^{i_1 \ldots i_n}_{0, 0, \ldots, 0}
\end{equation}
%
for a computational basis state $\ket{x}$, and where $U$ is the tensor
corresponding to the action of the quantum circuit. In this case, the result of
the contraction is a single complex amplitude (a rank-$0$ tensor), making it
suitable for strong or weak simulation
(defs.~\ref{def:strong-simulation},\ref{def:weak-simulation}). Fixing a final
projector allows for a different contraction order, and may greatly improve the
efficiency of the simulation \cite{Burgholzer2023}.

Depending on how the tensor network is contracted, the memory and time necessary
to respectively keep track of the tensor and contract it may vary dramatically
\cite{Lykov2022}. Thus, optimizing the procedure of finding the optimal
contraction ordering, known to be a computationally hard task in its generality,
is a central research topic
\cite{Pfeifer2014,Gray2021,Pednault2020,Fried2018,Schindler2020,Ibrahim2022,Liang2021}.
Nonetheless, when obtaining the state vector, one may upper bound the time
complexity of the contraction as $T^{\BigO(1)}\exp[\BigO(qD)]$, where $T$ is
the total number of gates in the circuit, $q$ is the maximum number of adjacent
qubits involved in a single operation, and $D$ is the depth of the circuit
\cite{Markov2008}. Note how the time complexity grows exponentially with the
depth of the circuit; this motivated the increase of depth in ``quantum
supremacy'' experiments, though other characteristics, like qubit connectivity
or better contraction orderings, may still allow for tensor-based simulation
\cite{Gray2021,Pan2022,Tindall2023}.

Finally, we note the method of \emph{decision diagrams}
\cite{Niemann2016,Lu2011,Zulehner2019efficiently,Viamontes2004,Zulehner2019matrix,Burgholzer2021equivalence}.
Developed in parallel to tensor methods, decision diagrams are similar to
tensor networks, but with less redundancy. This entails both advantages and
drawbacks, and we refer to Ref.~\cite{Burgholzer2023} for a comparison of the
two methods.

\subsection{Noise simulation}

In the previous subsections, we have considered simulation of ``perfect''
quantum circuits, i.e., circuits that do not simulate the presence of noise
(even though they may consider the presence of noise to speed-up the simulation,
such as in \cite{Markov2018}). However, the presence of noise is practically
unavoidable in current experimental settings \cite{Preskill2018}. Therefore, it
may be useful to simulate a noisy quantum circuit.

\FootnoteLater{For the reader unfamiliar with quantum channels, they may regard
    them as a generalization of measurements in the mixed state formulation.}

Typically, the density matrix formalism (\viz.\ section
\ref{sec:density-matrix}) is used to describe the quantum state resulting from
a quantum circuit affected by noise, by modelling the effects of noise as
\emph{quantum channels}, i.e., completely positive, convex-linear,
non-trace-increasing maps on density matrices\FootnoteNow \cite{Nielsen2012}.
These channels, in turn, reflect physical equational models of noise
\cite{Breuer2002}.

While, for example, tensor-based simulation (section
\ref{sec:tensor-network-simulation}) naturally extends to density matrix
simulation \cite{Markov2008}, it may seem that the overhead of maintaining a
density matrix while employing \Schrodinger\ or Feynman simulation would be
impeditive. However, it turns out that the effect of the usual noise channels
can be rephrased as the statistical average of random, non-unitary gates in a
quantum circuit \cite{Bassi2008}. Therefore, with an overhead due to repeating
the simulation multiple times to collect statistics, it is possible to extend
also the pure-state based methods to simulate noise.

\section{Simulation software}

In this section we examine some of the publicly available software for
simulating quantum circuits. It is important to note that we do not claim to be
exhaustive in the number of simulators considered, nor on the benchmarking the
considered solutions. The large and growing number of quantum circuit
simulators available, and the different possible goals for such software (e.g.,
small-scale \textit{vs.}~supremacy-scale simulation), would make it impossible
for a complete and even comparison. Thus, we focused on a subset of offline,
small-scale, industry-recognized simulators, such as those that a researcher
might use to validate their algorithms in toy-settings using their laptop.

\subsection{Methodology}

\FootnoteLater{As of July $16$, $2023$, the Github repositories for the Cirq,
    the Microsoft Quantum, and the Qiskit simulators have, respectively,
    $3,\!823$, $3,\!737$, and $3,\!678$ ``stars''.}

We began with the $3$ most popular (at the time of writing) quantum circuit
simulators under the Github ``quantum-computing'' tag\FootnoteNow. Then, we
considered simulators for which Qiskit or Cirq provided backend interface. For
providers of quantum hardware interfacing with Qiskit, we considered the
provider's simulation solution, if it existed. We then proceeded to traverse
the reference graph of the simulator's reference publications, gathering a
total of $20$ simulators. A graph outlining the citation chains followed is
given in figure \ref{fig:citation-graph}.

\begin{figure}[thb]
    \includegraphics[page=2]{assets/reference_spider.pdf}
    \caption{Graph illustrating the citation chains followed to enumerate the
        simulators considered.}
    \label{fig:citation-graph}
\end{figure}

\subsection{Simulators}

\Simulator{Cirq (internal/qsim/qsimh simulators)}

Cirq \cite{cirq2022} is Google's solution for quantum circuit related tasks,
from design to real-world execution. Besides third-party backends (see sections
\ref{sec:simulator-qflex},\ref{sec:simulator-quimb}), Cirq contains three
separate simulators: the built-in Python simulator \cite{cirq2022simulator},
qsim, and qsimh \cite{qsim2020}. The built-in Python simulator performs
\Schrodinger-based simulation (thus, maintains the state vector) using a Numpy
\cite{numpy2020} sparse-matrix representation, and is meant for testing small
circuits. qsim and qsimh, in contrast, are Google's optimized and
high-performance simulation solutions. Both are written in \Cpp, and are,
respectively, a \Schrodinger\ and a \Schrodinger-Feynman simulator (\viz.\
sections
\ref{sec:schrodinger-simulation},\ref{sec:schrodinger-feynman-simulation}).
qsim and qsimh employ gate fusion, vectorized instructions and multi-threading
for computational speed-up. qsimh supports trading computational time by
lower-fidelity simulation, as outlined in section \ref{sec:feynman-simulation}.
Both qsim and qsimh integrate with Cirq's Python interface.

\Simulator{Microsoft Quantum}

Microsoft's ``Quantum Development Kit'' is a tool-kit for quantum computation,
including quantum circuit simulation \cite{microsoft2023}. It is integrated with

\Simulator{Qiskit}
\Simulator{qFlex}
\Simulator{quimb}
\Simulator{Rigetti Forest}
\Simulator{NVIDIA cuStateVec}
\Simulator{NVIDIA cuTensorNet}
\Simulator{Amazon SV1}
\Simulator{Amazon TN1}
\Simulator{Amazon DM1}
\Simulator{MQT/DDSIM}
\Simulator{Intel Quantum Simulator}
\Simulator{Xanadu's Strawberry Fields}
\Simulator{Pennylane Lightning}
\Simulator{Qibo}
\Simulator{QCGPU}
\Simulator{Quantum++}

\appendix

\section{Delayed measurement}
\label{appendix:delayed-measurement}

The delayed measurement lemma states:
%
\begin{lemma}[Delayed measurement \cite{Nielsen2012}]
    \begin{equation}
        \def\RegisterSlash{{\hskip 1em /}}
        {\raisebox{10pt}{
        $\displaystyle
            \Qcircuit @C=1em @R=1em {
            & \qw & \ctrl{1} & \meter \\
            \RegisterSlash & \qw & \gate{U} & \qw
            }
        $
        }}
        \mskip 10mu = \mskip 5mu
        {\raisebox{11pt}{
        $\displaystyle
            \Qcircuit @C=1em @R=1em {
            & \meter & \cctrl{1}  &     \\
            \RegisterSlash & \qw & \gate{U}   & \qw
            }
        $
        }}
    \end{equation}
    %
    i.e., measurements can always be moved from an intermediate stage of a
    quantum circuit to the end of the circuit, replacing conditional classical
    operations by controlled quantum operations.
\end{lemma}

This statement may be proven by explicitly calculating the density matrix
resulting from the action of each circuit, for an arbitrary input, and checking
that the result is the same. It follows that, when speaking of a quantum
algorithm, one may always take the procedure to be described by a unitary
operation followed by measurements.

\bibliography{citations}

\end{document}