Quantum Machine Learning Tutorial

A Hands-on Tutorial for Machine Learning Practitioners and Researchers

Chapter 4.5 Recent Advancements

Quantum neural networks (QNNs) have emerged as a prominent paradigm in quantum machine learning, demonstrating potential in both discriminative and generative learning tasks. For discriminative tasks, QNNs utilize high-dimensional Hilbert spaces to efficiently capture and represent complex intrinsic relationships. In generative tasks, QNNs leverage variational quantum circuits to generate complex probability distributions that may exceed the capabilities of classical models. While these approaches share common learning strategies, they each introduce unique challenges and opportunities in terms of model design, theoretical exploration, and practical implementation. In the remainder of this section, we briefly review recent advances in QNNs. Interested readers can refer to @ablayev2019quantum [@li2022recent; @massoli2022leap] for a more comprehensive review.

Discriminative learning with QNN

QNNs for discriminative tasks have emerged as one of the most active research areas in quantum machine learning, demonstrating potential advantages in feature representation and processing efficiency. The quantum learning approach leverages the high-dimensional Hilbert space and quantum parallelism to potentially handle complex classification boundaries more effectively than classical neural networks. Research has shown particular promise in handling datasets with inherent quantum properties and problems where quantum entanglement can be meaningfully exploited [@huang2022quantum].

Model designs

In the realm of quantum discriminative models, researchers have developed various quantum neural architectures. In general, variational quantum classifiers [@havlivcek2019supervised; @mitarai2018quantum] could employ parameterized quantum circuits for classification tasks. Subsequently, quantum convolutional neural networks [@cong2019quantum] are designed for processing structured data. Hybrid quantum-classical architectures [@arthur2022hybrid] are proposed to combine quantum layers with classical neural networks. Other notable works include the development of quantum versions of popular classical architectures like recurrent neural networks [@bausch2020recurrent] and attention mechanisms [@shi2024qsan]. Finally, @perez2020data [@fan2022compact] have explored quantum re-uploading strategies for encoding classical data, achieving QML models with more expressive feature maps.

In addition to manually designed architectures, various lightening strategies have been explored to enhance the efficiency of quantum neural networks. For example, quantum architecture search methods have been developed by [@du2022quantum; @zhang2022differentiable] to automatically discover optimal quantum circuit designs with reduced gate complexity. @sim2021adaptive [@wang2022symmetric] introduced quantum pruning techniques that systematically identify and remove redundant quantum gates while preserving the performance. In the realm of knowledge distillation, researchers have demonstrated how to transfer knowledge from the teacher model given as quantum [@alam2023knowledge] or classical [@li2024hybrid] neural networks to more compact quantum circuit architectures that are more robust against quantum noises. These optimization approaches have collectively contributed to improving the practical performance of QNNs on real quantum devices, particularly in the NISQ era.

Theoretical foundations

To gain a deeper understanding of the potential advantages and limitations of QNNs, a crucial research topic is analyzing their learnability. More concisely, the learnability is determined by the interplay of three key aspects: expressivity, trainability, and generalization, as preliminarily introduced in Chapter 1.4{reference-type=“ref” reference=“chapt5:sec:qnn_theory”} with essential theoretical results. Beyond these foundational insights, an extensive body of research has conducted more comprehensive and detailed investigations into these three aspects, which will be reviewed individually in the following.

Expressivity. The expressivity of QNNs refers to their ability to represent complex functions or quantum states efficiently. Universal approximation theorems (UAT) incorporating data re-uploading strategies have been established by @perez2020data firstly with subsequent works  [@schuld2021effect; @yu2022power] in various problem settings. Beyond the UAT, @sim2019expressibility and @nakaji2021expressibility analyze the expressivity of QNNs by investigating how well the parameterized quantum circuits used in QNNs can approximate the Haar distribution, a critical measure of expressive capacity in quantum systems. Moreover, @yu2024non analyze the non-asymptotic error bounds of variational quantum circuits for approximating multivariate polynomials and smooth functions.

Trainability. The trainability of QNNs corresponds to two aspects during the optimization of QNNs: the gradient magnitude and the convergence rate.

For the first research line, @mcclean2018barren first found the phenomenon of vanishing gradients, dubbed as the barren plateau, where the gradient magnitude scales exponentially small with the quantum system size. Since then, a series of studies explored the cause of barren plateau, including global measurements [@cerezo2021cost], highly entanglement [@ortiz2021entanglement], and quantum system noise [@wang2021noise].

Efforts to address the barren plateau problem, a major challenge in training deep quantum circuits, have yielded strategies such as proper circuit and parameter initialization techniques [@grant2019initialization; @zhang2022escaping], cost function design [@cerezo2021cost], and proper circuits [@pesah2021absence]. Quantum-specific regularization techniques have also been developed to mitigate these effects [@larocca2022diagnosing].

Another research line on the trainability of QNNs focuses on the convergence of QNNs, which is not introduced in this Chapter. In particular, @kiani2020learning and @wiersema2020exploring experimentally found that overparameterized QNNs embrace a benign landscape and permit fast convergence towards near optimal local minima. Afterward, initial attempts have been made to theoretically explain the superiority of over-parameterized QNNs. Specifically, @larocca2023theory and @anschuetz2021critical utilized tools from dynamical Lie algebra and random matrix theory, respectively, to quantify the critical points in the optimization landscape of overparameterized QNNs. Moreover, @you2022convergence extended the classical convergence results of @xu2018convergence to the quantum domain, proving that overparameterized QNNs achieve an exponential convergence rate. Additionally, @liu2023analytic and @wang2023symmetric introduced the concept of the quantum neural tangent kernel (QNTK) to further demonstrate the exponential convergence rate of overparameterized QNNs. Besides the overparameterization theory, [@du2021learnability; @qi2023theoretical; @qian2024shuffle] investigated the required conditions to ensure the convergence of QNNs towards local minima.

Generalization. Research has also focused on understanding the sample complexity and generalization error bounds of quantum machine learning algorithms by using different statistical learning tools. In particular, @abbas2021power compared the generalization power of QNNs and classical learning models based on an information geometry metric. @caro2022generalization and @du2022efficient established generalization error bounds using covering numbers, revealing the impact of circuit structural factors—such as the number of gates and types of observables—on generalization ability. Similarly, @bu2022statistical analyzed the generalization ability of QNNs from the perspective of quantum resource theory, emphasizing the role of quantum resources such as entanglement and magic in influencing generalization.

Furthermore, frameworks for demonstrating quantum advantage in specific learning scenarios have been proposed [@huang2021information; @huang2022quantum], providing insights into the conditions under which quantum models outperform their classical counterparts. @du2021learnability and @zhang2024curse investigate the training-dependent generalization abilities of QNNs, while @du2023problem study problem-dependent generalization, highlighting key factors that enable QNNs to achieve strong generalization performance.

Beyond analyses focused on specific datasets and problems, the generalization ability of QNNs has also been examined through the lens of the No-Free-Lunch theorem. @poland2020no explore the average performance of QNNs across all possible datasets and problems, providing a broader perspective on their generalization potential. Extending this work, @sharma2022reformulation and @wang2024transition adapt the No-Free-Lunch theorem to scenarios involving entangled data, demonstrating the potential benefits of entanglement in certain settings. Additionally, @wang2024separable establish a No-Free-Lunch framework for various learning protocols, considering different quantum resources used in these protocols.

Applications

Practical applications of quantum neural networks for discriminative learning have spanned multiple domains. In computer vision, researchers have demonstrated quantum approaches to image classification [@henderson2020quanvolutional] and pattern recognition [@alrikabi2022face]. In quantum chemistry, QNNs have been applied to proton affinity predictions  [@jin2024integrating] and catalyst developments [@roh2024hybrid]. Financial applications include market trend classification [@li2023quantum] and fraud detection [@innan2024financial]. Medical applications encompass drug discovery [@batra2021quantum] and disease diagnosis [@enad2023review].

Generative learning with QNNs

QNNs for generative tasks represent a promising avenue in quantum machine learning, offering new methodologies for generating complex data distributions. By leveraging variational quantum circuits, these models aim to generate potentially more complex probability distributions compared to classical counterparts, particularly in domains where high-dimensional data or quantum properties are prominent. We refer the readers to @tian2023recent for a survey of QNNs in generative learning.

Model designs

Researchers have developed various QNN architectures tailored for generative tasks. Quantum circuit Born machines (QCBMs) [@benedetti2019generative] are one of the pioneering models, utilizing parameterized quantum circuits to generate discrete probability distributions. Quantum generative adversarial networks [@lloyd2018quantum] extend the adversarial framework to the quantum domain, where quantum generators and discriminators compete to learn complex distributions. Quantum Boltzmann machines [@amin2018quantum] are another notable model, employing quantum devices to prepare Boltzmann distributions for estimating target discrete distributions. Additionally, quantum autoencoders [@romero2017quantum] have been proposed for tasks like quantum state compression and reconstruction, offering potential advantages in quantum information processing. Recently, quantum diffusion models [@zhang2024generative; @kolle2024quantum] have been proposed for generating quantum state ensembles or classical images. These models showcase the versatility of QNNs in addressing diverse generative tasks.

Theoretical foundations

The theoretical understanding of generative quantum neural networks has advanced in several directions. Similar to QNNs for discriminative tasks, quantum generative models like QCBMs face the barren plateau issue with additional mechanisms from the Kullbach-Leibler (KL) divergence loss function [@rudolph2024trainability]. In parallel, QCBMs are more efficient in the data-limited regime than the other classical generative models [@hibat2024framework]. Besides, @gao2018quantum proved the existence of quantum generative model that is more capable of representing probability distributions compared with classical generative models, which has exponential speedup in learning and inference. Similarly, @gao2022enhancing proved the separation in expressive power between a class of widely used generative models, known as Bayesian networks, and its minimal quantum-inspired extension.

Applications

Generative QNNs have shown potential in various practical applications. In finance, they have been used to model complex financial data distributions and generate synthetic financial datasets, demonstrating better performance than classical models in certain scenarios [@alcazar2020classical; @zhu2022generative]. In the domain of quantum physics, quantum generative models have been applied for quantum state tomography and quantum simulation, aiding in the understanding of quantum systems [@benedetti2019adversarial]. In image generation, QGANs have been employed to produce high-quality images, showcasing their capability in handling complex visual data [@huang2021experimental]. Furthermore, quantum generative models have been explored for drug discovery, where they can potentially accelerate the process by efficiently exploring large chemical spaces [@li2021quantum]. These applications highlight the broad potential of QNNs in generative tasks across different domains.