Synthetic Sine Waves Using Variational Auto-Encoder

This blog post begins with a playful analogy to the scientific paper by King and Welling [1], likening to a friend who studies your fashion to create their own distinctive style. Most of inspiration is drawn from this paper. VAE is a generative model that learns from a sampled distribution.

Your Quirky Friend: The Encoder

This friend, called "Q," encapsulates the essence of your style and records it in a notebook, symbolizing the latent space in a VAE. Q uses the encoder for this task. They observe your outfit to capture its essence. Instead of merely replicating your outfit, Q identifies the key features that make your style unique.

Q has several methods to capture the key features of your style using the encoder, which is a neural network and can take various forms, such as fully connected networks, Convolutional Neural Networks (CNNs) for image data, or Long Short-Term Memory (LSTM) networks for sequential data.

Once Q has captured the key features of your style, they record these insights in their notebook. This notebook, or latent space, is where Q stores the essence of your fashion sense. Let's take a closer look at what this latent space represents

More Concretely, the Encoder is responsible for mapping the input data into a latent space representation using neural network forms just mentioned. It takes the input data and compresses it into a smaller, latent variable space, typically characterized by a mean and variance that define a probability distribution.

Q's Notebook: The Latent Space

Q's notebook represents the latent space in the VAE. It's where Q stores their notes, sketches, and ideas inspired by your fashion sense. This notebook contains the essence of your style, distilled into a set of key features. In the VAE, the latent space is a probabilistic representation of the input data. It's a distribution over the possible values of the latent variables, which capture the underlying patterns and structures in the data.

ELBO – Evidence Lower Bound

In order to make the VAE learns a meaningful latent space representation, the model optimizes the Evidence Lower Bound (ELBO). The ELBO has two main components: the reconstruction term, which checks how well the decoder can recreate the input data from the latent space, and the regularization term, which ensures the latent space distribution matches a prior distribution (usually a standard Gaussian distribution).

Reparameterization: Q's Creative Twist

Now, imagine Q wants to create a new outfit inspired by your style, but with a twist. Instead of directly sampling from their notebook, Q uses a clever trick: they sample features from a standard normal distribution (like a random fashion magazine) and then transforms the sample using their notebook's parameters (like applying their fashion sense). This is similar to the reparameterization trick used in VAEs. By sampling from a standard normal distribution and transforming the sample using the latent space's parameters, we can backpropagate through the sampling process and optimize the VAE's parameters efficiently.

Q's Fashion Creations: The Decoder

Q will use their parameterized samples to create new outfits inspired by your fashion sense. Q's creations may not be exact replicas, but they capture the essence of your style.

This is the decoder in a VAE. The decoder takes the reparameterized samples from Q's notebook and uses them to generate new data samples similar to the original input data.

In the VAE, the decoder is trained to reconstruct the input data from the latent space. decoder is also another neural network that takes samples from the latent space and reconstructs the input data, mapping the latent variables back to the data space.

Key Functions

Encoder: Compresses input data into a latent representation.
Decoder: Reconstructs data from the latent representation.

Q's Fashion Evolution: Training the VAE

As Q continues to study your fashion sense, create new outfits, and refine their reparameterization trick, they develop a unique understanding of your style. This process is like training the VAE. During training, the VAE learns to optimize the encoder, latent space, and decoder. The goal is to find a balance between reconstructing the input data accurately and capturing the underlying patterns and structures in the data.

Training Objective

The VAE is trained to minimize the difference between the original input and the reconstructed output while also ensuring that the latent space follows a desired distribution (usually a Gaussian distribution).

This structure allows the VAE to learn meaningful representations of the data while also enabling effective generation of new data samples.

The Result: A Quirky yet Stylish VAE

Through this process, Q develops a unique understanding of your fashion sense, which is reflected in their quirky yet stylish outfits. Similarly, the trained VAE can generate new data samples that capture the essence of the input data, while also introducing new and interesting variations.

**Figure 1.0 Variational Autoencoder (VAE) Architecture. The flow of data is indicated by arrows connecting each block, showing the sequence of operations from input to output**

Implementation Details Using Pytorch

The encoder/decoder-based Variational Autoencoder (VAE) shown in this Figure 2.0 processes inputs consisting of sine waves. These sine waves have an amplitude of 1 unit and a single frequency, but the phase varies randomly within the range of [-π, +π]. The encoder maps these inputs to a latent representation, characterized by mean and variance, which the decoder uses to reconstruct the sine waves.

Figure 2.0 The Encoder-Decoder architecture of a Variational Autoencoder (VAE) is designed to generate synthetic sine waves. It is important to note that the linear layers do not inherently compute the mean and variance of the input (such as averaging the input values in the case of the mean Instead, they learn to produce a vector that represents the and variance of latent distribution as part of the VAE training process

VAE Loss Function

The VAE loss function [1] comprises two main components:

Reconstruction Term: Measures the discrepancy between the input data and its reconstructed version.
Regularization Term (Kullback-Leibler Divergence): Ensures that the latent space distribution aligns closely with a prior distribution.

By minimizing the VAE loss function, the model learns to encode input data into a meaningful latent space representation, decode this representation into a reconstructed version of the input data, and generate new data samples akin to the input data.

Figure 3.0 This figure illustrates the VAE (Variational Autoencoder) loss function during training. The plot shows the decline in average loss over 25 epochs, highlighting the model's learning progression. The x-axis represents the epochs, and the y-axis represents the average loss, with a noticeable trend of decreasing loss as the training continues, indicating improved model performance.

Generating Synthetic Data Using VAEs

The steps for generation of Synthetic Sine Wave data using a Trained VAE are given here under:

Set the VAE to evaluation mode after sufficient convergence.
Disable gradient tracking to save memory and computation.
Sample latent vectors from a standard normal distribution. This is on Q's Creative Twist mentioned above.

Q's Synthetic Data Factory Powered by Circular Buffers

A circular buffer, often used for efficient data management, operates like a fixed-size, looping queue. In the of a VAE system, it functions as follows:

The circular buffer comprises two components—the read buffer and the write buffer:

Read Buffer: This is where Q continuously inputs random fashion patterns drawn from various magazines. These patterns are stored sequentially in the buffer for processing by the VAE encoder. The read buffer ensures a steady data to the encoder enabling operation.
Write Buffer Once the read buffer reaches its capacity (i.e., it becomes full), the write buffer takes over and begins storing new incoming patterns. This mechanism allows the system to handle data without overflow or interruptions. The decoder subsequently reads patterns the write buffer, transforming them into synthetic sine waves as part of the output.

The circular buffer operates such that when the end of the buffer is reached, it loops back to the beginning. This ensures optimal use of space, as no data is lost or wasted—old data is simply overwritten when the buffer cycles around.

In this implementation, the circular buffer enables efficient coordination between the encoder and decoder ensuring a seamless flow of and pattern decoding into synthetic sine waves. It is akin to a never-ending loop pattern generation and transformation.

Video 1.0: "Circular Buffers in Action – Q's Synthetic Data Factory" - The top window showcases Q randomly sampling fashion patterns and placing them into the read buffer. Once the buffer is full, the decoder activates, reading these patterns to produce synthetic sine waves. The green sine wave represents the reference wave with zero phase. The red sine wave is the noisy synthetic output generated by the decoder, while the blue sine wave is a smoothed version of the raw decoder output.

References

[1] Diederik P. Kingma and Max Welling (2019), An Introduction to Variational Autoencoders, Foundations and Trends in Machine Learning, arXiv:1906.02691v3 [cs.LG] 11 Dec 2019, https://doi.org/10.48550/arXiv.1906.02691

Tech Blog