Poster

TenGAN: Pure Transformer Encoders Make an Efficient Discrete GAN for De Novo Molecular Generation

Chen Li · Yoshihiro Yamanishi

2024 Poster

[ Poster]

Abstract

Deep generative models for de novo molecular generation using discrete data, such as the simplified molecular-input line-entry system (SMILES) strings, have attracted widespread attention in drug design. However, training instability often plagues generative adversarial networks (GANs), leading to problems such as mode collapse and low diversity. This study proposes a pure transformer encoder-based GAN (TenGAN) to solve these issues. The generator and discriminator of TenGAN are variants of the transformer encoders and are combined with reinforcement learning (RL) to generate molecules with the desired chemical properties. Besides, data augmentation of the variant SMILES is leveraged for the TenGAN training to learn the semantics and syntax of SMILES strings. Additionally, we introduce an enhanced variant of TenGAN, named Ten(W)GAN, which incorporates mini-batch discrimination and Wasserstein GAN to improve the ability to generate molecules. The experimental results and ablation studies on the QM9 and ZINC datasets showed that the proposed models generated highly valid and novel molecules with the desired chemical properties in a computationally efficient manner.

Chat is not available.