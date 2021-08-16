One of the most common problems NLP (Natural language processing) projects face is the lack of labeled data. Labeling data is expensive, time consuming and tedious. Data Augmentation techniques help us build better models by preventing overfitting and making the models more robust. In this post I will cover how we can use the transformers library and pre-trained models like BERT, GPT-2, T5 etc. to easily augment our text data. I also want to mention this interesting paper on Unsupervised Data Augmentation (UDA) from researchers at Google where they showed that with only 20 labeled examples and data augmentation combined with other techniques, their model performed better than state-of-the-art models on the IMDB dataset and the same technique also shows good results on image classification tasks. Here are the links for blog post, paper and github code for UDA. Some of this work is based on the AutoAugment paper.