A deep dive into Transformer a neural network architecture that was introduced in the famous paper “attention is all you need” in 2017, its applications, impacts, challenges and future directions
Artificial Neural Networks have reached ‘Grandmaster’ and even ‘super-human’ performance’ across a variety of games, from those involving perfect-information, such as Go ((Silver et al. (2016)); to those involving imperfect-information, such as ‘Starcraft’ (Vinyals et al. (2019)). Such technological developments from AI-labs have ushered concomitant applications across the world of business, where an ‘AI’ brand-tag is fast becoming ubiquitous. A corollary of such widespread commercial deployment is that when AI gets things wrong - an autonomous vehicle crashes; a chatbot exhibits ‘racist’ behaviour; automated credit-scoring processes ‘discriminate’ on gender etc. - there are often significant financial, legal and brand consequences, and the incident becomes major news. As Judea Pearl sees it, the underlying reason for such mistakes is that “. all the impressive achievements of deep learning amount to just curve fitting”. The key, Pearl suggests (Pearl and