Member-only story

Demystifying the CLIP Model

Sonam Tripathi
14 min readSep 23, 2023

--

As the medical field is experiencing rapid advancements, thanks to the remarkable progress in NLP and Computer Vision. Staying updated with these latest trends can be challenging. However, it’s equally rewarding to witness each of these swift breakthroughs unraveling the intricate biological and chemical mechanisms that lie beneath, all with the help of AI.

Photo by That’s Her Business on Unsplash

So recently I was going through one of the papers recently released by the Google’s team on Embeddings for Language/Image-aligned X-rays on which I will write a short summary in near future. While going through the paper, I encountered the CLIP model, which the team has used here, I decided to explore the details of this model first and I must say it’s one of the most impressive papers I’ve come across, even though it was quite lengthy.

So this article will focus on the internal workings of the CLIP model, its implementation and testing and finally its real-life applications.

So let’s deep dive into this journey:)

Quick Navigation:

  1. Background
  2. Detailed discussion on the Architecture of the Model
  3. Training Results
  4. Drawbacks of the Model

1. Background

--

--

Sonam Tripathi
Sonam Tripathi

Written by Sonam Tripathi

Sr. Associate Manager @Lilly | Researcher | Full-time Learner

No responses yet