AI/ML
Image Description Generator Model
This project creates an image caption generator that provides concise descriptions of images, aiding visually impaired individuals and facilitating multilingual communication. Trained on the Flickr8k dataset, the model uses a ResNext encoder and LSTM decoder, achieving strong results for specific image types. Future improvements include diverse datasets and attention mechanisms.
Technologies Used
Python, PyTorch, Numpy, Pandas, Matplotlib, Jupiter Notebooks, Google CoLab