Cut Latency for Image & Video AI Models : A guide to Multimodal Caching