Semantic Caching for AI Application Performance: Speed Up Responses, Cut Costs