Challenge
CCTV cameras are a source of lots of information that are not easily captured with tabular data. There are events that happen rarely (like shoplifting) and it does not make sense to track them specifically. On the other hand it takes an operator a lot of time to find them manually.
Solution
We built a multimodal (text + vision) agent that responds to the user’s prompt and automatically finds events of interest. If needed, the agent can build a dataframe with relevant information for further analysis, all of that controlled via a prompt in natural language.
Benefits
- Faster analysis results for atypical queries
- No-code insights
- Automatic analyses working 24/7 for typical queries