cyrion labs.

GenECA: A General-Purpose Framework for Real-Time Adaptive Multimodal Embodied Conversational Agents

Santosh Patapati, Trisanth Srinivasan

We present GenECA, a general-purpose framework for real-time multimodal interaction with embodied conversational agents. GenECA provides the first ECA system able to deliver context-aware speech and well-timed animations in real-time without reliance on human operators. Through modular design, it can support a wide variety of applications, such as education, customer service, and therapy.

Research Demonstration accepted IEEE/CVF Conference on Computer Vision and Pattern Recognition 2025

GenECA: A General-Purpose Framework for Real-Time Adaptive Multimodal Embodied Conversational Agents

Santosh Patapati, Trisanth Srinivasan

We present GenECA, a general-purpose framework for real-time multimodal interaction with embodied conversational agents. GenECA provides the first ECA system able to deliver context-aware speech and well-timed animations in real-time without reliance on human operators. Through modular design, it can support a wide variety of applications, such as education, customer service, and therapy.

Research Demonstration accepted IEEE/CVF Conference on Computer Vision and Pattern Recognition 2025

GenECA: A General-Purpose Framework for Real-Time Adaptive Multimodal Embodied Conversational Agents

Santosh Patapati, Trisanth Srinivasan

We present GenECA, a general-purpose framework for real-time multimodal interaction with embodied conversational agents. GenECA provides the first ECA system able to deliver context-aware speech and well-timed animations in real-time without reliance on human operators. Through modular design, it can support a wide variety of applications, such as education, customer service, and therapy.

Research Demonstration accepted IEEE/CVF Conference on Computer Vision and Pattern Recognition 2025

Towards Leveraging Semantic Web Technologies for Automated UI Element Annotation

Trisanth Srinivasan

This paper details a Chrome extension for auto-mated web UI element annotation using Semantic Web technolo- gies. This system is primarily designed in order to accommodate Visual Language Model (VLM) based web-agents, and provide them with distilled information about the current web landscape semantically. The system uses NLP models, vector embeddings, and FAISS for fast similarity searches to produce structured JSON annotations from unstructured visual web content. Im- plementation logic includes DOM observation, API interactions for semantic annotation and embedding generation, and BLEU evaluation. Evaluations show near-real-time performance, show- casing its potential for assisting VLM-first web development.

Research Paper accepted IEEE International Conference on Inventive Computation Technologies 2025

Towards Leveraging Semantic Web Technologies for Automated UI Element Annotation

Trisanth Srinivasan

This paper details a Chrome extension for auto-mated web UI element annotation using Semantic Web technolo- gies. This system is primarily designed in order to accommodate Visual Language Model (VLM) based web-agents, and provide them with distilled information about the current web landscape semantically. The system uses NLP models, vector embeddings, and FAISS for fast similarity searches to produce structured JSON annotations from unstructured visual web content. Im- plementation logic includes DOM observation, API interactions for semantic annotation and embedding generation, and BLEU evaluation. Evaluations show near-real-time performance, show- casing its potential for assisting VLM-first web development.

Research Paper accepted IEEE International Conference on Inventive Computation Technologies 2025

Towards Leveraging Semantic Web Technologies for Automated UI Element Annotation

Trisanth Srinivasan

This paper details a Chrome extension for auto-mated web UI element annotation using Semantic Web technolo- gies. This system is primarily designed in order to accommodate Visual Language Model (VLM) based web-agents, and provide them with distilled information about the current web landscape semantically. The system uses NLP models, vector embeddings, and FAISS for fast similarity searches to produce structured JSON annotations from unstructured visual web content. Im- plementation logic includes DOM observation, API interactions for semantic annotation and embedding generation, and BLEU evaluation. Evaluations show near-real-time performance, show- casing its potential for assisting VLM-first web development.

Research Paper accepted IEEE International Conference on Inventive Computation Technologies 2025

VIZ: Virtual & Physical Navigation System for the Visually Impaired

Santosh Patapati, Trisanth Srinivasan

While existing navigation tools, both physical and digital, are common, they remain fragmented and often lack proper accessibility support. VIZ addresses these gaps using generative AI to mimic human reasoning, enabling digital tasks through voice and human-like memory via vector databases and similarity search. VIZ offers scene descriptions, real-time collision avoidance, voice and vibration-based navigation, and memory-based task execution. Built with a custom 3D-printed chassis, it features LiDAR for obstacle detection, a microphone and speaker for voice interaction, motors for tactile feedback, and a Raspberry Pi for processing. Its digital assistant follows a three-step model: determining user intent, leveraging a modified ReAct architecture, and dynamically refining responses—effectively mimicking human decision-making to enhance digital accessibility.

Research Demonstration accepted IEEE/CVF Conference on Computer Vision and Pattern Recognition 2025

VIZ: Virtual & Physical Navigation System for the Visually Impaired

Santosh Patapati, Trisanth Srinivasan

While existing navigation tools, both physical and digital, are common, they remain fragmented and often lack proper accessibility support. VIZ addresses these gaps using generative AI to mimic human reasoning, enabling digital tasks through voice and human-like memory via vector databases and similarity search. VIZ offers scene descriptions, real-time collision avoidance, voice and vibration-based navigation, and memory-based task execution. Built with a custom 3D-printed chassis, it features LiDAR for obstacle detection, a microphone and speaker for voice interaction, motors for tactile feedback, and a Raspberry Pi for processing. Its digital assistant follows a three-step model: determining user intent, leveraging a modified ReAct architecture, and dynamically refining responses—effectively mimicking human decision-making to enhance digital accessibility.

Research Demonstration accepted IEEE/CVF Conference on Computer Vision and Pattern Recognition 2025

VIZ: Virtual & Physical Navigation System for the Visually Impaired

Santosh Patapati, Trisanth Srinivasan

While existing navigation tools, both physical and digital, are common, they remain fragmented and often lack proper accessibility support. VIZ addresses these gaps using generative AI to mimic human reasoning, enabling digital tasks through voice and human-like memory via vector databases and similarity search. VIZ offers scene descriptions, real-time collision avoidance, voice and vibration-based navigation, and memory-based task execution. Built with a custom 3D-printed chassis, it features LiDAR for obstacle detection, a microphone and speaker for voice interaction, motors for tactile feedback, and a Raspberry Pi for processing. Its digital assistant follows a three-step model: determining user intent, leveraging a modified ReAct architecture, and dynamically refining responses—effectively mimicking human decision-making to enhance digital accessibility.

Research Demonstration accepted IEEE/CVF Conference on Computer Vision and Pattern Recognition 2025

PhysNav-DG: A Novel Adaptive Framework for Robust VLM-Sensor Fusion in Navigation Applications

Trisanth Srinivasan, Santosh Patapati

PhysNav-DG is a novel navigation framework combining classical sensor fusion with vision-language models for accurate and transparent decision-making. Its dual-branch architecture predicts navigation actions and generates chain-of-thought explanations. A context-aware Adaptive Kalman Filter refines state estimation using sensor data and semantic cues from models like LLaMA 3.2 11B and BLIP-2. We also introduce MD-NEX, a multi-domain benchmark spanning indoor, driving, and social navigation with ground-truth actions and human-rated explanations. PhysNav-DG boosts navigation success by over 20%, with grounded (score: 0.87) and clear (4.5/5) explanations—bridging semantic reasoning and geometric planning for safer autonomous systems.

Research Paper accepted DG-ERBF at IEEE/CVF Conference on Computer Vision and Pattern Recognition 2025

PhysNav-DG: A Novel Adaptive Framework for Robust VLM-Sensor Fusion in Navigation Applications

Trisanth Srinivasan, Santosh Patapati

PhysNav-DG is a novel navigation framework combining classical sensor fusion with vision-language models for accurate and transparent decision-making. Its dual-branch architecture predicts navigation actions and generates chain-of-thought explanations. A context-aware Adaptive Kalman Filter refines state estimation using sensor data and semantic cues from models like LLaMA 3.2 11B and BLIP-2. We also introduce MD-NEX, a multi-domain benchmark spanning indoor, driving, and social navigation with ground-truth actions and human-rated explanations. PhysNav-DG boosts navigation success by over 20%, with grounded (score: 0.87) and clear (4.5/5) explanations—bridging semantic reasoning and geometric planning for safer autonomous systems.

Research Paper accepted DG-ERBF at IEEE/CVF Conference on Computer Vision and Pattern Recognition 2025

PhysNav-DG: A Novel Adaptive Framework for Robust VLM-Sensor Fusion in Navigation Applications

Trisanth Srinivasan, Santosh Patapati

PhysNav-DG is a novel navigation framework combining classical sensor fusion with vision-language models for accurate and transparent decision-making. Its dual-branch architecture predicts navigation actions and generates chain-of-thought explanations. A context-aware Adaptive Kalman Filter refines state estimation using sensor data and semantic cues from models like LLaMA 3.2 11B and BLIP-2. We also introduce MD-NEX, a multi-domain benchmark spanning indoor, driving, and social navigation with ground-truth actions and human-rated explanations. PhysNav-DG boosts navigation success by over 20%, with grounded (score: 0.87) and clear (4.5/5) explanations—bridging semantic reasoning and geometric planning for safer autonomous systems.

Research Paper accepted DG-ERBF at IEEE/CVF Conference on Computer Vision and Pattern Recognition 2025

CPS-Guard: Multi-Role Orchestration System for Dependability Assurance of AI-Enhanced Cyber-Physical Systems

Trisanth Srinivasan, Santosh Patapati, Himani Musku, Idhant Gode, Aditya Arora, Abubakr Nazriev, Sanika Hirave, Zaryab Kanjiani, Srinjoy Ghose

Cyber-Physical Systems (CPS) increasingly depend on advanced AI techniques to operate in critical applications. However, traditional verification and validation methods often struggle to handle the unpredictable and dynamic nature of AI components. In this paper, we introduce CPS-Guard, a novel framework that employs multi-role orchestration to automate the iterative assurance process for AI-powered CPS. By assigning specialized roles (e.g., safety monitoring, security assessment, fault injection, and recovery planning) to dedicated agents within a simulated environment, CPS-Guard continuously evaluates and refines AI behavior against a range of dependability requirements. We demonstrate the framework through a case study involving an autonomous vehicle navigating an intersection with an AI-based planner. Our results show that CPS-Guard effectively detects vulnerabilities, manages performance impacts, and supports adaptive recovery strategies, thereby offering a structured and extensible solution for rigorous V&V in safetyand security-critical systems

Research Paper accepted VERDI at The 55th Annual IEEE/IFIP International Conference on Dependable Systems and Networks

CPS-Guard: Multi-Role Orchestration System for Dependability Assurance of AI-Enhanced Cyber-Physical Systems

Trisanth Srinivasan, Santosh Patapati, Himani Musku, Idhant Gode, Aditya Arora, Abubakr Nazriev, Sanika Hirave, Zaryab Kanjiani, Srinjoy Ghose

Cyber-Physical Systems (CPS) increasingly depend on advanced AI techniques to operate in critical applications. However, traditional verification and validation methods often struggle to handle the unpredictable and dynamic nature of AI components. In this paper, we introduce CPS-Guard, a novel framework that employs multi-role orchestration to automate the iterative assurance process for AI-powered CPS. By assigning specialized roles (e.g., safety monitoring, security assessment, fault injection, and recovery planning) to dedicated agents within a simulated environment, CPS-Guard continuously evaluates and refines AI behavior against a range of dependability requirements. We demonstrate the framework through a case study involving an autonomous vehicle navigating an intersection with an AI-based planner. Our results show that CPS-Guard effectively detects vulnerabilities, manages performance impacts, and supports adaptive recovery strategies, thereby offering a structured and extensible solution for rigorous V&V in safetyand security-critical systems

Research Paper accepted VERDI at The 55th Annual IEEE/IFIP International Conference on Dependable Systems and Networks

CPS-Guard: Multi-Role Orchestration System for Dependability Assurance of AI-Enhanced Cyber-Physical Systems

Trisanth Srinivasan, Santosh Patapati, Himani Musku, Idhant Gode, Aditya Arora, Abubakr Nazriev, Sanika Hirave, Zaryab Kanjiani, Srinjoy Ghose

Cyber-Physical Systems (CPS) increasingly depend on advanced AI techniques to operate in critical applications. However, traditional verification and validation methods often struggle to handle the unpredictable and dynamic nature of AI components. In this paper, we introduce CPS-Guard, a novel framework that employs multi-role orchestration to automate the iterative assurance process for AI-powered CPS. By assigning specialized roles (e.g., safety monitoring, security assessment, fault injection, and recovery planning) to dedicated agents within a simulated environment, CPS-Guard continuously evaluates and refines AI behavior against a range of dependability requirements. We demonstrate the framework through a case study involving an autonomous vehicle navigating an intersection with an AI-based planner. Our results show that CPS-Guard effectively detects vulnerabilities, manages performance impacts, and supports adaptive recovery strategies, thereby offering a structured and extensible solution for rigorous V&V in safetyand security-critical systems

Research Paper accepted VERDI at The 55th Annual IEEE/IFIP International Conference on Dependable Systems and Networks

nonprofit applied ml lab, making a real impact

Who we are: Our story

What we do

AI Research and Development

AI Research and Development

AI Research and Development

Ethical AI and Safety

Ethical AI and Safety

Ethical AI and Safety

Collaborative Research Projects

Collaborative Research Projects

Collaborative Research Projects

AI for Social Good

AI for Social Good

AI for Social Good

Computational Science and Innovation

Computational Science and Innovation

Computational Science and Innovation

GenECA: A General-Purpose Framework for Real-Time Adaptive Multimodal Embodied Conversational Agents

GenECA: A General-Purpose Framework for Real-Time Adaptive Multimodal Embodied Conversational Agents

GenECA: A General-Purpose Framework for Real-Time Adaptive Multimodal Embodied Conversational Agents

Towards Leveraging Semantic Web Technologies for Automated UI Element Annotation

Towards Leveraging Semantic Web Technologies for Automated UI Element Annotation

Towards Leveraging Semantic Web Technologies for Automated UI Element Annotation

VIZ: Virtual & Physical Navigation System for the Visually Impaired

VIZ: Virtual & Physical Navigation System for the Visually Impaired

VIZ: Virtual & Physical Navigation System for the Visually Impaired

PhysNav-DG: A Novel Adaptive Framework for Robust VLM-Sensor Fusion in Navigation Applications

PhysNav-DG: A Novel Adaptive Framework for Robust VLM-Sensor Fusion in Navigation Applications

PhysNav-DG: A Novel Adaptive Framework for Robust VLM-Sensor Fusion in Navigation Applications

CPS-Guard: Multi-Role Orchestration System for Dependability Assurance of AI-Enhanced Cyber-Physical Systems

CPS-Guard: Multi-Role Orchestration System for Dependability Assurance of AI-Enhanced Cyber-Physical Systems

CPS-Guard: Multi-Role Orchestration System for Dependability Assurance of AI-Enhanced Cyber-Physical Systems