top of page

Research Blog

Buscar

In rescue scenarios, locating and understanding human motion quickly can make the difference between a successful intervention and a delayed response. Yet cameras, LiDAR, and wearable sensors often struggle in smoke-filled rooms, collapsed buildings, dark environments, or areas where direct line of sight is blocked. This is where WiFi sensing offers a promising alternative. By analyzing how human movement alters wireless signal propagation, it becomes possible to estimate body posture without requiring a camera-based view of the person.



Human Identification Procedure


CSI is a measurement that represents these disparities between carriers and subcarriers. The signal received by the receiver can be expressed using the CSI as follows.


The proposed framework uses Channel State Information (CSI) as the main sensing source. Human presence and motion affect the wireless channel, creating measurable variations in signal amplitude and phase. These raw CSI signals are first collected and then passed through a structured preprocessing pipeline that improves signal quality and extracts the most relevant motion information. This includes antenna selection, outlier removal, discrete wavelet transform (DWT)-based processing, and feature extraction. The goal is to decompose the spatial and temporal characteristics of the WiFi channel into informative representations that can support robust pose estimation even in complex indoor environments.


Human Identification Procedure
Human Identification Procedure

It begins with CSI raw data acquisition, where both amplitude and phase are extracted from the WiFi signals. These measurements are then refined through data preprocessing and transformed into features suitable for learning. Finally, the neural network predicts the human skeletal configuration and reconstructs a 3D human body representation. This pipeline shows how wireless sensing can move beyond simple activity recognition toward detailed human pose recovery.

At the core of the method is a neural network architecture designed to reconstruct the human skeleton from the processed CSI data. The model combines an encoder-decoder structure with a Spatial Orientation Attention Module (SOA), allowing it to better capture body orientation and motion relationships across the wireless channel. This enables the system to infer a structured human pose representation and map it to a full-body model such as SMPL-X, providing a richer reconstruction of the person’s posture.


Workflow for WiFi-based human pose estimation using CSI data and neural networks.
Workflow for WiFi-based human pose estimation using CSI data and neural networks.
On the left, CSI data collection combined with camera-based pose information is shown, where multiple wireless links capture signal variations caused by a human subject. In the center, the collected data is used for neural network design, which learns the relationship between CSI measurements and human body posture. The output of the network is a human skeleton reconstruction and SMPL-X body model, shown as both a skeletal representation and a 3D human figure. On the right, the figure highlights the dataset structure, including a sample 3×3 CSI matrix pose skeleton and the corresponding CSI amplitude signal, illustrating how wireless signal patterns are linked to body movement and posture.

Experimental Setup


The experimental setup and example outputs. It shows sample skeleton predictions alongside CSI amplitude signals captured across multiple wireless links. These examples demonstrate how different human poses, such as walking or crouching, produce distinguishable channel responses. The comparison between ground-truth and predicted skeletons provides visual evidence that CSI-based sensing can track meaningful posture changes with good spatial consistency. This approach is especially relevant for search-and-rescue operations, where first responders need situational awareness in environments that are unsafe or visually inaccessible. A WiFi-based human pose estimation system could help detect trapped survivors, infer whether a person is standing, lying down, or crouched, and support faster decision-making without intrusive sensing hardware. By leveraging existing wireless infrastructure, the method also opens the door to low-cost, privacy-preserving monitoring in emergency response, healthcare, and smart building applications.


Experimental examples show the relationship between CSI amplitude variations and predicted skeletal motion under different activities.
Experimental examples show the relationship between CSI amplitude variations and predicted skeletal motion under different activities.

Takeaway

CSI-channel spatial decomposition shows that WiFi signals can be transformed into meaningful human pose information, enabling camera-free and privacy-preserving body reconstruction in difficult rescue environments. By combining CSI preprocessing, spatial feature learning, and lightweight neural modeling, this approach offers a practical path toward detecting and understanding human posture in smoke, darkness, debris, or non-line-of-sight conditions where conventional vision systems often fail.


  • Foto del escritor: Carlos Osorio
    Carlos Osorio
  • 30 mar
  • 3 Min. de lectura


Autonomous flight in GPS-denied environments is one of the most demanding challenges in modern robotics. Drones operating indoors, in forests, near collapsed infrastructure, inside industrial plants, or in dense urban canyons cannot depend on satellite navigation for stable localization and mission execution. In these scenarios, the aircraft must perceive the scene, interpret the mission, reason about obstacles, and generate safe actions using onboard intelligence.

A promising solution is the LLM-enabled agent stack for denied flight, where large language model reasoning is combined with perception, planning, control, and multi-agent coordination. Rather than using a single monolithic autonomy block, this architecture organizes the navigation system as a hierarchy of task-specific agents that cooperate through shared scene understanding and mission intent.


Why denied flight needs intelligent agent stacks

In denied environments, the drone faces several simultaneous difficulties:

  • no continuous GPS position reference

  • limited visibility caused by smoke, dust, shadows, or clutter

  • narrow corridors and tight obstacle spacing

  • uncertain or changing terrain structure

  • dynamic mission goals and time pressure

Traditional navigation pipelines can handle parts of this problem, but they often struggle when the mission requires higher-level interpretation. A drone may detect obstacles and estimate motion, yet

still fail to decide the best route, adapt to new objectives, or coordinate with other drones. Where an LLM-enabled architecture becomes valuable. The language model acts as a reasoning layer that connects mission context, scene interpretation, route selection, and action priorities.


Architecture overview


The proposed architecture organizes denied-flight autonomy into a set of interacting task-specific modules. Each module contributes a dedicated function while exchanging information through structured scene and mission representations. The process begins with a perception layer, which extracts environmental cues from RGB images, depth information, and semantic masks. This layer identifies obstacles, free space, structures, and local scene features required for navigation.


The output of the perception layer is passed into a scene model, which constructs a compact representation of walls, hazards, traversable regions, and 3D geometry. This model provides a grounded description of the environment that can be used by both planning and reasoning modules. A prompt-driven task planning block introduces mission-level flexibility. Instead of relying only on hard-coded rules, the system can interpret high-level instructions such as corridor preference, inspection priority, formation behavior, or fallback conditions. This enables faster adaptation to new tasks and changing environments.


LLM agent navigation enables a drone to interpret mission context, analyze visual and state information, and make adaptive flight decisions in real time. By combining high-level reasoning with trajectory tracking and safety control, the agent can follow goals, avoid obstacles, and maintain stable navigation in dynamic or GPS-denied environments.
LLM agent navigation enables a drone to interpret mission context, analyze visual and state information, and make adaptive flight decisions in real time. By combining high-level reasoning with trajectory tracking and safety control, the agent can follow goals, avoid obstacles, and maintain stable navigation in dynamic or GPS-denied environments.

At the center of the architecture is the LLM reasoning layer. The LLM processes mission context, scene abstractions, and vehicle status to generate structured decisions that guide downstream modules. Its role is strategic rather than low-level. It may recommend path preferences, task priorities, coordination changes, or safety-aware responses when uncertainty increases. For team operations, a mission coordinator allocates roles and synchronizes multiple drones. This module supports cooperative exploration, target search, area partitioning, and mission reallocation. The LLM enhances this process by adding semantic mission awareness to geometric planning. The route selector and waypoint builder translate mission intent into executable subgoals. These modules select candidate paths and generate feasible intermediate waypoints that satisfy environmental and mission constraints.


A local guidance and safety layer then ensures physically safe execution. This block combines path-following control, obstacle avoidance, speed regulation, and action filtering. It acts as a safety shield between high-level reasoning and flight execution. Finally, a command aggregator fuses outputs from planning, coordination, and local control into a consistent kinematic command for the drone or drone group.


Functional significance


The main advantage of this stack lies in its separation of responsibilities. Perception grounds the system in real observations. The scene model organizes the environment into a usable structure. The LLM introduces contextual decision-making. Planning modules convert intent into motion objectives. Safety and control ensure that all generated commands remain feasible and secure. This layered design improves robustness in denied-flight conditions because the system is not dependent on a single navigation source. If localization becomes uncertain, mission logic can trigger fallback behaviors. If the environment changes, the planner can re-route. If multiple drones are available, coordination can be adjusted dynamically.


LLM-based multi-UAV mission planning system that converts user commands and external intelligence into coordinated flight tasks and formation control.
LLM-based multi-UAV mission planning system that converts user commands and external intelligence into coordinated flight tasks and formation control.

The LLM-enabled agent stack for denied flight provides a modular and intelligent framework for autonomous aerial navigation in challenging environments. By combining perception, scene understanding, prompt-driven planning, mission coordination, and safety-constrained control, the architecture supports adaptive and cooperative drone behavior when GPS is not available. This approach represents a strong direction for next-generation aerial autonomy, especially in missions that require resilience, flexibility, and higher-level reasoning.


  • Foto del escritor: Carlos Osorio
    Carlos Osorio
  • 29 mar
  • 3 Min. de lectura

Modern autonomous drones are often designed assuming GPS will always be available. In practice, however, many real-world missions take place in environments where satellite signals are weak, blocked, reflected, or intentionally jammed. Indoor facilities, dense urban areas, forests, tunnels, underground spaces, and disaster zones all create serious challenges for reliable positioning and navigation.




Why GPS-denied navigation matters

When GPS is unavailable, a drone can no longer rely on a stable global position estimate to follow routes, avoid obstacles, or complete its mission safely. In these situations, the system must estimate its motion, interpret the environment, and make flight decisions using onboard sensing and intelligent control.

A robust GPS-denied solution is especially important for applications such as:

  • search and rescue in collapsed or smoke-filled areas

  • warehouse and factory inspection

  • forest exploration and environmental monitoring

  • indoor logistics and inventory missions

  • autonomous surveillance in cluttered environments

In all of these cases, navigation must remain stable, adaptive, and safe even when external positioning is lost.


A multi-agent architecture for autonomous navigation

One promising solution is to organize the drone intelligence as a federation of task-specific agents. Instead of relying on a single monolithic algorithm, the navigation system can be divided into specialized modules that cooperate through a shared internal state. This type of architecture improves modularity, interpretability, and robustness. Each agent focuses on one essential function, while the overall system integrates perception, localization, mapping, planning, and control.




1. Perception agent

The perception agent is responsible for understanding what the drone sees. Using onboard cameras and other sensors, it detects:

  • obstacles

  • structural features

  • open corridors

  • free space for safe flight

This agent provides the environmental awareness needed for navigation in cluttered scenes. In GPS-denied spaces, reliable perception becomes the first step toward safe autonomy.


2. Localization agent

Without GPS, the drone must estimate its own position using internal and visual measurements. The localization agent can combine information from:

  • IMU data

  • visual odometry

  • lidar cues

  • fiducial markers

  • beacons or local anchors

A visual-inertial neural network is a strong option here, since it can fuse camera motion and inertial measurements to provide more accurate pose estimation under dynamic conditions.


3. Mapping agent

The mapping agent continuously updates a local representation of the environment. It identifies traversable regions, obstacle locations, and route constraints based on what the drone has observed so far. This local map does not need to be perfect or global. It only needs to be accurate enough to support short-horizon planning and safe navigation. In unfamiliar environments, this ability to build an online map is essential.


4. Planning agent

The planning agent decides how the drone should move. Based on the current mission goal and the shared environmental knowledge, it selects:

  • path direction

  • motion speed

  • waypoint sequence

  • fallback or recovery behaviors

This agent allows the drone to adapt its route when conditions change, obstacles appear, or uncertainty increases.


5. Safety and control agent

Even with good perception and planning, a drone still needs stable control. The safety and control agent ensures that the generated commands remain feasible and safe for the vehicle. It constrains aggressive actions, stabilizes the flight response, and handles disturbances. Advanced control strategies such as ADRC can improve robustness by compensating for model uncertainty and external perturbations, which is particularly valuable in real-world indoor or cluttered missions.


The role of a shared belief state

At the center of this architecture is a shared belief state. This internal representation acts as the common language between all agents. It can contain:

  • estimated pose

  • uncertainty level

  • local map information

  • mission intent

  • confidence in observations

By updating this belief state continuously, the system enables all agents to work together in a coordinated way. Perception informs mapping, localization updates motion estimates, planning selects the next action, and control enforces safe execution.


Benefits of the proposed approach

A task-specific agent architecture offers several advantages for GPS-denied navigation:

Robustness: if one component becomes less reliable, the others can still support the mission.

Scalability: new sensing or reasoning modules can be added without redesigning the full system.

Adaptability: The drone can react to dynamic obstacles, uncertainty, or environmental changes in real time.

Safety: control constraints and stabilization strategies reduce the risk of collision or mission failure.

Interpretability: each module has a clear role, making the system easier to debug and improve.


Looking ahead

GPS-denied navigation is not just a backup mode. It is becoming a core capability for the next generation of autonomous drones. As missions move into more complex, dynamic, and constrained spaces, drones will need to rely increasingly on onboard intelligence rather than external infrastructure. The proposed solution, based on perception, localization, mapping, planning, and safety/control agents, provides a strong foundation for this future. By combining these components through a shared belief state, autonomous systems can achieve more reliable navigation even in environments where GPS is completely unavailable.


bottom of page