Physics-Informed Neural Networks (PINNs) have recently gained attention as an effective approach for tackling complex inverse problems in image processing, especially in image denoising. This paper introduces an innovative framework that utilizes a range of neural network architectures—including ResUNet, UNet, U2Net, and Res2UNet—to implement denoising strategies grounded in nonlinear partial differential equations (PDEs). The proposed methods employ PDEs such as the heat equation, diffusion processes, multiphase mixture and phase change (MPMC) models, and Zhichang Guo's technique, embedding physical laws within the learning process to enhance denoising robustness and accuracy. We demonstrate that these models can be trained to effectively reduce noise while preserving critical image features, using a blend of data-driven methods and physical constraints. Our experiments show that integrating PDEs leads to superior denoising performance compared to traditional techniques. The models were evaluated on multiple datasets, with metrics like Peak Signal-to-Noise Ratio (PSNR) and Structural Similarity Index Measure (SSIM) indicating significant improvements in image quality. These results highlight the potential of using PINNs with nonlinear PDEs for advanced image-denoising tasks, paving the way for future research at the intersection of deep learning and physics-based modeling in image processing.
Ph.D. Electronic Engineer
Research Blog
Actualizado: 6 dic 2024
We explore the integration of Dynamic Mode Decomposition (DMD) with Physics-Informed Neural Networks (PINNs) to enhance control systems for UAV quadcopters. This innovative approach applies DMD techniques and PINNs to solve the Riccati equation, which is critical for accurate UAV position estimation. By embedding the UAV control problem within physics-based constraints, the models remain faithful to the physical principles governing UAV dynamics. DMD is used to extract key dynamic modes from a rich dataset of UAV flight parameters—such as position, velocity, and control inputs—yielding a reduced-order representation that encapsulates the essential UAV dynamics. This streamlined representation is then embedded within the PINN framework to solve the Riccati equation accurately. The resulting control strategy significantly enhances position estimation accuracy and optimizes overall control performance. Real-time validation was performed in a Unity-based physics simulation, factoring in real-world conditions like gravity and perturbation noise. The outcomes show notable improvements in estimation accuracy and control stability over conventional methods.
Fig.1 . Integrating DMD with PINN for UAV position and orientation estimation involves a step-by-step workflow. This sequence highlights the training procedure, detailing the neural network architecture and the loss function used to optimize the model.
DYNAMIC MODE DECOMPOSITION (DMD) WITH PHYSICS-INFORMED NEURONAL NETWORK (PINNS)
The integration of DMD and PINNs provides a robust control strategy for precise estimation of UAV position and orientation, even in the presence of noise and environmental perturbations. By optimizing a composite loss function, the PINN framework ensures that the estimated states adhere to the physical laws governing UAV dynamics, leading to improved accuracy and stability. During training, the neural network parameters are fine-tuned to minimize this loss, which in turn enhances the overall performance of the control system.
For the process of integrating DMD with PINN define the following steps:
Data Acquisition and Preprocessing: A comprehensive dataset, denoted as Xin​, is collected, comprising UAV flight information such as Euler angles (ϕ,θ,ψ,), position coordinates (x,y,z), and noisy observations. This dataset serves as the input for the DMD process.
Dynamic Mode Decomposition (DMD):Â The input data is processed using DMD to extract system matrices [A, B, Q, R], which represent the UAV dynamics and control parameters. These matrices form a reduced-order model that captures the dominant modes of the UAV's behavior.
Solving the Discrete Algebraic Riccati Equation (DARE): The matrices derived from DMD are then used to solve the discrete algebraic Riccati equation (DARE), yielding an initial state estimate X^1​.
Neural Network (NN) Refinement: The initial state estimate ˆX1 is then refined using a Physics-Informed Neural Network (PINN). The architecture consists of fully connected layers with ReLU activation functions between layers: • Input Layer: The network takes as input system matrices. • First Layer: The flattened input passes through a fully connected layer with 128 neurons, followed by a ReLU activation. • Second Layer: The output of the first hidden layer is passed through another fully connected layer with 128 neurons, again followed by a ReLU activation. • Output Layer: The output from the second hidden layer is passed through a final fully connected layer that outputs a vector reshaped into the desired size of A (state dimension squared), producing an estimated matrix ˆXk for the given system. This matrix represents a control output in the form of a transformation of the input matrices. This neural network framework incorporates physical laws and constraints directly into the learning process, improving the accuracy of the state estimates. The refinement process involves minimizing a composite loss function, where the variable α=0.1 and β=0.05 used to adjust the loss function:
PDE Loss: Ensures that the neural network solution adheres to the partial differential equations governing the UAV dynamics.
Initial Condition Loss: Penalizes deviations from the initial conditions.
Boundary Condition Loss: Ensures continuity and smoothness in the state estimates over time.
 Control Law and Position Estimate: The refined state estimate Xk is used to update the control law.
Output: The final output position (X,Y,Z) and Euler angles (ϕ,θ,ψ,) estimate.
The training process for the PINNs focuses on minimizing composite loss functions by adjusting the neural network parameters. By incorporating physical constraints and utilizing the dynamic modes extracted through DMD, the PINNs framework ensures robust and accurate UAV state estimations, even in the presence of noise and environmental disturbances. For training, we employed the IMCIS and Package Delivery UAV datasets. The network was trained for up to 5000 iterations, with early stopping triggered by an error tolerance threshold of 10^{-4}. The Adam optimizer was used, and the training duration ranged from 30 to 50 minutes. The network architecture consisted of three fully connected (Linear) layers and two ReLU activation functions. All computations were performed on an NVIDIA GTX GeForce RTX 4060 GPU. For model testing, we developed a Unity environment that included a physics-based simulation, accounting for factors like gravity and perturbation noise. Integration between the control model and the simulation environment was achieved via the UDP communication protocol.
Discussion of Simulation Results:
Fig.2 . Reference drone trajectory used for evaluating the performance of DMD and PINN control methods.
The integration of DMD with PINNs has led to significant improvements in UAV trajectory control, as evidenced by both quantitative and qualitative metrics. The DMD-PINN model achieved considerably lower RMSE and MAE compared to other models, including CNN, MLR, and the DMD-only model (refer to Table I). These results highlight the superior accuracy and reliability of the DMD-PINN approach for controlling UAV trajectories. Additionally, the DMD-PINN model closely matched the ground truth trajectories and demonstrated strong performance across various test scenarios, showcasing its robustness in handling noise and maintaining precision in dynamic environments. The combination of DMD and PINNs offers a powerful method for enhancing the fidelity of UAV control systems under diverse operational conditions.
C. A. Osorio Quero and J. Martinez-Carranza, "Physics-Informed Machine Learning for UAV Control," 2024 21st International Conference on Electrical Engineering, Computing Science and Automatic Control (CCE), Mexico City, Mexico, 2024, pp. 1-6, doi: 10.1109/CCE62852.2024.10770871.
BibTeX
@INPROCEEDINGS{10770871,author={Osorio Quero, Carlos Alexander and Martinez-Carranza, Jose},booktitle={2024 21st International Conference on Electrical Engineering, Computing Science and Automatic Control (CCE)}, title={Physics-Informed Machine Learning for UAV Control}, year={2024},volume={},number={},pages={1-6},doi={10.1109/CCE62852.2024.10770871}}
Actualizado: 6 nov 2024
Abstract
Traditional deep-learning techniques for image reconstruction often demand extensive training datasets, which might not always be readily available. In response to this challenge, methods not requiring pre-trained models have been developed, leveraging the training of networks to reverse-engineer the physical principles behind image creation. In this context, we introduce an innovative approach with our untrained Res-U2Net model for phase retrieval. This model allows us to extract phase information, crucial for detecting alterations on an object's surface. We can use this information to create a mesh model to represent the object's three-dimensional structure visually. Our study evaluates the effectiveness of the Res-U2Net model in phase retrieval tasks, comparing its performance with that of the UNet and U2Net models, specifically using images from the GDXRAY dataset.
Fig.1. 3D phase retrieval: (a) 2D Ray-X test image, (b) 2D phase retrieval estimate, and (c) resulting 3D mesh.
Method
Overview of the proposed architecture for Phase Retrieval:
Fig.2. Res-U2Net architecture: (a) U2Net model configuration, based on a multi-scale sequence of Res-UNet models, (b) Res-UNet model, the encoder extracts features using convolutional layers (Conv2D) with batch normalization, ReLU activation (ResBlock), and spatial resolution reduction via max pooling (MaxPooling2D). This is followed by a decoder assigning phases to the features by upsampling using transpose convolutions (Conv2DTranspose) with skip connections. Residual connections link the encoder and decoder layers to improve the training performance. Finally, a 1×440×4401×440×440 convolutional layer generates the segmentation mask, resulting in the network output.
BibTeX
@article{OsorioQuero:24,author = {Carlos Osorio Quero and Daniel Leykam and Irving Rondon Ojeda},journal = {J. Opt. Soc. Am. A},keywords = {Biomedical imaging; Computational imaging; Fluorescence lifetime imaging; Imaging techniques; Inverse design; Phase retrieval},number = {5},pages = {766--773},publisher = {Optica Publishing Group},title = {Res-U2Net: untrained deep learning for phase retrieval and image reconstruction},volume = {41},month = {May},year = {2024},url {https://opg.optica.org/josaa/abstract.cfm?URI=josaa-41-5-766}, doi = {10.1364/JOSAA.511074}}