I am passionate about applying reinforcement learning (RL) to address complex, real-world challenges. My research focuses on developing advanced RL methodologies to improve policy consistency and mitigate distributional shifts in offline settings. I work on optimizing RL-based security protocols for smart grids to enhance communication and authentication resilience. Additionally, I design novel RL algorithms for spacecraft trajectory optimization, aiming to improve decision-making and efficiency in space missions. My work is driven by the goal of creating robust, practical RL solutions that can make a significant impact across diverse applications.
|
Conservative Q-Learning through Diffusion Generative Models
Ongoing Project
Aim of this project is to mitigate distributional shift between behavioral and learned policies, using diffusion probabilistic models to enhance policy consistency and reduce out-of-distribution actions.
|
|
Optimized PKC and DRL-Based Security Protocols for Smart Grids
Ongoing Project
Developing optimized public key cryptography (PKC) security protocols and reinforcement learning (RL)-based security mechanisms to enhance secure communication, authentication, and resiliency in smart grid systems. |
|
Single-Agent Attention Actor-Critic (SA3C): A Novel RL based Solution for Low-Thrust Spacecraft Trajectory Optimization
S.M.T. Zaidi,
Adrian Arustei ,
Arslan Munir,
Atri Dutta
IEEE Transaction on Aerospace and Electronic Systems ,(2025)
Paper
Code
Developed the Single-Agent Attention Actor-Critic (SA3C) algorithm, enhancing decision-making and sample efficiency in low-thrust spacecraft trajectory optimization using deep reinforcement learning (DRL). Applied to geocentric and cislunar missions, SA3C outperforms traditional methods in complex multi-body orbital dynamics.
|
|
Automated Trajectory Planning: A Cascaded Deep Reinforcement Learning Approach for Low-Thrust Spacecraft Orbit-Raising
S.M.T. Zaidi,
Adrian Arustei ,
Arslan Munir,
Atri Dutta
IEEE Magazine on Aerospace and Electronic Systems ,(2025)
Paper
Code
Developed a novel Cascaded Deep Reinforcement Learning (CDRL) approach to optimize low-thrust spacecraft trajectory planning, significantly improving time-efficient orbit transfers. Achieved superior performance over traditional methods in complex multi-body environments for transfers to GEO and NRHO.
|
|
Cascaded Deep Reinforcement Learning-Based Multi-Revolution Low-Thrust Spacecraft Orbit-Transfer
S.M.T. Zaidi,
P.S. CHADALAVADA,
Hayat Ullah,
Arslan Munir,
Atri Dutta
IEEE Access 11,(2023): 82894-82911.  
Paper
Code
Developed a Cascaded Deep Reinforcement Learning (DRL) model for optimizing long-duration, low-thrust spacecraft transfers from GTO to GEO. This approach, guided by a gradient-aided reward function, significantly reduces transfer time compared to state-of-the-art methods, enhancing spacecraft autonomy in complex multi-revolution transfers.
|
|
Mode-Guided Feature Augmentation for Domain Generalization
Muhammad Haris Khan,
S.M.T. Zaidi,
Salman Khan,
Fahad Shehbaz Khan,
In British Machine Visiion Conference BMVC .,(p. 176)
Paper
Proposed a simple and efficient domain generalization (DG) approach that augments source domains by exploring dominant modes of variation in the feature space, enhancing generalization to unseen domains. Demonstrated competitive performance against state-of-the-art methods on popular DG benchmarks, including challenging single-source settings.
|
|
Machine Learning Assisted Low-Thrust Orbit-Raising: A Comparative Assessment of a Sequential Algorithm and Deep Reinforcement Learning Approach
Atri Dutta,
Adrian Arustei ,
Matthew Chace ,
P.S. CHADALAVADA,
James Steck,
S.M.T. Zaidi,
Atri Dutta
AIAA SCITECH 2024 Forum.,1669
Paper
Developed a machine-learning-assisted method for optimizing low-thrust orbit-raising trajectories, integrating a sequential algorithm with a neural network-based high-level planner, and benchmarked it against deep reinforcement learning approaches for geostationary and halo orbit missions. |
|
Learned vs. Hand-Crafted Features for Deep Learning Based Aperiodic Laboratory Earthquake Time-Prediction
S.M.T. Zaidi,
Asmaa Samy ,
Mehmet Kocatürk,
Hasan F. Ateş
2020 28th signal processing and communications applications conference (SIU) .,(p. 1-4) 2020/10/5
Paper
Developed and evaluated machine learning models for earthquake prediction using LANL data, demonstrating that a CNN-LSTM network significantly outperformed traditional methods, providing faster and more accurate predictions compared to existing approaches in the literature.
|
|
A behavioral paradigm for cortical control of a robotic actuator by freely moving rats in a one-dimensional two-target reaching task
S.M.T. Zaidi,
Samet Kocatürk ,
Tunçer Baykaş
Mehmet Kocatürk,
Journal of Neuroscience Methods .,373, 109555
Paper
Developed a novel behavioral paradigm for controlling neuroprosthetic trajectories in rats, adapting a one-dimensional setup for reaching two distant targets. This method, utilizing primary motor cortex activity to direct robotic actions, achieved over 78% accuracy in target reaching and demonstrated the potential for reversal learning. This work represents the first successful implementation of trajectory-based neuroprosthetic control in rodents, offering a cost-effective platform for exploring neural circuit principles and novel brain-machine interface technologies. |
Optimized PKC and DRL-Based Security Protocols for Smart Grids:
Developing optimized public key cryptography (PKC) security protocols and reinforcement learning (RL)-based security mechanisms to enhance secure communication, authentication, and resiliency in smart grid systems. This project is funded by the Department of Energy, US. (link)
Deep Reinforcement Learning Assisted Spacecraft Trajectory Optimization and Planning:
Developed a novel attention-based Single-Agent Actor-Critic (SA3C) algorithm for optimizing low-thrust spacecraft trajectories, outperforming traditional RL algorithms in geocentric and cislunar missions. This work resulted in three publications (2 submitted, 1 published), enhancing sample efficiency and decision-making in complex orbital dynamics. It is funded by NASA. (link)
|
Last Updated: September 2024
Credits for the template to Jon Barron.
|
|