Talha Zaidi

I am a Ph.D. Candidate in Computer Science at Kansas State University, advised by Arlsan Munir. I am a member of the ISCAAS Lab and my area of research is generative AI and reinforcement learning. Previously, I obtained an M.S. from Istanbul Medipol University and completed my B.S in Mechatronics and Controls Engineering from UET Lahore.

If you would like to collaborate on a research idea, please send me an email. I am very enthusiastic about collaborations and open-source projects. I also mentor students who are seeking graduate school in the U.S, or are just excited about doing research.

Top skills:
Reinforcement Learning, Machine Learning, Generative AI, Diffusion Models.

Email  /  Linkedin  /  GitHub  /  Google Scholar  

profile photo


Research
I am passionate about applying reinforcement learning (RL) to address complex, real-world challenges. My research focuses on developing advanced RL methodologies to improve policy consistency and mitigate distributional shifts in offline settings. I work on optimizing RL-based security protocols for smart grids to enhance communication and authentication resilience. Additionally, I design novel RL algorithms for spacecraft trajectory optimization, aiming to improve decision-making and efficiency in space missions. My work is driven by the goal of creating robust, practical RL solutions that can make a significant impact across diverse applications.

Geocentric orbital Transfer
Conservative Q-Learning through Diffusion Generative Models
Ongoing Project

Aim of this project is to mitigate distributional shift between behavioral and learned policies, using diffusion probabilistic models to enhance policy consistency and reduce out-of-distribution actions.

Geocentric orbital Transfer
Optimized PKC and DRL-Based Security Protocols for Smart Grids
Ongoing Project

Developing optimized public key cryptography (PKC) security protocols and reinforcement learning (RL)-based security mechanisms to enhance secure communication, authentication, and resiliency in smart grid systems.

SA3C Algorithm
Single-Agent Attention Actor-Critic (SA3C): A Novel RL based Solution for Low-Thrust Spacecraft Trajectory Optimization
S.M.T. Zaidi, Adrian Arustei , Arslan Munir, Atri Dutta IEEE Transaction on Aerospace and Electronic Systems ,(2025)
Paper Code

Developed the Single-Agent Attention Actor-Critic (SA3C) algorithm, enhancing decision-making and sample efficiency in low-thrust spacecraft trajectory optimization using deep reinforcement learning (DRL). Applied to geocentric and cislunar missions, SA3C outperforms traditional methods in complex multi-body orbital dynamics.

Geocentric orbital Transfer
Automated Trajectory Planning: A Cascaded Deep Reinforcement Learning Approach for Low-Thrust Spacecraft Orbit-Raising
S.M.T. Zaidi, Adrian Arustei , Arslan Munir, Atri Dutta IEEE Magazine on Aerospace and Electronic Systems ,(2025)
Paper Code

Developed a novel Cascaded Deep Reinforcement Learning (CDRL) approach to optimize low-thrust spacecraft trajectory planning, significantly improving time-efficient orbit transfers. Achieved superior performance over traditional methods in complex multi-body environments for transfers to GEO and NRHO.

CDRL
Cascaded Deep Reinforcement Learning-Based Multi-Revolution Low-Thrust Spacecraft Orbit-Transfer
S.M.T. Zaidi, P.S. CHADALAVADA, Hayat Ullah, Arslan Munir, Atri Dutta
IEEE Access 11,(2023): 82894-82911.  
Paper Code

Developed a Cascaded Deep Reinforcement Learning (DRL) model for optimizing long-duration, low-thrust spacecraft transfers from GTO to GEO. This approach, guided by a gradient-aided reward function, significantly reduces transfer time compared to state-of-the-art methods, enhancing spacecraft autonomy in complex multi-revolution transfers.

Geocentric orbital Transfer
Mode-Guided Feature Augmentation for Domain Generalization
Muhammad Haris Khan, S.M.T. Zaidi, Salman Khan, Fahad Shehbaz Khan,
In British Machine Visiion Conference BMVC .,(p. 176)
Paper

Proposed a simple and efficient domain generalization (DG) approach that augments source domains by exploring dominant modes of variation in the feature space, enhancing generalization to unseen domains. Demonstrated competitive performance against state-of-the-art methods on popular DG benchmarks, including challenging single-source settings.

Geocentric orbital Transfer
Machine Learning Assisted Low-Thrust Orbit-Raising: A Comparative Assessment of a Sequential Algorithm and Deep Reinforcement Learning Approach
Atri Dutta, Adrian Arustei , Matthew Chace , P.S. CHADALAVADA, James Steck, S.M.T. Zaidi, Atri Dutta
AIAA SCITECH 2024 Forum.,1669
Paper

Developed a machine-learning-assisted method for optimizing low-thrust orbit-raising trajectories, integrating a sequential algorithm with a neural network-based high-level planner, and benchmarked it against deep reinforcement learning approaches for geostationary and halo orbit missions.

Geocentric orbital Transfer
Learned vs. Hand-Crafted Features for Deep Learning Based Aperiodic Laboratory Earthquake Time-Prediction
S.M.T. Zaidi, Asmaa Samy , Mehmet Kocatürk, Hasan F. Ateş
2020 28th signal processing and communications applications conference (SIU) .,(p. 1-4) 2020/10/5
Paper

Developed and evaluated machine learning models for earthquake prediction using LANL data, demonstrating that a CNN-LSTM network significantly outperformed traditional methods, providing faster and more accurate predictions compared to existing approaches in the literature.

Geocentric orbital Transfer Geocentric orbital Transfer
A behavioral paradigm for cortical control of a robotic actuator by freely moving rats in a one-dimensional two-target reaching task
S.M.T. Zaidi, Samet Kocatürk , Tunçer Baykaş Mehmet Kocatürk,
Journal of Neuroscience Methods .,373, 109555
Paper

Developed a novel behavioral paradigm for controlling neuroprosthetic trajectories in rats, adapting a one-dimensional setup for reaching two distant targets. This method, utilizing primary motor cortex activity to direct robotic actions, achieved over 78% accuracy in target reaching and demonstrated the potential for reversal learning. This work represents the first successful implementation of trajectory-based neuroprosthetic control in rodents, offering a cost-effective platform for exploring neural circuit principles and novel brain-machine interface technologies.



Work Experience

Kansas State University
Graduate Teaching Assistant
Aug. 2021 - Dec. 2022
CIS 530/730 (Introduction to AI), &
CIS: 209A/B (Computer Programming)

ISCAAS Lab
Graduate Research Assistant
Jan. 2023 - Current
NASA AI Project
DOE Security Project

Istanbul Medipol University
Neuroprosthetics Lab
Graduate Research Assistant
Oct. 2018 - Dec. 2020



Funded Projects
Optimized PKC and DRL-Based Security Protocols for Smart Grids:

Developing optimized public key cryptography (PKC) security protocols and reinforcement learning (RL)-based security mechanisms to enhance secure communication, authentication, and resiliency in smart grid systems. This project is funded by the Department of Energy, US. (link)

Deep Reinforcement Learning Assisted Spacecraft Trajectory Optimization and Planning:

Developed a novel attention-based Single-Agent Actor-Critic (SA3C) algorithm for optimizing low-thrust spacecraft trajectories, outperforming traditional RL algorithms in geocentric and cislunar missions. This work resulted in three publications (2 submitted, 1 published), enhancing sample efficiency and decision-making in complex orbital dynamics. It is funded by NASA. (link)











Last Updated: September 2024
Credits for the template to Jon Barron.