Talha Zaidi

I am a Ph.D. Candidate in Computer Science at Kansas State University, advised by Arlsan Munir. I am a member of the ISCAAS Lab and my area of research is generative AI and reinforcement learning. Previously, I obtained an M.S. from Istanbul Medipol University and completed my B.S in Mechatronics and Controls Engineering from UET Lahore.

If you would like to collaborate on a research idea, please send me an email. I am very enthusiastic about collaborations and open-source projects. I also mentor students who are seeking graduate school in the U.S, or are just excited about doing research.

Top skills:
Reinforcement Learning, Machine Learning, Generative AI, Diffusion Models.

Email / Linkedin / GitHub / Google Scholar

Research

I am passionate about applying reinforcement learning (RL) to address complex, real-world challenges. My research focuses on developing advanced RL methodologies to improve policy consistency and mitigate distributional shifts in offline settings. I work on optimizing RL-based security protocols for smart grids to enhance communication and authentication resilience. Additionally, I design novel RL algorithms for spacecraft trajectory optimization, aiming to improve decision-making and efficiency in space missions. My work is driven by the goal of creating robust, practical RL solutions that can make a significant impact across diverse applications.

	Conservative Q-Learning through Diffusion Generative Models Ongoing Project Aim of this project is to mitigate distributional shift between behavioral and learned policies, using diffusion probabilistic models to enhance policy consistency and reduce out-of-distribution actions.
	Optimized PKC and DRL-Based Security Protocols for Smart Grids Ongoing Project Developing optimized public key cryptography (PKC) security protocols and reinforcement learning (RL)-based security mechanisms to enhance secure communication, authentication, and resiliency in smart grid systems.
	Single-Agent Attention Actor-Critic (SA3C): A Novel RL based Solution for Low-Thrust Spacecraft Trajectory Optimization S.M.T. Zaidi, Adrian Arustei , Arslan Munir, Atri Dutta IEEE Transaction on Aerospace and Electronic Systems ,(2025) Paper Code Developed the Single-Agent Attention Actor-Critic (SA3C) algorithm, enhancing decision-making and sample efficiency in low-thrust spacecraft trajectory optimization using deep reinforcement learning (DRL). Applied to geocentric and cislunar missions, SA3C outperforms traditional methods in complex multi-body orbital dynamics.
	Automated Trajectory Planning: A Cascaded Deep Reinforcement Learning Approach for Low-Thrust Spacecraft Orbit-Raising S.M.T. Zaidi, Adrian Arustei , Arslan Munir, Atri Dutta IEEE Magazine on Aerospace and Electronic Systems ,(2025) Paper Code Developed a novel Cascaded Deep Reinforcement Learning (CDRL) approach to optimize low-thrust spacecraft trajectory planning, significantly improving time-efficient orbit transfers. Achieved superior performance over traditional methods in complex multi-body environments for transfers to GEO and NRHO.
	Cascaded Deep Reinforcement Learning-Based Multi-Revolution Low-Thrust Spacecraft Orbit-Transfer S.M.T. Zaidi, P.S. CHADALAVADA, Hayat Ullah, Arslan Munir, Atri Dutta IEEE Access 11,(2023): 82894-82911. Paper Code Developed a Cascaded Deep Reinforcement Learning (DRL) model for optimizing long-duration, low-thrust spacecraft transfers from GTO to GEO. This approach, guided by a gradient-aided reward function, significantly reduces transfer time compared to state-of-the-art methods, enhancing spacecraft autonomy in complex multi-revolution transfers.
	Mode-Guided Feature Augmentation for Domain Generalization Muhammad Haris Khan, S.M.T. Zaidi, Salman Khan, Fahad Shehbaz Khan, In British Machine Visiion Conference BMVC .,(p. 176) Paper Proposed a simple and efficient domain generalization (DG) approach that augments source domains by exploring dominant modes of variation in the feature space, enhancing generalization to unseen domains. Demonstrated competitive performance against state-of-the-art methods on popular DG benchmarks, including challenging single-source settings.
	Machine Learning Assisted Low-Thrust Orbit-Raising: A Comparative Assessment of a Sequential Algorithm and Deep Reinforcement Learning Approach Atri Dutta, Adrian Arustei , Matthew Chace , P.S. CHADALAVADA, James Steck, S.M.T. Zaidi, Atri Dutta AIAA SCITECH 2024 Forum.,1669 Paper Developed a machine-learning-assisted method for optimizing low-thrust orbit-raising trajectories, integrating a sequential algorithm with a neural network-based high-level planner, and benchmarked it against deep reinforcement learning approaches for geostationary and halo orbit missions.
	Learned vs. Hand-Crafted Features for Deep Learning Based Aperiodic Laboratory Earthquake Time-Prediction S.M.T. Zaidi, Asmaa Samy , Mehmet Kocatürk, Hasan F. Ateş 2020 28th signal processing and communications applications conference (SIU) .,(p. 1-4) 2020/10/5 Paper Developed and evaluated machine learning models for earthquake prediction using LANL data, demonstrating that a CNN-LSTM network significantly outperformed traditional methods, providing faster and more accurate predictions compared to existing approaches in the literature.
	A behavioral paradigm for cortical control of a robotic actuator by freely moving rats in a one-dimensional two-target reaching task S.M.T. Zaidi, Samet Kocatürk , Tunçer Baykaş Mehmet Kocatürk, Journal of Neuroscience Methods .,373, 109555 Paper Developed a novel behavioral paradigm for controlling neuroprosthetic trajectories in rats, adapting a one-dimensional setup for reaching two distant targets. This method, utilizing primary motor cortex activity to direct robotic actions, achieved over 78% accuracy in target reaching and demonstrated the potential for reversal learning. This work represents the first successful implementation of trajectory-based neuroprosthetic control in rodents, offering a cost-effective platform for exploring neural circuit principles and novel brain-machine interface technologies.

Work Experience

Kansas State University
Graduate Teaching Assistant
Aug. 2021 - Dec. 2022
CIS 530/730 (Introduction to AI), &
CIS: 209A/B (Computer Programming)

ISCAAS Lab
Graduate Research Assistant
Jan. 2023 - Current
NASA AI Project
DOE Security Project

Istanbul Medipol University
Neuroprosthetics Lab
Graduate Research Assistant
Oct. 2018 - Dec. 2020

Funded Projects

Optimized PKC and DRL-Based Security Protocols for Smart Grids:

Developing optimized public key cryptography (PKC) security protocols and reinforcement learning (RL)-based security mechanisms to enhance secure communication, authentication, and resiliency in smart grid systems. This project is funded by the Department of Energy, US. (link)

Deep Reinforcement Learning Assisted Spacecraft Trajectory Optimization and Planning:

Developed a novel attention-based Single-Agent Actor-Critic (SA3C) algorithm for optimizing low-thrust spacecraft trajectories, outperforming traditional RL algorithms in geocentric and cislunar missions. This work resulted in three publications (2 submitted, 1 published), enhancing sample efficiency and decision-making in complex orbital dynamics. It is funded by NASA. (link)

Last Updated: September 2024
Credits for the template to Jon Barron.

Conservative Q-Learning through Diffusion Generative Models

Optimized PKC and DRL-Based Security Protocols for Smart Grids

Single-Agent Attention Actor-Critic (SA3C): A Novel RL based Solution for Low-Thrust Spacecraft Trajectory Optimization

Automated Trajectory Planning: A Cascaded Deep Reinforcement Learning Approach for Low-Thrust Spacecraft Orbit-Raising

Cascaded Deep Reinforcement Learning-Based Multi-Revolution Low-Thrust Spacecraft Orbit-Transfer

Mode-Guided Feature Augmentation for Domain Generalization

Machine Learning Assisted Low-Thrust Orbit-Raising: A Comparative Assessment of a Sequential Algorithm and Deep Reinforcement Learning Approach

Learned vs. Hand-Crafted Features for Deep Learning Based Aperiodic Laboratory Earthquake Time-Prediction

A behavioral paradigm for cortical control of a robotic actuator by freely moving rats in a one-dimensional two-target reaching task