Resume

Perusha Moodley

Contact Information:


Professional Summary

Highly experienced developer and technical team lead with over 18 years in software development and consulting. Recent PhD in Deep Reinforcement Learning, now pivoting to AI Safety and alignment. Currently enrolled in Blue Dot’s AI Safety Fundamentals Alignment course and an AISC (AI Safety Camp) project (Jan-April 2025). Lifelong learner adept at analysing and coding research papers and adopting new technologies. Proven track record in leading projects and developing innovative solutions. Seeking impactful role working on AI safety and alignment problems including technical governance for the model development-deployment evaluation process.


Core Competencies


Professional Experience

Lodgical Ltd - AI Research Development

Role: Founder & Lead Consultant
Duration: 2006 - Present

Baxter Healthcare Ltd — Senior Integration Specialist

Duration: 2006 - 2017

Forza Consulting BV - Senior Technical Consultant

Duration: 2010 - 2015

UTC FS - Development Lead

Duration: 2005 - 2006

Baxter Healthcare Ltd - Senior Analyst Programmer

Duration: 2001 - 2005

Deloitte and Touche - Senior Consultant

Duration: 2000 - 2001

PwC - Consultant

Duration: 1998 - 2000


Education

Ph.D. in Computer Science
University of Reading, UK
Focus: Deep Reinforcement Learning (RL/DRL)

M.Sc. in Mechanical Engineering
University of Natal, South Africa
Focus: Manufacturing automation

B.Sc. in Mechanical Engineering
University of Natal, South Africa
Focus: Robotics and automation


Research & Publications


Current Projects


Additional Activities


Technical Skills - Detailed

I have designed, modified and debugged algorithms both in RL/DRL and more generally in ML using PyTorch (GPU). I have extensive experience with both decision transformers and core RL algorithms. In the former I made novel contributions to the core transformer algorithm’s tokenisation and position encoding for multimodal RL tasks with visual inputs, with a view towards improving interpretability. Part of my work involved optimising the dataloader for the large multimodal datasets we trained on and tuning models. I wrote methods from mechanistic interpretability, used in AI alignment, to retrieve model activations and analyse model behaviour. In the latter, I worked with both on-policy (PPO) and off-policy (DQN) algorithms, in multi-task settings and with auxiliary signals. 


While my primary experience is in RL, I have experience working with clustering, contrastive methods, ResNets and LLMs (fine-tuning and LlamaIndex). I understand the transformer architecture very well and have trained models from scratch using GPUs with PyTorch. I have used multiple frameworks including SB3, CleanRL, Ray and RLLib and Gym/Gymnasium (created custom environments). I worked on projects extending the interpretation of decision transformers and skill transfer for transformer RL agents. 

Prior to my PhD I worked as an enterprise developer and analyst so I am experienced with working with users and on production systems. I held development team lead positions with responsibility for setting standards, best practices and architecture design.

Finally and most importantly, I am a life-long learner. As I have demonstrated throughout my career, I remain strongly interested in learning and confident in my ability to pick up any new skills.