Open to interdisciplinary research collaborations and selected AI systems roles

Applied AI Researcher and AI Systems Engineer

Shaurav Khadka

Researching and building dependable AI systems for complex real-world data.

Applied AI researcher with a systems-engineering background. My current priority is AI-assisted quantum device characterisation, supported by scientific machine learning, temporal reasoning, and computational modelling. I build and evaluate dependable systems that turn research questions into testable evidence.

Sydney, NSW, Australia · From research question to defensible evidence and dependable implementation.

Explore Research Directions View Applied Systems Résumé

Open evidence index ↗Inspect GitHub ↗LinkedIn ↗

Research

Research-Led Focus

Scientific questions lead. Applied systems provide the evidence base for testing ideas honestly.

Research Priorities

Current research direction

Artificial Intelligence
AI-Assisted Quantum Device Characterisation
Scientific Machine Learning
Quantum Computing
Open Quantum Systems
Non-Markovian Dynamics
Computational Physics
Computational Mathematics

Applied Evidence Base

Built and evaluated systems

Trustworthy AI
Document Intelligence
Semantic Retrieval
Temporal Graph Learning
Computer Vision
Robotics and Sim2Real
Reinforcement Learning
AI Reliability

Research Profile

Research Directions

My primary direction is AI-assisted quantum device characterisation: using rigorous artificial intelligence, scientific machine learning, and mathematical modelling to study noisy quantum systems. I am especially interested in research where temporal dependencies, interpretability, and practical mitigation decisions meet.

I want to contribute to research that is mathematically grounded, experimentally honest, physically informed, interpretable where possible, and useful beyond a controlled demonstration.

Discuss a research opportunity

Primary research focus

AI-Assisted Quantum Device Characterisation

My current priority is the characterisation and mitigation of non-Markovian noise in quantum hardware: combining physical reasoning, artificial intelligence, temporal modelling, and interpretable representations to connect measurement data with practical mitigation decisions.

Primary focus

Questions I want to pursue

How can process-tensor and tensor-network representations make memory effects measurable and interpretable?
Where can temporal AI methods improve characterisation without obscuring the underlying physics?
How can optimisation support parameter tuning or gate-sequence decisions for mitigation?

Active focus

Trustworthy AI Systems and Production Reliability

Evaluation methods for AI pipelines where traceability, robustness, auditability, confidence handling, latency, and cost matter alongside model accuracy.

Questions I want to pursue

How should reliability be measured across the full decision pipeline?
How can failure analysis distinguish model, data, and system faults?

Active focus

Temporal Learning for Dynamic and Relational Data

Learning systems for data that evolves over time: temporal graphs, sequential evidence, changing relationships, and non-static risk signals.

Questions I want to pursue

When does temporal modelling materially outperform static baselines?
How should time-dependent behaviour be evaluated and explained?

Active focus

Sim2Real Perception and Autonomous Systems

Robust perception under deployment shift, confidence-aware decisions, and vision-to-action systems that must behave safely outside curated datasets.

Questions I want to pursue

How can deployment adaptation be designed in from the beginning?
How should confidence thresholds shape downstream actions?

Research map

Interdisciplinary research landscape

A transparent map of the scientific domains and methods I am actively developing. These are research priorities and learning directions, not inflated claims of completed expertise.

Quantum systems, characterisation and control

The physical and computational language needed to study noisy quantum devices and practical mitigation strategies.

Quantum computing
Open quantum systems
Non-Markovian dynamics
Quantum control
Quantum error mitigation

Mathematical modelling and scientific AI

Methods for turning partially observed physical systems into testable models, interpretable evidence, and reproducible analysis.

Scientific machine learning
Process tensors
Tensor networks
Inverse problems
System identification
Uncertainty quantification

Broader scientific interests

Adjacent fields that sharpen how I think about dynamic, complex, and partially observed systems.

Data science
Scientific computing
Dynamical systems
Astrophysics
Cosmology
Complex systems

Research Method

From research question to defensible evidence.

A research-led loop for turning complex questions into models, experiments, evidence, and dependable systems.

01
Question
Frame the scientific or operational question, assumptions, constraints, and falsifiable evidence.
02
Model
Choose representations and computational methods that preserve the structure of the problem.
03
Test
Design reproducible experiments, compare baselines, inspect edge cases, and challenge assumptions.
04
Translate
Connect experimental findings to interpretable decisions, robust systems, and practical constraints.
05
Communicate
Document trade-offs, limitations, and decisions clearly enough to act on.

Measured Highlights

Three results worth remembering.

These are benchmark anchors, not a separate project list. Open any card for context, then inspect the complete six-system portfolio below.

Measured highlight 01 / 03

2.38% → 95.24%

Robot-image accuracy after deployment-specific Sim2Real adaptation

The model looked strong on curated data and degraded sharply on robot-camera images. The recovery came from treating domain shift as a deployment problem, not a footnote.

Inspect case study Evidence index

Baseline: 2.38% before deployment-specific adaptation.
Measured: 95.24% robot-image accuracy after targeted collection, augmentation, and fine-tuning.
Conditions: Robot-camera inputs with lighting, viewpoint, scale, and background differences.
Public scope: Collaborative team-level result with exported notebook figures and explicit attribution.

Measured highlight 02 / 03

300 → 1,925

AirRaid PPO mean reward after temporal observation changes

Observation design materially changed what the policy could learn. Frame skipping and frame stacking improved the benchmark result without pretending algorithm choice was the only lever.

Inspect case study Evidence index

Measured highlight 03 / 03

P@5 = 0.68 · R@5 = 0.68

RedditPulse semantic retrieval quality

The retrieval layer was measured before generation was treated as useful. That matters because grounded insight quality depends on which evidence the system surfaces first.

Inspect case study Evidence index

Inspectable Systems

Six distinct systems. One research-led portfolio.

The benchmark highlights above stay as concise entry points. The six systems below expand the portfolio across production reliability, temporal graph learning, conversational AI, semantic retrieval, machine-learning evaluation, and responsible-AI analysis.

Distinct systems

Benchmark anchors

Total case-study routes

Production AI · Document Intelligence

Public method case study

Production AI Reliability and Document Intelligence

Problem: Document intelligence can fail long before or after OCR. Real reliability depends on the complete path from ingestion to extraction, transformation, validation, and review.

Contribution: Built repeatable evaluation workflows across OCR configurations, mappings, confidence scores, error codes, and reruns while preserving traceability and review boundaries.

OCR → transform → validate → trace

AI/ML Research and Development Intern

Python
pandas
AWS S3
boto3
Azure Document Intelligence
JSON

Inspect case study

Temporal Graph Learning

Temporal graph-learning research build

Temporal GNN for Blockchain Fraud Detection

Problem: Fraud is relational and time-dependent. Static tabular features can miss how transactions evolve across a network.

Contribution: Built a TGAT-style pipeline with temporal encodings, attention-based message passing, dual-task learning, and an XGBoost comparison path.

t0 → t1 → t2

Research build

PyTorch
NetworkX
TGAT
Temporal GNNs
XGBoost

Inspect case study

Generative AI · Conversational Systems

Scoped conversational-AI prototype

LLM-Based Financial Assistant Prototype

Problem: Conversational assistants can produce fluent but poorly scoped responses. This prototype explores structured prompting, model comparison, synthetic profiles, and explicit safety boundaries.

Contribution: Built a structured prompt workflow, lightweight GPT-2 and DistilGPT-2 comparison, synthetic profile inputs, and a Gradio interface.

profile → prompt → compare → respond

Individual prototype

Transformers
GPT-2
DistilGPT-2
Prompt Design
Gradio

Inspect case study

NLP · Information Retrieval

Modular retrieval research toolkit

Semantic Search and Information Retrieval Engine

Problem: Keyword matching is transparent but limited when meaning varies across phrasing. The system needed a modular comparison path from classical retrieval to dense semantic search.

Contribution: Built reusable text preprocessing, TF-IDF baselines, embedding-based retrieval, ranking logic, and evaluation hooks for comparing relevance trade-offs.

clean → encode → rank → evaluate

Individual modular research build

TF-IDF
Sentence Transformers
Vector Search
Ranking

Inspect case study

Machine Learning · Data Science

Reusable experimental evaluation pipeline

Machine-Learning Evaluation and Data-Science Pipeline

Problem: A model result is only useful when the path from raw data to evaluation is reproducible, comparable, and explicit about failure cases.

Contribution: Built repeatable workflows for cleaning, exploratory analysis, feature engineering, supervised comparison, clustering, validation, hyperparameter tuning, and error analysis.

data → features → compare → inspect

Individual experimentation toolkit

scikit-learn
pandas
EDA
Cross-validation
Clustering
Error Analysis

Inspect case study

Responsible AI · Governance

Responsible-AI research and analysis portfolio

Responsible AI, Governance and Human-Centred Analysis

Problem: AI systems can be technically capable and still fail users, organisations, or communities when accountability, transparency, risk, and human oversight are treated as afterthoughts.

Contribution: Analysed responsible-AI principles, governance frameworks, human-centred design questions, adoption constraints, and technical-communication requirements across applied AI contexts.

risk → explain → govern → improve

Research analysis and technical communication

Responsible AI
Governance
Risk Analysis
Human-Centred AI
Technical Communication

Inspect case study

Research Evidence

Selected evidence and public scope.

A concise record of published benchmarks, method descriptions, and selected artifacts supporting the work shown above.

Project

Claim

Evidence

Scope

Inspect

Project

Vision and ROS2 Sim2Real

Claim

Robot-image accuracy recovery

Evidence

verified endpoint comparison

Scope

public benchmark

Inspect

Project

RL Benchmark Suite

Claim

RL curves and endpoint evaluation

Evidence

exported notebook figures

Scope

public benchmark

Inspect

Project

RedditPulse

Claim

Semantic retrieval evaluation

Evidence

verified metrics and evaluation charts

Scope

public benchmark

Inspect

Project

Production AI Reliability

Claim

Synthetic invoice reliability trace

Evidence

sanitised method illustration

Scope

public method

Inspect

Project

Temporal GNN Fraud Detection

Claim

TGAT versus XGBoost trade-off benchmark

Evidence

synthetic-validation table and charts

Scope

synthetic validation

Inspect

Project

LLM Financial Assistant Prototype

Claim

Prompt-driven conversational workflow

Evidence

scoped prototype architecture

Scope

exploration

Inspect

Project

Semantic Search and IR

Claim

Classical-to-dense retrieval comparison path

Evidence

retrieval-method architecture

Scope

public method

Inspect

Project

ML Evaluation and Data-Science Pipeline

Claim

Reproducible evaluation workflow

Evidence

experimental pipeline architecture

Scope

public method

Inspect

Project

Responsible AI and Governance Analysis

Claim

Risk-to-governance analysis path

Evidence

governance-analysis framework

Scope

public method

Inspect

Experience

Research and Technical Work, Ordered by Signal

Applied research, production-oriented AI R&D, technical leadership, and software engineering. Supporting operations experience stays visible without dominating the research narrative.

TRUUTH
AI/ML Research and Development Intern
Feb 2026 — Present
Sydney, NSW, Australia · Hybrid
Production-oriented document intelligence, fraud-detection evaluation, and AI reliability analysis. Built repeatable OCR-evaluation workflows across layouts, configuration choices, confidence scores, field mappings, and error codes while documenting traceability, reproducibility, validation dependencies, latency, and cost considerations.
- Document Intelligence
- OCR Evaluation
- AWS S3
- Azure Document Intelligence
- Reliability
Picpoint Nepal Pvt. Ltd.
Chief Technology Officer
Jun 2021 — Dec 2024
Kathmandu, Nepal · Hybrid
Technical leadership across operational systems, digital workflows, and data-informed decision support. Led the technical roadmap and maintained systems supporting remote workflows, business coordination, web operations, and market-intelligence tooling.
- Technical Leadership
- Operations Systems
- Data Workflows
- Web Systems
Thakur International
Jr. Full Stack Developer
Jun 2019 — May 2020
Kathmandu, Nepal · On-site
Application development, API integration, debugging, and backend-data quality within an agile engineering team. Implemented and maintained web and mobile components while improving maintainability through structured debugging, refactoring, and performance tuning.
- PHP
- Python
- JavaScript
- REST APIs
- Debugging

Additional Australian operations experience

Ingleburn Convenience Store · Operations and Digital Support Assistant · Part-time

Oct 2024 — Jun 2026

Supported transaction accuracy, inventory records, POS troubleshooting, digital administration, and customer-facing operations while completing postgraduate study in Australia.

Capabilities

Capability Atlas

The broader portfolio inventory, grouped by research and systems domain. These capabilities support three measured highlights and six different inspectable systems without turning every subtask into a separate project card.

Large Language Models
Retrieval-Augmented Generation
Document chunking
Vector embeddings
Context injection
Prompt engineering
Grounded generation
Language detection
Translation workflows
Cross-lingual retrieval

Text preprocessing
Tokenisation and lemmatisation
Named-entity recognition
TF-IDF baselines
Sentence-transformer embeddings
FAISS retrieval
Ranking logic
Sentiment modelling
Topic classification
Trend and community analysis

Supervised learning
Classification
Predictive modelling
Clustering
Exploratory data analysis
Feature engineering
Cross-validation
Hyperparameter tuning
Confusion-matrix analysis
Comparative benchmarking

Convolutional neural networks
Transfer learning
Fine-tuning
Image augmentation
Fine-grained recognition
Domain-shift analysis
Robot-camera adaptation
ROS2 integration
Confidence thresholds
Vision-to-action pipelines

Q-learning
DQN
PPO
Sparse-reward environments
Reward shaping
Frame stacking
Frame skipping
Temporal observations
Training-curve analysis
Policy evaluation

OCR evaluation
Azure Document Intelligence
Invoice parsing
Structured extraction
AWS S3
boto3
JSON transformation
Confidence-score analytics
Error-code analysis
Adversarial testing
Auditability and deployment risk

Responsible-AI analysis
Transparency and accountability
Human-centred AI design
AI-governance evaluation
AI-risk assessment
Literature reviews
Experimental reporting
Technical documentation
Presentations
Stakeholder communication

Foundation

Education and Selected Credentials

Formal study, applied leadership programs, and focused technical learning.

Education

Macquarie University

Master of Information Technology · Artificial Intelligence

2024 — Present · Sydney, NSW, Australia

Relevant work: NLP and LLM systems, graph machine learning, advanced computer vision and action, reinforcement learning, AI governance, and an industry AI/ML R&D internship.

Education

London Metropolitan University · Islington College

BSc Computer Science · First Class Honours

2017 — 2021 · Kathmandu, Nepal

Ranked among the top 10 students in the cohort. Built recommendation, trip-planning, and database-backed systems across Python, PHP/MySQL, Oracle, C#, Java, and GUI development.

Selected Credentials

Global Leadership Program
Macquarie University
Structured co-curricular leadership development focused on global engagement and professional growth.
MQ Incubator × KPMG Design Thinking
MQ Incubator × KPMG
Entrepreneurship, innovation, and human-centred problem solving.
UPG Sustainability Leadership · 2024
UPG
Selected as one of 500 participants from a global applicant pool.
CS50x Computer Science
Harvard University
Elements of AI
University of Helsinki
Ethics and Governance of AI for Health
World Health Organization

Independent Publishing

Books & Independent Publishing

Explore my authored and collaborative publishing catalogue: technology and well-being, children’s storytelling, illustration, editing, and creative production.

The catalogue below includes the seven distinct works listed on my Goodreads Author profile. Purchase links lead to Amazon Australia where a verified listing is available; Goodreads links provide the public catalogue record.

Works

Amazon

Buy

Goodreads

Discover

Shop Amazon Author Store Goodreads Author Profile Browse collaborative titles

Featured authored publication

The Digital Equilibrium

Navigating Technological Advancement for Optimal Well-Being

An independent authored work exploring how technological progress can be balanced with human well-being and intentional living.

Buy on Amazon View on Goodreads

Catalogue

Complete publication catalogue

Authored and collaborative publishing work, presented with direct reading and purchase paths.

Illustrator · Editor

Children Stories

Bal Katha

A children’s-story collection created with Gokul Khadka, with illustration and editorial contribution by Shaurav Khadka.

Buy on Amazon Goodreads

Illustrator · Creative contributor

Joyful Stories

An illustrated story collection listed in the Goodreads Author catalogue.

Buy on Amazon Goodreads

Illustrator · Editor

Joyful Stories

Mazzako Katha

A colourful illustrated collection designed for younger readers and created with Gokul Khadka.

Buy on Amazon Goodreads

Illustrator · Creative contributor

Joyful Stories

Mazzako Katha · Alternate edition

An alternate catalogue edition of the illustrated Mazzako Katha collection.

Buy on Amazon Goodreads

Illustrator · Editor

Words of Wisdom

Amritvani

A collaborative illustrated publication centred on devotional reflections and words of wisdom.

Buy on Amazon Goodreads

Illustrator · Creative contributor

2 in 1 Joyful, Children Stories

Combined children’s-story edition

A combined illustrated edition bringing together children’s stories in a single collection.

Buy on Amazon Goodreads

About

Research-Led Systems Work

I am an applied AI researcher and AI systems engineer interested in scientifically grounded methods for complex, dynamic, and imperfectly observed systems.

My current direction is AI-assisted quantum device characterisation, with a focus on non-Markovian dynamics, temporal reasoning, interpretable models, and mitigation-oriented analysis. I care about the full path from research question to defensible evidence: defining the problem clearly, selecting representations carefully, testing assumptions, evaluating limitations, and translating results into dependable systems.

My background in software development, technical operations, leadership programs, and independent publishing is an execution advantage. It helps me approach research not as an isolated model exercise, but as a disciplined process of experimentation, explanation, implementation, and responsibility.

Toolkit

Research Methods and Applied Toolkit

A research-first view of the methods I am developing and the implementation tools I use to test ideas reproducibly.

Actively developing

Research methods in development

14 methods

Mathematical, computational, and physically informed methods I am actively developing for scientific AI and quantum-device characterisation.

Artificial Intelligence
Scientific Machine Learning
Process-Tensor Reasoning
Tensor Networks
Computational Mathematics
Computational Physics
Numerical Linear Algebra
Probability and Statistical Inference
Inverse Problems
System Identification
Uncertainty Quantification
Temporal Modelling
Signal Processing
Optimisation

Used in applied systems

Applied engineering toolkit

29 tools

Languages, libraries, platforms, and workflow tools used to build, evaluate, and communicate applied AI systems.

Python
SQL
Java
C#
PHP
pandas
NumPy
Matplotlib
scikit-learn
TensorFlow
PyTorch
NetworkX
Hugging Face Transformers
Sentence Transformers
FAISS
Streamlit
Jupyter
Gymnasium
Stable-Baselines3
ROS2
Docker
Linux
AWS S3
boto3
Azure Document Intelligence
JSON
MySQL
Oracle
Git

Research Writing

Research Notes and Engineering Decisions

Short reflections on research questions, evaluation choices, and the engineering decisions that shape dependable scientific work.

Primary Research Direction

Why I am exploring process tensors for non-Markovian quantum-noise characterisation

A research-preparation note on memory effects in quantum hardware, process-tensor reasoning, tensor-network representations, and where AI may help without obscuring the physics.

Read note

Research Note

Evaluating AI reliability beyond headline accuracy

Why confidence calibration, error-code analysis, and traceability often matter more than a single accuracy number when systems leave the demo.

Read note

Research Note

What sparse rewards teach us about system design

Lessons from reward shaping in MountainCar and continuous-control tasks — and how sparse feedback reshapes how we structure learning systems.

Read note

Research Note

From keyword matching to semantic retrieval

Comparing TF-IDF baselines with transformer embeddings and vector search, and the practical trade-offs of moving to semantic retrieval.

Read note

Contact

Let’s investigate and build something consequential.

I am open to interdisciplinary research collaborations and selected AI systems roles spanning AI-assisted quantum device characterisation, scientific machine learning, trustworthy AI, and temporal reasoning.

Discuss Research General Email LinkedIn GitHub Download Résumé

Shaurav Khadka

Research Directions

AI-Assisted Quantum Device Characterisation

Trustworthy AI Systems and Production Reliability

Temporal Learning for Dynamic and Relational Data

Sim2Real Perception and Autonomous Systems

Interdisciplinary research landscape

From research question to defensible evidence.

Question

Model

Test

Translate

Communicate

Three results worth remembering.

Robot-image accuracy after deployment-specific Sim2Real adaptation

AirRaid PPO mean reward after temporal observation changes

RedditPulse semantic retrieval quality

Six distinct systems. One research-led portfolio.

Production AI Reliability and Document Intelligence

Temporal GNN for Blockchain Fraud Detection

LLM-Based Financial Assistant Prototype

Semantic Search and Information Retrieval Engine

Machine-Learning Evaluation and Data-Science Pipeline

Responsible AI, Governance and Human-Centred Analysis

Selected evidence and public scope.

Research and Technical Work, Ordered by Signal

AI/ML Research and Development Intern

Chief Technology Officer

Jr. Full Stack Developer

Ingleburn Convenience Store · Operations and Digital Support Assistant · Part-time

Capability Atlas

Generative AI, RAG and Multilingual Interaction

NLP, Information Retrieval and Social Intelligence

Machine Learning, Data Science and Evaluation

Computer Vision, Robotics and Sim2Real

Reinforcement Learning and Decision Systems

Production AI, Document Intelligence and Cloud Workflows

Responsible AI, Governance and Research Communication

Education and Selected Credentials

Macquarie University

London Metropolitan University · Islington College

Global Leadership Program

MQ Incubator × KPMG Design Thinking

UPG Sustainability Leadership · 2024

CS50x Computer Science

Elements of AI

Ethics and Governance of AI for Health

Books & Independent Publishing

The Digital Equilibrium

Complete publication catalogue

Children Stories

Joyful Stories

Joyful Stories

Joyful Stories

Words of Wisdom

2 in 1 Joyful, Children Stories

Research-Led Systems Work

Research Methods and Applied Toolkit

Research methods in development

Applied engineering toolkit

Research Notes and Engineering Decisions

Why I am exploring process tensors for non-Markovian quantum-noise characterisation

Evaluating AI reliability beyond headline accuracy

What sparse rewards teach us about system design

From keyword matching to semantic retrieval

Let’s investigate and build something consequential.