AI Architect - AI Centre of Excellence
Oracle EMEA | Campanillas, Spain
Senior Architect with over 20+ years of experience designing and deploying high-performance AI/ML, HPC, CAE and GPU-accelerated cloud infrastructure across global industries
Get In TouchSenior Architect with over 20+ years of experience designing and deploying high-performance AI/ML, HPC, CAE and GPU-accelerated cloud infrastructure across global industries. Proven success leading complex architecture engagements, cloud migrations, and GenAI platform enablement for research, enterprise, and hybrid environments.
Known for driving scalable innovation through deep technical expertise, cross-functional collaboration, and a customer-first mindset. Demonstrated ability to deliver infrastructure solutions aligned with business goals, technical compliance, and operational excellence across OCI, AWS, GCP and Azure ecosystems.
Currently serving as AI Architect at Oracle's Centre of Excellence in Campanillas, Spain, specializing in enterprise reference architectures, performance baselines, and guardrails for GPU-accelerated GenAI/HPC on OCI.
Production-ready frameworks and benchmarks for distributed AI/ML training, inference, and computer vision on Oracle Cloud Infrastructure
Comprehensive 15-track benchmark suite for evaluating LLM agentic workflow capabilities including planning, tool orchestration, self-healing, and context persistence.
Comprehensive benchmarking framework for evaluating LLM inference across quantization methods (AWQ, GPTQ, GGUF, FP8) with throughput, latency, memory, and quality metrics.
Production-ready observability framework for monitoring LLM inference workloads on Kubernetes with NVIDIA GPUs, combining centralized logging with infrastructure validation testing.
Comprehensive benchmarking framework for reasoning-first LLMs like NVIDIA Nemotron-3-Nano, analyzing hybrid Mamba-Transformer MoE architectures on OCI GPU infrastructure.
Comprehensive framework for evaluating Retrieval-Augmented Generation (RAG) systems, measuring retrieval quality, generation accuracy, and end-to-end performance on OCI.
Comprehensive performance evaluation of Speculative Decoding techniques for LLM inference acceleration, comparing draft-target model configurations on OCI GPU infrastructure.
Comprehensive benchmark framework for Mixture of Experts (MoE) model inference on OCI, analyzing expert routing efficiency, throughput scaling, and memory utilization patterns.
Comprehensive benchmark framework comparing PyTorch DDP, FSDP, and DeepSpeed ZeRO-2/ZeRO-3 for distributed LLM training on Oracle Kubernetes Engine (OKE) with NVIDIA GPUs.
Production-ready implementation of Mistral-7B-Instruct fine-tuning using QLoRA with 4-bit quantization for efficient training on consumer/cloud GPUs.
High-performance YOLOv8 object detection deployment using NVIDIA Triton Inference Server on Oracle Kubernetes Engine with TensorRT optimization.
Deep-dive GPU profiling framework using NVIDIA Nsight Systems to analyze CUDA kernels, NVTX markers, and NCCL communication patterns across distributed training strategies.
Comprehensive framework for benchmarking vLLM vs NVIDIA Triton vs HuggingFace TGI inference servers on Kubernetes with NVIDIA Nsight Systems GPU profiling.
Reusable benchmarking framework for evaluating LLM inference server performance on OpenShift clusters with GPU acceleration using IBM Fusion HCI and NVIDIA A100 MIG GPUs.
Complete framework for deploying NVIDIA cuOpt on OCI for Electric Vehicle fleet optimization with GPU-accelerated route optimization - 10-100x faster than CPU solvers.
Complete Prometheus + Grafana observability stack for monitoring GPU clusters, vLLM inference, and LLM training workloads on Oracle Kubernetes Engine (OKE).
Comprehensive benchmarking framework for evaluating LLM serving performance comparing vLLM, TGI, and NVIDIA NIM on Kubernetes with detailed latency and throughput analysis.
Comprehensive framework for benchmarking distributed Mixture-of-Experts (MoE) training using Expert Parallelism (EP) and hybrid EP+Data Parallelism strategies on Oracle Kubernetes Engine with NVIDIA GPUs.
Practical strategy guide for selecting LLM training parallelism approaches, comparing DDP, Pipeline, Tensor Parallelism, and hybrid strategies with detailed NCCL communication pattern analysis.
All projects include comprehensive documentation, performance benchmarks, and production deployment guides
View All Projects on GitHub20+ years of specialized expertise in AI/ML infrastructure, HPC systems, and cloud architecture delivering enterprise-scale solutions
Enterprise-Scale Solutions
Leading the design and implementation of large-scale, high-performance computing environments for AI/ML workloads.
Oracle Cloud Infrastructure
Designing and implementing robust, scalable, and cost-effective cloud solutions on Oracle Cloud Infrastructure (OCI).
Industry Specialization
System Architecture
Customer consultation
Technical consulting
Documentation
Leadership
Oracle Iberia
Feb 2021 - Present | Campanillas, Spain
• Define enterprise reference architectures, performance baselines, and guardrails for GPU-accelerated GenAI/HPC on OCI
• Architect and operate GPU-accelerated Gen AI and HPC/AI platforms on OCI (Kubernetes/OKE plus Slurm & PBS Pro)
• Lead performance engineering & benchmarking, CUDA/NCCL micro-benchmarks, optimize GPU utilization and throughput
• Enable distributed training & inference for LLMs and CV/NLP (DeepSpeed/FSDP/Horovod on Slurm/PBS Pro and OKE)
• Build reusable IaC blueprints (Terraform/Resource Manager, Helm, OCI DevOps/OCIR) for rapid GPU cluster deployment
• Partner with automotive CAE/simulation teams to map CFD/FEA/crash workloads to optimal shapes/schedulers
DXC Technology
Nov 2018 – Jan 2021 | Europe & UK
• HPC and emerging technologies consultant supporting scientific computing workloads across financial services, aerospace, and automotive
• Delivered HPC infrastructure engineering using NVIDIA Bright Cluster Manager, xCAT, LSF, and PBS Pro
• Led automation initiatives with Ansible, Docker, and Python for cluster provisioning and application deployment
• Enabled hybrid cloud integration with AWS and GCP for scalable compute environments
• Conducted extensive application and hardware benchmarking with performance optimization
Citi (Citicorp Services India)
Aug 2016 – Oct 2018 | Financial Engineering
• HPC Engineer supporting Financial Engineering Research Group for real-time financial trading and risk modeling
• Point-of-Contact for emerging HPC technologies, driving innovation in simulation grid architecture
• Conducted hardware and application benchmarking, validating performance for production trading environments
• Designed and tested hybrid HPC architecture PoCs ensuring scalability and reliability
• Customized ELK stack for infrastructure observability, log correlation, and anomaly detection
Tata Motors / Tata Technologies
Jun 2008 – Aug 2016 | Automotive R&D
• HPC operations and infrastructure lead for Computer-Aided Engineering (CAE) Research Group
• Directed daily operations of multi-node, heterogeneous HPC cluster for automotive simulations
• Led integration and performance tuning of LS-DYNA, Abaqus, Ansys Fluent, MSC Nastran, StarCCM+, OptiStruct
• Developed custom CAE job submission portal integrated with PBS Pro
• Enabled centralized CAE access via Altair e-Compute portal across engineering teams
Sankalp Venture
Mar 2007 – May 2008
Led enterprise Linux infrastructure administration, web/mail servers, and team management for Indian Express news sites.
Vindhaya Institute
Mar 2004 – Feb 2007
Taught computer science subjects, conducted lab sessions, mentored B.E. students on software development projects.
Strategic partnerships with leading organizations across global markets, delivering transformational AI infrastructure solutions with proven results and measurable impact
Global Enterprises
• Energy & Oil Companies
• Manufacturing Giants
• $2.4T+ combined market cap
Innovation Leaders
• LLM Builders
• Biotech AI
• Research Pioneers
Public Sector
• Smart Cities
• National Initiatives
• Vision 2030 Projects
Financial Innovation
• Payment Leaders
• Cross-Border Platforms
• Digital Banking
Delivering mission-critical AI/ML infrastructure solutions that drive digital transformation across industries
DataRobot AI platform deployment for digital transformation initiatives
Cross-border payments platform with data sovereignty compliance
LLM training infrastructure for next-generation AI companies
5G network optimization with AI-driven analytics
Smart city initiatives and digital governance platforms
Predictive maintenance and quality optimization systems
AI/ML platform design, HPC cluster deployment, cloud migration strategy
GPU utilization, RDMA networking, workload scheduling, cost reduction
Data sovereignty, regulatory compliance, security best practices
50-Node GPU Cluster • xLSTM Technology • Research to Production
Challenge: Austrian LLM builder entering productization phase, seeking European AI leadership position
Solution: Deployed 50+3 Node BM GPU H100.8 cluster with RDMA networking for xLSTM technology research
Impact: Enabled transition from university research to commercial AI products, competing with leading European AI companies
GPU Infrastructure:
HPC Components:
Recognition Received
Q1FY25 EMEA Technology Engineering Excellence Award for outstanding collaborative work and customer success
Building strategic partnerships with industry leaders across global markets to deliver transformational AI infrastructure solutions
2021, 2023
2023
2021
2021
Data Centers 2023
Certified
Expert Level
Associate Level
Professional Series
8.0 Administration
Oracle Cloud
NVIDIA AI
Multi-Cloud
Leadership
Total of 15 Professional Certifications spanning cloud architecture, AI/ML, data platforms, security, and quality management across Oracle, AWS, GCP, NVIDIA, and industry standards
Delivering transformational AI/ML and HPC infrastructure solutions across diverse industry verticals
Digital Transformation - AI/ML Platform
Fortune 500 Oil & Gas Company
Challenge: Deploy DataRobot AI platform for digital transformation initiatives
Solution: Enhanced performance with specialized OCI features and HPC expertise
Impact: ✅ Delivered on timeline, customer adopted OCI for production workloads
Cross-Border Payments Platform
FTSE 250 Listed Company
Challenge: Expand into new market with data sovereignty compliance
Solution: Oracle Cloud deployment with enhanced GlusterFS integration
Impact: ✅ 25% cost reduction vs on-premises, enabling rapid market entry
5G Network Optimization
Leading Middle East Telecom Provider
Challenge: 5G network optimization with AI-driven analytics platform
Solution: GPU-accelerated HPC cluster with Oracle Linux optimization
Impact: ✅ Full compliance achieved, enhanced performance delivered
Therapeutic AI Research Platform
Healthcare AI Innovator
Challenge: Generative AI for therapeutic antibody design and protein discovery
Solution: High-performance GPU cluster with specialized AI frameworks
Impact: ✅ Accelerated drug discovery, research breakthrough achieved
AI-Powered Urban Management
National Vision 2030 Project
Challenge: AI-powered visual pollution detection processing 100K-200K images daily
Solution: Scalable cloud infrastructure with automated AI pipeline
Impact: ✅ Revolutionized environmental monitoring and urban planning
xLSTM Technology • 50-Node GPU Cluster
Austrian LLM Pioneer
Challenge: European LLM builder entering productization phase, seeking AI leadership position
Solution: Deployed 50+3 Node GPU H100.8 cluster with RDMA networking
Impact: ✅ Enabled transition from university research to commercial AI products
Sharing knowledge and insights through technical blogs, reference architectures, and open-source contributions
View all publications at blogs.oracle.com/authors/deepak-soni
Reference architecture for deploying enterprise-scale generative AI solutions on OCI with comprehensive ERP integration capabilities.
Accelerate and scale the storage of virtual machine images in a KVM environment with enterprise-grade reliability.
Use remote synchronous block replication on Oracle Cloud Infrastructure for enterprise-grade data replication.
Video surveillance and analytics software performance optimization on OCI for enhanced security operations.
Powering protein large language models in antibody discovery on OCI for pharmaceutical innovation.
Accelerating telco innovation by leveraging power of GPUs on OCI for enhanced customer experiences.
Pioneering collaboration for AI innovation and excellence with One Lexiicon ownGPT AI model on OCI.
Pioneering de novo antibody design with OCI, supporting Silica Corpora's AI mission for precision and efficacy.
Personal collection of AI/ML infrastructure projects, HPC configurations, automation scripts, and technical implementations for enterprise-scale deployments.
Research-focused repository containing academic projects, mathematical computing implementations, and early-stage experimental work in system architecture.
Enterprise-grade implementations for Oracle Developer Relations, featuring DeepSpeed training, GPU clustering, and production-ready AI infrastructure patterns.
Advanced Retrieval-Augmented Generation (RAG) chatbot for medical information and healthcare applications, featuring vector search, semantic understanding, and context-aware responses for medical queries.
Published insights and technical achievements from my tenure as HPC & CAE Systems Engineer at Tata Technologies, focusing on performance optimization and enterprise-scale solutions.
Performance Comparison Chart
Demonstrated 68% reduction in runtime through NUMA optimization techniques for compute-intensive CAE applications like LS-DYNA in automotive simulations. Achieved 45-50% performance improvement in CPU time through strategic process and memory placement.
MSC Patran - Stress Analysis
Engineered custom HPC environment that reduced CAE loop time by 25% (12 weeks to 9 weeks) with 8x increase in license utilization. Implemented GlusterFS distributed file system and Torque/Maui job scheduling for enterprise-scale CAE workflows.
OCI Reference Architectures
Technical Articles
Open Source Projects
Developer Engagements
1999 - 2002
Madhya Pradesh, India • State Technical University
Comprehensive graduate program in computer applications covering advanced software engineering, distributed systems architecture, database management, network programming, and enterprise computing solutions. Specialized coursework in system design, performance optimization, and scalable application development.
1995 - 1998
Madhya Pradesh, India • Affiliated College
Rigorous undergraduate program in pure and applied mathematics covering advanced calculus, linear algebra, differential equations, probability theory, statistics, and numerical analysis. Built strong analytical and problem-solving foundation essential for understanding AI/ML algorithms, performance optimization, and computational complexity in large-scale infrastructure systems.
Systems design, architecture patterns, enterprise software development methodologies, and scalable application frameworks
Statistical analysis, computational mathematics, algorithmic optimization, and mathematical foundations for AI/ML
Distributed systems design, network architecture, high-performance computing principles, and infrastructure scalability
"The combination of computer science expertise and mathematical foundations provides the perfect foundation for understanding both the theoretical principles and practical applications of modern AI/ML infrastructure architecture."
Active participant in global professional communities spanning AI, HPC, cloud computing, and technology domains
Active member in specialized communities and technical forums worldwide
388,963 members
Data Science | Machine Learning | Deep Learning | AI
698,184 members
Best Group for Project Management
545,631 members
AWS, Azure, GCP, IBM, Alibaba, OCI
27,763 members
HPC Infrastructure & Supercomputing
409,895 members
World's Largest Automotive Group
195,565 members
Linux Systems & Open Source
CAE & Engineering
CAD, CAE, FEM, MBD & Optimization
AI & ML
OpenAI, ChatGPT, NLP, AI Agents
Scientific Computing
HPC-AI Advisory Council, CSSC
When not architecting AI infrastructures, I find balance and inspiration through sports and music
The Gentleman's Game
Cricket has been a lifelong passion - from following international matches to understanding the strategic complexities that mirror the analytical thinking required in AI architecture.
"Cricket teaches patience, strategy, and the importance of both individual excellence and team collaboration - principles that directly apply to leading complex infrastructure projects."
Creative Expression
Music provides the perfect counterbalance to technical work - offering creative expression and emotional release that keeps me energized and inspired.
"Music teaches rhythm, timing, and the art of harmonious collaboration - qualities essential for orchestrating complex AI infrastructure deployments."
"The analytical precision required for AI architecture finds perfect balance in the strategic thinking of cricket and the creative flow of music. These passions keep me grounded, inspired, and bring fresh perspectives to solving complex technical challenges."
Cricket strategy enhances architectural planning
Musical creativity drives innovative solutions
Sports and music build collaborative skills
Ready to transform your AI/ML infrastructure? Let's discuss how we can accelerate your journey to production-scale AI solutions.
Connect for strategic AI infrastructure discussions and industry insights
LinkedIn ProfileExplore open-source contributions and technical implementations
GitHub ProfileOpen to strategic AI infrastructure consulting and enterprise architecture engagements upon request