Deepak Soni

Deepak Soni

AI Architect - AI Centre of Excellence

Oracle EMEA | Campanillas, Spain

Senior Architect with over 20+ years of experience designing and deploying high-performance AI/ML, HPC, CAE and GPU-accelerated cloud infrastructure across global industries

Get In Touch

About Me

Senior Architect with over 20+ years of experience designing and deploying high-performance AI/ML, HPC, CAE and GPU-accelerated cloud infrastructure across global industries. Proven success leading complex architecture engagements, cloud migrations, and GenAI platform enablement for research, enterprise, and hybrid environments.

Known for driving scalable innovation through deep technical expertise, cross-functional collaboration, and a customer-first mindset. Demonstrated ability to deliver infrastructure solutions aligned with business goals, technical compliance, and operational excellence across OCI, AWS, GCP and Azure ecosystems.

Currently serving as AI Architect at Oracle's Centre of Excellence in Campanillas, Spain, specializing in enterprise reference architectures, performance baselines, and guardrails for GPU-accelerated GenAI/HPC on OCI.

Core Expertise

AI/ML Infrastructure HPC Platforms GPU Computing GenAI Platforms Automotive CAE NVIDIA Ecosystem

Core Expertise

20+ years of specialized expertise in AI/ML infrastructure, HPC systems, and cloud architecture delivering enterprise-scale solutions

AI/ML & HPC Infrastructure Architecture

Enterprise-Scale Solutions

Leading the design and implementation of large-scale, high-performance computing environments for AI/ML workloads.

  • Large-Scale GPU Cluster Design (NVIDIA A100/H100)
  • Multi-node GPU clusters for GenAI and LLM training
  • Performance optimization & tuning for HPC/AI workloads

Cloud Solutions Architecture

Oracle Cloud Infrastructure

Designing and implementing robust, scalable, and cost-effective cloud solutions on Oracle Cloud Infrastructure (OCI).

  • Hybrid & Multi-Cloud Architecture
  • Kubernetes & Docker Containerization
  • Infrastructure as Code (Terraform, Ansible)

Domain Knowledge

Industry Specialization

  • Automotive HPC: Autonomous Driving (AD/ADAS), Computer-Aided Engineering (CAE), simulation workloads
  • Financial & Defence: Monte Carlo simulation, financial app orchestration
  • Generative AI & NVIDIA Ecosystem: GenAI platforms, LLM training, full NVIDIA AI stack

HPC & Technical Skills

System Architecture

Workload Management

  • • Slurm
  • • IBM LSF

Storage & Networking

  • • Lustre, GPFS
  • • RDMA (RoCE v2)

DevOps & Monitoring

  • • CI/CD, Git
  • • Prometheus, Grafana

Programming

  • • Python, Shell
  • • Linux Systems

Professional Skills

C-Level & Executive Advisory

Customer consultation

Pre-Sales Engineering

Technical consulting

System Design

Documentation

Technical Mentoring

Leadership

Professional Experience

Oracle Logo

AI Architect - AI Centre of Excellence

Oracle Iberia

Feb 2021 - Present | Campanillas, Spain

• Define enterprise reference architectures, performance baselines, and guardrails for GPU-accelerated GenAI/HPC on OCI

• Architect and operate GPU-accelerated Gen AI and HPC/AI platforms on OCI (Kubernetes/OKE plus Slurm & PBS Pro)

• Lead performance engineering & benchmarking, CUDA/NCCL micro-benchmarks, optimize GPU utilization and throughput

• Enable distributed training & inference for LLMs and CV/NLP (DeepSpeed/FSDP/Horovod on Slurm/PBS Pro and OKE)

• Build reusable IaC blueprints (Terraform/Resource Manager, Helm, OCI DevOps/OCIR) for rapid GPU cluster deployment

• Partner with automotive CAE/simulation teams to map CFD/FEA/crash workloads to optimal shapes/schedulers

TECHNOLOGY

Senior Professional, Emerging Technologies

DXC Technology

Nov 2018 – Jan 2021 | Europe & UK

• HPC and emerging technologies consultant supporting scientific computing workloads across financial services, aerospace, and automotive

• Delivered HPC infrastructure engineering using NVIDIA Bright Cluster Manager, xCAT, LSF, and PBS Pro

• Led automation initiatives with Ansible, Docker, and Python for cluster provisioning and application deployment

• Enabled hybrid cloud integration with AWS and GCP for scalable compute environments

• Conducted extensive application and hardware benchmarking with performance optimization

citi

HPC Analyst

Citi (Citicorp Services India)

Aug 2016 – Oct 2018 | Financial Engineering

• HPC Engineer supporting Financial Engineering Research Group for real-time financial trading and risk modeling

• Point-of-Contact for emerging HPC technologies, driving innovation in simulation grid architecture

• Conducted hardware and application benchmarking, validating performance for production trading environments

• Designed and tested hybrid HPC architecture PoCs ensuring scalability and reliability

• Customized ELK stack for infrastructure observability, log correlation, and anomaly detection

TATA TECHNOLOGIES

Lead HPC Solutions Developer

Tata Motors / Tata Technologies

Jun 2008 – Aug 2016 | Automotive R&D

• HPC operations and infrastructure lead for Computer-Aided Engineering (CAE) Research Group

• Directed daily operations of multi-node, heterogeneous HPC cluster for automotive simulations

• Led integration and performance tuning of LS-DYNA, Abaqus, Ansys Fluent, MSC Nastran, StarCCM+, OptiStruct

• Developed custom CAE job submission portal integrated with PBS Pro

• Enabled centralized CAE access via Altair e-Compute portal across engineering teams

S

Senior Linux System Administrator

Sankalp Venture

Mar 2007 – May 2008

Led enterprise Linux infrastructure administration, web/mail servers, and team management for Indian Express news sites.

V

Programmer & Academic Mentor

Vindhaya Institute

Mar 2004 – Feb 2007

Taught computer science subjects, conducted lab sessions, mentored B.E. students on software development projects.

Professional Portfolio

Strategic partnerships with leading organizations across global markets, delivering transformational AI infrastructure solutions with proven results and measurable impact

Fortune 500

Global Enterprises

• Energy & Oil Companies

• Manufacturing Giants

• $2.4T+ combined market cap

AI Unicorns

Innovation Leaders

• LLM Builders

• Biotech AI

• Research Pioneers

Government

Public Sector

• Smart Cities

• National Initiatives

• Vision 2030 Projects

FinTech

Financial Innovation

• Payment Leaders

• Cross-Border Platforms

• Digital Banking

Global Reach

EMEA
Europe, Middle East, Africa
AMERICAS
North & South America
APAC
Asia Pacific
NORDIC
Scandinavian Region
50+
Global Organizations
Across 4 continents
100k+
GPU Hours
AI/ML workloads
$50M+
Infrastructure Value
Deployed solutions

Professional Impact

Delivering mission-critical AI/ML infrastructure solutions that drive digital transformation across industries

Energy Sector

DataRobot AI platform deployment for digital transformation initiatives

✓ Production deployment success

FinTech

Cross-border payments platform with data sovereignty compliance

✓ 25% cost reduction achieved

AI Innovation

LLM training infrastructure for next-generation AI companies

✓ 50+ GPU cluster deployed

Telecom

5G network optimization with AI-driven analytics

✓ Full compliance achieved

Public Sector

Smart city initiatives and digital governance platforms

✓ Multi-region deployment

Manufacturing

Predictive maintenance and quality optimization systems

✓ 30% efficiency gain

Service Categories

Infrastructure Architecture

AI/ML platform design, HPC cluster deployment, cloud migration strategy

Performance Optimization

GPU utilization, RDMA networking, workload scheduling, cost reduction

Compliance & Security

Data sovereignty, regulatory compliance, security best practices

Professional Recommendations

Testimonials from Oracle VPs, industry leaders, and satisfied clients who have experienced the value of strategic AI infrastructure expertise

NS

Nitin Satpute

Principal Cloud Architect

HPC/GPU & AI Platform Solutions

"It has been a privilege to collaborate with Deepak on a range of challenging projects, from intricate client deployments to internal R&D initiatives. What stands out is his profound expertise across the entire HPC and AI stack, which he pairs with an exceptional, hands-on command of complex systems. Deepak possesses a rare blend of deep technical knowledge and a strategic intuition that allows him to architect the most effective solution with remarkable efficiency. He is an approachable, detail-oriented professional who consistently elevates the projects he's on. I wholeheartedly recommend Deepak as a senior expert in the AI and HPC field, whose unique problem-solving abilities often feel like 'magic' to those who work with him."
AI & HPC Excellence
TB

Taha Benssiba

VP of AI @ Oracle

Oracle Leadership Team

"Working with Deepak is a privilege and an absolute delight. Deepak is one of few people I know whose thirst for knowledge and self improvement helped him transition from hardware-heavy HPC SME to an accomplished AI SME whose expertise spans from AI GPU clustering architectures, to logical architecture, and Oracle and Open-source AI software. Deepak would be an invaluable asset to any organization."
Oracle AI Leadership
MK

Maria Konijnenberg

Supercomputing & AI

Senior Leadership

"We have been involved in several Big Compute/HPC/GPU implementation projects with Deepak for customers in different industries all over EMEA. Deepak contributes huge value to the work of the Big Compute specialist team because thanks to his work we could prove the feasibility of our proposed architectures and demonstrate that they are working. He goes the extra mile to make customisations if needed, always having customer satisfaction in his focus. Deepak is a top talent and I hope we get to work for him for a long time. Resources like him are in high demand (externally and internally). As a seasoned engineer, he has the intuition (based on 2 decades of expertise) to always choose an approach that would eventually prove the simplest and most elegant."
HPC Excellence
MT

Marta Tolosa

Cloud Native Architect Lead @ Oracle

Oracle Cross-Team Collaboration

"I have the pleasure of working with Deepak as part of a deep technical team, and I can confidently say that his collaborative spirit and open-minded approach have made a significant impact on our cross-team efforts. One of his standout qualities is the willingness to step in and support others, no matter the challenge. Every time I've needed his assistance, he's shown up with enthusiasm and a problem-solving mindset, ensuring that our projects move forward seamlessly. His reliability and dedication are truly commendable. What's equally impressive is his journey from India to Spain, which speaks volumes about his adaptability and curiosity. He's always eager to learn, ask thoughtful questions, and share insights from his diverse experiences. This enriches team dynamics and fosters a culture of continuous growth. Working with Deepak has been a fantastic experience."
Team Collaboration
AH

Alexander Hödicke

Best Practices Leader EMEA

Technology Engineering

"I'd like to recognize Deepak for his outstanding passion for sharing knowledge and promoting best practices within our teams. His commitment is clearly demonstrated through the multiple insightful blog entries he has authored. These contributions have been incredibly valuable, helping to broaden our understanding and foster a culture of continuous learning and sharing. Being a thought leader in his field, he is helping our customers and partners alike to deploy solutions quickly and generate value from them. Deepak's willingness to share his expertise is truly appreciated and makes a positive impact on our work environment."
Knowledge Sharing Excellence
LATEST SUCCESS STORY

Austrian AI Pioneer Breakthrough

50-Node GPU Cluster • xLSTM Technology • Research to Production

Challenge: Austrian LLM builder entering productization phase, seeking European AI leadership position

Solution: Deployed 50+3 Node BM GPU H100.8 cluster with RDMA networking for xLSTM technology research

Impact: Enabled transition from university research to commercial AI products, competing with leading European AI companies

Technical Architecture:

GPU Infrastructure:

  • 50 + 3 Node GPU Cluster
  • BM GPU H100.8 configuration
  • RDMA cluster networking

HPC Components:

  • 14 + 2 Node CPU Cluster
  • BM HPC E5.144 nodes
  • File System Storage (FSS)
xLSTM Technology H100 GPUs RDMA Networking Production Ready

Recognition Received

Q1FY25 EMEA Technology Engineering Excellence Award for outstanding collaborative work and customer success

Professional Network

Building strategic partnerships with industry leaders across global markets to deliver transformational AI infrastructure solutions

Professional Certifications

Oracle Cloud Infrastructure

Architect Associate

2021, 2023

🔒

Security Professional

2023

☁️

Cloud Foundation

2021

⚙️

Operations Associate

2021

NVIDIA AI & Data Centers

🤖

Introduction to AI

Data Centers 2023

💻

GPU Computing

Certified

🚀

AI Infrastructure

Expert Level

Multi-Cloud Expertise

☁️

AWS Architect

Associate Level

🔵

Google Cloud

Professional Series

Bright Cluster

8.0 Administration

4

Oracle Cloud

3

NVIDIA AI

6

Multi-Cloud

2

Leadership

Total of 15 Professional Certifications spanning cloud architecture, AI/ML, data platforms, security, and quality management across Oracle, AWS, GCP, NVIDIA, and industry standards

Client Success Stories

Delivering transformational AI/ML and HPC infrastructure solutions across diverse industry verticals

Global Energy Corporation

Digital Transformation - AI/ML Platform

Fortune 500 Oil & Gas Company

Challenge: Deploy DataRobot AI platform for digital transformation initiatives

Solution: Enhanced performance with specialized OCI features and HPC expertise

Impact: ✅ Delivered on timeline, customer adopted OCI for production workloads

FinTech Payment Leader

Cross-Border Payments Platform

FTSE 250 Listed Company

Challenge: Expand into new market with data sovereignty compliance

Solution: Oracle Cloud deployment with enhanced GlusterFS integration

Impact: ✅ 25% cost reduction vs on-premises, enabling rapid market entry

Telecommunications Giant

5G Network Optimization

Leading Middle East Telecom Provider

Challenge: 5G network optimization with AI-driven analytics platform

Solution: GPU-accelerated HPC cluster with Oracle Linux optimization

Impact: ✅ Full compliance achieved, enhanced performance delivered

Biotech AI Pioneer

Therapeutic AI Research Platform

Healthcare AI Innovator

Challenge: Generative AI for therapeutic antibody design and protein discovery

Solution: High-performance GPU cluster with specialized AI frameworks

Impact: ✅ Accelerated drug discovery, research breakthrough achieved

Government Smart City Initiative

AI-Powered Urban Management

National Vision 2030 Project

Challenge: AI-powered visual pollution detection processing 100K-200K images daily

Solution: Scalable cloud infrastructure with automated AI pipeline

Impact: ✅ Revolutionized environmental monitoring and urban planning

LATEST

European AI Research Unicorn

xLSTM Technology • 50-Node GPU Cluster

Austrian LLM Pioneer

Challenge: European LLM builder entering productization phase, seeking AI leadership position

Solution: Deployed 50+3 Node GPU H100.8 cluster with RDMA networking

Impact: ✅ Enabled transition from university research to commercial AI products

Technical Publications & Insights

Sharing knowledge and insights through technical blogs, reference architectures, and open-source contributions

OCI Reference Architectures

Deploy Scalable OwnGPT Model on Oracle Cloud

Reference architecture for deploying enterprise-scale generative AI solutions on OCI with comprehensive ERP integration capabilities.

Oracle Cloud GenAI ERP Integration
View Architecture

Accelerate VM Image Storage in KVM

Accelerate and scale the storage of virtual machine images in a KVM environment with enterprise-grade reliability.

KVM Storage Performance
View Architecture

Remote Synchronous Block Replication

Use remote synchronous block replication on Oracle Cloud Infrastructure for enterprise-grade data replication.

Block Storage Replication OCI
View Architecture

Video Surveillance Analytics Performance

Video surveillance and analytics software performance optimization on OCI for enhanced security operations.

Video Analytics Performance OCI Blog
Read Blog

Protein Large Language Models

Powering protein large language models in antibody discovery on OCI for pharmaceutical innovation.

Protein LLM Antibody OCI Blog
Read Blog

Telco Innovation with GPUs

Accelerating telco innovation by leveraging power of GPUs on OCI for enhanced customer experiences.

Telco GPU OCI Blog
Read Blog

One Lexiicon ownGPT AI Model

Pioneering collaboration for AI innovation and excellence with One Lexiicon ownGPT AI model on OCI.

ownGPT AI Model OCI Blog
Read Blog

De Novo Antibody Design

Pioneering de novo antibody design with OCI, supporting Silica Corpora's AI mission for precision and efficacy.

AI Design Antibody OCI Blog
Read Blog

Primary GitHub Repository

Personal collection of AI/ML infrastructure projects, HPC configurations, automation scripts, and technical implementations for enterprise-scale deployments.

AI Infrastructure HPC Automation
View Repository

Academic & Research Projects

Research-focused repository containing academic projects, mathematical computing implementations, and early-stage experimental work in system architecture.

Research Academic Mathematics
View Repository

Oracle DevRel Contributions

Enterprise-grade implementations for Oracle Developer Relations, featuring DeepSpeed training, GPU clustering, and production-ready AI infrastructure patterns.

Oracle DeepSpeed Enterprise
View Contributions

Medical RAG Chatbot

Advanced Retrieval-Augmented Generation (RAG) chatbot for medical information and healthcare applications, featuring vector search, semantic understanding, and context-aware responses for medical queries.

RAG Healthcare AI/ML Vector Search
View Repository

Technical Articles from Tata Technologies Experience

Published insights and technical achievements from my tenure as HPC & CAE Systems Engineer at Tata Technologies, focusing on performance optimization and enterprise-scale solutions.

NUMA Benchmarking Results

Performance Comparison Chart

17.7s → 5.59s
68% faster

Unlocking Performance: How NUMA Tuning Can Triple Your CAE Simulation Speed on HPC

Demonstrated 68% reduction in runtime through NUMA optimization techniques for compute-intensive CAE applications like LS-DYNA in automotive simulations. Achieved 45-50% performance improvement in CPU time through strategic process and memory placement.

NUMA HPC Optimization CAE Performance LS-DYNA
Read Article

HPC Ecosystem Architecture

MSC Patran - Stress Analysis

12 weeks → 9 weeks
25% faster

From Bottleneck to Breakthrough: Revolutionizing CAE Workflows with a Tailored HPC Ecosystem

Engineered custom HPC environment that reduced CAE loop time by 25% (12 weeks to 9 weeks) with 8x increase in license utilization. Implemented GlusterFS distributed file system and Torque/Maui job scheduling for enterprise-scale CAE workflows.

HPC Architecture CAE Workflow MSC Nastran Performance
Read Article

Technical Impact

5+

OCI Reference Architectures

10+

Technical Articles

3+

Open Source Projects

1000+

Developer Engagements

Education & Academic Background

Master of Computer Applications (M.C.A)

1999 - 2002

Rajiv Gandhi Proudyogiki Vishwavidyalaya (RGPV)

Madhya Pradesh, India • State Technical University

Comprehensive graduate program in computer applications covering advanced software engineering, distributed systems architecture, database management, network programming, and enterprise computing solutions. Specialized coursework in system design, performance optimization, and scalable application development.

Software Engineering System Architecture Database Systems Network Programming

Bachelor of Science in Mathematics

1995 - 1998

Post Graduate College, Satna

Madhya Pradesh, India • Affiliated College

Rigorous undergraduate program in pure and applied mathematics covering advanced calculus, linear algebra, differential equations, probability theory, statistics, and numerical analysis. Built strong analytical and problem-solving foundation essential for understanding AI/ML algorithms, performance optimization, and computational complexity in large-scale infrastructure systems.

Mathematical Analysis Statistics Numerical Methods Linear Algebra

Academic Foundation

Software Engineering

Systems design, architecture patterns, enterprise software development methodologies, and scalable application frameworks

Mathematical Computing

Statistical analysis, computational mathematics, algorithmic optimization, and mathematical foundations for AI/ML

System Architecture

Distributed systems design, network architecture, high-performance computing principles, and infrastructure scalability

Educational Philosophy

"The combination of computer science expertise and mathematical foundations provides the perfect foundation for understanding both the theoretical principles and practical applications of modern AI/ML infrastructure architecture."

Professional Communities

Active participant in global professional communities spanning AI, HPC, cloud computing, and technology domains

29

Professional Groups

Active member in specialized communities and technical forums worldwide

Key Professional Communities

Big Data & AI

388,963 members

Data Science | Machine Learning | Deep Learning | AI

Project Manager Community

698,184 members

Best Group for Project Management

Cloud Computing

545,631 members

AWS, Azure, GCP, IBM, Alibaba, OCI

High Performance Computing

27,763 members

HPC Infrastructure & Supercomputing

Auto OEM & Dealer Network

409,895 members

World's Largest Automotive Group

Linux Expert

195,565 members

Linux Systems & Open Source

Additional Specialized Communities

CAE & Engineering
CAD, CAE, FEM, MBD & Optimization

AI & ML
OpenAI, ChatGPT, NLP, AI Agents

Scientific Computing
HPC-AI Advisory Council, CSSC

Beyond Technology

When not architecting AI infrastructures, I find balance and inspiration through sports and music

Cricket Enthusiast

The Gentleman's Game

Cricket has been a lifelong passion - from following international matches to understanding the strategic complexities that mirror the analytical thinking required in AI architecture.

Cricket Interests:

  • International cricket analysis and statistics
  • Following World Cup and tournament strategies
  • Player performance analytics and data trends
  • Team dynamics and leadership insights

"Cricket teaches patience, strategy, and the importance of both individual excellence and team collaboration - principles that directly apply to leading complex infrastructure projects."

Music & Singing

Creative Expression

Music provides the perfect counterbalance to technical work - offering creative expression and emotional release that keeps me energized and inspired.

Musical Journey:

  • Vocal performance and singing practice
  • Exploring diverse musical genres and styles
  • Bollywood classics and contemporary hits
  • International music appreciation

"Music teaches rhythm, timing, and the art of harmonious collaboration - qualities essential for orchestrating complex AI infrastructure deployments."

Work-Life Harmony

"The analytical precision required for AI architecture finds perfect balance in the strategic thinking of cricket and the creative flow of music. These passions keep me grounded, inspired, and bring fresh perspectives to solving complex technical challenges."

Strategic Thinking

Cricket strategy enhances architectural planning

Creative Problem-Solving

Musical creativity drives innovative solutions

Team Leadership

Sports and music build collaborative skills

Let's Build the Future Together

Ready to transform your AI/ML infrastructure? Let's discuss how we can accelerate your journey to production-scale AI solutions.

Professional Network

Connect for strategic AI infrastructure discussions and industry insights

LinkedIn Profile

Technical Collaboration

Explore open-source contributions and technical implementations

GitHub Profile
Available on Demand

Open to strategic AI infrastructure consulting and enterprise architecture engagements upon request