<?xml version="1.0" encoding="utf-8" standalone="yes"?><rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom"><channel><title>Projects on Hitesh Pattanayak</title><link>/projects/</link><description>Recent content in Projects on Hitesh Pattanayak</description><generator>Hugo</generator><language>en-us</language><atom:link href="/projects/index.xml" rel="self" type="application/rss+xml"/><item><title>Index and Search Petabytes of Data</title><link>/projects/data-pipeline-architecture/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>/projects/data-pipeline-architecture/</guid><description>&lt;h2 id="project-overview">Project Overview&lt;/h2>
&lt;p>Architected and implemented comprehensive end-to-end data pipeline infrastructure for processing M365 workload events using modern data engineering technologies. Built scalable real-time processing systems handling high-volume streaming data with advanced analytics and search capabilities.&lt;/p>
&lt;h2 id="key-achievements">Key Achievements&lt;/h2>
&lt;h3 id="data-pipeline-infrastructure">Data Pipeline Infrastructure&lt;/h3>
&lt;ul>
&lt;li>Architected end-to-end data pipeline processing M365 workload events using Databricks, Apache Spark, and Delta Lake&lt;/li>
&lt;li>Implemented high-volume streaming data processing with Event Hubs and Azure Container Apps&lt;/li>
&lt;li>Developed complex Databricks jobs for event processing, indexing, and content enrichment&lt;/li>
&lt;li>Implemented SCD2 (Slowly Changing Dimension Type 2) algorithms for historical data tracking and temporal data management&lt;/li>
&lt;/ul>
&lt;h3 id="pipeline-services">Pipeline Services&lt;/h3>
&lt;ul>
&lt;li>&lt;strong>Data Ingestion Service&lt;/strong>: Event processing service publishing to multiple Event Hubs (backup, backfill, retention, recovery points, threats) with tenant resolution and validation&lt;/li>
&lt;li>&lt;strong>Data Discovery Service&lt;/strong>: Go-based REST API for browsing indexed metadata and data discovery with filtering capabilities and AI-enhanced natural language search&lt;/li>
&lt;/ul>
&lt;h3 id="ai-powered-semantic-search">AI-Powered Semantic Search&lt;/h3>
&lt;ul>
&lt;li>Implemented RAG (Retrieval Augmented Generation) pipeline for semantic search over backed-up M365 data&lt;/li>
&lt;li>Backup data stored in embedded form in CosmosDB using hybrid vector + keyword search capabilities&lt;/li>
&lt;li>Natural language queries translated to structured metadata filters using Azure OpenAI Chat Completions&lt;/li>
&lt;li>Filter generation prompt uses system prompt combined with few-shot examples for reliable, structured output&lt;/li>
&lt;li>Enables users to search petabytes of backup data using plain English queries without knowledge of underlying schema&lt;/li>
&lt;/ul>
&lt;h3 id="ai-developer-tooling">AI Developer Tooling&lt;/h3>
&lt;ul>
&lt;li>&lt;strong>Elastic Dashboard Changelog&lt;/strong>: Python script that diffs &lt;code>.ndjson&lt;/code> Kibana dashboard files (unreadable in standard GitHub diffs) and feeds the structured diff to Anthropic API to generate a human-readable changelog&lt;/li>
&lt;li>&lt;strong>Security Fix Automation&lt;/strong>: Local AI skill that ingests Cycode security findings and applies targeted fixes using LLM-assisted code correction with full finding context&lt;/li>
&lt;/ul>
&lt;h3 id="observability--monitoring">Observability &amp;amp; Monitoring&lt;/h3>
&lt;ul>
&lt;li>Implemented comprehensive monitoring solutions using Elastic Stack (Elasticsearch, Kibana)&lt;/li>
&lt;li>Created custom dashboards for service metrics, performance monitoring, and operational insights&lt;/li>
&lt;li>Automated alerting and incident management via Incident.io integration&lt;/li>
&lt;li>Designed operational runbooks and failure models for production systems&lt;/li>
&lt;/ul>
&lt;h3 id="infrastructure--operations">Infrastructure &amp;amp; Operations&lt;/h3>
&lt;ul>
&lt;li>Designed detailed troubleshooting procedures and alert response protocols&lt;/li>
&lt;li>Created escalation paths for critical data pipeline components&lt;/li>
&lt;li>Infrastructure as Code implementation using Pulumi&lt;/li>
&lt;li>Integration with OpenAI for enhanced search capabilities&lt;/li>
&lt;/ul>
&lt;h2 id="technologies-used">Technologies Used&lt;/h2>
&lt;ul>
&lt;li>&lt;strong>Data Processing&lt;/strong>: Databricks, Apache Spark, Delta Lake, PySpark&lt;/li>
&lt;li>&lt;strong>Streaming&lt;/strong>: Azure Event Hubs, real-time processing&lt;/li>
&lt;li>&lt;strong>Backend Development&lt;/strong>: Go, Python, TypeScript/Node.js&lt;/li>
&lt;li>&lt;strong>Databases&lt;/strong>: CosmosDB, Azure SQL Warehouse&lt;/li>
&lt;li>&lt;strong>Infrastructure&lt;/strong>: Azure Container Apps, Pulumi (IaC)&lt;/li>
&lt;li>&lt;strong>Monitoring&lt;/strong>: Elasticsearch, Kibana, Elastic Stack&lt;/li>
&lt;li>&lt;strong>AI/ML&lt;/strong>: Azure OpenAI, RAG pipeline, CosmosDB hybrid vector search, few-shot prompting, semantic metadata filter generation&lt;/li>
&lt;li>&lt;strong>DevOps&lt;/strong>: Incident.io, automated alerting, operational runbooks&lt;/li>
&lt;/ul></description></item><item><title>Enterprise Authentication, Authorization and User Management</title><link>/projects/enterprise-auth-system/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>/projects/enterprise-auth-system/</guid><description>&lt;h2 id="project-overview">Project Overview&lt;/h2>
&lt;p>Developed and maintained a comprehensive user management microservice handling authentication, authorization, and user lifecycle management for enterprise-scale multi-tenant SaaS platform. Built to serve thousands of users across multiple organizations with complex authorization requirements.&lt;/p>
&lt;h2 id="key-achievements">Key Achievements&lt;/h2>
&lt;h3 id="authentication--authorization-system">Authentication &amp;amp; Authorization System&lt;/h3>
&lt;ul>
&lt;li>Built robust REST APIs for user management, role-based access control (RBAC), and organization onboarding using Go, Chi router, and OpenAPI specifications&lt;/li>
&lt;li>Integrated Auth0 authentication platform with custom role and permission management&lt;/li>
&lt;li>Supported social connections, machine-to-machine authentication, and enterprise identity providers&lt;/li>
&lt;li>Developed sophisticated authorization system with hierarchical permissions and policy-based access control&lt;/li>
&lt;/ul>
&lt;h3 id="database--data-architecture">Database &amp;amp; Data Architecture&lt;/h3>
&lt;ul>
&lt;li>Designed and implemented CosmosDB data layer with optimized queries for user, role, and organization management&lt;/li>
&lt;li>Implemented proper partitioning strategies across multiple containers for sub-second response times&lt;/li>
&lt;li>Built efficient data models with optimized partition keys and query patterns&lt;/li>
&lt;/ul>
&lt;h3 id="enterprise-features">Enterprise Features&lt;/h3>
&lt;ul>
&lt;li>Implemented fine-grained workload tenant permissions for Azure and Kubernetes services&lt;/li>
&lt;li>Built organization lifecycle management including automated onboarding and user invitations&lt;/li>
&lt;li>Developed group management and complete organization deletion workflows&lt;/li>
&lt;li>Created service account management with client credentials flow for machine-to-machine authentication&lt;/li>
&lt;/ul>
&lt;h3 id="platform-integration">Platform Integration&lt;/h3>
&lt;ul>
&lt;li>Integrated Azure Key Vault for secrets management&lt;/li>
&lt;li>Built automated organization onboarding with Auth0 integration and custom domain validation&lt;/li>
&lt;li>Implemented Microsoft tenant discovery capabilities&lt;/li>
&lt;/ul>
&lt;h3 id="quality--observability">Quality &amp;amp; Observability&lt;/h3>
&lt;ul>
&lt;li>Implemented comprehensive testing suite including unit tests, integration tests with CosmosDB&lt;/li>
&lt;li>Developed table-driven test patterns achieving high code coverage&lt;/li>
&lt;li>Established observability patterns using OpenTelemetry and structured logging with Clues&lt;/li>
&lt;li>Built comprehensive error handling with context propagation&lt;/li>
&lt;/ul>
&lt;h2 id="technical-highlights">Technical Highlights&lt;/h2>
&lt;ul>
&lt;li>&lt;strong>Authorization Matrix&lt;/strong>: Supporting 100+ permissions across different workload types&lt;/li>
&lt;li>&lt;strong>Multi-tenant Architecture&lt;/strong>: Scalable design serving thousands of users across multiple organizations&lt;/li>
&lt;li>&lt;strong>Testing Excellence&lt;/strong>: Comprehensive unit and integration testing with GoMock&lt;/li>
&lt;/ul>
&lt;h2 id="technologies-used">Technologies Used&lt;/h2>
&lt;ul>
&lt;li>&lt;strong>Backend&lt;/strong>: Go, Chi Router, OpenAPI/Swagger code generation&lt;/li>
&lt;li>&lt;strong>Authentication&lt;/strong>: Auth0, RBAC, policy-based authorization&lt;/li>
&lt;li>&lt;strong>Database&lt;/strong>: Azure CosmosDB&lt;/li>
&lt;li>&lt;strong>Cloud Services&lt;/strong>: Azure Key Vault, EventHub&lt;/li>
&lt;li>&lt;strong>Architecture&lt;/strong>: Microservices, multi-tenant SaaS, event-driven architecture&lt;/li>
&lt;li>&lt;strong>Observability&lt;/strong>: OpenTelemetry, structured logging, comprehensive error handling&lt;/li>
&lt;li>&lt;strong>Testing&lt;/strong>: Unit tests, integration tests, table-driven patterns, GoMock&lt;/li>
&lt;li>&lt;strong>DevOps&lt;/strong>: Containerization, auto-generated Dockerfiles&lt;/li>
&lt;/ul></description></item><item><title>Canario (Corso) - Microsoft 365 Backup Solution</title><link>/projects/canario-m365-backup/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>/projects/canario-m365-backup/</guid><description>&lt;h2 id="project-overview">Project Overview&lt;/h2>
&lt;p>Developed an enterprise-grade, open-source data protection engine for Microsoft 365 environments, creating the first comprehensive backup solution addressing critical M365 data protection needs for IT administrators.&lt;/p>
&lt;h2 id="key-technical-achievements">Key Technical Achievements&lt;/h2>
&lt;h3 id="backend-development">Backend Development&lt;/h3>
&lt;ul>
&lt;li>Developed enterprise-grade backup and restore system for Microsoft 365 services (Exchange, OneDrive, SharePoint, Teams)&lt;/li>
&lt;li>Implemented CLI interface with comprehensive command structure supporting backup, restore, export, and debug operations&lt;/li>
&lt;li>Built modular architecture with clear separation between API layer (&lt;code>/pkg&lt;/code>), CLI controller, and internal services&lt;/li>
&lt;li>Designed repository abstraction layer supporting multiple storage backends (S3, filesystem)&lt;/li>
&lt;/ul>
&lt;h3 id="microsoft-365-integration">Microsoft 365 Integration&lt;/h3>
&lt;ul>
&lt;li>Integrated with Microsoft Graph API for seamless access to M365 data&lt;/li>
&lt;li>Implemented service-specific backup handlers for:
&lt;ul>
&lt;li>&lt;strong>Exchange&lt;/strong>: Email backup and restore&lt;/li>
&lt;li>&lt;strong>OneDrive&lt;/strong>: File backup and synchronization&lt;/li>
&lt;li>&lt;strong>SharePoint&lt;/strong>: Site and document library protection&lt;/li>
&lt;li>&lt;strong>Teams&lt;/strong>: Conversation and channel data backup&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>Built robust authentication and authorization flows for Microsoft 365 environments&lt;/li>
&lt;/ul>
&lt;h3 id="enterprise-features">Enterprise Features&lt;/h3>
&lt;ul>
&lt;li>Developed comprehensive backup lifecycle management (create, list, delete, restore)&lt;/li>
&lt;li>Implemented data export functionality with multiple format support&lt;/li>
&lt;li>Built advanced debugging tools for troubleshooting backup operations&lt;/li>
&lt;li>Created extensive test coverage including end-to-end testing infrastructure&lt;/li>
&lt;li>Designed for enterprise scalability and security requirements&lt;/li>
&lt;/ul>
&lt;h3 id="production-readiness">Production Readiness&lt;/h3>
&lt;ul>
&lt;li>Currently in Beta with active community engagement&lt;/li>
&lt;li>Built production-ready architecture with enterprise security standards&lt;/li>
&lt;li>Implemented comprehensive error handling and logging&lt;/li>
&lt;li>Created CI/CD pipeline with automated dependency management&lt;/li>
&lt;/ul>
&lt;h2 id="technical-skills-demonstrated">Technical Skills Demonstrated&lt;/h2>
&lt;ul>
&lt;li>&lt;strong>Languages&lt;/strong>: Go (advanced CLI applications, microservices architecture)&lt;/li>
&lt;li>&lt;strong>Cloud Integration&lt;/strong>: Microsoft 365, Azure, AWS S3&lt;/li>
&lt;li>&lt;strong>API Integration&lt;/strong>: Microsoft Graph API, RESTful services&lt;/li>
&lt;li>&lt;strong>Architecture&lt;/strong>: CLI applications, microservices, repository pattern&lt;/li>
&lt;li>&lt;strong>Testing&lt;/strong>: Unit testing, end-to-end testing, test-driven development&lt;/li>
&lt;li>&lt;strong>DevOps&lt;/strong>: Git, Docker, CI/CD, automated dependency management&lt;/li>
&lt;/ul>
&lt;h2 id="project-impact">Project Impact&lt;/h2>
&lt;ul>
&lt;li>Created the &lt;strong>first open-source solution&lt;/strong> addressing critical M365 data protection needs&lt;/li>
&lt;li>Enabled IT administrators to have full control over Microsoft 365 data backup strategies&lt;/li>
&lt;li>Built for enterprise scalability supporting large-scale M365 deployments&lt;/li>
&lt;li>Active community engagement with ongoing development and feature requests&lt;/li>
&lt;/ul>
&lt;h2 id="technologies-used">Technologies Used&lt;/h2>
&lt;ul>
&lt;li>&lt;strong>Core Language&lt;/strong>: Go&lt;/li>
&lt;li>&lt;strong>Microsoft Integration&lt;/strong>: Microsoft Graph API, Microsoft 365 services&lt;/li>
&lt;li>&lt;strong>Storage Backends&lt;/strong>: AWS S3, filesystem abstraction&lt;/li>
&lt;li>&lt;strong>Architecture&lt;/strong>: CLI-first design, modular microservices&lt;/li>
&lt;li>&lt;strong>Testing Framework&lt;/strong>: Comprehensive unit and end-to-end testing&lt;/li>
&lt;li>&lt;strong>DevOps&lt;/strong>: Docker containerization, CI/CD pipelines&lt;/li>
&lt;/ul></description></item><item><title>Kubernetes Infrastructure &amp; Platform Engineering</title><link>/projects/infracloud-kubernetes-platform/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>/projects/infracloud-kubernetes-platform/</guid><description>&lt;h2 id="project-overview">Project Overview&lt;/h2>
&lt;p>Built a comprehensive bare metal Kubernetes cluster provisioning platform, developing custom controllers and reconciliation logic to manage cluster lifecycle on bare metal infrastructure.&lt;/p>
&lt;h2 id="key-achievements">Key Achievements&lt;/h2>
&lt;ul>
&lt;li>&lt;strong>Kubernetes Components&lt;/strong>: Built custom Kubernetes controller, API server, and scheduler components for bare metal cluster management&lt;/li>
&lt;li>&lt;strong>Reconciliation Logic&lt;/strong>: Developed state management system to ensure desired cluster configuration and handle cluster drift&lt;/li>
&lt;li>&lt;strong>Bootstrap Service&lt;/strong>: Created service deployed on private bootstrap machines in client networks that pulled commands from centralized SaaS platform and reported status&lt;/li>
&lt;li>&lt;strong>Bare Metal Provisioning&lt;/strong>: Implemented bare metal readiness detection using DHCP and TFTP protocols&lt;/li>
&lt;li>&lt;strong>Agent-based Architecture&lt;/strong>: Developed agents installed on bare metal machines to handle cluster operations&lt;/li>
&lt;li>&lt;strong>Infrastructure Automation&lt;/strong>: Leveraged Tinkerbell framework for automated bare metal provisioning workflows&lt;/li>
&lt;/ul>
&lt;h2 id="technologies-used">Technologies Used&lt;/h2>
&lt;ul>
&lt;li>&lt;strong>Container Orchestration&lt;/strong>: Kubernetes (custom controllers, API server, scheduler)&lt;/li>
&lt;li>&lt;strong>Programming&lt;/strong>: Go, Python&lt;/li>
&lt;li>&lt;strong>Infrastructure&lt;/strong>: Terraform, bare metal provisioning&lt;/li>
&lt;li>&lt;strong>Protocols&lt;/strong>: DHCP, TFTP for network boot and discovery&lt;/li>
&lt;li>&lt;strong>Provisioning&lt;/strong>: Tinkerbell framework&lt;/li>
&lt;li>&lt;strong>Architecture&lt;/strong>: SaaS platform with distributed agent-based management&lt;/li>
&lt;/ul></description></item><item><title>High-Performance Financial Trading Platform</title><link>/projects/thoughtworks-trading-platform/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>/projects/thoughtworks-trading-platform/</guid><description>&lt;h2 id="project-overview">Project Overview&lt;/h2>
&lt;p>Architected and delivered microservices-based cryptocurrency trading platform for Voyager Inc., building scalable backend infrastructure to handle high-frequency trading operations and real-time market data processing.&lt;/p>
&lt;h2 id="key-achievements">Key Achievements&lt;/h2>
&lt;ul>
&lt;li>Architected cryptocurrency trading platform with microservices&lt;/li>
&lt;li>Engineered time-series data management with TimescaleDB&lt;/li>
&lt;li>Built event-driven architecture with Apache Kafka for real-time market data&lt;/li>
&lt;li>Implemented sub-millisecond query performance for trading analytics&lt;/li>
&lt;li>Deployed production-grade observability with 99.9% uptime&lt;/li>
&lt;/ul>
&lt;h2 id="technologies-used">Technologies Used&lt;/h2>
&lt;ul>
&lt;li>&lt;strong>Backend&lt;/strong>: Golang, gRPC, REST APIs&lt;/li>
&lt;li>&lt;strong>Database&lt;/strong>: TimescaleDB, PostgreSQL&lt;/li>
&lt;li>&lt;strong>Messaging&lt;/strong>: Apache Kafka&lt;/li>
&lt;li>&lt;strong>Infrastructure&lt;/strong>: Kubernetes, Microservices&lt;/li>
&lt;li>&lt;strong>Monitoring&lt;/strong>: Production observability&lt;/li>
&lt;/ul></description></item><item><title>Visual Platform Development</title><link>/projects/sureify-visual-platform/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>/projects/sureify-visual-platform/</guid><description>&lt;h2 id="project-overview">Project Overview&lt;/h2>
&lt;p>Developed a visual configuration platform for insurance workflows, enabling non-technical users to create and manage complex business processes through an intuitive interface.&lt;/p>
&lt;h2 id="key-achievements">Key Achievements&lt;/h2>
&lt;ul>
&lt;li>Developed visual configuration platform for insurance workflows&lt;/li>
&lt;li>Built responsive frontend interfaces with modern React patterns&lt;/li>
&lt;li>Created robust backend APIs for workflow management&lt;/li>
&lt;li>Implemented microservices architecture for scalability&lt;/li>
&lt;li>Enhanced user experience with drag-and-drop functionality&lt;/li>
&lt;/ul>
&lt;h2 id="technologies-used">Technologies Used&lt;/h2>
&lt;ul>
&lt;li>&lt;strong>Frontend&lt;/strong>: React, modern JavaScript&lt;/li>
&lt;li>&lt;strong>Backend&lt;/strong>: Node.js, REST APIs&lt;/li>
&lt;li>&lt;strong>Architecture&lt;/strong>: Microservices&lt;/li>
&lt;li>&lt;strong>UI/UX&lt;/strong>: Responsive design, visual workflows&lt;/li>
&lt;li>&lt;strong>Integration&lt;/strong>: Insurance domain APIs&lt;/li>
&lt;/ul></description></item><item><title>Process Automation Solutions</title><link>/projects/process-automation-solutions/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>/projects/process-automation-solutions/</guid><description>&lt;h2 id="project-overview">Project Overview&lt;/h2>
&lt;p>Implemented comprehensive process automation solutions across various business domains, focusing on reducing manual effort and improving operational efficiency.&lt;/p>
&lt;h2 id="key-achievements">Key Achievements&lt;/h2>
&lt;ul>
&lt;li>Implemented RPA solutions for business process automation&lt;/li>
&lt;li>Developed web applications for client requirements&lt;/li>
&lt;li>Built automated workflows for business process optimization&lt;/li>
&lt;li>Reduced manual processing time by significant margins&lt;/li>
&lt;li>Created maintainable and scalable automation frameworks&lt;/li>
&lt;/ul>
&lt;h2 id="technologies-used">Technologies Used&lt;/h2>
&lt;ul>
&lt;li>&lt;strong>Automation&lt;/strong>: RPA tools and frameworks&lt;/li>
&lt;li>&lt;strong>Web Development&lt;/strong>: Full-stack development&lt;/li>
&lt;li>&lt;strong>Process Design&lt;/strong>: Business workflow optimization&lt;/li>
&lt;li>&lt;strong>Integration&lt;/strong>: Legacy system integration&lt;/li>
&lt;li>&lt;strong>Quality Assurance&lt;/strong>: Automated testing&lt;/li>
&lt;/ul></description></item></channel></rss>