Sidekick Technical Design

Version 1.0

1. Introduction

1.1 Purpose

The purpose of this technical design document is to provide a comprehensive and detailed description of the 'Sidekick' AI Assistant platform, a web-based solution that provides secure access to foundation AI models (leveraging OpenAI’s frontier models) and custom AI assistants. This document aims to furnish IT managed service providers, responsible for administering Microsoft Azure services on behalf of our customers, with crucial information regarding the design, deployment, security, integration, and maintenance aspects of the platform.

The goal is to ensure that the platform is understood to be secure, robust, and efficient, thereby facilitating its seamless deployment and management in diverse customer environments. By outlining the technical intricacies and operational guidelines, this document serves as a foundational reference to support the successful implementation and sustained operation of the 'Sidekick' AI Assistant platform.

1.2 Scope

This document covers the technical architecture, deployment strategies, security measures, integration points, and maintenance protocols for the 'Sidekick' AI Assistant platform.

This document focuses on the deployment of the platform within customer Microsoft Azure environments, ensuring that all configurations meet the necessary security and compliance standards. Furthermore, the scope encompasses the integration of the platform with existing customer systems, performance optimization strategies, and the support and maintenance services required to ensure ongoing reliability.

This technical design document does not cover business-related aspects such as pricing models, marketing strategies, or end-user training documentation, focusing solely on the technical implementation and operational aspects.

1.3 Audience

This technical design document is intended for a diverse audience comprising internal and external stakeholders involved in the deployment and management of the 'Sidekick' AI Assistant platform. The primary audience includes:

Internal Propella Team: Engineers, developers, and technical architects within the Propella team responsible for designing, developing, and deploying the platform. This section of the audience will benefit from understanding the detailed architecture, security protocols, and integration points to ensure the platform's reliability and robustness.
Customer Stakeholders: This group includes both technical and non-technical stakeholders within the customer organisations. Technical stakeholders, such as IT managers, system architects, and developers, will require an in-depth understanding of the platform's deployment and operational aspects. Non-technical stakeholders, such as business managers and process owners, will benefit from insights into how the platform's functionalities align with business objectives and improve operational efficiency.
Customer IT Providers: Managed service providers responsible for administering Microsoft Azure environments on behalf of the customers. These IT providers will play a crucial role in provisioning access to Azure resources, configuring environments, and ensuring that the deployment adheres to security standards and operational protocols.

By addressing the needs and expectations of these varied audience groups, this document aims to facilitate a seamless and efficient deployment process, ensuring that all parties are well-informed and can effectively collaborate to support the successful implementation of the 'Sidekick' AI Assistant platform.

1.4 Overview

The platform contains the following functional components:

The ‘Home Base’ landing page - all Assistants are accessible from this page.
The Chat History panel that persists on the left-hand side of the solution – provide ability to review historical chats, select the Home Base button to return to landing page, and access user-specific functions.

Assistant pages – dedicated pages where users can initiate chats with defined Assistants. These Assistants could be general AI models like OpenAI’s GPT-4o, or custom Assistants that integrate to specific data sources like our Tax Assistant (integrates with ATO website) and Law Assistant (integrates with AustLII).

2. System Architecture

2.1 Architecture Overview

The Sidekick AI Assistant platform is a cloud-native solution hosted on Microsoft Azure. It offers specialised AI assistants (e.g. for law, tax) that interact with users. These assistants are centrally managed within a single Assistants microservice, while an Orchestrator service intelligently routes requests to the appropriate assistant or external services, such as Propella Dynamic Dialogue (a service that offers conversational chatbot functionality).

The platform primarily operates in production mode, with a development environment deployed only by specific agreement. This flexible design allows the platform to scale seamlessly and integrate with external OpenAI models, PostgreSQL databases, and knowledgebases.

The following diagram illustrates the overall solution architecture for the solution.

2.2 Components Description

Users

Users interact with the Sidekick platform through a web or mobile front-end interface

Linux Virtual Machines (Production)

Hosts key services as Docker containers on Linux VMs, ensuring scalability and reliability:

Front-End Service

Manages user interactions and sends requests to the Orchestrator service.

Orchestrator Service

Routes incoming requests to the relevant assistant in the Assistants microservice or external endpoints like Propella Dynamic Dialogue (separate AI service not included within the scope of this solution).

Assistants Service

Central service for all AI assistants, each specialising in different areas (e.g., law, tax, finance). Uses OpenAI models, Bing Search, and knowledge bases for enhanced responses.

External Endpoints (Future Integration)

Placeholder for future services, such as Propella Dynamic Dialogue, to support dynamic interactions beyond Sidekick’s core assistants.

Conditional Features

Networking Resources (If ENABLE_VNET is true)

Includes a resource VNet with private endpoints and DNS zones necessary for secure communication of resources within the VNet. The resource VNet is also peered with the VM’s network for seamless integration.

App Registrations

Manages authentication using Azure Active Directory (AAD) for both user logins and service connections.

Storage Accounts

Stores user-uploaded files and knowledge base documents.

Cognitive Services (Azure OpenAI API service)

Provides LLM-based responses for the Assistants microservice.

Bing Search Integration

Allows assistants (e.g., tax assistant) to query external websites such as the Australian Tax Office (ATO) for additional information and resources.

Azure Database for PostgreSQL

Stores chat history and knowledge base data. If disabled, users must provide their own PostgreSQL instance.

Azure Container Jobs and Container Job Environment (If ENABLE_KB is true)

Processes and transforms knowledge base documents for querying by assistants.

Application Insights and Log Analytics Workspace

Provides logging and monitoring of the platform to ensure reliability and track performance metrics.

2.3 Interactions and Data Flow

1. User Interaction with Front-End

Users engage with the Front-End Service through a web or mobile interface. They can ask questions or provide inputs (e.g., uploading files).

2. Routing via Orchestrator Service

The Orchestrator Service receives the user’s request and determines the appropriate action:

Route to Assistants Microservice: If the request requires a specific assistant (e.g. legal or tax query).

Route to External Dynamic Dialogue Endpoint: If the request requires Propella’s Dynamic Dialogue system.

3. Assistant Query Handling and Response Generation

The Assistants Microservice processes the request and invokes the necessary OpenAI models (either through Sidekick’s or the user’s instance).

If relevant, the assistant queries a knowledge base processed by knowledgebase pipeline.

4. Knowledgebase Processing with Azure Container Jobs (Optional)

If Knowledgebase is enabled, knowledge base documents are processed and indexed, enabling assistants to query them efficiently.

5. Database Access for Chat History and Knowledge base Data

Chat interactions are logged in the PostgreSQL database. If the database is disabled, the platform integrates with the user’s own PostgreSQL instance.

Knowledgebase data, if used, is stored here as well.

6. Storage of User Files and Logs

User-uploaded files and knowledgebase documents are stored in Storage Accounts.

Interaction logs are saved in the PostgreSQL database.

2.4 Technology Stack

Technology	Description
Azure Linux Virtual Machines	Hosts the front-end, orchestrator, and assistant services, each running as Docker containers.
Azure Storage Accounts	Manages files uploaded by users and knowledgebase documents.
Azure Cognitive Services (OpenAI)	Provides LLM-based AI capabilities for the assistants.
Azure Bing Search	Enables assistants to query external sources like the Australian Tax Office (ATO) for dynamic and enriched responses.
Azure Database for PostgreSQL	Stores chat history and knowledgebase data. If not used, the platform integrates with the user’s PostgreSQL instance.
Azure Container Jobs	Processes and transforms knowledgebase documents.
Microsoft Azure Active Directory (AAD)	Secures the platform with authentication and access management.
Application Insights and Log Analytics Workspace	Tracks performance metrics and logs platform activity for monitoring and diagnostics.

3. Deployment Model

3.1 Deployment Topology

A diagram of software development

Description automatically generated — Figure 4: Deployment Topology

The infrastructure resources correspond to those detailed in Section 2. All deployments are authenticated using an Azure Service Principal for secure access and management.

Azure DevOps

Propella uses its own Azure DevOps instance to manage the deployment process. The source code for the platform is organised into separate branches for development and production within the same repository.

Virtual Machine

A single Linux virtual machine hosts all the platform services, with each service deployed as a Docker container. The VM regularly pulls updated container images from Propella’s private container registry, ensuring the platform is always running the latest version of the code.

Customer Onboarding

During the initial setup, customers provide branding files, and their IT provider configures access to their Azure environment. Propella uses Terraform (Infrastructure as Code) to deploy and manage the platform’s infrastructure.

3.2 Environment Configuration

Production Environment:

The production environment is deployed by default and forms the primary operational platform for the customer.
Customers can request access to beta features (if available), but these are not enabled by default.
Not all features are activated in the production environment; only the default features as outlined in Section 2 are enabled.

Development Environment:

Optional and used internally by Propella for testing and staging updates before deployment to production.
Maintained separately with strict access controls.

3.3 High Availability and Scalability

High Availability

The single VM hosting the platform services can be deployed in a high-availability configuration, such as enabling zone redundancy or increasing scaling, at an additional cost to meet specific business demands.
The current basic configuration is sufficient for most use cases and can be easily recreated or recovered using automation in case of a disaster.
The Azure Database for PostgreSQL, used to store chat history, is zone redundant and backed up by Azure daily, ensuring data can be recovered in the event of a disaster.

Scalability

Propella can scale the platform by increasing storage or computational capacity based on application usage or business requirements. This is achieved using Terraform automation.

4. Security

4.1 Security Requirements

The Sidekick platform adheres to the following security requirements:

Data Protection: All user data and knowledgebase documents must be protected during transit and at rest using industry-standard encryption.
Authentication: Only authorised users and systems can access the platform through secure authentication mechanisms such as Microsoft Entra ID.
Infrastructure Integrity: Resources must be provisioned and managed securely, with updates and patches applied regularly to prevent vulnerabilities.
Network Security: The platform enforces secure communication protocols (e.g., HTTPS) and blocks unauthorised access to sensitive resources.

4.2 Identity and Access Management

Authentication:

The platform uses Microsoft Entra ID for authentication, ensuring secure and seamless single sign-on (SSO) for users.

Role-Based Access Control (RBAC):

RBAC is implemented at both the platform and infrastructure levels to restrict access to authorised personnel.

a) Platform Access:

Users within the organisation (as identified via Microsoft Entra ID) will have access to the platform by default, but no access to the underlying infrastructure resources such as Azure Storage, Azure PostgreSQL, or container registries.

b) Resource Access:

Only Propella’s operational team and authorised customer IT personnel (as configured during deployment) will have specific access to manage or audit underlying infrastructure.

c) Azure Service Principal:

Propella uses an Azure Service Principal to authenticate Terraform deployments, providing secure and automated resource provisioning without exposing sensitive credentials.

d) Customer Access:

During the onboarding process, customer IT providers configure access permissions for their users, ensuring compliance with their internal policies while adhering to platform security standards.

Describe encryption methods and data protection strategies.

4.3 Data Security

Encryption:

Data in Transit: All data exchanged between users, the platform, and Azure resources is encrypted using TLS 1.2 or higher.
Data at Rest: Knowledgebase documents, uploaded files, and chat history stored in Azure Storage and PostgreSQL are encrypted using Azure-managed encryption keys (AES-256).

Access Control:

Access to sensitive data is limited based on RBAC and customer-specific configurations.

4.4 Compliance and Regulatory Considerations

The platform is designed to comply with Australian-specific regulatory standards and requirements:

Australian Privacy Principles (APPs): Ensures compliance with the Australian Privacy Act 1988 for data governance and privacy, protecting customer data in line with local laws.
Data Residency: All platform resources are deployed in the Australia East Azure region, and all customer data remains within Australian data centres, ensuring data sovereignty.
Customer Environment: All data processed and stored by the platform remains within the customer’s Azure environment, without being transferred outside Australian borders.
Customisable Compliance: The platform can be configured to meet additional compliance requirements as requested by the customer, ensuring alignment with industry-specific standards.

4.5 Threat Model and Mitigation

Potential Threats:

Unauthorised Access: Risk of unauthorised users gaining access to sensitive data or systems.
Data Breaches: Exposure of customer data due to weak encryption or misconfigurations.
Malware and Exploits: Potential exploitation of vulnerabilities in the VM or services.
Denial of Service (DoS): Attacks that disrupt the availability of the platform.

Mitigation Strategies:

Strong Authentication: Enforcing secure password policies and SSO through Microsoft Entra ID.
Regular Updates: Applying security patches to VMs and containerised services to prevent vulnerabilities.
Encryption: Using TLS for data in transit and AES-256 for data at rest to prevent unauthorised access.
Disaster Recovery: Automated recovery processes using Terraform to minimise downtime in case of service disruptions.
Monitoring and Alerts:
- Azure App Insights and Log Analytics Workspace are used for real-time monitoring and detection of anomalies.
- Proactive alerts are set up to notify the operational team of potential threats or irregular activities.
  Access Logging:
- While fine-grained access control is implemented, all platform access is logged to provide transparency and detect any potential unauthorised actions.
- Default Azure security features are also leveraged to track and audit platform access, ensuring that any suspicious activities are detected promptly.

5. Integration and Interoperability

5.1 Integration Points

The Sidekick platform is currently a standalone application that resides within a Customer’s Azure environment. Integration to external end points includes:

Integration to OpenAI’s API for chat completions (calls to OpenAI models, such as GPT-4o). This is mediated via the Azure OpenAI services framework.

5.2 API and SDK

Detail the API and SDKs provided for integration.

5.3 Interoperability Standards

Discuss the standards followed to ensure interoperability.

6. Support and Maintenance

6.1 Support Plan & SLAs

The Support Plan for the 'Sidekick' AI Assistant platform and associated Service level Agreement are outlined in a separate document, located here:

https://sidekick.helpscoutdocs.com/article/202-sidekick-support-model-slas

6.2 Monitoring and Logging

Monitoring Mechanisms:

1. Application Performance Monitoring (APM)

Azure Application Insights

Real-time monitoring of application performance.
Tracks response times, failure rates, and application dependencies.
Detects and diagnoses performance anomalies.

2. Infrastructure Monitoring

Azure Monitor

Comprehensive monitoring of Azure resources including VMs, Kubernetes clusters, databases, and networking components.
Collects and analyses metrics and logs from the entire infrastructure.
Enables creation of custom dashboards to visualise resource health and performance.

Azure Log Analytics

Aggregates and correlates logs from various Azure resources.
Enables powerful querying and visualisation of log data.

3. Network Monitoring

Azure Network Watcher

Provides network diagnostic and visualisation tools.
Monitors network performance and identifies network-related issues.

Traffic Analytics

Analyses network traffic and provides insights into the security and efficiency of network usage.

4. Security Monitoring

Azure Security Centre

Provides a unified view of the security posture across Azure subscriptions.
Monitors for threats and vulnerabilities in real-time.
Recommends security best practices and compliance policies.

5. Service Health Monitoring

Azure Service Health

Notifies about Azure service incidents, planned maintenance, and health advisories.
Customizable alert rules to receive notifications relevant to specific Azure services.

Logging Mechanisms

1. Application Logging

Structured Logs

Logs generated by the application components, such as user interactions, API calls, error messages, and transaction details.
Use of structured logging formats (e.g., JSON)for consistency and easier processing.

Azure Diagnostic Logs

Detailed logs from Azure resources and applications.
Collect logs from Azure resources like Azure App Service, Azure Functions, and Azure SQL Database.

2. Infrastructure and System Logs

Azure VM Logs: System and application logs from virtual machines (Windows event logs, Linux syslog).

Container Logs: Logs from Azure Kubernetes Service (AKS) clusters, including pod and container logs.

Azure Diagnostic Extension: Collects diagnostic data (logs, metrics) from Azure VMs and sends it to Azure Monitor.

3. Audit Logs

Azure Active Directory (AzureAD) Logs: Logs of user sign-ins, application registrations, and role assignments.

Azure Activity Log: Logs of all administrative operations on Azure resources (create, update, delete).

4. Security Logs

Azure Sentinel

Collects and analyses security data from across the enterprise.
Uses AI to detect and respond to incidents automatically.

Azure Monitor Security Logs

Aggregates security-related logs for monitoring and alerting.

5. Custom Logs

Logging custom events and metrics specific to the 'Sidekick' AI Assistant platform.
Use Azure Monitor Custom Logs to collect and query custom log data.

Key Practices

1. Centralised Log Management

Use of centralised log management platforms(Azure Monitor, Log Analytics) for unified logging across all components.
Efficient querying and analysis of logs from multiple sources.

2. Alerting and Notification

Define alert rules in Azure Monitor to notify relevant personnel upon detecting anomalies or threshold breaches.
Use Azure Action Groups to configure notification methods (email, SMS, webhook) for alerts.

3. Retention Policies

Define log retention policies based on compliance requirements and operational needs.
Regularly archive or delete outdated logs to manage storage costs and ensure compliance.

4. Continuous Improvement

Regular review and refinement of monitoring and logging configurations to adapt to evolving requirements and optimise performance.

6.3 Disaster Recovery Plan

Objective: The objective of this Disaster Recovery Plan is to ensure the 'Sidekick' AI Assistant platform can quickly and effectively recover from any disaster scenario, minimising down time and data loss while ensuring continuity of service.

Scope: This plan covers all components of the 'Sidekick' AI Assistant platform, including infrastructure, applications, data, and integrations, deployed within Microsoft Azure environments. The scope includes disaster recovery strategies, roles and responsibilities, and step-by-step procedures to recover from various disaster scenarios.

6.4 Software Updates and Patch Management

Discuss the process for software updates and patch management.

7. Performance and Optimisation

7.1 Performance Requirements

Outline the key performance metrics.

7.2 Load Testing

Describe the load testing strategy and results.

7.3 Performance Tuning

Provide strategies for performance tuning.

8. Appendices

8.1 Glossary

8.2 References

1. Azure Documentation

· Microsoft. (2024). "Azure ArchitectureCenter." Microsoft Learn. Retrieved from https://learn.microsoft.com/en-us/azure/architecture/

2. API Documentation

· OpenAI. (2024). "ChatGPT APIDocumentation." OpenAI. Retrieved from https://beta.openai.com/docs/api-reference/introduction

3. Cloud Security

· Turner, J., & Reilly, P. (2021)."Securing the Cloud: A Comprehensive Guide to Cloud Security." Wiley.

4. Disaster Recovery Planning

· Smith, A., & Johnson, R. (2022)."Effective Disaster Recovery Planning: Strategies and BestPractices." Pearson.

5. AI Integration Best Practices

· Peterson, D. (2023). "Integrating AI intoEnterprise Systems: A Practical Guide." O'Reilly Media.

6. Microsoft Azure – Design Considerations

· Microsoft. (2024). "Designing for Efficiencyand Scalability on Azure." Microsoft Learn. Retrieved from https://learn.microsoft.com/en-us/azure/best-practices-availability-paired-regions

8.3 Contact Information

Solution Architect

Alistair Toms

E: Alistair.Toms@propella.ai

M: 0421 190 338

Data Engineer

Yi Xiang Chee

E: Yixiang.Chee@propella.ai

M: 0478 913 462

Software Engineer

Vincent Taneli

E: Vincent.Taneli@propella.ai

M: 0406 221 916