Skip to main content

Deployment Architectures

Table of Contents​

1. Sidecar Deployment​

Run the Spice Runtime in a separate container or process on the same machine as the main application. For example, in Kubernetes as a Sidecar Container. This approach minimizes communication overhead as requests to the Spice Runtime are transported over local loopback.

Benefits

  • Low-latency communication between the application and the Spice Runtime.
  • Simplified lifecycle management (same pod).
  • Isolated environment without needing a separate microservice.
  • Helps ensure resiliency and redundancy by replicating data across sidecars.

Considerations

  • Each application pod includes a copy of the Spice Runtime, increasing resource usage.
  • Updating the Spice Runtime independently requires updating each pod.
  • Accelerated data is replicated to each sidecar, adding resiliency and redundancy but increasing resource usage and requests to data sources.
  • May increase overall cost due to resource duplication.

Use This Approach When

  • Fast, low-latency interactions between the application and the Spice Runtime are needed (e.g., real-time decision-making).
  • Scaling needs are small or moderate, making duplication of the Spice Runtime in each pod acceptable.
  • Keeping the architecture simple without additional services or load balancers is preferred.
  • Performance and latency are prioritized over cost and complexity.

Example Use Case
A real-time trading bot or a data-intensive application that relies on immediate feedback, where minimal latency is critical. Both containers in the same pod ensure very fast data exchange.


2. Microservice Deployment (Single or Multiple Replicas)​

The Spice Runtime operates as an independent microservice. Multiple replicas may be deployed behind a load balancer to achieve high availability and handle spikes in demand.

Benefits

  • Loose coupling between the application and the Spice Runtime.
  • Independent scaling and upgrades.
  • Can serve multiple applications or services within an organization.
  • Helps achieve high availability and redundancy.

Considerations

  • Additional network hop introduces latency compared to sidecar.
  • More complex infrastructure, requiring service discovery and load balancing.
  • Potentially higher cost due to additional infrastructure components.

Use This Approach When

  • A loosely coupled architecture and the ability to independently scale the AI service are desired.
  • Multiple services or teams need to share the same AI engine.
  • Heavy or varying traffic is anticipated, requiring independent scaling of the Spice Runtime.
  • Resiliency and redundancy are prioritized over simplicity.

Example Use Case
A large organization where multiple services (recommendations, analytics, etc.) need to share AI insights. A centralized Spice Runtime microservice cluster helps separate teams consume AI outputs without duplicating efforts.


3. Tiered Deployment​

Overview
A hybrid approach combining sidecar deployments for performance-critical tasks and a shared microservice for batch processing or less time-sensitive workloads.

Benefits

  • Real-time responsiveness where needed (sidecar).
  • Centralized microservice handles broader or shared tasks.
  • Balances resource usage by limiting sidecar instances to high-priority operations.
  • Helps balance performance and latency with cost and complexity.

Considerations

  • More complex deployment structure, mixing two patterns.
  • Must ensure consistent versioning between sidecar and microservice instances.
  • Potentially higher operational complexity and cost.

Use This Approach When

  • Certain application components require ultra-low-latency responses, while others do not.
  • Centralized AI or analytics is needed, but localized real-time decision-making is also required.
  • The system can handle the operational complexity of running multiple deployment patterns.
  • Balancing performance and latency with cost and complexity is the goal.

Example Use Case
A logistics application that calculates routing decisions in real time (sidecar) while a microservice component processes aggregated data for periodic analysis or re-training models.


4. Cloud-Hosted in the Spice Cloud Platform​

Overview
The Spice Runtime is deployed on a fully managed service within the Spice Cloud Platform, minimizing the operational burden of managing clusters, upgrades, and infrastructure.

Benefits

  • Reduced overhead for deployment, scaling, and maintenance.
  • Access to specialized hosting features and quick setup.
  • Helps reduce operational complexity and cost.

Considerations

  • Reliance on external hosting and associated terms or limits.
  • Potential compliance or data residency considerations for certain industries.
  • May introduce latency depending on the cloud provider's infrastructure.

Use This Approach When

  • Limited DevOps resources are available, or focus on application logic over infrastructure is preferred.
  • A fully managed environment with minimal setup time is desired.
  • A single, managed solution is prioritized over running own clusters.
  • Minimizing operational complexity and cost is the goal.

Example Use Case
A startup or team with limited DevOps support that needs a reliable, managed environment. Quick deployment and minimal in-house infrastructure responsibilities are priorities.


5. Sharded Deployment​

Overview
The Spice Runtime instances are sharded based on specific criteria, such as by customer, state, or other logical partitions. Each shard operates independently, with a 1:N Application to Spice instances ratio.

Benefits

  • Helps distribute load across multiple instances, improving performance and scalability.
  • Isolates failures to specific shards, enhancing resiliency.
  • Allows tailored configurations and optimizations for different shards.

Considerations

  • More complex deployment and management due to multiple instances.
  • Requires effective sharding strategy to balance load and avoid hotspots.
  • Potentially higher cost due to multiple instances.

Use This Approach When

  • Distributing load across multiple instances for better performance is needed.
  • Isolating failures to specific shards to improve resiliency is desired.
  • The application can benefit from tailored configurations for different logical partitions.
  • The complexity of managing multiple instances can be handled.

Example Use Case
A multi-tenant application where each customer has a dedicated Spice Runtime instance. This helps ensure that heavy usage by one customer does not impact others, and allows for customer-specific optimizations.


6. Additional Considerations​

Hybrid/On-Premises
For organizations subject to strict data governance, the Spice Runtime can run entirely on-premises or in a hybrid setup (some services in the cloud, some on-prem). This helps ensure compliance with data residency requirements while balancing performance and cost.

Edge/IoT
Specialized containers or hardware can be used where device resources are limited. Useful for near-sensor AI in manufacturing or industrial monitoring scenarios. This approach helps achieve low latency and high performance in resource-constrained environments.


References​