Deployment Architectures

Table of Contents
1. Sidecar Deployment
2. Microservice Deployment (Single or Multiple Replicas)
3. Tiered Deployment
4. Cloud-Hosted in the Spice Cloud Platform
5. Sharded Deployment
6. Additional Considerations
References

1. Sidecar Deployment

Run the Spice Runtime in a separate container or process on the same machine as the main application. For example, in Kubernetes as a Sidecar Container. This approach minimizes communication overhead as requests to the Spice Runtime are transported over local loopback.

Benefits

Low-latency communication between the application and the Spice Runtime.
Simplified lifecycle management (same pod).
Isolated environment without needing a separate microservice.
Helps ensure resiliency and redundancy by replicating data across sidecars.

Considerations

Each application pod includes a copy of the Spice Runtime, increasing resource usage.
Updating the Spice Runtime independently requires updating each pod.
Accelerated data is replicated to each sidecar, adding resiliency and redundancy but increasing resource usage and requests to data sources.
May increase overall cost due to resource duplication.

Use This Approach When

Fast, low-latency interactions between the application and the Spice Runtime are needed (e.g., real-time decision-making).
Scaling needs are small or moderate, making duplication of the Spice Runtime in each pod acceptable.
Keeping the architecture simple without additional services or load balancers is preferred.
Performance and latency are prioritized over cost and complexity.

Example Use Case
A real-time trading bot or a data-intensive application that relies on immediate feedback, where minimal latency is critical. Both containers in the same pod ensure very fast data exchange.

2. Microservice Deployment (Single or Multiple Replicas)

The Spice Runtime operates as an independent microservice. Multiple replicas may be deployed behind a load balancer to achieve high availability and handle spikes in demand.

Benefits

Loose coupling between the application and the Spice Runtime.
Independent scaling and upgrades.
Can serve multiple applications or services within an organization.
Helps achieve high availability and redundancy.

Considerations

Additional network hop introduces latency compared to sidecar.
More complex infrastructure, requiring service discovery and load balancing.
Potentially higher cost due to additional infrastructure components.

Use This Approach When

A loosely coupled architecture and the ability to independently scale the AI service are desired.
Multiple services or teams need to share the same AI engine.
Heavy or varying traffic is anticipated, requiring independent scaling of the Spice Runtime.
Resiliency and redundancy are prioritized over simplicity.

Example Use Case
A large organization where multiple services (recommendations, analytics, etc.) need to share AI insights. A centralized Spice Runtime microservice cluster helps separate teams consume AI outputs without duplicating efforts.

3. Tiered Deployment

Overview
A hybrid approach combining sidecar deployments for performance-critical tasks and a shared microservice for batch processing or less time-sensitive workloads.

Benefits

Real-time responsiveness where needed (sidecar).
Centralized microservice handles broader or shared tasks.
Balances resource usage by limiting sidecar instances to high-priority operations.
Helps balance performance and latency with cost and complexity.

Considerations

More complex deployment structure, mixing two patterns.
Must ensure consistent versioning between sidecar and microservice instances.
Potentially higher operational complexity and cost.

Use This Approach When

Certain application components require ultra-low-latency responses, while others do not.
Centralized AI or analytics is needed, but localized real-time decision-making is also required.
The system can handle the operational complexity of running multiple deployment patterns.
Balancing performance and latency with cost and complexity is the goal.

Example Use Case
A logistics application that calculates routing decisions in real time (sidecar) while a microservice component processes aggregated data for periodic analysis or re-training models.

4. Cloud-Hosted in the Spice Cloud Platform

Overview
The Spice Runtime is deployed on a fully managed service within the Spice Cloud Platform, minimizing the operational burden of managing clusters, upgrades, and infrastructure.

Benefits

Reduced overhead for deployment, scaling, and maintenance.
Access to specialized hosting features and quick setup.
Helps reduce operational complexity and cost.

Considerations

Reliance on external hosting and associated terms or limits.
Potential compliance or data residency considerations for certain industries.
May introduce latency depending on the cloud provider's infrastructure.

Use This Approach When

Limited DevOps resources are available, or focus on application logic over infrastructure is preferred.
A fully managed environment with minimal setup time is desired.
A single, managed solution is prioritized over running own clusters.
Minimizing operational complexity and cost is the goal.

Example Use Case
A startup or team with limited DevOps support that needs a reliable, managed environment. Quick deployment and minimal in-house infrastructure responsibilities are priorities.

5. Sharded Deployment

Overview
The Spice Runtime instances are sharded based on specific criteria, such as by customer, state, or other logical partitions. Each shard operates independently, with a 1:N Application to Spice instances ratio.

Benefits

Helps distribute load across multiple instances, improving performance and scalability.
Isolates failures to specific shards, enhancing resiliency.
Allows tailored configurations and optimizations for different shards.

Considerations

More complex deployment and management due to multiple instances.
Requires effective sharding strategy to balance load and avoid hotspots.
Potentially higher cost due to multiple instances.

Use This Approach When

Distributing load across multiple instances for better performance is needed.
Isolating failures to specific shards to improve resiliency is desired.
The application can benefit from tailored configurations for different logical partitions.
The complexity of managing multiple instances can be handled.

Example Use Case
A multi-tenant application where each customer has a dedicated Spice Runtime instance. This helps ensure that heavy usage by one customer does not impact others, and allows for customer-specific optimizations.

6. Additional Considerations

Hybrid/On-Premises
For organizations subject to strict data governance, the Spice Runtime can run entirely on-premises or in a hybrid setup (some services in the cloud, some on-prem). This helps ensure compliance with data residency requirements while balancing performance and cost.

Edge/IoT
Specialized containers or hardware can be used where device resources are limited. Useful for near-sensor AI in manufacturing or industrial monitoring scenarios. This approach helps achieve low latency and high performance in resource-constrained environments.

References

Kubernetes Documentation

Table of Contents​

1. Sidecar Deployment​

2. Microservice Deployment (Single or Multiple Replicas)​

3. Tiered Deployment​

4. Cloud-Hosted in the Spice Cloud Platform​

5. Sharded Deployment​

6. Additional Considerations​

References​

Table of Contents

1. Sidecar Deployment

2. Microservice Deployment (Single or Multiple Replicas)

3. Tiered Deployment

4. Cloud-Hosted in the Spice Cloud Platform

5. Sharded Deployment

6. Additional Considerations

References