👋 Welcome to BlindLlama!

Making AI Confidential & Transparent

📜 What is BlindLlama?

Introduction

🛠️ BlindLlama makes it easy to use open-source LLMs by using Confidential & transparent AI APIs that abstract all the complexity of model deployment while ensuring users’ data is never exposed to us thanks to end-to-end protection with secure hardware.

🔐 To provide guarantees to developers that data sent to our managed infrastructure is not exposed, we have developed a Confidential & transparent architecture to serve AI models.

We currently serve Llama2 but will be making more open-source models available in the near future!

Our backend has two key properties:

Confidentiality: Your data is never accessible to us. We serve AI models inside hardened environments that do not expose data even to our admins. All points of access, such as SSH, logs, networks, etc., are blocked to ensure the isolation of data.
Transparency: We provide you with verifiable cryptographic proof that these controls are in place, thanks to the use of Trusted Platform Modules (TPMs).

Warning

BlindLlama is still under development. It does not yet have the full security features.

Do not test our API with confidential information... yet!

You can follow our progress towards the next beta and 1.0 versions of BlindLLama on our roadmap.

We welcome contributions to our project from the community! Don't hesitate to raise issues on GitHub, reach out to us or see our guide on how to audit BlindLlama (coming soon!).

👩🏻‍💻 Use cases

BlindLlama is meant to help developers working with sensitive data to easily get started with LLMs by using managed AI APIs that abstract the hardware and software complexity of model deployment while ensuring their data remains unexposed.

Several scenarios can be answered by using BlindLlama, such as:

Benchmarking the best open-source LLMs against one’s private data to find out which one is the most relevant without having to do any provisioning
Structuring medical documents
Analysis or auto-completion of a confidential code base

✅ When should you use BlindLlama?

You want to get started with LLMs that are complex to deploy, such as Llama 2 70B
You don’t want to manage that infrastructure as it requires too much time, expertise and/or budget
You don’t want to expose your data to a third party AI provider that manages the infrastructure for you due to privacy/compliance issues

❌ What is not covered by BlindLlama?

BlindLlama is simply a drop-in replacement to query a remotely hosted model instead of having to go through complex local deployment. We do not cover training from scratch, but we will cover fine-tuning soon.
BlindLlama allows you to quickly and securely leverage models which are open-source, such as Llama 2, StarCoder, etc. Proprietary models from OpenAI, Anthropic, and Cohere are not supported yet as we would require them to modify their backend to offer a Confidential & transparent AI infrastructure like ours.
BlindLlama’s trust model implies some level of trust in Cloud providers and hardware providers since we leverage secure hardware available and managed by Cloud providers (see our trust model section for more details).

BlindLlama virtually provides the same level of security, privacy, and control as solutions provided by Cloud providers like Azure OpenAI Services.

🚀 Getting started

Check out our Quick tour, which will enable you to play with an example using the Llama 2 model while ensuring your data remains private and without the hassle of provisioning!
Find out more about How we protect your data
Discover the architecture and trust model behind BlindLlama.
You can also check out our video introducing BlindLlama and walking you through the quick tour:

📚 Advanced security whitepaper

We created the BlindLlama whitepaper to cover the architecture and security features behind BlindLLama in greater detail.

The whitepaper is intended for an audience with security expertise.

You can read or download the whitepaper here!

🎯 Roadmap

There are three key milestones planned for the BlindLlama project.

BlindLlama Alpha launch (not attestable):

The alpha launch of BlindLlama provides an API for the Llama2-70b model which you can query with our python SDK.

Users can test out and query our API but should not yet send any confidential data to the API as it is does not yet have full implementation of security features.

The server-side code includes the backbones for our attestation feature (which wll enable us to be able to prove the server is deploying the expected code to end users) but this feature will be fully launched in the following beta phase.

Expected launch date: week ending 08/09/2023

BlindLlama Beta launch (attestable):

The beta version adds the full implementation of TPM-based attestation, meaning our APIs can be fully verified remotely. This version will not yet have full hardening of server-side environment or audit and thus is not yet recommended in production!

Provisional launch date: week ending 06/10/2023

BlindLlama 1.0 launch (audit-ready):

A fully-secure version of BlindLlama ready for audit, with a fully hardened server environment.

Provisional launch date: week ending 08/12/2023

You can check out more details about these stages and our progress to achieveing these milestones on our official roadmap.

📇 Get in touch

We would love to hear your feedback or suggestions, here are the ways you can reach us:

Found a bug? Open an issue!
Got a suggestion? Join our Discord community and let us know!
Set up a one-on-one meeting with a member of our team

Want to hear more about our work on privacy in the field AI?

Check out our blog
Subscribe to our newsletter here

Thank you for your support!

🔒 Who made BlindLlama?

BlindLlama is developed by Mithril Security, a startup focused on democratizing privacy-friendly AI using secure hardware solutions.

We have already had our first project, BlindAI, an open-source Rust inference server that deploys ONNX models on Intel SGX secure enclaves, audited by Quarkslab.

BlindLlama builds on the foundations of BlindAI but provides much faster performance and focuses on serving managed models directly to developers instead of helping AI engineers to deploy models.