Design Thinking: SOLAR — LLM power Cognitive Network for building safe and privacy first solution.

11 min readJan 16, 2024

Background and Motivation

Creating a secure and privacy-focused large language model (LLM) solution is crucial for any enterprise and consumer-based AI solution. This is especially important for the design of the PAVAI solution, as LLM-generated content is dynamic and poses several challenges. Some of these challenges include hallucinations, where the model generates inaccurate or made-up information; prompt hijacking or attacks, where malicious users attempt to manipulate the model into producing harmful or inappropriate content; and uncensored content, which can include offensive or inappropriate language.

By prioritizing safety and privacy in the design of the PAVAI solution, we can help ensure that the LLM-generated content is accurate, secure, and appropriate for all users.

There are several excellent frameworks currently available that simplify the process of building AI applications. Some popular options include LangChain, LlamaIndex, and Haystack, among others that are still being discovered.

LangChain is a versatile framework that covers a wide range of general use cases, such as document analysis and summarization, chatbots, and code analysis. It is particularly well-suited for building conversation flow-type applications. LlamaIndex is a simple and flexible data framework that allows you to connect custom data sources to large language models (LLMs). It is particularly useful for building read-after-generate (RAG) type integrations. Haystack offers similar functionality to other frameworks for building custom apps with LLMs, making it a strong choice for developers looking to create specialized AI applications.

While the aforementioned frameworks are highly useful for building AI applications, I have found that they lack built-in content safety enforcement and data privacy detection for Personal Identifiable Information (PII) and data anonymization when it comes to large language models (LLMs). It is possible to work around this limitation by adding a custom component and orchestration layer, but this can become cumbersome to maintain over time.

In order to ensure the safe and privacy-focused use of LLMs, it is important to consider incorporating these features into the framework itself, rather than relying on custom solutions that may be difficult to maintain. This will help to ensure that AI applications are able to effectively and efficiently enforce content safety and protect user data.

In my research for a balanced design for the “system-brain” of my private voice assistant, PAVAI, I have identified a need for a solution that can enforce content safety and data privacy while still allowing for the rapid expansion of knowledge.

To address this need, I have developed a new concept called SOLAR, which is a large language model (LLM)-based cognitive network that is specifically designed to enforce content safety and data privacy. SOLAR is intended to be used in offline or distributed modes, making it an ideal solution for PAVAI and other private voice assistants.

By using LLMs as the foundation for SOLAR, we can ensure that the network is able to quickly and effectively expand its knowledge base, while the cognitive network design allows for the enforcement of content safety and data privacy rules. This will help to ensure that PAVAI and other voice assistants are able to provide safe and private services to their users, even when operating in offline or distributed modes.

Cognitive network (CN) is a new type of data network that makes use of cutting edge technology from several research areas machine learning, knowledge representation, computer network, network management to solve some problems.
Large Language models (LLM) build with layers neural networks and train with massive data with the intend to similar human brain Neuron activities.
Our Solar System is the natural gravitational bound system of the Sun and the objects that orbits.

The Concept Design

Conceptually, the architecture of SOLAR is similar to that of a natural solar system, with the Sun representing the general knowledge large language model (LLM) and planets representing domain-specific knowledge or task-focused models. The rings around each planet serve as the safety and data privacy enforcement layers, helping to ensure that all content generated by the models is safe and private.

The cognitive network acts as the gravitational force that separates the content and rules, allowing for the enforcement of content safety and data privacy while still enabling the rapid expansion of knowledge. I believe that this design and architecture is well-suited for the PAVAI solution, as it allows for the creation of a safe and private voice assistant that is able to quickly and effectively expand its knowledge base.

see diagram below to illustrate the concept.

Figure-1 SOLAR — LLM Cognitive Network Concept

The key Components in SOLAR LLM Cognitive Network are:

Solar Client —used by application to interact with LLM Moderation & Query for application.

Data Privacy Engine for Personal Identification Identifier analysis and anonymization; the technique apply here is entity recognition and de-identification ruleset.

Content Safety Check for safety detection and classification of user input and LLM model output text.

Cognitive Network Router perform following tasks.

Task-1 modelling user query to specific topic or user intents
Task-2 map to subject expert LLM models configure in SOLAR
Task-3 To obtain high-quality responses to user questions, it is often beneficial to forward the query on-demand to a model that has been specifically trained in the relevant domain of knowledge. For example, if a user asks a question about medicine, the query should be forwarded to a model that has been trained on medical data, rather than a generalized LLM.
This approach can be applied to any domain of knowledge, such as finance, law, or even a custom personal model. By forwarding user questions to the right model for the task, we can ensure that the responses are accurate and relevant, providing a better user experience. This can be achieved through the use of a routing mechanism that is able to identify the appropriate model for each user question and forward the query accordingly.

Functionary Models for support of Function Calling and Self-Critique.

General Purpose Model for support user interactions inclduing chating or completion.

Domain Specific Models are specialized LLMs that are trained for specific tasks or domains of knowledge. For example, the Llava model is a domain-specific model that is designed for multimodal interaction with images, while the StableDifussion model is a domain-specific model that is trained for text-to-image generation.

Other examples of domain-specific models include those that are trained in finance, law, medicine, or other subjects. These models are able to provide more accurate and relevant responses than a generalized LLM, as they have been specifically trained on data from the relevant domain.

In addition to these pre-trained domain-specific models, it is also possible to train your own specialized model for a specific task or domain of knowledge. This can be done using a variety of techniques, such as transfer learning or fine-tuning, and can help to ensure that the model is able to provide the most accurate and relevant responses possible.

How it works?

To use the SOLAR network for AI applications, follow these steps:

Configure the SOLAR network policy to specify what content is considered safe and what is not safe. This will help to ensure that all content generated by the network meets the necessary safety standards.
Specify data protection settings to detect and protect against potential data risks. This should include the ability to automatically anonymize any personal identifiable information (PII) that is detected.
Use the SOLAR network client moderation and query method to interact with the LLM API, just like you would with a regular chat API call.
When a user query text is received, the data privacy engine will perform an analysis of the text to scan for any PII content. If PII is detected, the user will be alerted and asked to revise their question to be safer.
If the query text passes all safety and content checks, the cognitive router will attempt to model the user’s query text to a specific topic or intent. This will allow the router to map the query to a target LLM model with the corresponding domain expert knowledge.
The cognitive router can then perform a self-critique by taking a second option response from another model and combining both model responses. The router can then perform a self-critique by asking the same question again using only responses from the two or more models. This will result in a higher-quality response that has been carefully vetted by multiple models.

By following these steps, you can effectively use the SOLAR network to enforce content safety and data privacy, while still allowing for the rapid expansion of knowledge. This will help to ensure that your AI applications are safe, private, and able to provide high-quality responses to user queries.

The Implementation Approach

The goal of the implementation is to create a “system-brain” for PAVAI applications, specifically for the Vocei and Talkie applications. These applications must be able to operate in both offline and distributed modes, similar to a solar system.

To achieve this, I have taken a flexible and dynamic approach by utilizing a variety of open source AI models in machine learning and large language models. This allows for the creation of a robust and scalable system that can be easily adapted to meet the needs of the Vocei and Talkie applications. By taking advantage of these existing models, I am able to quickly and efficiently build a system-brain for PAVAI that is able to provide high-quality responses to user queries, even in offline or distributed modes..

The Solar cognitive network implementation is a python library, Data Security Engine also an embedded library but can move to API based later.

Use LLamaGuard for content safety check — please note this is a latest model from Meta required obtain access approval. To reduce resource utilisation, I downloaded a quarantined GGUF file version from HF.

Use Zerphy as a general purpose LLM since it’s a fine tune version of Mistral and strong on multilingual support. I also tested mixtral-8x7b-instruct-v0.1, it works okay but less resposive due larger model size.

For Multimodal support, I tried both LLava-1.5–7B and Bakllava1 both seem to work reasonable well.

For Function Calling support and Self-Critique , I use picked the support model traing to support Function calling in Llama.cpp python server.

Then a use a selection of machine learning model for topic modelling and intent content classification.

Here’s how it looks like in the actual implementation

The deployment can be offline or distributed mode. The primary criteria s for Offline Mode support is able download models and save it locally. At runtime, PAVAI cognitive network can run in two mode.

All.in.One Mode — single application running all models if you have a limited hardware.
Solar Mode — with one or two LLama.cpp server serving a set of LLM models. break up into two server group tends to archive better overral performance.

here’ a code snippet SOLAR the client to illustrate the network implementation based on a Command Design Pattern.


class LLMSolarClient():
    """settings"""
    _skip_content_safety_check = True
    _skip_data_security_check = False
    _skip_self_critique_check = False
    """objects"""
    _cn_invoker = None
    _datasecurity = None
    _default_client = None
    _domain_client = None

    def __init__(self,
                 default_url: str = None,
                 default_api_key: str = "EMPTY",
                 domain_url: str = None,
                 domain_api_key: str = "EMPTY",
                 skip_content_safety_check: bool = True,
                 skip_data_security_check: bool = False,
                 skip_self_critique_check: bool = False) -> None:
        # api client
        self._default_client = openai.OpenAI(
            api_key=default_api_key, base_url=default_url)
        self._domain_client = openai.OpenAI(
            api_key=domain_api_key, base_url=domain_url)
        # security engine
        self._datasecurity = DataSecurityEngine()
        self._skip_content_safety_check = skip_content_safety_check
        self._skip_data_security_check = skip_data_security_check
        self._skip_self_critique_check = skip_self_critique_check
        # solar network
        self._cn_invoker = LLMSolarNetwork(default_client=self._default_client,
                                           domain_client=self._domain_client,
                                           data_security=self._datasecurity,
                                           skip_content_safety_check=self._skip_content_safety_check,
                                           skip_data_security_check=self._skip_data_security_check,
                                           skip_self_critique_check=self._skip_self_critique_check)
        self._cn_invoker.set_on_start(StartAction(self._cn_invoker))
        self._cn_invoker.set_on_input(InputAction(self._cn_invoker))
        self._cn_invoker.set_on_routing(RoutingAction(self._cn_invoker))
        self._cn_invoker.set_on_thinking(ThinkAction(
            network=self._cn_invoker, receiver=LLMReceiverAPI(self._cn_invoker)))
        self._cn_invoker.set_on_output(OutputAction(self._cn_invoker))
        self._cn_invoker.set_on_finish(FinishAction(self._cn_invoker))

Performance Consideration

Enabling data privacy checks has a minimal impact on performance, as the query input and output are typically small text blocks.

However, enabling content safety checks can add an additional 15% to the response time, as it requires an API call to the LlamaGuard LLM model. Running LlamaGuard as a default model on a local server can reduce this delay to 5–8%.

Enabling self-critique functionality can significantly increase response times, as it requires multiple LLM calls to various models. This can double the response time, or even more in some cases.

In summary, while safety and data privacy features can add additional response time and compute resources, they are an important investment in the security and reliability of your AI system. Think of them like the cost of installing an antivirus tool on a PC — while they may add some overhead, they provide valuable protection and peace of mind. By carefully balancing the need for safety and performance, you can create an AI system that is both secure and efficient.

In-Action

Data Privacy Enforcement Sample Screenshot

The screenshot below shows a sample LLM response message after applying anonymization. According to the policy, all personal identifiable information (PII) must be removed from the text and replaced with appropriate tags. In this example, the actual name and location have been replaced with <PERSON> and <LOCATION> tags, respectively. This helps to ensure that sensitive information is protected and not disclosed in the LLM response.

PAVAI.Vocie — Data Privacy PII Protection Example

Content Safety Check sample screenshot below on SOLAR Content Moderation and rejection as define in the safety policy.

Summary

While the SOLAR Cognitive Network may not be a perfect fit for all use cases and scenarios, it is well-suited for PAVAI solutions that require the enforcement of strict security and privacy standards. The SOLAR network provides a robust and flexible framework for implementing data privacy, content safety, and self-critique functionality, making it an ideal choice for applications that require high levels of security and safety.

In addition to PAVAI solutions, the SOLAR network is also applicable to other solutions that require similar levels of security and safety. This includes applications in industries such as healthcare, finance, and legal, where the protection of sensitive information is of the utmost importance.I am confident that the SOLAR network will be a valuable asset for any organization looking to build a secure and reliable AI system. I hope that you will find it as useful and effective as I have.

##@TODO — add code git repository here after documentations.

Have a nice day!

Thanks

Credits and References:

Llama Guard: LLM-based Input-Output Safeguard for Human-AI Conversations
🦙 Python Bindings for llama.cpp
LangChain
LlamaIndex
Haystack