I haven’t slept much in the past 24 hours, but I want to document my thoughts after the VMware Explore Las Vegas Hackathon! 😎
I am honored to announce that the Tanzu Application Service (TAS) team took 1st place in the hackathon with our concept of expanding the existing Tanzu Application Service, Cloud Foundry, and BOSH ecosystem with Gen AI/LLM-as-a-Service technologies.
First and foremost, I want to thank Vladimir Velikov, Cami Hough, Franky Barragan, and Brian Chang for helping to plan and run the hackathon!! You all made a great hackathon event possible, and it was great to see the VMware community hack away at many avenues of the VMware ecosystem!
Also, a big thank you to all the judges; it was great to talk with all of you and hear your feedback and ideas on how we can work with the rest of the VMware ecosystem.
In particular, it was great to talk to the “Wolf of VMware,” Chris Wolf and Alan Renouf, to hear their take on our project as they lead the cutting-edge work at VMware AI Labs. I was very excited to see the announcement of VMware Private AI today at Explore, and I think our project aligns with that philosophy.
What a privilege it was to work with Long Nguyen, Amelia Downs, and Jonathon Regher to showcase the power of using Tanzu Application Sevie and Cloud Foundry BOSH to deploy applications that make use of OpenAI and LLM-as-a-Service technologies!!
In particular, Long Nguyen, our project’s lead engineer, developed and pioneered this work with his contributions to BOSH, custom stemcell work, and various updates to the Cloud Foundry ecosystem. He was the one who got BOSH to deploy and manage VMs with GPUs – which was the critical first step to open the door for so many more things in the GenAI/LLM space. 😎
I also want to send some special thanks to David Stevenson, who helped mentor our team and assisted with providing access to his awesome TAS-hosted environment for our demo.
Disclaimer- All the work documented in this blog is POC/prototype for experimental purposes only – we are making no guarantees or commitments to this work becoming available in a VMware product or the open-source Cloud Foundry ecosystem. This is all future-facing work that is subject to change.
So, What was our problem statement?
As the slide says, our team was focused on enabling the Tanzu Application Service Customer base and the Cloud Foundry ecosystem that runs millions of application instances or containers globally with GenAI and LLMs.
If you have been in the Cloud Foundry ecosystem for some time, you know that AI/ML workloads are the types of workloads that you would typically NOT find running within a Cloud Foundry environment, as TAS/Cloud Foundry expects applications to decouple state and, generally speaking shoving these large GPU drivers inside a container isn’t a great experience on Cloud Foundry or Kubernetes.
(IMHO- This is a type of workload that’s best suited for a virtual machine, especially one managed by BOSH.)
At a super high level, our team demonstrated the ability to expand the BOSH ecosystem to deploy virtual machines equipped with GPUs to enable the hosting and training of Large Language Models(LLMs) within the Cloud Foundry ecosystem. This served as a cornerstone for running Chat-GPT-like applications on the platform.
If we drill into it a bit deeper, our team produced a custom bosh release and stemcell that was configured with the following:
- Packaged FastChat’s API server with Llama 2 models into a BOSH release.
- Deployed it via BOSH side-by-side with our TAS deployment.
- Built a TAS app that uses our new prototype, “Tanzu AI Service,” via hardcoded credentials.
One key benefit of this proposed architecture is enabling Tanzu Application Service customers to run multi-cloud private Gen AI & LLM services. Yes, that’s right, the customer or end user of the TAS deployment could run this entirely in the walls of their own data center without exposing their sensitive IP and data to a third-party cloud or API.
Let’s see it in action:
The concept of our demo was pretty simple.
- Ask the ChatGPT style Application a question that was outside the knowledge of the foundational LLM ( in this case, lama2)
- Pass it a text file that will answer your question and essentially train the LLM with domain-specific information and store that new knowledge as an embedding in a pgvector compatible Postgres DB.
- Deploy a TAS app that uses our new prototype, “Tanzu AI Service,” via hardcoded credentials to the Open AI endpoint. (emulating a service binding)
After we return from Explore, we will sit down and properly record our demo, but I can share a few screens/steps of what was demonstrated at the Hackathon.
“Ask the app running on TAS a question outside the training range.”
“Upload the domain-specific information as a test file– AKA some information that can provide the answer to your question.”
Watch the GPUs 🔥 up as the new information is ingested
The sweet graphs above are outputs from nvtop that show GPU utilization of training events on the bosh deployment after the app running on TAS is presented with data to expand its knowledge.
Ask the same question again – (we are looking for team 8):
Well, there you go– a fully functional “Chat GPT style” application running all inside a TAS/Cloud Foundry ecosystem talking to our prototype LLM-as-a-Service bosh deployment that can store embeddings (training) in our Postgres DB. 😎
So, How does this work?
You can see the overall technical architecture of the TAS LLM-as-a-Sevice prototype. One key note is that we are not only running an Open AI-compatible API in conjunction with LLMs. But we are then using a Postgres DB to store vector embeddings, which is a critical component of training our LLMs.
From a future perspective, we could potentially envision a new tile or service to be loaded into the Tanzu Application Service or Cloud Foundry marketplace.
We want your feedback!
Thanks again to everyone who attended the hackathon this year! I would love to hear if you are potentially interested in this use case so we can help drive this prototype forward into an actual product.
There are still two more days of great VMware Explore Sessions for Tanzu Application Service — be sure to check out my previous blog to register them for your catalog.