Home
Posts
Posts
Speeding Up LLM Inference
Techniques to speed up inference of LLMs to increase token generation speed and reduce memory consumption: mixed-precision, Bfloat16, quantization, fine-tuning with adapters, continuous batching
Last updated on Aug 21, 2023
5 min read
LLM
,
Optimization
Introducing K8sGPT: Giving AI Superpowers to Kubernetes Users
k8sgpt is a tool for scanning your Kubernetes clusters, diagnosing, and triaging issues in simple English. It has SRE experience codified into its analyzers and helps to pull out the most relevant information to enrich it with AI.
Last updated on Aug 20, 2023
2 min read
Intro
,
Tutorial
Welcome to Wowchemy, the website builder for Hugo
Welcome π We know that first impressions are important, so we’ve populated your new site with some initial content to help you get familiar with everything in no time.
Last updated on Apr 16, 2023
3 min read
Demo
,
ζη¨
Writing technical content in Markdown
Wowchemy is designed to give technical content creators a seamless experience. You can focus on the content and Wowchemy handles the rest. Highlight your code snippets, take notes on math classes, and draw diagrams from textual representation.
Last updated on Aug 27, 2022
6 min read
Display Jupyter Notebooks with Academic
Learn how to blog in Academic using Jupyter notebooks
Last updated on Apr 16, 2023
2 min read