We are back with another episode of True ML Talks. In this, we again dive deep into LLMs, LLMops and Generative AI, and we are speaking with Michael Boufford.
Michael is the CTO of Greenhouse who joined as the first employee about 11 years ago, and so wrote the first lines of code and got to build the company up to where it is today.
📌
Our conversations with Mike will cover below aspects:- Organizational Structure of ML Teams at Greenhouse- How LLMs and Generative AI Models are Used in Greenhouse- Navigating Large Language Models- Understanding Prompt Engineering- LLMOps and Critical Tooling for LLMs
Greenhouse's data science and machine learning teams have evolved with the company's growth, transitioning from generalists to specialized roles. Key aspects of their organizational structure include:
Additionally, a Business Analyst team addresses business-related questions and provides insights.
Infrastructure management is the responsibility of a separate Infrastructure team, overseeing components like Kubernetes and AWS. Data stores have a dedicated team for management.
Here are the several use cases where these models have been employed within Greenhouse's operations.
While ChatGPT, powered by models like GPT-4, offers impressive results, there are still some challenges and concerns associated with its use. Here are a few problems that arise with ChatGPT:
When considering whether to invest in self-hosted models or rely on large commercial language models, leaders should carefully evaluate the following factors:
Prompt engineering has become a topic of debate within the field of large language models (LLMs). It involves crafting effective prompts to elicit desired responses from the model. Here are some key points to understand the concept and its implications:
LLMOps and the tooling landscape around large language models (LLMs) are gaining attention.When it comes to prompt management, quick data handling, labeling feedback, and other essential tasks, certain tools are expected to play a critical role as LLM usage expands. Some key considerations include:
Evaluating Large Domain ModelsIn the context of human involvement in evaluation, the "human in the loop" approach is commonly employed in serious use cases with LLMs. Human validation is crucial to assess the performance of the model and validate its output. Even during the fine-tuning process of GPT models, human involvement played an essential role.For less critical use cases where there is room for some margin of error, a cost-effective approach involves using larger models to evaluate the responses of smaller models. Multiple responses generated by the smaller models can be compared and rated by a larger model, allowing for the establishment of metrics to measure performance. While this approach incurs some costs, it is generally more economical compared to relying solely on human efforts.
Staying updated in the ever-evolving world of LLMs and machine learning can be challenging. Here are some effective approaches to staying informed and gaining knowledge:
Keep watching the TrueML youtube series and reading the TrueML blog series.
TrueFoundry is a ML Deployment PaaS over Kubernetes to speed up developer workflows while allowing them full flexibility in testing and deploying models while ensuring full security and control for the Infra team. Through our platform, we enable Machine learning Teams to deploy and monitor models in 15 minutes with 100% reliability, scalability, and the ability to roll back in seconds - allowing them to save cost and release Models to production faster, enabling real business value realisation.
Join AI/ML leaders for the latest on product, community, and GenAI developments