True ML talks #18 - Discussão sobre IA Generativa com Tushar Kant

Built for Speed: ~10ms Latency, Even Under Load
Blazingly fast way to build, track and deploy your models!
- Handles 350+ RPS on just 1 vCPU — no tuning needed
- Production-ready with full enterprise support
We are back with another episode of True ML Talks. In this, we dive deep into ML Platform and we are speaking with Tushar
Tushar is a seasoned MLOps leader with 20+ years of experience at top tech companies and a wide range of skills in product, business, engineering, and investment banking. He is also the co-founder of the worldwide IIT, artificial intelligence, and machine learning forum, and runs a very active Slack community for that as well.
Watch the full episode below:
IIT Artificial Intelligence and Machine Learning Forum
Forum's Vision
IIT AI/ML Forum was started with the vision of creating a community where IITians working in AI/ML could share their knowledge, collaborate, and help each other. They believed that by working together, IITians could surpass any other engineering institute in the world.
The forum has been a huge success, with over 1800 members from all over the world. The forum has organized events, supported other organizations, and grown into a thriving community in its own right.
Forum's Accomplishment
Tushar is particularly proud of three things that the forum has accomplished:
- Organizing the one-day track for AI at Icon.:This is a major conference in Silicon Valley, and the fact that the forum was able to organize the AI track is a testament to its reputation and influence.
- Building strong connections among members: The forum has helped to create lifelong friendships and business partnerships among its members.
- Providing a support system during the COVID-19 pandemic: When the world shut down, the forum continued to meet every other week, providing a source of knowledge and support for its members.
Pivotal Moments in the Growth of AI and MLOps
The combination of the Cloud computing, Transformers, Pre-training will be a major driver of innovation in AI and MLOps in the coming years. Particularly the potential of multimodal AI, which combines natural language processing and computer vision to solve complex problems.
Cloud computing has made AI more accessible and affordable for everyone. This has led to a surge of innovation in the field, as startups and individuals are now able to develop and deploy AI applications without having to invest in expensive infrastructure.
Transformers have revolutionized natural language processing and computer vision. Transformers are a type of neural network architecture that is able to learn long-range dependencies in data. This makes them well-suited for tasks such as machine translation and image recognition.
Pre-training is a technique where a large language model is trained on a massive dataset of text and code. This pre-trained model can then be fine-tuned for specific tasks, such as translation or question answering. Pre-training has significantly improved the performance of AI models on a wide range of tasks.
ChatGPT and Generative AI: Potential Applications Across Industries
ChatGPT and generative AI have the potential to revolutionize many industries. He is particularly interested in the potential of these technologies to improve customer service, reduce fraud, personalize products and services, and improve healthcare.
Examples of specific applications of ChatGPT and generative AI in different industries:
- Customer service and experience: ChatGPT and generative AI can be used to automate customer service tasks, such as responding to queries and generating reports. This can free up customer service representatives to focus on more complex tasks.
- Risk assessment and fraud detection: ChatGPT and generative AI can be used to identify and mitigate risks in the banking and finance industry. For example, they can be used to detect fraudulent transactions and assess the risk of borrowers.
- Personalization: ChatGPT and generative AI can be used to personalize products and services for customers in the retail industry. For example, they can be used to recommend products to customers based on their past purchases and browsing history.
- Premium determination and risk assessment: ChatGPT and generative AI can be used to determine insurance premiums and assess the risk of policyholders in the insurance industry.
- Patient advocacy and disease diagnosis: ChatGPT and generative AI can be used to develop patient advocacy tools and diagnose diseases more quickly and accurately in the healthcare industry.
LLMs and Risk Assessment
LLMs are still in their early stages of development, but they have the potential to revolutionize risk assessment in the financial services industry:
LLMs can process more data, faster. Risk assessment models traditionally rely on a limited amount of data, such as credit scores and income. LLMs can process much more data, such as spending patterns, buying behavior, and online behavior. This allows them to create more accurate risk assessments.
LLMs can consider associative factors. In addition to individual factors, such as credit score, LLMs can also consider associative factors, such as the company a person works for and the industry they work in. This can help them to create more comprehensive risk assessments.
Future of LLMs
Types of players in the ecosystem
He believes that there will be three types of players in the ecosystem:
- Foundation model builders: Companies like OpenAI, Google, and Meta that develop the large language models themselves.
- LLM Ops platforms: Companies like AWS and Google that provide platforms for developers to build and deploy LLM applications.
- LLM distributors: Companies that develop and sell LLM-powered products and services to end users.
📌
Electical Power Industry:
In the electrical power industry, there are generators, transmission lines, and distributors. In the LLM industry, Tushar sees foundation model builders as generators, cloud computing providers as transmission lines, and startup companies as distributors.
Closed vs. open source:
There will be a space for both closed and open source LLMs.
Closed source models will be preferred by large enterprises that need production-ready solutions with support. Open source models will be preferred by smaller companies and researchers who need more flexibility and customization.
Middleware's Role:
There will be a need for middleware to help developers use LLMs more easily and efficiently. Middleware can provide features such as model management, fine-tuning, and monitoring.
Benefits and risks of LLMs:
It is imperative to view LLMs as tools that can either amplify human capabilities or pose risks, depending on their application. Like any tool, the use of LLMs is shaped by human choices and intentions. They hold the potential to Advance medical treatments, Foster innovative educational programs, Automate tasks currently performed by humans. However, they can also Generate deepfakes, Spread misinformation, Manipulate individuals.
Human role in the development and use of LLMs:
Even as LLMs grow in sophistication, they will always fall short of fully grasping the nuances of human values. Consequently, humans retain a pivotal role in ensuring that LLMs align with our values. This includes, Establishing ethical guidelines for LLM development and usage, Educating the public about LLM benefits and risks, and Recognizing that humans possess the unique capacity to think creatively and find innovative solutions, while LLMs are constrained by their training data.
Building Generic RAG Systems: AWS vs. Startups
When it comes to constructing generic RAG systems, AWS and startups each bring their own distinct advantages and challenges to the table.
AWS Strengths: AWS is well-placed to develop generic RAG systems due to its substantial customer base and a wide range of services that can support RAG. For example, AWS offers SageMaker, a machine learning platform for training and deploying RAG models. Additionally, AWS provides various data storage and processing services ideal for RAG workflows.
AWS Weaknesses: AWS might not match the agility of startups in terms of swiftly developing and launching new products. Furthermore, AWS's focus may not be as specific as startups, especially in use cases like RAG for healthcare.
Startup Advantages: Startups excel in agility, allowing them to focus on specific use cases and rapidly innovate in the RAG domain. Their niche focus can lead to unique RAG solutions and innovations often overlooked by larger entities.
Startup Challenges: Startups often grapple with resource constraints, lacking the extensive customer base and service portfolio of AWS. Competing with AWS on price can be daunting due to the scale and resources of the tech giant.
Advice for startups that are developing RAG systems:
- Focus on specific use cases: Startups should focus on developing RAG solutions for specific use cases. This will help them to differentiate themselves from AWS and other large companies.
- Move quickly: Startups need to move quickly to develop and launch their RAG solutions. This is because AWS and other large companies can easily copy their products.
- Be an attractive M&A candidate: Startups should focus on developing RAG solutions that are attractive to M&A candidates.This will give them a way to exit their business if they are unable to compete with AWS and other large companies.
Navigating the Fast-Paced World of Generative AI
Advice for Leaders
- Seja ágil e flexível. O campo da IA generativa está em constante evolução, por isso é importante ter uma mentalidade e uma equipa que seja capaz de se adaptar rapidamente a novos desenvolvimentos.
- Concentre-se em resolver problemas reais. Não se deixe levar pelo hype da IA generativa. Em vez disso, concentre-se em identificar desafios de negócio reais que podem ser resolvidos com esta tecnologia.
- Não tenha medo de chegar tarde. Não há problema se outra pessoa o superar no mercado com uma nova solução de IA generativa. O importante é aprender com os erros deles e construir um produto melhor.
- Não force. Nem todo problema precisa de uma solução de IA generativa. Use a sua perspicácia empresarial para identificar os problemas certos a resolver com esta tecnologia.
Conselhos para líderes de ciência de dados e engenharia
- Não comece com a ferramenta. Não procure apenas formas de usar a IA generativa. Em vez disso, comece por identificar os seus desafios de negócio e depois veja se a IA generativa é a ferramenta certa para os resolver.
- Trabalhe de trás para a frente a partir do cliente. Quais são as necessidades do cliente? Quais são os seus pontos de dor? Assim que compreender o cliente, pode começar a pensar em como a IA generativa pode ser usada para os ajudar.
- Não ceda a mandatos de cima para baixo. Se a sua equipa de liderança estiver a exigir que todas as equipas apresentem casos de uso de IA generativa, não se limite a cumprir formalidades. Conteste e pergunte por que razão eles acham que a IA generativa é a solução certa para esses problemas.
Leia os nossos blogs anteriores na série True ML Talks:
Continue a assistir a TrueML série do YouTube e lendo o TrueML série de blog.
TrueFoundry é uma PaaS de Implantação de ML sobre Kubernetes para acelerar os fluxos de trabalho dos desenvolvedores, ao mesmo tempo que lhes permite total flexibilidade no teste e implantação de modelos, garantindo total segurança e controle para a equipe de Infraestrutura. Através da nossa plataforma, capacitamos as Equipes de Machine Learning a implantar e monitorar modelos em 15 minutos com 100% de confiabilidade, escalabilidade e a capacidade de reverter em segundos - permitindo-lhes economizar custos e lançar Modelos em produção mais rapidamente, possibilitando a realização de valor de negócio real.
TrueFoundry AI Gateway delivers ~3–4 ms latency, handles 350+ RPS on 1 vCPU, scales horizontally with ease, and is production-ready, while LiteLLM suffers from high latency, struggles beyond moderate RPS, lacks built-in scaling, and is best for light or prototype workloads.
The fastest way to build, govern and scale your AI














.webp)






.webp)

.webp)
.webp)





.png)



