BROKEN CHAIN WITH WEAK LINK ON RED BACKGROUND (Photo by H. Armstrong Roberts/ClassicStock/Getty … [+]
The AI is good. The entire spectrum of artificial intelligence (AI), from predictive to reactive to prescriptive to generative AI, and the machine learning (ML) capabilities that power it, are commonly considered as evolving technical developments that can, as a whole, benefit society if we apply them carefully. .
However, there is an if and a but (and maybe even an occasional maybe) to this proposition.
The various fears associated with AI that need to be analyzed do not concern the question of which job roles and workplace functions could soon be fully automated by robots and driven by AI. The general panic is over in this regard and most people understand that some menial jobs will disappear, more high value jobs can be created and existing roles can now be augmented and positively accelerated by AI to improve our lives.
All that said, strengths, weaknesses, opportunities and threats. (SWOT) analysis of the current state of AI wouldn’t hurt. For the sake of narrative here, let’s reorganize this analysis into Opportunities, Strengths, Weaknesses, and Essential Care and Consideration of Threats (OSWT).
Opportunities
There is so much we can do with AI and Large Language Models (LLMs) if we take the opportunity to really understand how they work. If we ask ChatGPT to describe Einstein’s behavior general theory of relativity, we obtain a fairly very precise answer. But at the end of the day, ChatGPT is still “just” a computer program (like all other LLMs are) that blindly executes its instruction set. He doesn’t understand Einstein’s theory of general relativity any better than your favorite pet.
“Unfortunately, we use ‘human’ words to describe the techniques engineers use to create AI models and functions. For example, we talk about “machine learning” and “training” in the context of how we work with LLMs in AI. This is misleading because an LLM does not have a mind like that of a human,” clarified Keith Pijanowskisenior technologist and SME AI/ML at MinIOa company known for its work in high-performance open source object storage for cloud-native workloads such as those currently running for AI.
There’s a certain irony here, Pijanowski says: How can a non-thinking chatbot correctly summarize the findings of the smartest man who ever lived? If we can better understand the essentially adversarial nature of LLMs, we may be able to uncover more opportunities to use these new intelligence functions that we haven’t even considered yet.
Strengths
The strength of LLMs is that they are trained to understand the probability distribution of words in the training set used to create them. If the training set is large enough (i.e. a corpus of Wikipedia articles or public code on GitHub), then the models will have a vocabulary and corresponding probability distribution that will give the impression that their results have a real understanding of the text they produce.
If we move on to an example from philosophy and ask ChatGPT the question: “What does “cogito, ergo sum” mean and who wrote it? the result is something similar to the text below:
“Cogito, ergo sum” is a Latin philosophical proposition that translates to “I think, therefore I am” in English. This statement is famous for being associated with René Descartes, a French philosopher, mathematician and scientist. Descartes expressed this idea in his work “Discourse on Method”, published in 1637. The expression reflects Descartes’ attempt to establish a fundamental truth that cannot be doubted: the certainty of one’s own existence as a ‘to be thinking.
“So we’re looking at the strengths element here, and as noted earlier, LLMs produce results like this using probability distributions,” Pijanowski explained. “It works something like this: They start by looking at the text of the question and determine that the word “cogito” has the highest probability of being the first word in the answer. From there, they look at the question and the first word in the answer to determine which word has the highest probability of being next. This continues over and over until a special “end of answer” character is determined to have the highest probability.
Pijanowski explains that this ability to generate a natural language response based on billions of probabilities is not something to fear – rather, it is something that should be leveraged for commercial value. The results are even better when you use modern techniques. For example, using techniques such as Augmented Recovery Generation (RAG) and fine-tuning, we can teach an LLM on your specific business. Getting these human-like results will require data, and your infrastructure will need a robust data storage solution.
Now that we understand what LLMs are good at and why, let’s look at what LLMs can’t do.
Weaknesses
For Pijanowski and his team, the weaknesses are relatively obvious…and it’s a reality learned from experience working with MInIO customers. We know that LLMs cannot think, understand or reason and this is the fundamental limitation of LLMs.
“Language models do not have the ability to reason about a user’s question. These are probabilistic machines that give a very good idea of a user’s question. No matter how much of an assumption something is, it’s still an assumption and anything that creates those assumptions will eventually produce something that isn’t true. In generative AI, this is called a hallucination,” Pijanowski proposed. “When properly trained, hallucinations can be kept to a minimum. Fine tuning and RAG also significantly reduce hallucinations. Bottom line: To properly train a model, refine it, and give it relevant context (RAG), you need data and the infrastructure to store it at scale and serve it efficiently.
Threats
The most popular use of LLMs is of course generative AI. Generative AI does not produce a specific response comparable to a known outcome. This contrasts with other AI use cases, which make a specific prediction that can be easily tested.
“It’s simple to test models for image detection, categorization and regression. But how can we test LLMs used for generative AI in an impartial, factually accurate and scalable way? How can you be sure that the complex answers generated by LLMs are correct if you are not an expert yourself? Even if you are an expert, human reviewers cannot participate in automated testing performed in a CI/CD pipeline,” Pijanowski explained, highlighting what could be one of the most relevant threat factors in this area.
He laments the fact that there are a few references in the industry that can help. GLUE (General Language Understanding Evaluation) is used to evaluate and measure the performance of LLMs. This is a set of tasks that evaluate the models’ ability to process human language. SuperGLUE is an extension of the GLUE benchmark that introduces more difficult language tasks. These tasks involve resolving coreferences, answering questions, and more complex linguistic phenomena.
“While the above criteria are useful, a large part of the solution should lie in the organization’s own data collection procedures. Consider recording all questions and answers and creating your own tests based on personalized results. It will also require a data infrastructure designed to scale and perform,” concluded Pijanowski. “When we look at the strengths, opportunities, weaknesses and threats of LLMs (now reorganized in this SOWT order), if we want to exploit the first and mitigate the other two, then we will need data and a solution for storage capable of handling a lot of things. of this.
Although a SWOT analysis (in any order) of AI is arguably somewhat simplistic, prone to generalization, and in itself worthy of further auditing of fact or fiction, these technologies are currently evolving very rapidly and it This is certainly a careful evaluation exercise that we should continually request.
Remember, SWOT also stands for Success WithOut Tears.