AI models like ChatGPT and Claude are powerful, but they aren’t perfect. They can sometimes produce inaccurate, biased, or misleading answers due to issues related to data quality, training methods, prompt handling, context management, and system deployment. These problems arise from the complex interaction between model design, user input, and infrastructure. Here are the main factors that explain why incorrect outputs occur: 1. Model Training Limitations AI relies on the data it is trained on. Gaps, outdated information, or insufficient coverage of niche topics lead to shallow reasoning, overfitting to common patterns, and poor handling of rare scenarios. 2. Bias & Hallucination Issues Models can reflect social biases or create “hallucinations,” which are confident but false details. This leads to made-up facts, skewed statistics, or misleading narratives. 3. External Integration & Tooling Issues When AI connects to APIs, tools, or data pipelines, miscommunication, outdated integrations, or parsing errors can result in incorrect outputs or failed workflows. 4. Prompt Engineering Mistakes Ambiguous, vague, or overloaded prompts confuse the model. Without clear, refined instructions, outputs may drift off-task or omit key details. 5. Context Window Constraints AI has a limited memory span. Long inputs can cause it to forget earlier details, compress context poorly, or misinterpret references, resulting in incomplete responses. 6. Lack of Domain Adaptation General-purpose models struggle in specialized fields. Without fine-tuning, they provide generic insights, misuse terminology, or overlook expert-level knowledge. 7. Infrastructure & Deployment Challenges Performance relies on reliable infrastructure. Problems with GPU allocation, latency, scaling, or compliance can lower accuracy and system stability. Wrong outputs don’t mean AI is "broken." They show the challenge of balancing data quality, engineering, context management, and infrastructure. Tackling these issues makes AI systems stronger, more dependable, and ready for businesses. #LLM
AI Limitations Overview
Explore top LinkedIn content from expert professionals.
-
-
Generative AI continues to generate excitement, but significant challenges are often overlooked. Reports from respected sources such as Harvard Business Review and Goldman Sachs highlight that current expectations may not align with reality. The technology, while promising, has limitations that need to be acknowledged and addressed. In May, Harvard Business Review discussed "AI's Trust Problem," in June, Goldman Sachs raised doubts about whether the expected $1 trillion in AI investment will deliver substantial returns. Their concern: aside from developer efficiency, there may not be enough value to justify such massive spending, especially in the near term. Jim Covello, Goldman Sachs' head of global equity research, pointed out that replacing low-wage jobs with costly technology contradicts earlier tech transitions, which focused on improving efficiency and affordability. A recent analysis from Planet Money echoes this skepticism, listing “10 reasons why AI may be overrated.” Issues like hallucinations (when AI generates false or misleading information) and declining quality in AI-generated outputs raise concerns about its readiness for widespread use. A study by The Washington Post also examined what people ask AI chatbots about, revealing unexpected trends. Along with common academic assistance, some topics raised ethical and personal concerns. 🔍 Reality check: Generative AI can be impressive but often struggles with accuracy, leading to errors or hallucinations. 💸 Investment risks: Financial experts question the value of massive investments in AI and wonder if the technology will offer enough returns in the short term. 📉 Productivity vs. quality: While AI can increase productivity, particularly in coding, research shows that the quality of AI-generated code is often subpar. 📚 Help with homework: Students turn to AI chatbots for homework help, but concerns arise when AI provides direct answers rather than guidance or learning support. ❓ Personal and sensitive queries: Many chatbot users ask about personal topics, including sex and relationships, which raises ethical questions about privacy and appropriate use. These points serve as a reminder that while generative AI is a powerful tool, it’s important to approach it with realistic expectations and a clear understanding of its current limitations. #GenerativeAI #AIEthics #AIRealityCheck #AIinEducation #TechInvestments #AIProductivity #AIChallenges #AIHomework #AIandSex #AIinConservation #AIFuture #AIHype
-
OpenAI somewhat apologetically launched GPT-4.5 yesterday along with several caveats, instead of its usual fanfare - it’s not a frontier model, and it won’t top any benchmarks. So what is the selling point? A slightly more emotionally intelligent model that understands nuance and hallucinates less (37.1%, a significant improvement over GPT-4o's 61.8%). This isn’t an intelligence upgrade. It’s a trust upgrade. That’s progress, sure. But at what cost? GPT-4.5 is 15x more expensive than GPT-4o! Even Sam Altman calls it a "giant expensive model." And that’s because it is. GPT-4.5 is an expensive reminder that we’ve hit the limits of scaling pre-training compute. The easy gains are over. Making AI smarter was cheap. Making it trustworthy is brutally expensive. Few questions on my mind post this launch. 1️⃣ Is AI Now a Marginal Game? The era of effortless gains from scaling compute seems to be over. Now, the biggest challenge is alignment—controlling hallucinations, improving trust—and that doesn’t come cheap. Small reductions in AI’s tendency to confidently make things up seem to cost orders of magnitude more than past leaps in raw intelligence. 2️⃣ Is OpenAI Repositioning AI for Enterprise? For regular users, GPT-4.5’s pricing is outrageous. But for finance, healthcare, and law—where hallucinations aren’t funny, they’re lawsuits—this might be exactly what they need. This one may be for the compliance teams, not the consumers. 3️⃣ Are We Entering the ‘Usability’ Era of AI? GPT-4.5 doesn’t make AI smarter. It makes it sound more human, communicate more smoothly, and be less wrong. That’s a shift from "Can this AI reason?" to "Can this AI be trusted?" The real test is whether the market believes a slightly more reliable, slightly more human-like AI is worth a 10x price jump. Will customers buy it or will they settle for a much cheaper, marginally dumber, sometimes hallucinating sidekick.
-
As a consumer, I can't stand AI content right now. I know it will get better. I know people who understand it will have better job prospects in the future. But right now, all I see if low-effort slop. I'm not just talking about social content, though this weekend I saw a picture of Elon Musk sleeping on an office floor in an American flag sleeping bag that most of Facebook seemed to believe is real... (*sigh*) I'm talking about your e-mails, DMs, and texts. Yes, people who I know follow me on LinkedIn. YOUR messages. And whether it's a message or social content, I have the same issue: I don't CARE if it's more efficient for YOU. I'm the consumer. I just want the end product to be GOOD. AI getting you 90% of the way there isn't good enough for me. I don't care if you can get 20 times as many messages out per day. When it's impersonal mass messages, I'm not responding. When your e-mails are long, well-formatted messages that continually repeat points (as AI is wont to do), I'm going to think you're lazy. If you have outright obvious errors as I did in one e-mail today (clearly thought our client Sketch actually DREW PICTURES, like 'sketching', instead of just being his gamertag), I'm going to forever believe you're wasting my time when you pop up in my DMs. AI is a shortcut where the end goal isn't precision. AI is a wonderful organizer of thoughts. And, most importantly, a fantastic way to churn out ROUGH DRAFTS. Treat it as such.
-
How well does AI write code? According to medalist judges, AI’s code is not so great. But there were a few surprises buried in this paper. This is the most critical and comprehensive analysis of AI coding agents so far. I expected Claude 3.7 to be near the top, but OpenAI’s o4 and Gemini 2.5 Pro scored significantly higher. Both can solve most coding problems that the judges ranked as ‘Easy’, and the solutions cost pennies to generate. OpenAI’s o4-mini-high delivered solutions that only required human interventions 17% of the time for $0.11. Compare that to the cost of a software engineer implementing the solution, and the benefits are obvious. It generated complete implementations for medium problems 53% of the time, also at a significant cost savings. However, its reliability drops to 0 for hard problems. Researchers found that AI coding assistants are exceptionally useful if they are given access to the right tools and focused on simple or medium difficulty problems. With tools and multiple attempts, solution accuracy doubled for some LLMs, and they were able to solve a small number of hard problems. However, programming skills and software engineers are still required. AI coding tool users must be able to identify flawed implementations and know how to fix them. Even with tools and multiple attempts, AI coding assistants still fumble problems at all difficulty levels. Code reviews and validation continue to be critical parts of the workflow, so the hype of vibe-coding and AI replacing software engineers is still just a myth. At the same time, the software engineering workflow is changing dramatically. Multiple researchers have attempted to determine how much code is written by AI vs. people, but accurate classification methods are proving elusive. At the same time, research like this makes the trend undeniable. $0.11 per implementation represents a cost savings that businesses won’t pass up. The future of software engineering is AI augmented. An increasing amount of code will be written by AI and validated by people. Most code required to implement a feature falls into the medium or easy category. AI coding assistants can’t do the most valuable work, but their impact on the time it takes to deliver a feature will be bigger than the benchmarks indicate. Now that we’re seeing research into the root causes of implementation failure, like this paper, expect AI coding tools to accelerate their capabilities development rate in the next two years. For everyone in a technical role, it’s time to think about how to adapt and best position yourself for the next 5-10 years.
-
The report by BBC on how #AI assistants, specifically OpenAI's ChatGPT, Microsoft's CoPilot, Google's Gemini and Perplexity represent news content from BBC is very timely and extremely relevant to the ongoing debate on the veracity of information presented by such agents. While the general sense of findings might come as no surprise to anyone familiar with how the foundation models tend to #hallucinate, the scope and scale of such errors was generally limited to anecdotal observations. Also, the independent viewpoint from a publisher was missing. This report is an important addition to the growing body of experimentation, and hard evidence that points towards a need for more work in this area. Per BBC, "the findings are concerning, and show: 51% of all AI answers to questions about the news were judged to have significant issues of some form 19% of AI answers which cited BBC content introduced factual errors – incorrect factual statements, numbers and dates 13% of the quotes sourced from BBC articles were either altered or didn’t actually exist in that article." BBC is right in noting that the scale and scope of such errors, and the distortion of trusted content is unknown, and it is not just the audiences, media companies and the regulators who do not know the extent of this issue, but possibly even the AI companies themselves also don't know it. I agree with their perspective that publishers, like themselves, should have control over whether and how their content is used, with the AI companies being able to demonstrate how agents process the information, along with the scale and scope of errors and inaccuracies they produce. With such performance, perhaps it is understandable why they want to block these AI agents to crawl their websites, for they believe when AI assistants cite trusted brands like BBC as the source, even if incorrect, the audiences are more likely to trust the answers. When I read marketing stories of how #GenAI helps improve task #productivity for normal things such as conducting market research or peruse through reams of documents and summarize them (peruse...in 5 minutes...really?), and I compare and contrast with a report like this, I am left wondering as to what is the meaning of such "productivity?" If fact-checking of content is dropped out as a key criteria of performing research, but instead one relies on readily obtained content that appears to be authentic, does saving those previous few hours really count? And what about the lay people who, wrongly or rightly, believe the printed word, or the displayed word in this case as the true gospel? Is misinformation at scale masquerading as readily-available and quickly-obtainable "intelligence" going to be the new normal? It is not the limitless power of AI that scares me. The current "performance" is enough! Cognitive Chasm book https://lnkd.in/gJGQMK4V
-
Article from NY Times: More than two years after ChatGPT's introduction, organizations and individuals are using AI systems for an increasingly wide range of tasks. However, ensuring these systems provide accurate information remains an unsolved challenge. Surprisingly, the newest and most powerful "reasoning systems" from companies like OpenAI, Google, and Chinese startup DeepSeek are generating more errors rather than fewer. While their mathematical abilities have improved, their factual reliability has declined, with hallucination rates higher in certain tests. The root of this problem lies in how modern AI systems function. They learn by analyzing enormous amounts of digital data and use mathematical probabilities to predict the best response, rather than following strict human-defined rules about truth. As Amr Awadallah, CEO of Vectara and former Google executive, explained: "Despite our best efforts, they will always hallucinate. That will never go away." This persistent limitation raises concerns about reliability as these systems become increasingly integrated into business operations and everyday tasks. 6 Practical Tips for Ensuring AI Accuracy 1) Always cross-check every key fact, name, number, quote, and date from AI-generated content against multiple reliable sources before accepting it as true. 2) Be skeptical of implausible claims and consider switching tools if an AI consistently produces outlandish or suspicious information. 3) Use specialized fact-checking tools to efficiently verify claims without having to conduct extensive research yourself. 4) Consult subject matter experts for specialized topics where AI may lack nuanced understanding, especially in fields like medicine, law, or engineering. 5) Remember that AI tools cannot really distinguish truth from fiction and rely on training data that may be outdated or contain inaccuracies. 6)Always perform a final human review of AI-generated content to catch spelling errors, confusing wording, and any remaining factual inaccuracies. https://lnkd.in/gqrXWtQZ
-
Is generative AI: the key to unprecedented productivity or a cause for future mass unemployment? The Oliver Wyman Forum's report indicates that generative AI ✅could contribute up to $20 trillion to global GDP by 2030 ✅save 300 billion work hours annually. Yet, while 96% of employees believe AI can help in their current jobs, 60% are afraid it will automate them out of work, and 61% do not find it very trustworthy. The survey across 16 countries revealed ✅55% of employees use generative AI weekly, ✅but only 36% receive sufficient AI training from their employers. ✅40% of users would rely on AI for major financial decisions, ✅30% would share more personal data for a better experience, despite their mistrust. Generative AI's impact is already significant: it could displace millions of jobs globally, with one-third of all entry-level roles at risk of automation. Meanwhile, junior employees armed with AI may potentially replace their first-line managers, creating a vacuum in the job pyramid. 𝐓𝐨 𝐦𝐚𝐱𝐢𝐦𝐢𝐳𝐞 𝐛𝐞𝐧𝐞𝐟𝐢𝐭𝐬, 𝐜𝐨𝐦𝐩𝐚𝐧𝐢𝐞𝐬 𝐦𝐮𝐬𝐭 𝐚𝐝𝐨𝐩𝐭 𝐚 𝐩𝐞𝐨𝐩𝐥𝐞-𝐟𝐢𝐫𝐬𝐭 𝐚𝐩𝐩𝐫𝐨𝐚𝐜𝐡, 𝐢𝐧𝐯𝐞𝐬𝐭𝐢𝐧𝐠 𝐢𝐧 𝐰𝐨𝐫𝐤𝐞𝐫 𝐭𝐫𝐚𝐢𝐧𝐢𝐧𝐠 𝐚𝐧𝐝 𝐬𝐮𝐩𝐩𝐨𝐫𝐭. 𝐓𝐡𝐢𝐬 𝐦𝐞𝐚𝐧𝐬 𝐜𝐫𝐞𝐚𝐭𝐢𝐧𝐠 𝐢𝐧𝐭𝐮𝐢𝐭𝐢𝐯𝐞 𝐩𝐫𝐨𝐜𝐞𝐬𝐬𝐞𝐬 𝐚𝐥𝐨𝐧𝐠𝐬𝐢𝐝𝐞 𝐭𝐞𝐜𝐡𝐧𝐨𝐥𝐨𝐠𝐲 𝐚𝐧𝐝 𝐚𝐝𝐝𝐫𝐞𝐬𝐬𝐢𝐧𝐠 𝐞𝐦𝐩𝐥𝐨𝐲𝐞𝐞 𝐜𝐨𝐧𝐜𝐞𝐫𝐧𝐬 𝐭𝐨 𝐚𝐯𝐨𝐢𝐝 𝐦𝐨𝐫𝐚𝐥𝐞 𝐝𝐞𝐜𝐥𝐢𝐧𝐞 𝐚𝐧𝐝 𝐢𝐧𝐜𝐫𝐞𝐚𝐬𝐞𝐝 𝐭𝐮𝐫𝐧𝐨𝐯𝐞𝐫. Here are some facts that caught my attention: ✅In the healthcare sector, generative AI could save doctors three hours a day by 2030, enabling them to serve an additional 500 million patients annually. ✅AI could democratize access to mental health support, potentially reaching 400 million new patients globally. ➡Despite its potential, generative AI presents risks, including hallucinations, black-box logic, cyberattacks, and data breaches. Managing these risks requires a dynamic model of test, measure, and learn, with proactive involvement from business leaders, regulators, and consumers. 𝐀𝐧𝐝 𝐰𝐡𝐚𝐭 𝐚𝐛𝐨𝐮𝐭 𝐜𝐫𝐞𝐚𝐭𝐢𝐯𝐢𝐭𝐲? The report highlights a significant potential for generative AI to enhance creativity. By automating routine and monotonous tasks, AI frees up time for workers to engage in more thoughtful and creative aspects of their jobs. This new productivity paradigm could redefine the value of work, emphasizing innovation and collaboration between humans and AI. ➡ However, there are concerns about originality and authenticity, as AI-generated content may blur the lines between human and machine creativity. As we stand at this pivotal juncture, HOW are we prepared to navigate the risks and rewards of generative AI? Or maybe it's a matter of WHEN. Let me know what data points in the report caught your attention and how you think they might evolve. ⬇
-
A word of caution. Attributing reasoning and logic to Generative AI (Gen AI) is a mistake. Gen AI is great at producing plausible output. The output may not be accurate all the time. It works great when we are not looking for accuracy but possibilities. For example, ask it to rewrite some text or give you a recipe or a tour plan or a poem; it works fine. At least, it will appear so to us. But, if you expect it to provide you with factually accurate information every time, it's not there (Yet 😃 ). For example, when we turned on Gen AI-based responses on our live chat, it started to spew grammatically accurate but factually inaccurate information to our clients. There are techniques and approaches to improve accuracy, but it will not be 100%. Don't get me wrong. Gen AI is one of the most transformative technology innovations we will see in our lifetime (Internet & Mobile Phones are among the others on my list). As my friend Hemant puts it, at the moment, Gen AI is great at three things : 🔹 Translation (Example: Text to code, Language to language, format to format) 🔹 Summarizing (Example: Extracting insights from reviews, call transcripts) 🔹 Semantic search There is a lot that can be done with these three things. However, we should just be clear-eyed about its limitations. Otherwise, it will disappoint, or worse burn big a hole in your pocket 😃 . Your thoughts? #generativeAI #amazonadvertising #walmartconnect ___________________ Follow Me Here 👉 https://lnkd.in/gp3Q6H8B
-
🌠 What to know about OpenAI's GPT 4.5 launch -- One of the weirder model updates I've written... it's incredibly important for the industry, though not for the reasons we expected. -- overall sentiment: people are underwhelmed. GPT-4.5, codenamed Orion, was highly anticipated as the Next Generation Foundation Model, fueled by significant hype from both OpenAI and the community. However, in practice, it delivers only incremental improvements over GPT-4o at best. Even OpenAI's own published benchmarks highlight only modest gains. While it has been trained to exhibit greater emotional intelligence and reduced hallucination rates, these enhancements could have been achieved through fine-tuning rather than necessitating a full-fledged next-generation upgrade. -- sticker shock - the model costs $75/1M input tokens, $150/1M output tokens. By comparison, Claude 3.7 costs $3/1M input tokens and $15/1M output tokens. This price tag is outright impractical for the vast majority of use cases, particularly for only incremental improvements. I do suspect that this pricing may be temporary - OpenAI independently noted that they have "run out of GPUs" and therefore want to limit usage only to true aficionados. So why is this important for the broader industry? Throwing data and GPUs at the models have hit their limit. This puts Sam Altman's product roadmap announcement last month into perspective - he likely knew that 4.5 would underwhelm. He was pre-emptively sharing the plans for GPT 5, as it will evolve to integrate both conventional AI models along with chain of thought reasoning models. It looks like reasoning models will lead us to the next level of improvements in GenAI. PS this will be the first time ever I will *not* recommend trying the new model for yourself. It's simply not worth the cost.
Explore categories
- Hospitality & Tourism
- Productivity
- Finance
- Soft Skills & Emotional Intelligence
- Project Management
- Education
- Technology
- Leadership
- Ecommerce
- User Experience
- Recruitment & HR
- Customer Experience
- Real Estate
- Marketing
- Sales
- Retail & Merchandising
- Science
- Supply Chain Management
- Future Of Work
- Consulting
- Writing
- Economics
- Employee Experience
- Workplace Trends
- Fundraising
- Networking
- Corporate Social Responsibility
- Negotiation
- Communication
- Engineering
- Career
- Business Strategy
- Change Management
- Organizational Culture
- Design
- Innovation
- Event Planning
- Training & Development