Tagged “blog”

Insights from the Largest Study on ChatGPT Usage October 2025
OpenAI & Harvard on Consumer ChatGPT Usage

TL;DR ⏱️
- 1.5M conversations analyzed, set against 700M weekly active users
- Usage is now globally balanced, with strong growth in low- and middle-income countries
- ChatGPT is used for asking, doing, and expressing — across work and private life
(Read More) (Original-Websource)
One of the biggest missing pieces between Open-Source and Open-Weight models ... and a Weird Finding about Pre-training Data! September 2025
Open-Source ≠ Open-Weight — Pre-training Data Mystery

TL;DR ⏱️
- Pre-training data shapes everything downstream (RAG, RLHF, prompting).
- FineWeb uses per-snapshot deduplication only, not global.
- Raises open questions about contamination vs. genuine generalization.
(Read More) (Original-Websource)
Unmasking AI’s Footprint - The Real Cost of Large Language Models August 2025
The Hidden Environmental Cost of AI

TL;DR ⏱️
- Mistral AI published the first full life-cycle analysis (LCA) of an LLM
- Training has huge up-front CO₂, water, and resource costs
- Smart choices (renewables, smaller models, optimisation) reduce the footprint
(Read More) (Original-Websource)
Is your Data on European Servers of US-Hyperscalers? August 2025
Microsoft admits: "No, We Can’t Guarantee It"

TL;DR ⏱️
- Microsoft France admitted under oath: EU data on U.S. hyperscalers isn’t fully protected
- U.S. CLOUD Act enables access without informing you
- Enterprises need exit plans & sovereign AI options
(Read More) (Original-Websource)
Understanding the Levels of LLM Evaluation August 2025
A Crash-Course in Evaluating Large Language Models 🚀

TL;DR ⏱️
- Complex tasks → open-ended outputs
- Wide range of benchmark approaches
- Big differences in scale & complexity
(Read More) (Original-Websource)
Risk of LLMs Getting Stuck in Local Optima – Are We Training Optimally? July 2025
Local Optima Risk in Large Language Model Training

TL;DR ⏱️
- LLMs risk getting stuck in local optima during training
- Extreme dimensionality reduces “bad minima” but makes intuition hard
- Research shows many optima are connected, reducing real-world risk
(Read More) (Original-Websource)
LLM Context Engineering and Vision Language Models – What do they have in common? July 2025
Context Matters – In Language and Vision

TL;DR ⏱️
- Context engineering goes beyond prompt engineering
- More context = better predictions (for humans and models)
- Vision Transformers outperform CNNs by leveraging global context early
(Read More) (Original-Websource)
Real AI Business Process Risks – LLM & RAG Attacks July 2025
Invisible Context Editing: How Hidden Attacks Undermine AI & RAG Pipelines

TL;DR ⏱️
- Malicious actors hide invisible instructions in PDFs
- LLMs & RAG pipelines can be silently manipulated
- Guard-rail models comparing PDF text vs OCR output can detect anomalies
(Read More) (Original-Websource)
What color is the Coke can in this picture? On Context and Neural Structures June 2025
Context and Neural Structures in LLMs

TL;DR ⏱️
- LLMs only predict based on the context we provide
- Humans also rely on assumptions instead of raw data
- Understanding neural structures improves AI deployment
(Read More) (Original-Websource)
Battle for Talent and Resources in AI May 2025
Who Will Stay on Top of the AI Race?

TL;DR ⏱️
- Global battle for AI talent and compute resources
- U.S., China, and Europe positioning themselves with different strategies
- Open-weight tooling becoming an important backbone
(Read More) (Original-Websource)
Bigger AI Risk due to Sleeper Agents, MCP & Reasoning Attacks May 2025
Now bigger AI Risk due to Sleeper Agents, MCP & Reasoning Attacks

TL;DR ⏱️
- Increased risks from LLM backdoors or data poisoning via tool calling
- “Sleeper Agents” hidden in open-weight models
- Closed-source models vulnerable through pre-tuning data injection
- Possible new attack surface when using MCP
(Read More) (Original-Websource)
Comma Talk 25 – Keynote & GenAI Roundtable Highlights May 2025
AI is happening in the heart of EU industry! 🤖🇪🇺

TL;DR ⏱️
- Keynote talk on (Gen)AI trends and hype demystification
- GenAI Roundtable with open, unfiltered discussions
- Industry-wide AI applications across Europe
(Read More) (Original-Websource)
Meta’s LLaMA 4 Fraud Has Been Exposed! April 2025
AI Benchmarks and the Meta LLaMA 4 Controversy

TL;DR ⏱️
- Meta’s Arena Elo #2 model is not the same as the Hugging Face version
- Performance discrepancies raise trust issues in AI benchmarks
- Benchmark gaming and contamination are growing concerns
(Read More) (Original-Websource)
Meta’s Llama 4 Release – Issues with Elo Scores? April 2025
LLaMA 4 is here – but questions remain about Elo Scores and benchmarks.

TL;DR ⏱️
- Wait for verified SOTA benchmark results and reliable Elo scores
- Elo Score #2 based on "Experimental" Maverick raises questions
- Meta switched to Mixture of Experts (MoE) architecture, similar to DeepSeek and Mistral
- Released multiple model sizes, scaling up to 2 trillion parameters
- Proposing multi-million token context windows
(Read More) (Original-Websource)
VIBE Coding – SWE & AI Engineering Jobs, Code Models & Reinforcement Learning March 2025
VIBE Coding – Code Models & Huge Opportunity in Reinforcement Learning

TL;DR ⏱️
- VIBE Coding: no longer coding yourself, just prompting desired functions
- Code models + RL = measurable GenAI performance & big opportunities
- Still controversial: productivity vs. technical debt
(Read More) (Original-Websource)
There is also technical foundation for thinking capability in Reasoning LLMs March 2025
Do LLMs Think – Or Is It Just Next Token Prediction? 🤔

TL;DR ⏱️
- Short explanation of reasoning models
- Technical description: where is the potential for "thoughts" in transformers
- My hypothesis on LLMs and thinking capabilities
(Read More) (Original-Websource)
DeepSeek has huge issues. Code on last Slide. Be careful! February 2025
DeepSeek Alignment and Censorship Concerns

TL;DR ⏱️
- DeepSeek R1 alignment issues
- Censorship on topics related to China
- Use with caution
(Read More) (Original-Websource)
What a Wonderful Start, More to Come! January 2025
My journey at Comma Soft AG and my recent promotion to Senior Level 🥳🥰

TL;DR ⏱️
- Promotion to Senior Level
- Leading AI R&D and Product Development
- Building Alan.de for the European market
(Read More) (Original-Websource)
What happens in a learning (AI) brain? Problems of AI Model Knowledge Editing and LLM Cut Off dates lead to more errors in question answering and worse climate footprint January 2025
Discussing the challenges of AI Model Knowledge Editing and the impact on question answering systems and climate footprint.

TL;DR ⏱️
- Knowledge editing in AI models is complex and can lead to errors.
- Re-training models from scratch has a significant climate impact.
- Alternative solutions like RAG and Knowledge Graphs have their own challenges.
(Read More) (Original-Websource)
Expanding my GenAI content beyond LinkedIn! January 2025
I'm excited to share that my work-in-progress blog is now live at carstendraschner.github.io!

TL;DR ⏱️
- New blog live at carstendraschner.github.io
- Reach broader audience beyond LinkedIn
- Experiment with Alan.de and GitHub Pages
(Read More) (Original-Websource)
LLAMA 3.3 70B is there! Here are the facts! December 2024
New Release of LLAMA 3.3 70B with Significant Improvements

TL;DR ⏱️
- Performance Improvements over 3.2 & 3.1
- Similar License of Meta
- On par with other API Models from OpenAI & Anthropic
(Read More) (Original-Websource)
New DE/EU Open Source LLM 🇪🇺 Teuken 7B, Now Truly Open Source? November 2024
New Model out of Open-GPT-X developed within Germany from European Perspective

TL;DR ⏱️
- New German OS LLM
- 7B, Transperant Docs
- Problems in Alignment?
(Read More) (Original-Websource)
We're Dealing with the Butterfly Effect in LLM Creation Pipelines 🦋🤖 November 2024
When we develop novel LLMs, AI-Agents and GenAI Pipelines for Alan.de or within our various other GenAI projects, we're continuously learning about the Butterfly Effect in GenAI creation pipelines and how to mitigate this problem. 👨🏼‍💻 🤯 Every change can have a huge impact on the entire pipeline that creates and executes models

TL;DR ⏱️
- The Butterfly Effect can significantly impact LLM creation pipelines.
- Small changes in preprocessing or generation configs can have large ripple effects.
- Constantly evaluating trade-offs and staying up-to-date with new models is crucial.
- Strategies to mitigate the Butterfly Effect are essential for optimal performance.
(Read More) (Original-Websource)
New Official OSI "Open Source AI definition" attacks Meta's LLAMA as "Open Washing" November 2024
We getting closer what should be considered being open source in GenAI and not only open weight.

TL;DR 📝
- Official OSI definition of Open Source AI available.
- Meta doesn't like it 🙅‍♂️.
- State-of-the-art (SOTA) models lack being truly Open Source 🤔.
- "Open Washing" Problem of Meta and others?!
(Read More) (Original-Websource)
The Uncanny Valley Phenomenon in GenAI Face Synthesis November 2024
GenAI hits Uncanny Valley Problems as we have seen in Animation and Computer Games

TL;DR 📝
- Official OSI definition of Open Source AI available.
- Meta doesn't like it 🙅‍♂️.
- State-of-the-art (SOTA) models lack being truly Open Source 🤔.
- "Open Washing" Problem of Meta and others?!
(Read More) (Original-Websource)
A GenAI Solution/Tweak to Rescue ARD tagesschau webpage?! October 2024
Brainstorming how old regulations meet new GenAI opportunities.

TL;DR 📝
- Debate over new regulations restricting text content of public broadcasters online.
- Reform could impact news portals like tagesschau.de by limiting text content to only that which accompanies broadcasts.
- GenAI could potentially auto-generate videos from text content to comply with regulations.
- Questions about the impact and regulation of GenAI in public broadcasting contexts.
(Read More) (Original-Websource)
We Cannot Waste Time and Resources through Regenerating GenAI Creations! GenAI Control is Key for Productivity! October 2024
We will see more and more precise adjustment options in GenAI Tooling

TL;DR ⏱️
- GenAI often lacks precise Adjuments
- Tools and Models start CLosing the Gap
- Major Waste of Time and Resources through inefficient Regenerates
(Read More) (Original-Websource)
Choosing the Right LLM for Your Needs - Key Considerations October 2024
Consider the key factors when selecting a Large Language Model (LLM) for your project.

TL;DR ⏱️
- Benchmark Performance
- License
- Model Size
- Alignment
(Read More) (Original-Websource)
Stop adding Languages to LLMs! The Potential Drawbacks of Training Multilingual Large Language Models (LLMs) for Performance and Sustainability! October 2024
Exploring the downsides of creating multilingual LLMs and their impact on performance and resource utilization.

TL;DR ⏱️
- Challenges of building multilingual LLMs
- Inefficiencies in token usage and context length
- Increased hardware costs and reduced token training
- Weighing multilingual models against language-specific models
(Read More) (Original-Websource)
Where Science Meets Innovation - My personal Highlights & Insights into the PG 2024! Do you have answers to the open Questions? September 2024
Highlights and open questions from the Petersberger Gespräche (PG) 2024, covering AI, energy transition, chip technologies, and more.

TL;DR ⏱️
- AI and consciousness discussions
- Energy transition and regulatory challenges
- Distributed chip technologies in Europe
- Generative AI in media
- Metaverse applications beyond gaming
(Read More) (Original-Websource)
Today's Research Proposal - How to achieve "real" thinking and reasoning in GenAI, rather than just relying on a silent Chain of Thought, as seen in ReflectionAI or possibly GPT-o1? September 2024
Exploring the potential for achieving true reasoning and thinking in Generative AI models beyond the current Chain of Thought methodologies.

TL;DR ⏱️
- Current state of reasoning in models
- Possibilities for transformers to learn to think
- Customization ideas for achieving true reasoning
- Open questions and discussion points
(Read More) (Original-Websource)
For more sustainability transparency in GenAI! Share your knowledge and reduce energy waste! September 2024
Emphasizing the importance of transparency and shared knowledge to enhance sustainability in GenAI.

TL;DR ⏱️
- GenAI involves very large models and significant training efforts
- Transparency can help share emissions and reduce energy waste
- Open source models can optimize future development
(Read More) (Original-Websource)
Sustainable Air-Gapped On-Prem LLM Solution! How can we make GenAI available on almost any hardware, and how is it also available as a portable demo on our Alan Notebook September 2024
Exploring the development of a full-stack GenAI LLM solution that can run on a variety of hardware configurations, including a portable demo setup.

TL;DR ⏱️
- Developing Alan, a full-stack GenAI LLM solution
- Hosted on German hyperscaler infrastructure
- Offers a smaller version, Alan-S-LLM
- Portable demo available on Alan Notebook
(Read More) (Original-Websource)
Combining the Hugging Face Model Platform and Knowledge Graph trend analysis over time could improve GenAI research and reduce waste of energy! August 2024
Exploring the potential of leveraging knowledge graphs to analyze trends in evolving models for better GenAI research and efficiency.

TL;DR ⏱️
- Leveraging knowledge graphs for GenAI trends
- Identifying high-performing models and best practices
- Potential for a crowd-sourced GenAI cookbook
(Read More) (Original-Websource)
What is the perfect approach to adjust an LLM to your GenAI use case? July 2024
Exploring various methods to customize LLMs for specific GenAI use cases, ranging from simple to complex approaches.

TL;DR ⏱️
- Various ways to customize LLMs for specific use cases
- Approaches vary in difficulty and complexity
- Pros and cons of different methods
- More dimensions to improve GenAI use cases
(Read More) (Original-Websource)
These results give me hope for sustainable AI 🌱 July 2024
I'm impressed by some of the recent advances in the field of "small" open-weight Language Models (LLMs).

TL;DR ⏱️
- Increased documentation supports reproducibility
- Data quality improves model performance
- Model distillation reduces hardware needs
(Read More) (Original-Websource)
LLMs - Big vs Small. Bigger is Better!? OR Let's not waste energy!? July 2024
The AI community is abuzz with debates over the efficacy of large versus small language models. Both have their own merits and limitations.

TL;DR ⏱️
- AI community debates model sizes
- Massive models vs. smaller, efficient models
- Insights and future predictions
- Links to further reading
(Read More) (Original-Websource)
GenAI, what is plagiarism? Its Impact on Science. How should it be handled? What is your perspective? July 2024
Discussing the implications of GenAI on scientific work and the thin line between acceptable use and plagiarism.

TL;DR ⏱️
- Use of GenAI in scientific work
- Acceptable vs. debatable vs. critical usage
- Questions and concerns about plagiarism
- The pressure on researchers and students
- Opportunities for better research
(Read More) (Original-Websource)
Adjust GenAI responses towards more ethical behavior possible through system prompts!? Do you trust in such LLM-chat prepended pamphlets? July 2024
Exploring the potential and challenges of using system prompts to guide LLM behavior towards ethical outputs.

TL;DR ⏱️
- GenAI chat interactions often include system prompts
- System prompts aim to guide ethical LLM behavior
- Challenges exist in ensuring compliance and formulation
- Questions on designing and revealing system prompts
(Read More) (Original-Websource)
Alternative to GenAI creativity? Watch and try out these fun Evolutionary Algorithms. Problem-solving without GenAI and SGD-based approaches explained! June 2024
Exploring Evolutionary Algorithms as an alternative to GenAI for problem-solving, using a fun 2D vehicle race example.

TL;DR ⏱️
- There is hype around GenAI and LLMs
- Evolutionary Algorithms (EAs) offer an alternative
- A fun example of EAs using a 2D vehicle race
- Steps involved in EAs explained
(Read More) (Original-Websource)
Fast New ELO Over MixEval. But Looking into Code, I Got Doubts! June 2024
Have you seen that MixEval Hard has two interesting but little critical aspects

TLDR
- Examined MixEval, an open-source LLM benchmark
- Uses LLMs as judges to predict continuous values
- Evaluation data contains duplicates
- Uncertain about trusting these methods
(Read More) (Original-Websource)
Do we need another GenAI solution? How & why we developed a full-stack GenAI LLM+RAG tool called Alan. A sneak peek at what I am currently working on. June 2024
An overview of the motivations and technical aspects behind developing our own GenAI solution, Alan, at Comma Soft AG.

TL;DR ⏱️
- Diverse GenAI solutions exist
- Unique motivations for developing our own tool
- Technical advantages of our solution
- Questions on custom development vs. wrappers
(Read More) (Original-Websource)
Will we reach AGI, and if so, by transformers-based architectures? Share your perspective! June 2024
Exploring the potential of transformers-based architectures in achieving Artificial General Intelligence (AGI) and the ongoing debate surrounding it.

TL;DR ⏱️
- GenAI's impact on AGI discussions
- Technical challenges with transformer-based architectures
- Optimistic yet cautious approach at Comma Soft AG
(Read More) (Original-Websource)
Do you differentiate AI Ethics principles between AI/ML fields like GenAI/LLMs or Knowledge Graph-based ML? How do we deal with so-called facts on the internet as training data? June 2024
Exploring the nuances of AI ethics across different AI/ML fields and handling internet-based training data responsibly.

TL;DR ⏱️
- AI ethics principles across different AI/ML fields
- Personal background and perspective on AI ethics
- Recommendations and further reading on AI ethics
- Questions to ponder on AI ethics practices
(Read More) (Original-Websource)
What expectations do you have regarding the values and norms of your GenAI chat assistants? Highly sensitive topic in the LLM space! My take... May 2024
Exploring the ethical considerations and expectations surrounding the values and norms embedded in GenAI chat assistants.

TL;DR ⏱️
- LLMs generate text based on training
- Alignment and finetuning influence behavior
- Ethical considerations in different languages
- Need for a holistic view on model behavior
(Read More) (Original-Websource)
Be careful when you speak of Open (Source) GenAI. Why OpenAI and Meta (shouldn't) use the word Open within their GenAI efforts? May 2024
Examining the implications of using the term "Open" in the context of GenAI by organizations like OpenAI and Meta.

TL;DR 🚅
- Open Source is a huge and important field in computer science and AI
- The word "Open" is used widely within the GenAI field: OpenAI, Open Source LLMs
(Read More) (Original-Websource)
Thanks to the Open Source Community for all their efforts! Greetings from PyCon 2024 May 2024
Expressing gratitude to the open-source community and sharing experiences from PyConDE & PyData Berlin 2024.

TL;DR ⏱️
- Trip to PyCon with colleagues
- Attended insightful talks in various AI fields
- Appreciation for open-source community
- Gratitude to all contributors and supporters
(Read More) (Original-Websource)
Who will take care of truly low-resource languages? A good step towards more fair GenAI LLM pricing at OpenAI for Japanese-using people! April 2024
Exploring the challenges and recent developments in addressing low-resource languages within the GenAI landscape, with a focus on OpenAI's efforts for the Japanese language.

TL;DR ⏱️
- Issues with LLMs for low-resource languages
- Major challenges with different character languages
- OpenAI's new dedicated model for Japanese
- Concerns about AI ethics and inequality
(Read More) (Original-Websource)
What is your preferred LLM family? And do you start with an already finetuned LLM? Why you have chosen this LLM? I love to hear your perspective! April 2024
Understanding the preferences and choices behind selecting specific LLM families and their finetuned variants.

TL;DR ⏱️
- GenAI for text implemented by LLMs
- Many open-source models available
- Continuous influx of new models
- Key foundation model families
- LLM-based GenAI pipelines at Comma Soft AG
(Read More) (Original-Websource)
NVIDIA Benchmark might be WRONG cause it states - You lose money AND LLM inference speed if you add more NVIDIA A100. This NVIDIA Benchmark is NOT reliable. April 2024
Analyzing the reliability of NVIDIA's benchmark results and the implications for LLM inference speed and hardware investment.

TL;DR ⏱️
- Terms and background on LLMs and inference
- Strange findings in NVIDIA's benchmark results
- Concerns about the reliability of these benchmarks
- Questions and further reading on the topic
(Read More) (Original-Websource)
Too many LLMs?! How to keep track with all the Open Source Models? Identify the finetuned-masked LLMs and its position within the GenAI landscape! April 2024
Navigating the complex landscape of GenAI models can be challenging, but it's crucial to understand the foundational and finetuned models to make informed decisions.

TL;DR ⏱️
- The GenAI landscape is crowded with many models
- Keeping track of innovations and true effects is hard
- Transparency issues with many so-called "open-source" models
- Recommendations for navigating this landscape
(Read More) (Original-Websource)
Be careful when you are using LLAMA-2! Legal risks & Sustainability Implications due to LLAMA-2 is (NOT) Open Source. March 2024
Important considerations regarding LLAMA-2's legal and sustainability implications.

TL;DR ⏱️
- LLAMA-2's legal and sustainability challenges
- Not truly open-source according to OSD
- Technical implications of its license
- Meta's restrictions and their broader impact
(Read More) (Original-Websource)
The major players in GenAI are facing challenges with their Generative AIs. GenAI capabilities and security issues related to LLMs Tools • 37C3 Presentation March 2024
Challenges and security issues in GenAI and LLMs, highlighted at 37C3.

TL;DR ⏱️
- GenAI has immense capabilities
- Ethical and secure GenAI pipelines are crucial
- 37C3 presentation on security issues and exploitations
- Categories of threats and challenges in GenAI
(Read More) (Original-Websource)
Not prompting in English?... You have Problems!! LLM Language Barriers • Democratizing GenAI and fair pricing March 2024
Exploring the challenges of using Generative AI with languages other than English and the implications for cost and performance.

TL;DR ⏱️
- Tokenizers and their role in LLMs
- Challenges of non-English prompts
- Efficiency and fairness in GenAI
- Recommendations for LLM pipelines
(Read More) (Original-Websource)
Ever wondered about open source LLLM sizes - 7B, 13B, 70B? February 2024
Where do those model sizes come from?... My findings!

TL;DR ⏱️
- Open-source LLM alternatives to AIaaS
- Hugging Face as a source for open-source models
- Many models are finetuned variants
- Bigger models imply slower inference & higher costs
- Different use cases require different model capabilities
- Questioning the parameter step sizes of models
(Read More) (Original-Websource)
We've beaten GPT4! ... is a sentence which starts to annoy me. February 2024
About Mistrust in LLM Evaluation. Benchmark contamination in LLMs? How to Evaluate GenAI?!

𝗧𝗟;𝗗𝗥 ⏱️
- News is flooded with LLMs being “better” than GPT4
- LLM Evaluation is Difficult
- Benchmark Contamination is a serious issue
- Build your own use case specific benchmarks
(Read More) (Original-Websource)
DALLE has surprising guardrails. Your image is not filtered based on your prompt. "Dead cookies" may be generated ...sometimes February 2024
Interesting findings on DALLE's content filtering mechanisms.

TL;DR ⏱️
- DALLE-3 filters your content AFTER image creation
- With prompt “dead cookies” you can reproduce inconsistent filtering over OpenAI API
- 40% of cases with same “dead cookies” prompt stop through content filter and 60% reach us over API
(Read More) (Original-Websource)
Evil LLMs available! Break GenAI Alignment through finetuning! February 2024
Need for LLM Alignment transparency?

TL;DR ⏱️
- Powerful LLMs are mostly aligned
- Alignment can be broken through finetuning
- Need for transparency in alignment processes
- Questions about alignment in LLMs
(Read More) (Original-Websource)
LLAMA2 13B is faster than LLAMA2 7B, according to NVIDIA benchmark! February 2024
Interesting findings on NVIDIA's LLAMA 2 benchmark results.

TL;DR ⏱️
- NVIDIA LLAMA 2 Benchmark results
- LLAMA 13B reported faster than LLAMA 7B
- Questions about the accuracy of these findings
- Seeking community insights
(Read More) (Original-Websource)

See all tags.