Module 1 - LLM Fundamentals
Module 1 - LLM Fundamentals
&
Google Gemini, Google Deep Research,
Google AI Studio, Midjourney, Synthesia,
MAY 2025 Google Flow, Canva
Module 1: LLM Fundamentals &
the Evolving AI Landscape
Welcome to Module 1 of your learning guide to Large Language Models (LLMs) for
global health. This foundational module is designed to equip you with a robust and
current understanding of LLM core concepts, the rapidly evolving AI ecosystem,
frontier developments with a focus on healthcare applications, and strategies for
staying informed. This module will deepen your understanding of LLMs, enabling
you to leverage these technologies effectively and responsibly.
2
1.1: Core LLM Concepts
Understanding the fundamental principles behind LLMs is crucial for appreciating their
capabilities, limitations, and how they can be applied to your work in global health. This section
will break down these core concepts in accessible terms, moving beyond a general background
to an expert-level understanding of their application.
LLMs are a sophisticated type of AI designed to understand, process, and generate human-like
language. Unlike traditional software with explicit programming, LLMs learn by analyzing vast
amounts of text data, identifying intricate patterns, statistical relationships, and contextual
associations between words and phrases. They essentially learn language rules, common
knowledge, and stylistic nuances implicitly from this data. Their core function is predicting the
most likely next word (or token) in a sequence given the preceding context, which allows them
to generate coherent text. This is typically achieved through self-learning or unsupervised
training. It's vital to remember that LLMs are advanced pattern-matching systems, not sentient
entities, which helps set realistic expectations.
The remarkable capabilities of modern LLMs are largely due to the Transformer architecture,
introduced by Google researchers in their 2017 paper "Attention Is All You Need". This marked a
significant shift from older models like Recurrent Neural Networks (RNNs) that processed
language sequentially, creating bottlenecks for long texts. The Transformer enables parallel
processing, analyzing all words in a sequence simultaneously, which is faster and allows for a
better grasp of word relationships regardless of their position. Many Transformers use an
encoder-decoder structure: the encoder reads the input and creates a numerical
representation of its meaning, and the decoder uses this representation to generate the output.
This parallel processing capability is key to training LLMs on massive datasets.
3
Positional Encoding
Since Transformers process words in parallel, they don’t inherently know word order.
Positional encodings are mathematical signals added to each word’s representation to
provide information about its position in the sequence, which is vital for meaning (e.g., “dog
bites man” vs. “man bites dog”).
LLMs don’t process language word by word or character by character; instead, they break text
into smaller units called tokens. A token can be a whole word, part of a word, a single
character, or punctuation. This process, tokenization, allows models to handle large
vocabularies and new words efficiently.
Parameters are the internal variables or “settings” that an LLM learns during training. Think of
them as the connections in the model’s neural network that store the knowledge derived from
data, representing patterns and linguistic rules. Models can have millions to trillions of
parameters; generally, more parameters allow for learning more complex patterns and
potentially better performance, but also mean the model is larger, requires more
computational power, and increases costs. It’s not just about quantity; training data quality and
architecture are also crucial. Users might interact with adjustable parameters like
“Temperature” (randomness/creativity) or “Top-p”.
The context window (or context length) is the amount of text (in tokens) an LLM can consider at
any single point when processing input and generating output. It’s like the model’s short-term
memory. A larger context window means the model can “remember” more of the preceding
text, which is vital for coherence in long dialogues or documents. Leading models have context
windows from 128,000 up to 1 million tokens or more. However, larger windows need more
computational resources and models might struggle to effectively use information in the middle
of very long contexts.
While traditional LLMs were text-only, a significant frontier is multimodal models. Modality
refers to the type of data an AI can process or generate, including text, image, audio, video, and
others. Multimodal LLMs (MLLMs) can understand, process, and generate information across
multiple modalities simultaneously. They typically use specialized encoders for each modality,
convert data into common numerical formats (embeddings), and then fuse these
representations for the core LLM backbone to process. This allows for a more holistic
understanding, similar to human senses.
4
Table 1: Core LLM Concepts Explained
Token The basic unit of text LEGO brick / Building Affects how text is
(word, part of word, block processed,
character) that an LLM vocabulary size
processes. handling, and context
window limits.
Context Window The amount of text (in Short-term memory / Impacts coherence in
tokens) the LLM can Attention span long
actively consider or texts/conversations,
“remember” at any ability to process large
given time. documents. Larger
windows improve
context but increase
computational cost.
5
Practical Implications: Capabilities and Limitations
• Capabilities:
○ Enhanced Contextual Understanding: Transformers and attention lead to
coherent outputs.
○ Versatility: LLMs can perform many language tasks (generation, summarization,
Q&A).
○ Handling Complexity: Large parameter counts enable tackling complex tasks.
○ Multimodal Processing: Integrating text, image, audio, and video opens new
possibilities.
○ Efficiency: Parallel processing allows faster training and inference.
• Limitations:
○ Context Window Constraints: Finite context can cause “forgetting” in long texts.
○ Factual Inaccuracy (Hallucinations): LLMs can produce plausible but incorrect
information; verification is essential.
○ Bias: LLMs can inherit and amplify biases from training data.
○ Computational Cost: Large LLMs are expensive to train and run.
○ Lack of True Understanding: LLMs use pattern matching, not genuine
comprehension or common sense.
○ Catastrophic Forgetting: Fine-tuning can make models lose previous skills.
○ Steerability Issues: Models may not always follow instructions perfectly.
Understanding these trade-offs is critical. The power of the Transformer architecture enables
impressive abilities but also brings challenges like cost and ensuring reliability. Outputs require
verification, especially in high-stakes health applications.
• Tool Evaluation: Your knowledge of ‘context windows’ will help you assess
if an LLM can handle the length of their project reports or interview
transcripts. For instance, if a client needs to analyze lengthy annual health
reports (e.g., 200 pages), a model with a small context window would be
impractical without significant pre-processing, like breaking documents
into chunks. You can advise on the trade-offs between models with larger
context windows (potentially higher cost, slower) versus pre-processing
strategies.
• Explaining Limitations: You can clearly articulate why an LLM might
‘hallucinate’ (due to its pattern-matching nature, not a knowledge base)
6
and the necessity of human oversight for outputs, especially when dealing
with patient data or public health recommendations.
• Advising on Customization: Understanding ‘parameters’ and ‘fine-tuning’
(which will be covered in more detail later) allows you to discuss the
potential for tailoring models to specific global health terminologies or
contexts, even if recommending no-code/low-code approaches for
implementation.
• Managing Expectations: You can set realistic expectations regarding what
LLMs can and cannot do, preventing disappointment and ensuring the
technology is applied effectively to solve real problems.
7
1.2: The Current LLM Ecosystem
The LLM landscape is diverse and changes rapidly, featuring proprietary systems from major
tech companies and a growing number of powerful open-source alternatives. Understanding
these is crucial for selecting the right tools for global health needs. LLMs can be broadly
categorized into proprietary (closed-source) and open-source models.
The most prominent proprietary LLMs come from OpenAI (GPT series), Google (Gemini family),
and Anthropic (Claude family).
8
○ Ideal Use Cases: Multimedia analysis, research requiring high accuracy,
coding, translation/localization.
• Claude Series (Anthropic – e.g., Claude 3 Haiku/Sonnet/Opus, Claude 3.5/3.7
Sonnet):
○ Focus: AI safety, reliability, ethical considerations, steerable outputs.
○ Features: Strong vision (image analysis) capabilities. Offers models balancing
speed, cost, and capability (Haiku: fast/affordable, Sonnet: balanced, Opus:
most capable). Claude 3.7 Sonnet features “extended thinking” for improved
accuracy. Context window around 200,000 tokens; supports agentic workflows.
○ Strengths: High reliability, safety focus, excels at structured tasks, analysis,
summarization, strong coding (especially 3.7 Sonnet), good at handling long
context, clear outputs.
○ Weaknesses: Can sometimes be more verbose or less creative than GPT;
proprietary access.
○ Ideal Use Cases: Enterprise applications needing safety/reliability, processing
long documents, analytical reasoning, coding, content structuring.
Feature/Aspect GPT Series (e.g., 4.1 / Gemini Series (e.g., Claude Series (e.g.,
4o) 2.5 Pro / 2.0 Flash) 3.7 / 3.5 Sonnet)
Multimodality Strong (Text, Image, Native & Strong (Text, Strong Vision (Text,
Audio input/output – Image, Audio, Video Image input)
GPT-4o/4.1) input)
Noted Weaknesses Higher cost (vs. 4o), Less creative flair (vs. Can be verbose, less
potential personality GPT) creative (vs. GPT)
issues (4o)
9
Feature/Aspect GPT Series (e.g., 4.1 / Gemini Series (e.g., Claude Series (e.g.,
4o) 2.5 Pro / 2.0 Flash) 3.7 / 3.5 Sonnet)
Ideal Use Cases Complex problem- Multimedia analysis, Enterprise use, long
solving, advanced research, coding, document analysis,
content/code localization coding,
generation summarization
● Hugging Face: The central hub for accessing, sharing, and collaborating on open-
source models, datasets, and tools. You can often try models directly on their platform
via “Spaces” (interactive demos).
10
● No-Code/Low-Code Platforms: Several platforms are emerging that provide user-
friendly interfaces to run and fine-tune open-source models without extensive coding.
(Specific examples will be explored in later modules focused on RAG and
customization).
● Local GUIs: Tools like LM Studio allow users to download and run various open-source
models locally on their personal computers (Mac, Windows, Linux) if they have
sufficient processing power (especially GPU) and RAM. This provides a no-code way to
experiment with different models.
Choosing between open-source and proprietary LLMs involves trade-offs crucial for global
health applications.
Data Control / Privacy Full control; Data stays in- Relies on vendor policies;
house Data typically sent to vendor
11
For global health, data privacy and control often favor open-source if technical expertise and
resources are available. Cost can also lean towards open-source, considering total cost of
ownership. If top performance and dedicated support are key, and data privacy addressed,
proprietary models may be preferred.
12
Ethical Consideration: Data Sovereignty in Global Health
The choice between proprietary and open-source LLMs has significant ethical implications for
data sovereignty in global health. Sending health data, even if de-identified, to servers
controlled by commercial entities in other countries can raise concerns about who ultimately
controls that data and under what legal frameworks it is protected. Open-source models, when
deployed within an organization’s own infrastructure or within the country of operation on a
trusted cloud, can offer a way to maintain data sovereignty, aligning better with the health data
governance principles of many nations and communities.
1. Define the Task: What will the LLM do (e.g., summarize research, draft proposals,
analyze patient feedback)? Does it need creativity, factual accuracy, medical
knowledge, coding, or image/audio processing?
2. Assess Accuracy and Reliability Needs: How critical is factual correctness? What are
the consequences of errors?
3. Evaluate Performance Requirements (Speed): Does the application need immediate
responses (e.g., patient chatbot) or is some delay acceptable (e.g., report draft)?
4. Consider Data Sensitivity and Privacy: What data will be processed? Does it include
Protected Health Information (PHI)? Are there regulations like GDPR or HIPAA? Does
data need to stay in-house?
5. Analyze Budget and Resources: What’s the budget for fees, hosting, and
infrastructure? What technical expertise is available?
6. Determine Ease of Use and Support Needs: How easily can it integrate into workflows?
How much technical support is needed?
7. Review Licensing and Usage Rights: Are there restrictions on commercial use or
modifications?
There’s no single “best” LLM; the optimal choice is contextual. The field’s rapid evolution
means focusing on these criteria is key.
Table 4: Open Source vs. Proprietary LLMs: Considerations for Selection Criteria
13
Criteria Open Source Open Source Proprietary Proprietary
(Pros) (Cons) (Pros) (Cons)
Data Sensitivity Data can remain Risk of security Usually Data is often
& Privacy entirely within the vulnerabilities if compliant with processed
organization’s not properly data regulations externally, raising
infrastructure. configured. (e.g., HIPAA) and potential privacy
offers secure concerns,
environments. especially with
PHI.
Budget & No licensing fees; Requires in- Support services High licensing
Resources lower cost for house technical and and usage fees,
long-term use. resources for infrastructure particularly for
setup, included in API calls and
maintenance, subscription. large-scale
and fine-tuning. applications.
14
Criteria Open Source Open Source Proprietary Proprietary
(Pros) (Cons) (Pros) (Cons)
● Learning Objective: To practice applying selection criteria for LLMs in a global health
context.
● Task: Review the data usage policies of ChatGPT (OpenAI’s general policy for consumer
services), Google’s Gemini (through AI Studio or their general terms), and Anthropic’s
Claude (their standard terms). You can search for “[Tool Name] data usage policy” or
“[Tool Name] privacy policy.”
o Identify one key statement from each policy regarding how user data might be
used by the company (e.g., for model improvement, service provision, etc.).
o Based on these statements, what are the top 1-2 data usage concerns a global
health organization should have before its staff uses the free or standard
versions of these tools for drafting internal project documents that might
contain summarized (but not directly identifiable patient) sensitive
programmatic information (e.g., “challenges in TB adherence in District X
relating to social stigma”)?
● Expected Outcomes/Reflection Questions:
o The user should locate and reference specific policy clauses.
o Concerns might include data being used for model training, data retention
periods, and the geographical location of data processing.
o How would these concerns change if the data involved directly identifiable
patient information?
o What steps could the organization take to mitigate these concerns if they still
wish to leverage these tools? (e.g., stricter internal usage guidelines, exploring
enterprise versions with different data policies).
15
Beyond General Purpose: The Need for Specialization
General-purpose LLMs, trained on diverse internet data, are versatile but can lack depth and
accuracy in specialized fields like medicine, which have specific terminology and require high
factual precision. A general model might struggle with nuanced clinical notes without
significant fine-tuning. This drives the development of domain-specific or purpose-built LLMs,
pre-trained or extensively fine-tuned on field-specific data (e.g., biomedical literature, EHRs).
This focused training leads to more accurate and relevant outputs. Another approach is custom
LLMs, where a general model is fine-tuned on an organization’s specific dataset. General
models offer breadth, specialized models offer depth, and custom models offer tailored
performance. For global health, specialized or custom models often outperform general ones
where accuracy and context are paramount.
16
Model Example Primary Training Data Key Developer/Source
Focus Capabilities/Tasks
Generation,
Multimodal Analysis
● Accelerating Drug Discovery: Analyzing literature and datasets to identify drug targets,
predict interactions, and optimize clinical trials. AI agents based on LLMs can
orchestrate discovery workflows.
● Enhancing Diagnostic Support: Assisting healthcare workers in resource-limited
settings by analyzing symptoms, history, and test results (potentially images) to suggest
diagnoses or areas for investigation. This augments, not replaces, clinicians.
● Improving Public Health Surveillance (Infoveillance): Monitoring social media, news,
etc., in multiple languages to detect disease outbreaks, track sentiment on
interventions (e.g., vaccinations), understand health behaviors, and combat
misinformation.
● Synthesizing Research and Evidence: Rapidly processing and summarizing research
papers, reports, and guidelines to help users stay updated and generate evidence-
based recommendations.
● Streamlining Administrative Tasks: Automating tasks like claims processing,
scheduling, drafting correspondence, and generating clinical documentation (e.g.,
ambient scribing from consultations).
● Facilitating Patient Education and Communication: Generating clear explanations of
health conditions/treatments in multiple languages, powering chatbots for health
questions.
17
Multi-Modal LLMs (MLLMs): AI That Sees and Hears
MLLMs process and integrate various data types (text, images, audio, video). They use
specialized encoders for each modality, fuse the representations into a common format,
allowing the core LLM backbone to reason across combined information. This enables tasks
like describing images, answering questions about videos, or generating images from text.
Leading examples include GPT-4o, Claude 3 family, and Gemini series. Open-source examples
include LlaVA and Meta’s ImageBind. MLLMs interact with information more like humans,
leading to more natural interaction and enhanced problem-solving.
Successful implementation requires high-quality, diverse training data, robust reliability and
safety methods, privacy adherence, and careful workflow integration. Specialized models offer
depth but may lack breadth. The multimodality of healthcare data makes MLLMs a relevant
frontier but demands careful validation.
● The Challenge: Generic LLMs might not know the specific national
guidelines or might provide information based on other countries’
protocols.
18
● The Specialized Solution Idea: You could propose the development of a
custom, lightweight LLM (or a sophisticated RAG system, which we’ll cover
later) fine-tuned specifically on the country’s official malaria treatment
guidelines, related national health policies, and training materials. This
specialized tool could then be embedded into a simple mobile application
for health workers.
● The Value Proposition: Health workers could query the application in
natural language (even offline, if the model is small enough to be on-
device) to get immediate, accurate advice based only on the approved
national guidelines. This improves adherence, standardizes care, and acts
as a persistent job aid. You would emphasize how this goes beyond what a
general-purpose tool like standard ChatGPT could reliably offer for this
specific, high-stakes use case. This leverages the concept of bringing
focused, expert knowledge directly to the point of care.
● Data Quality and Bias: Visual and audio data can carry significant biases. For example,
skin condition diagnostic models trained predominantly on lighter skin tones may
perform poorly on darker skin. Ensuring diverse and representative multimodal datasets
is critical but challenging.
● Privacy of Multimodal Data: An image of a patient’s home environment during a
telehealth call, or a recording of their voice, can reveal much more sensitive information
than text alone. Secure collection, storage, and processing protocols are paramount.
● Interpretability and Trust: If an MLLM combines an image and text to suggest a
diagnosis, and it’s incorrect, understanding why it made that error can be even more
complex than with text-only models. Building trust with clinicians and patients requires
transparency about these limitations.
19
one in isolation? For example, could it help link observed feeding practices
(audio/text) with the visual content of meals (images)?
2. What are three key ethical or practical challenges the client would need to
address before implementing such an MLLM-based analysis? (e.g., consent for
image/audio use, image quality issues, potential biases in interpreting meal
content across different cultural food preparations).
● Expected Outcomes/Reflection Questions:
o The user should be able to suggest integrated analyses, e.g., correlating themes
from audio discussions with visual evidence in meal photos and survey data.
o Challenges might include: ensuring informed consent for multimodal data
collection and AI analysis, technical difficulties with variable image quality,
algorithmic bias in identifying food types or portion sizes across diverse local
cuisines, and privacy concerns if images or audio inadvertently capture
identifiable individuals or sensitive household contexts.
o What kind of expertise would the client need on their team or as partners to
responsibly explore this MLLM application?
AI knowledge quickly becomes outdated. Capabilities that were science fiction a year ago might
be common today. Professionals need awareness of AI developments to understand impacts
on their work and the populations they serve. For global health consultants, this understanding
is vital for identifying opportunities, mitigating risks, and providing informed advice.
20
o Zvi Mowshowitz’s “Don’t Worry About the Vase” (weekly LLM development
analysis).
● Key Organizations and Research Labs:
o Major AI Labs: OpenAI, Google DeepMind (and Google AI), Meta AI, Anthropic
publish blogs and research.
o Hugging Face: Essential for tracking open-source developments via their blog,
platform updates, and model/dataset repositories.
● Relevant Conferences and Journals (Overview):
o Major AI Conferences (often technical, look for summaries): NeurIPS, ICML,
ICLR.
o Applied Conferences: The AI Summit, World Summit AI.
o Key Journals: Nature Machine Intelligence. For healthcare: NEJM AI, The Lancet
Digital Health.
● Online Communities:
o Hugging Face Hub: Explore models, datasets, “Spaces” (demos).
o Reddit: r/MachineLearning, r/LocalLLaMA, r/huggingface.
o Dedicated Platforms: “God of Prompt” for practical application.
● Key Individuals/Thought Leaders: Follow influential researchers and educators on
LinkedIn or X (formerly Twitter) like Andrew Ng, Yann LeCun, Fei-Fei Li, Cassie Kozyrkov.
21
Resource Type Specific Examples Focus / Frequency Accessibility (Non-
Technical)
Summit (Applied
Focus)
● Bias Awareness: LLMs can perpetuate biases from training data. Be critical of
stereotypical outputs or skewed perspectives.
● Data Privacy: Exercise caution with sensitive information (especially PHI). Understand
provider data policies. Consider solutions with greater data control (e.g., secure open-
source models) for sensitive applications.
● Misinformation and Verification: LLMs can “hallucinate”. Verify critical information
from LLMs against trusted sources before relying on it.
● Transparency and Accountability: LLMs are often “black boxes”. Understand
limitations in tracing outputs and assigning accountability.
● Responsible Use: Avoid generating harmful, deceptive, or abusive content. Consider
real-world impacts. Human oversight (“human-in-the-loop”) is often necessary in high-
stakes situations.
22
Navigating these dimensions is a fundamental professional competence. For global health
consultants, promoting ethical AI use is a core responsibility.
1. Curate Your Sources (Start Small): Don’t try to drink from the firehose.
Select 2-3 accessible newsletters (e.g., The Neuron, Ben’s Bites) and one
blog (e.g., One Useful Thing) that resonate with you. Dedicate 15-30
minutes, 2-3 times a week, to scan them.
2. Follow Key Organizations & People: Follow major labs (OpenAI, Google
DeepMind, Meta AI, Anthropic) and a few thought leaders in AI for health on
LinkedIn or X for high-level announcements.
3. Hands-on “Playtime”: Schedule 30-60 minutes bi-weekly or monthly to
experiment with a new feature in a familiar tool (ChatGPT, Gemini, Claude)
or try a new, accessible tool mentioned in your curated sources (e.g., a new
no-code AI platform, or exploring a model on Hugging Face Spaces).
4. Focus on Relevance: As you read or experiment, constantly ask: “How
could this be relevant to [a specific global health challenge, a client type, a
project I’m working on]?” This filters information through your professional
lens.
5. Share & Discuss: If you’re part of a team or community of practice, share
interesting findings or tools. Discussing them deepens understanding. This
structured yet manageable approach helps you stay informed about key
trends and identify emerging AI solutions relevant to global health without
requiring you to become a full-time AI researcher.
● Learning Objective: To begin curating resources for continuous learning in AI for global
health.
● Task:
1. From the list of “Accessible Newsletters and Blogs” in the Curated Resource
Report (and Table 5 above), select two newsletters that seem most relevant to
your interests. If possible, visit their websites and review their latest few posts or
sign up if they are free.
2. Identify one AI tool, concept, or research finding discussed in these
newsletters/blogs in the last month that you believe could have a tangible
application or implication (positive or negative) for global health consultancy
work.
3. Briefly explain your reasoning: What specific global health problem or
consultancy task could it relate to, and why is it significant?
23
● Expected Outcomes/Reflection Questions:
o The user identifies two specific newsletters/blogs and demonstrates
engagement with their content.
o The user selects a relevant AI development and articulates a clear connection to
global health consultancy, considering its potential impact.
o How might you integrate regular review of such resources into your existing
professional development routine?
o What criteria would you use to decide if a newly announced AI tool is worth
dedicated time to explore further for your consultancy practice?
Module 1 Summary
This module has laid the groundwork for your journey to becoming an expert LLM consultant in
global health. You’ve delved into the core technical concepts that power LLMs, from the
revolutionary Transformer architecture to the practicalities of tokens, parameters, and context
windows. You’ve navigated the current LLM ecosystem, comparing leading proprietary models
like GPT, Gemini, and Claude with the burgeoning world of open-source alternatives such as
Llama and Gemma, and considered the critical factors for selecting the right approach in a
global health context. Furthermore, you’ve explored the exciting frontier of specialized and
multimodal LLMs, recognizing their immense potential to address complex challenges in
healthcare research and operations. Finally, you’ve equipped yourself with strategies and
resources for continuous learning and underscored the paramount importance of ethical
considerations in all AI engagements.
This foundational knowledge is crucial as you move towards more advanced topics in
subsequent modules, where you will learn to apply these tools for producing PhD-level
technical documents, customizing LLMs for specific global health needs, and leveraging AI in
advanced research methods.
24
o JMIR article: “Revolutionizing Health Care: The Transformative Impact of Large
Language Models in Medicine”
o NVIDIA Glossary: “What Are Multimodal Large Language Models?”
● Staying Updated:
o Newsletters: The Rundown, Ben’s Bites, The Neuron
o Blogs: Ethan Mollick’s “One Useful Thing”
25
26
Nathan Miller
natemiller33@gmail.com
27