✦ revision deck · companion volume ✦

aws AI Practitioner services & lifecycle

tap a card to flip it the AWS service zoo, lifecycle terms and FM trap questions, in revision form

20 cards 6 topics cards flipped 0 / 20
§ 01

infrastructure

3 cards
01infrastructure
RegionAZ≥ 3

AWS Regions

eu-west-2 us-east-1
tap to flip ✦
01infrastructure

AWS Region

A physical location in the world where AWS clusters its data centres.

  • Each Region has ≥ 3 Availability Zones (the magic number — not 2)
  • AZs are isolated, physically separate, geographically grouped
  • Connected by redundant, ultra-low-latency networks
Exam trap: "Region has minimum 2 AZs" is the wrong-but-tempting option. It's always 3.

Pick a Region for: compliance (data residency), latency (close to users), service availability, price.

02infrastructure
AZ≥ 1 DC

Availability Zones

AZ-a multiple DCs AZ-b 1 DC
tap to flip ✦
02infrastructure

Availability Zone (AZ)

One or more discrete data centres with redundant power, networking and connectivity, inside a Region.

  • Each AZ has independent power, cooling and physical security
  • AZs in a Region are interconnected via high-bandwidth, low-latency, redundant metro fibre
  • Naming: eu-west-2a, eu-west-2b, etc.
Exam trap: "AZ consists of two or more discrete data centres" is wrong. It's one or more.

Spread workloads across ≥ 2 AZs in the same Region for high availability — single AZ = single point of failure.

03infrastructure
EdgeCloudFront

Edge Locations

Region
tap to flip ✦
03infrastructure

Edge Locations

Separate from Regions and AZs — these are content delivery endpoints close to end users.

  • Used by services like CloudFront (CDN) and Route 53
  • Cache content near users for faster delivery
  • Far more numerous than Regions (400+ globally)
Exam trap: Edge Locations are NOT inside Regions, and Regions are NOT made of Edge Locations. They're a parallel concept.

Hierarchy: Region contains AZs which contain data centres. Edge Locations sit outside this stack.

§ 02

sagemaker family

6 cards
04sagemaker
no-codebusiness users

SageMaker Canvas

drag drop predict
tap to flip ✦
04sagemaker

SageMaker Canvas

Generate ML predictions without writing any code. Visual, drag-and-drop interface for business analysts.

  • Chat with popular LLMs
  • Use Ready-to-use models
  • Build custom models on your data — automatically
Exam keyword: "no code" or "business user" or "without writing code" → Canvas. Every time.

Think of it as the friendly front-end to ML for people who don't live in PyCharm.

05sagemaker
model hubone-click

SageMaker JumpStart

Llama Stable Dif. Claude deploy with 1 click
tap to flip ✦
05sagemaker

SageMaker JumpStart

An ML hub of foundation models, built-in algorithms and prebuilt solutions you can deploy in a few clicks.

  • Access to popular FMs (Llama, Stable Diffusion, etc.)
  • Pre-built end-to-end solutions for common use cases
  • Fine-tune models on your own data
Exam keyword: "one-click", "end-to-end solutions", "prebuilt", "foundation model hub" → JumpStart.

Canvas vs JumpStart: Canvas = no-code UI for predictions. JumpStart = model + solution hub for builders.

06sagemaker
responsible AIbias

SageMaker Clarify

feature importance
tap to flip ✦
06sagemaker

SageMaker Clarify

Two jobs, both critical for Responsible AI:

  • Detects bias in your data and in model predictions across groups
  • Explains predictions — shows how each input feature contributed (SHAP values)
whenwhat it checks
pre-trainingbias in your dataset
post-trainingbias in model predictions
inferenceexplainability per prediction

Exam keyword: "bias", "fairness", "explain predictions", "transparency" → Clarify.

07sagemaker
responsible AIdrift

Model Monitor

threshold ⚠ alert production drift
tap to flip ✦
07sagemaker

SageMaker Model Monitor

Continuously watches deployed models in production and alerts when something goes wrong.

  • Detects data drift — when incoming data shifts from training data
  • Detects concept drift — when the underlying relationships change
  • Flags data quality issues and anomalies
  • Alerts you to inaccurate predictions
Exam keyword: "monitor production", "drift", "model degrading over time", "alerts" → Model Monitor.

Clarify catches bias upfront and explains predictions. Model Monitor catches problems after deployment. Both = Responsible AI.

08sagemaker
data prepno-code

Data Wrangler

messy clean ML
tap to flip ✦
08sagemaker

SageMaker Data Wrangler

The fastest, easiest way to prep tabular and image data for ML — with little to no code.

  • Import data from many sources (S3, Athena, Redshift, Snowflake)
  • 300+ built-in transforms (impute, encode, normalise)
  • Visualise distributions and outliers
  • Export as a pipeline or feature set
Exam keyword: "prepare data", "data preparation", "clean data", "tabular and image data for ML" → Data Wrangler.

Different from Ground Truth, which is for labeling data (annotating it), not transforming it.

09sagemaker
labelinghumans + ML

Ground Truth

cat dog cat labels added by humans
tap to flip ✦
09sagemaker

SageMaker Ground Truth

For labeling and annotating training data — turns raw data into labeled data for supervised learning.

  • Use Mechanical Turk workers, vendors, or your own team
  • Active learning: ML auto-labels easy items, humans handle hard ones
  • Reduces labeling cost vs. doing everything by hand
  • Also: Ground Truth Plus = fully managed labeling service

Exam keyword: "data labeling", "annotate data", "human reviewers add labels" → Ground Truth.

§ 03

responsible AI

2 cards
10responsible
core concept

Responsible AI pillars

responsible AI
tap to flip ✦
10responsible

The 6 pillars (AWS Responsible AI)

pillarwhat it means
Fairnessequitable outcomes across groups
Explainabilityunderstand how predictions are made
Privacy & securityprotect data & respect rights
Safetyalgorithms work as intended, no harm
Controllabilityhumans can oversee & correct AI
Veracity & robustnessaccurate & reliable outputs
Governancecompliance, accountability
Transparencyopenness about capabilities & limits

AWS tools that map here: Clarify (fairness + explainability), Model Monitor (safety + robustness), A2I (controllability).

11responsible
human reviewA2I

Amazon A2I

ML low confidence human ♡
tap to flip ✦
11responsible

Amazon Augmented AI (A2I)

Implements human review for ML predictions — useful when accuracy matters and the model is uncertain.

  • Built-in workflows for Rekognition, Textract, custom models
  • Trigger human review when prediction confidence is low
  • Or sample a % of predictions for QA review
  • Reviewers can be your team, vendors, or Mechanical Turk

This is the AWS service for the "human-in-the-loop" pillar of Responsible AI. Critical for high-stakes use cases like medical, legal, content moderation.

§ 04

lifecycle terms

4 cards
12lifecycle
sequence4 steps

The ML process

collect prep train eval
tap to flip ✦
12lifecycle

The ML process sequence

The fundamental 4-step sequence the exam will test:

  1. Data collection — gather from all sources
  2. Data preprocessing — clean, normalise, split train/val/test
  3. Model training — feed prepared data to the algorithm
  4. Model evaluation — assess on the held-out test set
Logic check: you can't preprocess what you don't have, can't train on raw mess, can't evaluate nothing. The sequence is forced by what each step needs.

In the wider ML lifecycle (MLOps), this extends to: framing → data → train → deploy → monitor → iterate.

13lifecycle
traininglearning phase

Training

parameters updating ↻ labeled training data
tap to flip ✦
13lifecycle

Training

The phase where the model learns — its internal parameters (weights) are being updated.

  • Uses the training set (~60–80% of data)
  • Algorithm minimises an error function
  • Model "sees" examples and adjusts
  • The output is a trained model
Tell from inference: in training, parameters CHANGE. In inference, parameters are FROZEN. If the question says "adjusting parameters" → training.
14lifecycle
valtest

Validation vs Testing

validation tune knobs testing final check during training after training
tap to flip ✦
14lifecycle

Validation vs Testing

validationtesting
whenduring trainingafter training
dataval set (10–20%)test set (10–20%)
purposetune hyperparameters, prevent overfitunbiased final estimate of performance
repeat?many timesideally once

Mnemonic: validation = the dress rehearsal (you adjust). Testing = the actual show (you don't tweak after).

Neither involves serving predictions to real users — that's inference.

15lifecycle
inferenceproduction

Inference

new data trained model predict
tap to flip ✦
15lifecycle

Inference

The deployed model uses its frozen learned parameters to make a prediction on new, real-world input.

  • This is what users actually experience
  • No learning happening — parameters don't change
  • Two modes: real-time (live API, low latency) and batch (process many records at once)
Exam trap: the question describes "user inputs new data → model returns prediction" — this is ALWAYS inference, even though it sounds like it could be validation or testing. Validation & testing happen on data you ALREADY HAVE during the model dev process.

In SageMaker, you deploy a model to an endpoint for real-time inference, or run a batch transform job for bulk.

§ 05

foundation models

3 cards
16fm
FM technique!! exam trap

FMs & self-supervised

the cat sat on [ ? ] predict → "mat" the data labels itself ♡
tap to flip ✦
16fm

Foundation Models use self-supervised learning

The defining technique for FMs. Model gets vast amounts of unlabeled data, then generates labels from the data itself.

  • No humans labeling anything upfront
  • Example: LLM predicts the next word — the "label" is just the next word that already exists in the text
  • Used to pre-train, then fine-tune for downstream tasks
Exam trap: "Unsupervised" is the wrong-but-tempting answer. Don't fall for it.
⨯ Unsupervised = unlabeled data, no labels at all, find patterns
✓ Self-supervised = unlabeled data, model CREATES labels from it

Question keywords: "foundation model" + "generates labels" + "from raw input" → always self-supervised.

17fm
4 paradigmsquick ref

Learning paradigms

supervised unsuperv. self-sup. reinforce. all sit under "machine learning"
tap to flip ✦
17fm

The 4 learning paradigms

typedataused for
Supervisedlabeled in/out pairsclassification, regression
Unsupervisedunlabeled, no labels createdclustering, anomaly
Self-supervisedunlabeled, model creates labelsfoundation models, LLMs
Reinforcementreward signalgames, robotics, RLHF

Plus semi-supervised = small labeled + large unlabeled, uses pseudo-labeling.

18fm
BedrockFM access

Bedrock

Amazon Bedrock Claude Titan Llama Mistral +
tap to flip ✦
18fm

Amazon Bedrock

The fully managed service for accessing foundation models via API from multiple providers.

  • Access to Anthropic Claude, Meta Llama, Mistral, Cohere, Amazon Titan, etc.
  • Serverless — no infrastructure to manage
  • Fine-tune FMs on your own data privately
  • Knowledge Bases for RAG, Agents for tool-use
  • Guardrails for safety filtering
Bedrock vs SageMaker JumpStart:
Bedrock = serverless API for FMs, fully managed
JumpStart = deploy FMs to YOUR SageMaker infrastructure, more control
§ 06

fit & challenges

2 cards
19fit
biasvariance

Bias × Variance

high bias high variance
tap to flip ✦
19fit

Bias & Variance, locked in

underfitoverfit
biasHIGHlow
variancelowHIGH
train perfpoorgreat
test perfpoorpoor

Memorable framing:

  • Underfit = high bias, model is too simple, biased toward being basic
  • Overfit = high variance, model memorised noise, output swings wildly with new data

Goldilocks zone in the middle. Reduce overfit with: more data, regularisation, early stopping, dropout, ensembling.

20fit
challengesdata first

ML's #1 challenge

data: collection & quality algorithms compute
tap to flip ✦
20fit

The biggest ML challenge: data

Collecting and preparing high-quality data is THE primary challenge in real-world ML implementation.

  • Algorithms are commodity (open-source, well-documented)
  • Compute is rentable (AWS, GPUs on demand)
  • Data quality, quantity, labeling, bias — all bottlenecks
  • "Garbage in, garbage out" — your model is only as good as its data
Exam rule of thumb: if "biggest challenge" or "primary difficulty" appears in a question, the answer is almost always about data, not algorithms or compute.

Adjacent challenges: model interpretability, deployment costs, monitoring drift, ethical concerns, regulatory compliance.