AI/ML Interview Questions - Easy

Easy-level AI/ML interview questions with LangChain examples and Mermaid diagrams.

Q1: What is the difference between supervised and unsupervised learning?

Answer:

Supervised: Has labels (input → output mapping) Unsupervised: No labels (discover structure)

LangChain Example:

 1from langchain.prompts import FewShotPromptTemplate, PromptTemplate
 2from langchain.llms import OpenAI
 3
 4# Supervised: Few-shot learning with labeled examples
 5examples = [
 6    {"text": "I love this product!", "label": "Positive"},
 7    {"text": "Terrible experience.", "label": "Negative"},
 8    {"text": "It's okay.", "label": "Neutral"}
 9]
10
11example_prompt = PromptTemplate(
12    input_variables=["text", "label"],
13    template="Text: {text}\nSentiment: {label}"
14)
15
16few_shot_prompt = FewShotPromptTemplate(
17    examples=examples,
18    example_prompt=example_prompt,
19    prefix="Classify sentiment:",
20    suffix="Text: {input}\nSentiment:",
21    input_variables=["input"]
22)
23
24llm = OpenAI()
25result = llm(few_shot_prompt.format(input="This is amazing!"))

Q2: Explain the ML pipeline.

Answer:

LangChain Pipeline:

 1from langchain.chains import SequentialChain, LLMChain
 2from langchain.prompts import PromptTemplate
 3
 4# Stage 1: Preprocess
 5preprocess = LLMChain(
 6    llm=OpenAI(),
 7    prompt=PromptTemplate(
 8        input_variables=["raw_text"],
 9        template="Clean this text: {raw_text}\nCleaned:"
10    ),
11    output_key="cleaned"
12)
13
14# Stage 2: Extract features
15extract = LLMChain(
16    llm=OpenAI(),
17    prompt=PromptTemplate(
18        input_variables=["cleaned"],
19        template="Extract key topics: {cleaned}\nTopics:"
20    ),
21    output_key="topics"
22)
23
24# Stage 3: Classify
25classify = LLMChain(
26    llm=OpenAI(),
27    prompt=PromptTemplate(
28        input_variables=["topics"],
29        template="Classify: {topics}\nCategory:"
30    ),
31    output_key="category"
32)
33
34# Complete pipeline
35pipeline = SequentialChain(
36    chains=[preprocess, extract, classify],
37    input_variables=["raw_text"],
38    output_variables=["cleaned", "topics", "category"]
39)
40
41result = pipeline({"raw_text": "Check out this AI tool!!!"})

Q3: What is overfitting?

Answer:

Prevention: More data, regularization, cross-validation, early stopping

LangChain Example:

 1# Overfitting in few-shot: Too many examples memorized
 2# Good balance: 3-5 diverse examples
 3
 4from langchain.prompts import FewShotPromptTemplate
 5
 6# Good: Balanced examples
 7good_examples = [
 8    {"input": "I love this!", "output": "Positive"},
 9    {"input": "Terrible.", "output": "Negative"},
10    {"input": "It's okay.", "output": "Neutral"}
11]
12
13# Overfitting: Too specific, won't generalize
14overfit_examples = [
15    {"input": "I love this product!", "output": "Positive"},
16    {"input": "I love this service!", "output": "Positive"},
17    {"input": "I love this app!", "output": "Positive"},
18    # ... 20 more similar examples
19]

Q4: Explain precision vs. recall.

Answer:

Precision: Of predicted positives, how many correct? Recall: Of actual positives, how many found?

LangChain Evaluation:

 1from langchain.evaluation import load_evaluator
 2
 3evaluator = load_evaluator("labeled_criteria", llm=OpenAI())
 4
 5predictions = [
 6    {"input": "I love this!", "output": "Positive", "reference": "Positive"},  # TP
 7    {"input": "It's okay.", "output": "Positive", "reference": "Neutral"},     # FP
 8    {"input": "Terrible!", "output": "Negative", "reference": "Negative"},     # TN
 9    {"input": "Not good.", "output": "Neutral", "reference": "Negative"}       # FN
10]
11
12# Calculate metrics
13tp = sum(1 for p in predictions if p["output"] == p["reference"] == "Positive")
14fp = sum(1 for p in predictions if p["output"] == "Positive" != p["reference"])
15fn = sum(1 for p in predictions if p["output"] != "Positive" == p["reference"])
16
17precision = tp / (tp + fp) if (tp + fp) > 0 else 0
18recall = tp / (tp + fn) if (tp + fn) > 0 else 0

Q5: What is cross-validation?

Answer:

Purpose: Robust evaluation using all data

LangChain Example:

 1from langchain.evaluation import load_evaluator
 2import numpy as np
 3
 4def cross_validate_prompt(examples, k=5):
 5    fold_size = len(examples) // k
 6    scores = []
 7    
 8    for i in range(k):
 9        # Split data
10        test_start = i * fold_size
11        test_end = test_start + fold_size
12        
13        test_set = examples[test_start:test_end]
14        train_set = examples[:test_start] + examples[test_end:]
15        
16        # Create prompt with training examples
17        few_shot_prompt = create_few_shot_prompt(train_set)
18        
19        # Evaluate on test set
20        evaluator = load_evaluator("criteria", llm=OpenAI())
21        fold_score = evaluate_test_set(few_shot_prompt, test_set, evaluator)
22        scores.append(fold_score)
23    
24    return {
25        "mean": np.mean(scores),
26        "std": np.std(scores)
27    }

Q6: Classification vs. Regression?

Answer:

LangChain Examples:

 1# Classification: Discrete output
 2classify_chain = LLMChain(
 3    llm=OpenAI(),
 4    prompt=PromptTemplate(
 5        template="Classify: {email}\nCategory: [Spam/Important/Normal]",
 6        input_variables=["email"]
 7    )
 8)
 9result = classify_chain.run(email="Win free iPhone!")
10# Output: "Spam"
11
12# Regression: Continuous output
13score_chain = LLMChain(
14    llm=OpenAI(),
15    prompt=PromptTemplate(
16        template="Rate sentiment 0-100: {review}\nScore:",
17        input_variables=["review"]
18    )
19)
20result = score_chain.run(review="Pretty good product")
21# Output: "72.5"

Q7: What is feature engineering?

Answer:

LangChain Example:

 1from langchain.chains import TransformChain, SequentialChain
 2
 3# Extract features
 4def extract_features(inputs: dict) -> dict:
 5    text = inputs["text"]
 6    return {
 7        "length": len(text),
 8        "word_count": len(text.split()),
 9        "has_exclamation": "!" in text,
10        "uppercase_ratio": sum(1 for c in text if c.isupper()) / len(text),
11        "text": text
12    }
13
14feature_chain = TransformChain(
15    input_variables=["text"],
16    output_variables=["length", "word_count", "has_exclamation", "uppercase_ratio", "text"],
17    transform=extract_features
18)
19
20# Use features for classification
21classify_chain = LLMChain(
22    llm=OpenAI(),
23    prompt=PromptTemplate(
24        template="""Features:
25- Text: {text}
26- Length: {length}
27- Words: {word_count}
28- Has !: {has_exclamation}
29- Uppercase: {uppercase_ratio}
30
31Sentiment:""",
32        input_variables=["text", "length", "word_count", "has_exclamation", "uppercase_ratio"]
33    )
34)
35
36pipeline = SequentialChain(
37    chains=[feature_chain, classify_chain],
38    input_variables=["text"]
39)

Q8: Batch vs. Online Learning?

Answer:

LangChain Example:

 1from langchain.vectorstores import FAISS
 2from langchain.embeddings import OpenAIEmbeddings
 3
 4# Batch: Index all at once
 5documents = load_all_documents()
 6vectorstore = FAISS.from_documents(documents, OpenAIEmbeddings())
 7
 8# Online: Add incrementally
 9vectorstore = FAISS.from_documents(initial_docs, OpenAIEmbeddings())
10
11def on_new_document(doc):
12    vectorstore.add_documents([doc])  # Incremental update
13
14while True:
15    new_doc = wait_for_new_document()
16    on_new_document(new_doc)

Q9: What is transfer learning?

Answer:

LangChain Example:

 1from langchain.llms import OpenAI
 2from langchain.chains import LLMChain
 3
 4# Transfer: Use pre-trained LLM for specific task
 5llm = OpenAI(model="gpt-3.5-turbo")  # Pre-trained
 6
 7# Adapt to medical domain with few-shot
 8medical_chain = LLMChain(
 9    llm=llm,
10    prompt=PromptTemplate(
11        template="""Medical assistant. Use general knowledge for medical context.
12
13Examples:
14Q: What is hypertension?
15A: High blood pressure condition.
16
17Q: {question}
18A:""",
19        input_variables=["question"]
20    )
21)
22
23# Transfers general knowledge to medical domain
24result = medical_chain.run(question="What causes diabetes?")

Q10: What is data augmentation?

Answer:

LangChain Example:

 1from langchain.chains import LLMChain
 2
 3# Augment by paraphrasing
 4augment_chain = LLMChain(
 5    llm=OpenAI(),
 6    prompt=PromptTemplate(
 7        template="""Generate 3 paraphrases preserving meaning:
 8
 9Original: {text}
10
11Paraphrases:
121.""",
13        input_variables=["text"]
14    )
15)
16
17# Original
18original = "I love this product!"
19
20# Generate variations
21augmented = augment_chain.run(text=original)
22
23# Now have multiple training examples:
24# - "I love this product!"
25# - "This product is amazing!"
26# - "I'm really happy with this!"
27# - "This product is fantastic!"

Summary

Key ML concepts with LangChain:

  • Learning types: Supervised vs. Unsupervised
  • Pipeline: Data → Model → Deploy
  • Overfitting: Prevention strategies
  • Metrics: Precision, Recall
  • Cross-validation: Robust evaluation
  • Tasks: Classification vs. Regression
  • Features: Engineering and extraction
  • Learning modes: Batch vs. Online
  • Transfer: Leverage pre-trained models
  • Augmentation: Increase training data

All with practical LangChain implementations!

Related Snippets