This repository contains an experimental CrewAI notebook for interview preparation:
The notebook researches a target company and interviewer, generates interview questions for a specific role, and then runs an interactive practice loop where the user answers questions and receives coaching feedback.
The same workflow is now available as a Rich-powered CLI app:
.\interviewTo install the console script, run:
.\.venv\Scripts\python.exe -m pip install -e .
interviewer-agent runYou can also pass the interview context up front:
.\interview `
--company "Maneva" `
--interviewer "Avery Smith" `
--position "Forward Deployed Engineer" `
--job-description "Build AI workflows for industrial customers." `
--questions 5 `
--research-depth fast `
--max-attempts 3The CLI renders a terminal "operator console" with Rich panels, a callback-driven live dashboard, concise Markdown feedback, and phase markers for BOOT, RESEARCH, QUESTION GEN, PRACTICE, and SESSION SAVED. During research, real CrewAI task callbacks update the company/interviewer/question progress bars, while tool callbacks update the current activity line. By default it uses faster research, generates 5 questions, pauses before practice begins, and writes session history to runs/interview-session-YYYYMMDD-HHMMSS.json unless --output is provided. Use --research-depth standard for the original deeper research mode.
Interview-ready PNG diagrams are available in diagrams/:
Regenerate them with:
.\.venv\Scripts\python.exe .\scripts\generate_diagrams.pyUser inputs
-> Research company
-> Research interviewer
-> Generate interview questions
-> Parse questions from Markdown
-> Ask questions one by one
-> Collect user answer
-> Generate feedback
-> Retry answer or move to next question
The notebook expects these values to be available, usually through a .env file:
ANTHROPIC_API_KEY=...
SERPER_API_KEY=...ANTHROPIC_API_KEY is used by Claude through CrewAI.
SERPER_API_KEY is used by Serper for web search.
The notebook uses:
crewaicrewai_toolspython-dotenvIPython.displayre
The key CrewAI imports are:
from crewai import Agent, Process, Crew, Task
from crewai_tools import ScrapeWebsiteTool, SerperDevToolThe workflow uses Anthropic Claude through CrewAI:
llm = LLM(
model="anthropic/claude-sonnet-4-6",
api_key=os.getenv("ANTHROPIC_API_KEY"),
temperature=0.2,
max_tokens=1600,
)claude-sonnet-4-6 is used because the workflow needs tool calling. Earlier Claude Sonnet 4 model IDs caused strict-tool compatibility errors with CrewAI.
temperature is kept low (0.2) on purpose: question generation must produce a stable, parseable format, so reproducibility is preferred over creative variety. The CLI configuration lives in interviewer_agent/workflow.py.
The notebook asks for:
- interviewer name
- company name
- job position
- job description
These values are stored in:
interviewer
company
job_position
job_descriptionThey are then interpolated into the research tasks and question-generation prompt.
The research agent uses two tools:
search_tool = SerperDevTool(n_results=5, max_usage_count=3)
scrape_tool = ScrapeWebsiteTool(max_usage_count=4)Tool limits are intentional:
- Serper can search the web at most 3 times.
- The scraper can read at most 4 webpages.
- Serper returns 5 results per search.
This prevents the research agent from repeatedly searching and spending unnecessary tokens.
The research_agent gathers company and interviewer context.
research_agent = Agent(
role="Researcher Agent",
goal="Conduct in depth research on a company.",
backstory="As a Research specialist your mission is to conduct in depth research on a company.",
llm=llm,
tools=[scrape_tool, search_tool],
max_iter=8,
)Important details:
- It can search and scrape the web.
- It has a limited number of tool calls.
max_iter=8limits its reasoning/tool-use loop.
The coach_agent generates questions and later grades answers.
coach_agent = Agent(
role="AI Interview Coach",
goal=f"Coach the user to prepare for an interview for the {job_position} role at {company} by grading the user's answer.",
backstory=f"You are an expert on technical job interviews in companies like {company}.",
llm=llm,
)research_company_task asks the research agent to gather company information relevant to the target role.
research_person_task asks the research agent to gather public information about the interviewer.
define_questions_task asks the coach agent to generate the requested number of
interview questions (default 5, configurable with --questions) using:
- job description
- company research
- interviewer research
It uses the two research tasks as context:
context=[research_company_task, research_person_task]To keep the output deterministic and reliably parseable, the task prompt pins an exact output format instead of asking for free-form Markdown:
- exactly N lines, one question per line;
- each line numbered
1.,2.,3., and so on; - no title, preamble, closing remark, blank lines, bullets, bold/italic markup, or surrounding quotes.
This canonical numbered shape maps directly onto the parser's numbered pattern, so a well-behaved model run parses cleanly every time.
The main crew runs the three tasks sequentially:
crew = Crew(
agents=[research_agent, coach_agent],
tasks=[research_company_task, research_person_task, define_questions_task],
verbose=True,
process=Process.sequential,
embedder=embedder,
)
result = await crew.kickoff_async({
"topic": "Write a list of question to prepare for the interview."
})Task order:
- Research the company.
- Research the interviewer.
- Generate interview questions.
The notebook uses await crew.kickoff_async(...) because Jupyter already runs an event loop. Calling crew.kickoff(...) synchronously inside a notebook can fail.
The generated questions are stored as raw Markdown:
questions = result.tasks_output[2].rawImportant: questions is a Markdown string, not a Python list.
The CLI parses it with parse_interview_questions in
interviewer_agent/parsing.py. Even though the prompt
asks for a canonical numbered list, models occasionally drift, so the parser is
tolerant and recognizes several shapes:
- numbered lines (
1.,2),3:) — the canonical, deterministic format; - plain Markdown bullets (
-,*,+); - notebook-style
**Question N:**blocks with quoted text; - same-line
**Question N:** ...entries.
It also trims wrapping markup/quotes and de-duplicates repeated questions. Earlier the
parser only handled the **Question N:** notebook format; when a model returned plain
bullets, nothing matched and question generation failed with
The model returned question text, but no questions could be parsed. The deterministic
prompt plus the tolerant parser together prevent that failure mode.
The notebook contains an older experiment using:
human_input=TrueThis is not the recommended approach for interview practice.
In CrewAI, human_input=True means the human reviews the agent's final output. It does not behave like a clean interview-answer input mechanism. In this project, it caused the agent to receive the user's answer as feedback on its own generated question, which could make it repeat the question.
Use Python input() instead.
The notebook includes a simple loop that:
- Parses questions.
- Displays one question.
- Collects the user's answer with
input(). - Sends the question and answer to the coach agent.
- Displays feedback.
- Moves to the next question.
This is useful for one attempt per question.
The final workflow is the iterative practice loop:
practice_history = await practice_interview_with_iterations(
questions,
max_attempts_per_question=3,
)For each question, it:
- Displays the question.
- Collects an answer attempt.
- Runs a feedback task with the coach agent.
- Displays feedback.
- Lets the user retry the same question.
- Moves to the next question when the user types
next. - Stops cleanly when the user types
exit,quit, orstop.
Each attempt is stored in practice_history with:
- question index
- question text
- attempt number
- user answer
- feedback
- Load environment variables.
- Import libraries.
- Create the LLM.
- Enter interviewer, company, role, and job description.
- Create research tools.
- Create the research agent.
- Define company and interviewer research tasks.
- Create the coach agent.
- Define the question-generation task.
- Run the main crew.
- Extract
questions. - Skip the
human_input=Trueexperiment. - Run the iterative feedback loop.
- Generated questions are still parsed from Markdown with regex (now tolerant of multiple formats), rather than a structured JSON contract.
- The notebook still contains the older
human_input=Trueexperiment. - Research output and feedback are mostly raw Markdown.
- There is no structured scoring rubric yet.
- There is no final performance summary across all attempts.
- Make
define_questions_taskreturn structured JSON instead of Markdown. - Remove the deprecated
human_input=Truecells. - Add a consistent rubric for feedback, such as clarity, depth, relevance, and confidence.
- Save
practice_historyto a file after the session. - Add a final summary task that reviews all attempts and gives an overall improvement plan.