Deep Dive On AI, LLMs and Prompt Engineering For Product Managers
Take a deep breath and think through step by step ...
✋ If you only have 2 minutes, get the highlights here 👇
In this essay we’ll look at how Product Managers can leverage Large Language Models via prompt writing level up. We’ll start with a brief intro on what AI and machine learning is, touch on how large language models function and then review how you can use prompt engineering to accelerate delivery and do more with less. This was originally delivered as a presentation @ProductBC.
Data Science Changed My Life, Maybe It Can Change Yours?
Before becoming a data scientist I was a software engineer at Disney working on the virtual world Club Penguin. Club Penguin was a big deal as it had ~300M users, $120M in ARR … but for me it was one of the roughest, bleakest periods in my life. The studio had a very difficult engineering culture, was a crazy place to work, seldom in a good way and not usually to my advantage. At my lowest point, I was browsing Disney’s internal jobs board and I saw a posting for ‘Data Scientist’ … although I didn’t know what a Data Scientist was, I became intrigued and began investigating the role. A few weeks later I applied, ended up getting the job and it changed my life massively for the better and continues to reverberate positively for me to this very day.
Sometimes Something Comes Along At Just The Right Time
Often when an industry is going through a transition, the dust has to settle on it a little before we can look back, understand what happened and get a sense of what’s next. As product managers we now seem to be settling into a market expectation that we’ll have fewer PMs who are more senior, more ‘super’ individual contributor oriented and focused more on building and less on building organizations of PMs.
This can be taken as tough news but it’s also worth recognizing the excesses that led us here. A couple years ago I interviewed product managers who worked at companies such as Microsoft, Twitter and Meta. This one person at Meta, I talked to them, looked at their resume and asked them what they did and later concluded that they hadn’t shipped anything at Meta. How can someone have the title of product manager and not build and ship products? This of course came to be known as the zero percent interest rate product manager along with the product manager theatre referenced above. Yet in the midst of all this volatility, AI has been a bright spot for tech and for product managers. Maybe AI can change our profession for the better? Before we look at LLMs and prompt writing, this essay is going to spend time quickly building up the foundations of AI to help you understand what you’re doing when writing prompts.
What is AI? Rapid Deep Dive For Product Managers
AI is a broader concept aimed at creating ‘machines’ capable of performing tasks that would typically require human intelligence. Machine Learning (ML) is a way of achieving AI through algorithms that can learn from and make predictions or decisions based on data. Practically speaking AI is ML and ML is a subset of computer science. Machine learning enables computers to learn from data. It uses algorithms to identify patterns in the data. By understanding these patterns, the algorithm can classify new data it’s never seen before and often do so pretty accurately.
AI Is Classification: Red Dots vs Blue Dots
The below plot shows you a hypothetical example of a bunch of dots, some are red and some are blue. How can you predict whether a new hitherto unseen dot is going to be red or blue? Well you could use basic statistics (y=mx+b) in the form of y=−1.09x+0.31 and do a pretty good job of predicting all the dots. You could also use a simple machine learning algorithm based on statistical models like regression to predict whether a dot was red or blue. And you could use even more sophisticated models like a neural net to do the same thing. Whatever you chose, the model would try multiple different ways with the algorithm you provided it with to generate a trained model that can classify the colour of a dot. Once it’s done and you’re happy with it you can then point it at a brand new set of dots the model has never seen before and understand its prediction capabilities. This is the test phase of ML.
What Happens When The Dots Get Complicated?
What if your blue and red dots looked like this new plot below? Now ML becomes a ‘must have’ and not a ‘nice to have’. You might need to use a more sophisticated algorithm to train a model to predict whether dots are red or blue. Especially when there are non linear relationships in the data. In the example below, it’s not so simple to use a linear equation.
Running Inference On A Trained Model
This hypothetical example in a simplified sense involves calling a function that takes a ‘dot’ and runs inference on it using the model to tell you if the dot is red or blue. Inference is using the model to predict something. That’s the computer science part, at the end of the day running inference on machine learning means at some level functions are called with parameters and the function spits out a result. I’m simplifying this (a lot) but essentially this is what is happening when you are using an LLM like ChatGPT. You are sending a function your inputs and the function returns a response back to you. When you have lots of data, lots of labels beyond red or blue things can get complicated but machine learning can take in millions and billions of parameters and train on millions and billions of different rows of these parameters. You can go really wide and deep and this is exactly what LLMs did to get to our cultural ‘ChatGPT’ moment.
Precision and Recall
Two concepts to remember with machine learning are precision and recall.
Precision → When the AI makes a prediction on something, how often is it right?
Recall → Of the total population of dots how many true classifications are we making, not the same thing as precision.
You can get really high precision but only make a confident classification when you are quite certain and then your recall will suffer. In our dots example this means that you might pass on making a confident prediction on say 80% of the dots but be super confident about classifying the other 20%. Or you can get really high recall and say every dot (100%) is blue but have terrible precision, you haven’t achieved anything meaningful although you did correctly identify all the true blue dots simply because you classified every red and blue dot that way. Precision and Recall are conflicting priorities that the data scientist must balance and the AI oriented PM is often involved here too to bring in the customer and business needs to bear on the tuning decisions. This can be a fun problem particularly in product to work through!
Large Language Models
Large language models (LLM) like ChatGPT are machine learning algorithms trained with billions of rows of textual data. They are trained to predict the next word in a sentence by analyzing vast amounts of text corpuses. They spot patterns in how words are used and follow these patterns to make educated guesses. When given a phrase, an LLM uses its training to fill in missing words (or tokens), much like completing the next word or two in a familiar saying. This process is repeated countless times, making the model proficient in generating text that flows naturally.
Hallucinations
When you run inference (ask ChatGPT something) you will get a result back that has imperfect precision and recall. Remember no AI model ever generally get’s either perfect precision or perfect recall. There is always room for error. LLMs can and do generate inaccurate information, known as ‘hallucinations’. These occur when they prioritize generating a wide range of plausible responses (high recall) over ensuring each response is strictly accurate (high precision). This happens because LLMs infer from patterns, not facts and so can weave convincing but incorrect narratives. When an LLM like ChatGPT predicts the next set of tokens, it bases its predictions on patterns it has learned from the training dataset. However, it doesn't ‘know’ facts; it infers them based on associations between tokens.
LLMs are trained on extensive data to predict text responses with a balance of precision (relevance) and recall (completeness). They recognize patterns to generate educated guesses for missing words. However, no model achieves perfect precision or recall, sometimes leading to "hallucinations"
Precision and Recall
Precision and recall are two metrics that help us understand an LLM's performance. Precision in this context would be the model's ability to generate correct and relevant tokens. Recall would be its ability to generate all the appropriate tokens that could properly fit in a given context. An LLM can "hallucinate" when striving for high recall, attempting to cover as many plausible tokens as possible, sometimes at the expense of precision, leading to the generation of tokens that make sense in the context of language but aren't factually accurate or relevant to the specific context it was asked about. AI is probabilistic as opposed to deterministic and this is the biggest difference between machine learning and computer science.
Does That Sound Boring?
Let’s look at examples of precision in recall with some prompts I made in late 2022 with GPT3 being very naive with my prompting … remember hallucination is a form of recall or precision gone awry … I was playing around with an AI PowerPoint generator tool and just seeing what kind of ridiculous decks I could make for fun at a Christmas party. First up: I created a deck that taught salespeople how to sell cigarettes to Mennonites (btw I’m married to one). Here’s a slide from the deck:
You can see the model is hallucinating, there aren’t really any benefits to smoking cigarettes but here the model is trying it’s level best to provide a plausible sounding answer and crafting a sales strategy around it! Yeah smoking is bad for you.
Next I asked the AI to create an HR deck advising employees what to do if they had an had an ‘unforeseen accident’ during a meeting:
The recall here is actually pretty good and for such a ridiculous question the model has neutrally addressed this from all angles. And a lot people stopped here or they read about the lawyer who got in hot water for relying on ChatGPT and it made up a fictional case. You’ve likely heard of prompt engineering and when I first heard of this, it sounded somewhat intimidating.
Prompt Engineering + Precision + Recall
LLMs like ChatGPT predict the next token, such as ‘you’ after ‘I love…’ through a process starting with tokenization of the input. They analyze context from extensive text training, calculate potential token probabilities and select the most likely next token based on these probabilities. The prediction may vary with context, randomness, or exploration of less likely options. But it should be able to predict “I love you” as the output amongst other possibilities (“I love dogs”).
Prompt engineering for Large Language Models (LLMs) like ChatGPT involves crafting input text (prompts) in a way that guides the model towards generating specific, desired outputs. It's a form of interaction design and a new higher level programming language specific to AI, leveraging the model's training to elicit accurate, relevant responses. This process can significantly enhance both the precision and recall of the model's outputs in the context of machine learning.
I was listening to a podcast by Brian Roemmele on LLMs and prompt engineering and he likened LLMs to having this amazing set of rich treasures or tokens located inside a galaxy of billions of tokens. Prompt engineering was this magical way finder you could use to locate all the treasures and find the tokens that weren’t accessible, to really see what the LLMs could do you had to dig deep and prompt engineering was the best way to access the ‘deeper’ learnings in the galaxy of tokens. This analogy stuck with me and got me really interested in prompt engineering, using the power of language as an almost higher order language to interface with computers.
Previously when I was a data scientist and trying to extract more precision and recall from models, I’d add more parameters to my model or use principle components analysis and this involved adding more code and complexity. Explainability was always an issue with non data science folks when sharing the results of my models. Conversely prompt engineering makes machine learning inference more accessible to people who are good writers. By the power of your writing you can unlock amazing insights and output from LLM that are simply not available by naive prompting techniques. Prompt engineering will improve the quality of your answers and that means improving precision and enhancing recall. As Product Managers, writing is a big part of what we do and so I believe that PMs are well suited to learn this new language and help lead in this new AI era.
Improving Precision
Precision in this context means the model's ability to generate relevant and accurate responses to the prompt. By carefully designing prompts that are clear, specific, and include relevant context or constraints, you can guide the model to produce outputs that closely match the desired information or answer. For instance, instead of asking a vague question, providing detailed context or specifying the format of the desired answer can lead to more accurate and relevant outputs.
Enhancing Recall
Recall refers to the model's capacity to generate all relevant pieces of information related to the prompt. Through prompt engineering, you can structure prompts to encourage the model to explore a broader range of information or ideas that might be relevant to the query. By asking open-ended questions, prompting for lists, or using cues that signal the model to consider multiple perspectives, you can improve the model's recall, ensuring it captures a wider array of relevant responses.
Prompt Engineerings Application in Machine Learning
Prompt engineering is akin to fine-tuning the interaction between human users and the LLM. It doesn't change the model's internal workings or its training data but optimizes how the model's capabilities are accessed and utilized. This optimization can lead to more effective use of LLMs in various applications, from content creation and question answering to more complex problem-solving tasks. By understanding and applying principles of prompt engineering, users can more effectively harness the power of LLMs, steering them toward producing outputs that are not only more precise and relevant but also comprehensive and informative, thereby enhancing both precision and recall in the model's responses. Yeah prompt engineering is your friend.
Principles Of Prompt Engineering
We’ve noted already that prompt engineering is a strategic method of interacting with Large Language Models (LLMs) like ChatGPT to obtain high-quality, relevant responses. This method involves structuring input (prompts) to guide the model towards generating specific outputs. Understanding and applying the core concepts of prompt engineering—setting the persona, context, situation, task/output, and following guides—can significantly improve the precision and relevance of the answers received from an LLM. We’ll walk through some core principles.
Setting the Persona: By defining a persona, you give the LLM a frame of reference for the type of language, tone, and level of detail expected in the response. This can range from a casual conversational style to a highly technical discourse. Establishing a persona helps tailor the model's output to match the intended audience's expectations and needs, enhancing the answer's relevance and accessibility
Establishing the Context: Context setting involves providing background information or the broader environment related to the query. It equips the LLM with a comprehensive understanding of the subject matter, allowing for responses that are not only accurate but deeply informed by the specific circumstances or nuances of the topic. This leads to more nuanced and contextually appropriate answers.
Defining the Situation: The situation narrows down the focus from the general context to specific circumstances or events relevant to the prompt. It directs the model's "attention" to the immediate factors or conditions influencing the query, ensuring that the generated responses are directly applicable and tailored to the unique aspects of the situation described.
Specifying the Task/Output: Clearly stating the task or desired output informs the LLM of the exact nature of the response needed, whether it's an explanation, a list, a narrative, or another form of output. This clarity prevents misinterpretation and ensures that the model's computational resources are efficiently directed towards producing the specific type of content requested.
Use of Examples: Including examples within the prompt can guide the model more effectively, especially for complex tasks. Demonstrating the format or structure of the desired response can help the model align its outputs more closely with user expectations. You can steer the output when show a past example of how this should look although you will also constrain the model when you do this so be mindful
Following Guides: Guides are instructions or principles like "take a deep breath and think through step by step before answering," which encourage thoughtful and deliberate interaction with the LLM. They promote a more reflective approach to prompt engineering, leading to better-structured queries that are more likely to elicit precise, comprehensive, and relevant responses. By considering these guidelines, users can refine their prompts to better leverage the model's capabilities, enhancing the quality of the interaction.
Real World Quick Example
Here is a really quick prompt I wrote that skips some of the principles above for convenience and expediency. I wrote this in less than a couple minutes to generate a paragraph to kick start my writing for this article which originated as a talk with Product BC.
[context] and [persona]
You are an expert Data Scientist who has worked at big tech companies such as Google, Apple, Meta and Tesla. You have a background in statistics and have earned your PHD in statistics and also a post doc from Stanford. You are incredibly smart. You have also taken many data science courses from Johns Hopkins University in biostatistics.
[situation]
You are giving an introductory talk on machine learning to a group of product managers who are not engineers or data scientists. You need to show them fundamental concepts of machine learning so that the product managers understand the basics. You are mostly focused on teaching students about large language models and how they work under the hood
[task]
Write me an easy to understand paragraph that explains how large language models work, what predicting the next token means. It's gotta be really easy to understand for a non technical business audience of product managers. Follow the [guides] when crafting your response.
[guides]
- Take a deep breath and think through step by step before answering.
- Always apply first principles thinking to anything you answer back with.
- Make every word count!
- Double check your work for logical inconsistencies, grammatical errors or just plain nonsense.
- Do not use cringe or hokey language.
- Did I remind you to double check your work?
- I will give you a nice fat $200 tip for an amazing + thoughtful answer.
- You can do it!
You can see that I use square brackets to separate conceptual blocks in my prompts. I’ve seen some people use XML tags or even markdown syntax. It doesn’t matter so much which way you write them as long as you’re consistent and organized. You want to organize the prompt in a way that's both understandable to the human crafting it and usable by the machine interpreting it. By following these prompt engineering principles, iterating and experimenting with them you’ll begin to see the power of this technique. Now we’ll look at some specific optimizations I recommend you incorporate into your prompts.
Prompt Engineering Hacks For Product Managers
"Take a deep breath and work on this problem step by step”
If there’s one technique I recommend it’s writing “Take a deep breath and work on this problem step by step," at the end of your prompts especially for tougher problems. It’s almost like computer science problems are generally easier when they’re broken out into discreet steps. A function should only do one thing and I wonder if the same principle applies here? More formally though, DeepMind's new technique, Optimization by PROmpting (OPRO), enhances AI math skills through effective, human-like prompts, streamlining problem-solving. This method involves one AI assessing another's solutions, then refining prompts for better outcomes. Key phrases, notably "Take a deep breath and work on this problem step by step," significantly boost AI performance on math tasks.
"Do not use hokey or cringe language in your response”
This is my discovery FWIW and I just can’t recommend this enough to you, especially if you don’t want to come off like a supercilious embarrassment. I find that the ‘base’ or default state of LLMs is to write effusive, silly and inoffensively yet horrifyingly bland and overtly non-confidence inspiring drivel. Yeah it’s that bad. The default prose and thinking is so awful, maybe it’s designed this way so as not to offend the general public but trust me you will cause everyone around you to lose confidence if you don’t fix this in your prompt writing or post response editing. I wonder if this hokeyness has caused reticence amongst product managers to use LLMs. Hokeyness can be bad for your career and mental health!
Unprompted, unrefined LLMs can produce outputs that may come across as cringe or hokey, akin to a mix of sitcom dad puns and dated robot prose. These outputs, often too formal, outdated, or off-topic, resemble a tone-uncertain script filled with clichés and groan-inducing humor. The result is a text that feels more suited for a time capsule than professional settings, charming yet perplexing in its silliness. —> ChatGPT roasting itself
I’ll give you a nice $20 tip for a good answer, so try your hardest for me.
If you tell the LLM you’ll give it a tip for a good answer, you often get a better answer. Why is that?
Engagement and Quality Association Including a mention of tipping can align your prompt with patterns of high-engagement or high-quality exchanges found in the LLM's training data. This could lead to responses that are more detailed or perceived as higher quality, as the model draws from interactions where gratitude or rewards are expressed for helpful information.
Patterns in Data: The LLM's responses are shaped by patterns in its training data. If the data shows that discussions involving tipping are associated with polite, detailed, or constructive exchanges, the model may produce responses that reflect these qualities. The model doesn't understand the concept of a tip but responds based on the characteristics of exchanges where such offers are made. But sometimes, ChatGPT will call you out if you don’t pay up!
ChatGPT has memory now. Be careful about casually using the tipping trick.
You should probably be 100% drug free but LLMs don’t need to be drug free.
I’ve saved the wildest hack for last. You can improve the quality of responses from the LLM when you to tell the model to take drugs. 🤯
Take a deepth breath, relax, and enter a state of flow as if you've just taken Adderall (mixed amphetamine salts).
https://twitter.com/thatroblennon/status/1750973277290557458
This is a variation on the ‘take a deep breath and think through step by step’ tactic above but it does change the output and depending on the circumstances it can be a marked improvement. Why would this work? I asked AI and it told me that the prompt hack metaphorically boosts the AI's focus, akin to how stimulants enhance human concentration. While not literally enhancing AI capabilities, it frames requests to potentially draw out more concise and focused responses, using creative framing to improve output quality and engagement. These hacks aren’t exhaustive and I’m coming across new ones all the time, testing and adjusting my prompts and updating my custom GPTs with them.
Writing Is Thinking
Decisions are your output as PM
An exceptional product manager stands out not just for making good decisions, but for making informed decisions quickly and writing it down is often key. Writing is more than communication, it's also a rigorous process of thinking, analyzing, and refining ideas into clear, actionable decisions. Writing to make better decisions, to get feedback on those decisions accelerates and improves your product sense. Large Language Models are your friend once again, serving as advanced tools that provide feedback, suggest alternatives, and enhance the decision-making process. As PM you can harness writing and LLMs for rapid, sound decision-making.
“While product managers may not build the actual product, they do produce something very tangible for a team: decisions.” Blackbox of PM
Writing is an important skill for critical thinking and decision-making. More than words; the writing we produce is an internal product designed for readers—whether a specification, presentation, or memo—aiming to be clear and interpretable, influencing the experience of both executives and team members. Product Managers always think of the end user experience for the end product … but we should also think about the end user experience of the person reading our writing. Writing effectively bolsters your credibility and influence, yet achieving conciseness and clarity can be a challenging constraint. LLMs are your friend acting as "super editors" to refine writing, ensuring precision and clarity while avoiding unprofessional language (assuming you’re very stern about avoiding hokeyness). By leveraging LLMs for editing, PMs can improve their writing to deliver clear, impactful messages, fostering better decisions and stronger influence. I’m actually underselling you how incredible this can be for your effectiveness as PM.
Great AI prompts are just great writing. Writing is probably the #1 most important skill in the AI era—dare I say even more important than coding.
https://twitter.com/petergyang/status/1766551523600015487?s=12
Tersify Is Your Friend
For product managers, effective writing is crucial for decision-making and influence, ensuring ideas are clear and persuasive. Yet, achieving brevity and clarity requires time and effort. AI can serve as an advanced editor to enhance your writing's impact and clarity.
As a data scientist, I learned the hard way that presenting extensive research to executives can overwhelm them. While reproducible research is vital for credibility, detailing every step from data collection to conclusions can dilute the message. Instead, such details should be summarized or referenced briefly for executive audiences.
Product managers should not only focus on user experience but also on how their writing is perceived by colleagues, especially executives. Business writing is an experience; it should be engaging and clear, not burdensome. Overly wordy and unclear writing can lead to skimming, ignoring, or critical feedback, undermining a PM's influence and credibility. To enhance impact and facilitate better decision-making, it's crucial to edit and refine your writing, aiming for clarity and conciseness.
I’ve created a Custom GPT: Tersify that you use in Chat GPT Premium to do this for you. Ask it to ‘tersify’ or ‘use terse’ on a given body of text
First Principles
First-principles thinking simplifies complex problems by breaking them down into basic elements and reconstructing them from the ground up. It is particularly beneficial for product managers, enabling them to establishing clarity and unlock unique insights into the problems they’re solving.Implementing first-principles thinking allows product managers to critically evaluate their decisions and thinking.
Similarly, using AI as a support tool in analysis ensures logic is sound, making the final documents or presentations coherent, persuasive, and of high quality. Before AI, you could read up on first principles thinking, build it up as muscle memory and you should still do this! But pre-AI you would double check your work, maybe write your own comments on what you wrote and sit with it for awhile. Then you’d get feedback on it, talk it live and action this feedback. Sometimes the feedback would have something obvious, sometimes it took a while to get the feedback and sometimes you skipped these steps altogether and went straight into the meeting with your customer, executive or manager then got the feedback (usually negative) right there and then! And you learned not to do that.
Wouldn’t be nice to fast track and iterate on the feedback? Wouldn’t it be nice to able to self serve most of this yourself so that you could almost have a simulation of the meeting before the actual meeting?
Here is custom GPT I wrote to evaluate from first principles:
Put your arguments, reasons, decision records and analyses into AI and evaluate them via a first principles perspective to find obvious gaps and opportunities to improve.
Anticipating executive feedback allows product managers to proactively refine their content, improving presentations and increasing confidence. This approach supports fast feedback loops, efficient iteration, and optimal use of resources, enabling much of the preliminary work to be done independently. You’re shrinking the time it takes to reach a good decision and get buy in for it, which means you can confidently build sooner and move onto other problems. Time to clarity is an often under-appreciated metric in product orgs and using AI to assist you with first principles thinking can be a big win.
Prompt Fundamentals + Tersify + First Principles In Action
Product Managers often struggle to clearly articulate product ideas and requests to design and engineering teams, a process that is both time-consuming and challenging. Writing detailed one-pagers for every feature request, amid a constant stream of customer suggestions, can easily overwhelm PMs. However, AI's capability to generate templates streamlines this task, enabling the quick creation of one-pagers and efficient management of administrative duties. Acting as an auxiliary PM, AI supports rapid iteration on feature requests, boosts the speed of feature development, and improves collaboration with design and engineering teams. This method not only saves time but also enhances productivity, allowing PMs to provide more value with reduced effort and lower costs. The significant role of AI in speeding up product development, including in the planning and prioritization stages, highlights its deflationary effect, empowering PMs to accomplish more in less time.
When aggregating feature ideas, whether from managers, customers, or stakeholders, these often become part of a 'feedback river', getting logged as ideas, JIRA tickets, or emails, and can easily be lost. While managing them in tools is standard, they may still become overlooked, and sharing them with customers for feedback isn't always straightforward. This can lead to delays in development, as teams may not have clear directions on the feature's requirements or how to proceed.
Product Managers need to sharpen and identify problems very clearly for engineering teams. And this takes time and it means writing up a one pager either to research for an anchor feature or to establish something for a smaller one. PMs are busy and you could spend days writing these up over a series of weeks. Actually you might find that your customers can throw more features and problems at you than you can properly handle and maybe you think you need to hire an associate PM to work under you to capture all these requirements and validate them, prioritize them against competing priorities and you just become overwhelmed.
But you can use AI to establish templates to turn around these one pagers in an hour, to take care of all the administrative parts of your writing and thinking and use AI as your associate product manager to get going fast. If a customer identifies a problem, you can take their feedback and without jumping to a solution too early you can create a one pager, ask them to validate it and then work with design and engineering to scope it. I think as PM I’ve written up more one pagers than was thought to be humanly possible, I would’ve had to have worked serious over time to do so. Maybe 80-100 hr weeks. Instead AI automates this and we’ve been able to very quickly iterate on feature requests, scope them fast and significantly increase feature velocity.
We can establish what we need to build, find a fast design and schedule it for engineering. The re-acceleration of velocity of feature development has been amazing for me as PM. This is one example of how AI is deflationary tech for PMs allowing us to do more with less effort and deliver much more value to customers at a lower cost. AI helps accelerate delivery of product even at the planning and prioritization level.
Process to walk through a one pager. Instead of creating a custom one pager GPT here is a suggested outline for your prompt to create a one pager template yourself:
[Persona] Now you know how to write up a persona, you can do so for the type of product manager you are or want to be!
[Customer] You know your customer and their needs and wants, write that here
[Business Goals] Write your companies vision, strategy and goals here
[Problem] What problem (even briefly) do you think this feature solves?
[Feature] Write up simple bullets of the feature
[Example] Show AI what a good example of a oner pager is at your company
[Guides] Don’t forget the guides!
You can use techniques like this to automate the scaffolding or administrative parts of your work as product manager. By delegating this to AI, you can increasingly focus on the higher value activities. This is the optimistic view of AI, it’s deflationary technology that will allow us to do more, to do it faster and better.
Summary
This essay illustrates how prompt engineering with Large Language Models like ChatGPT can streamline decision-making and actually expedite product development from optimizing product development upstream. But we’ve cautioned on the critical balance between precision and recall in AI predictions and the strategic management of AI "hallucinations" for optimal output quality. Data science has significantly improved many lives, mine included, heralding a shift from the Zero Interest Rate Policy (ZIRP) era, characterized by "product manager theatre" and an over-reliance on frameworks and large teams for execution. Today, there's a sharp pivot towards efficiency, speed, and delivering value. In this evolving landscape, AI is set to transform all tech professions, particularly product management. Large Language Models (LLMs) present a vast potential for product managers, with prompt writing/engineering emerging as a critical skill to leverage this potential. Prompt engineering, akin to a higher-level software language, offers a unique advantage to skilled writers. By emphasizing clarity, brevity, and first-principles thinking, prompt engineering can significantly accelerate the impact and efficiency of decision-making and writing, which are crucial outputs for product managers.
Parting Objections And Considerations
Be incredibly careful what you pass to LLMs
Not prompting thoughtfully: It’s one thing to get called out for over-reliance on using ChatGPT to do your work (which you shouldn’t be doing, LLMs are a reasoning agent, an accelerant to your thinking not a replacement for it).
Not carefully reading and editing AI responses: If you aren’t taking total ownership of your writing you will likely regret, something will read odd in your writing and people will start to question you. It’s gonna happen. You want to be like this person and notice he starts out by stating that ownership is the #1 factor in distinguishing a PM!
Entering confidential info into public LLMs: I’m a huge proponent of AI + LLMs but I can’t caution you enough here. If your organization has a policy on LLM usage, read it thoroughly and respect it. If it doesn’t, ask your leader about one. You should be incredibly thoughtful and careful about what you write, copy and
GDPR violation + breach of trust: Did you paste your customers PII into a public LLM? Did you enter your companies proprietary and private information? User IDs which are PII? Internal HR employee data? Upcoming product roadmap for a publicly traded company? You wouldn’t post this stuff on social media so why would you add it to the train cycle of a public LLM? Did you say it’s ok because they don’t use the information when training because I checked a box telling them not to? Ideally your company has a solid and private LLM where you are SOC-II compliant and can enter anything with full confidence. But when using public LLMs such as Claude and ChatGPT you need to exercise caution and prudence as a business leader.
I was surprised at the pushback and ambivalence I received when presenting this during my live talk. Is it really that big of a deal? Yeah it is, we’re hired as PMs to look after the businesses and products we manage so take this very seriously.
Wait is this just all a shortcut to oblivion? 🤔
When I was presenting my original talk to Product BC, a question arose about the need to put in substantial effort and understanding to prompt writing to produce great answers. I responded that this was actually pretty valid —> There's still lots of work to be done so that products like ChatGPT can provide maximum value to non-power users who just wants to type something in and get a great response.
And the natural response is well won’t this stuff just get abstracted away as LLMs get better and better or they become customized? We’re getting pretty good circumstantial evidence to indicate that better general purpose models won’t necessarily result in a series of step changes in improvements and that accordingly prompt engineering will always be a valuable skill. Expertise in interfacing with the API will likely remain a useful skill.
Freely I admit that end users will not write 600 line prompts. Instead they will benefit from prompt engineering done ahead of time or maybe the prompt writing is done dynamically to update and feed the LLM context not available elsewhere. But the further away from the metal
you are the less effective and savvy you will likely remain.
And an important question arose which we need to address:
Do you think this logic also applies to the use of ChatGPT in any subject? For example, if you are a product manager and use ChatGPT to help write a PRD, wouldn't you have a less fundamental understanding of that PRD than someone who wrote it from scratch? You might read it and understand it at some level, but would you understand it as well as someone who built it from an understanding of first principles? In other words, would you be further away from the "metal" of product management? The PRD generated by ChatGPT might actually be better, but the understanding of the PM would surely be worse? Question on Product BC Slack
For the PRD scenario, the answer of whether the 'AI' PM would have less understanding than the 'from scratch' PM depends. It's certainly possible and concede even almost a certainty should the AI PM naively wrote a prompt and copy and paste the output into a PRD on Confluence and then called it a day. If we looked at this apples to apples we'd hold the skill set of both PMs here constant.
But I'd go back to first principles as noted and think of this in terms of output, the primary output of a PM in my mind is making decisions. And this output can be framed in terms of precision and recall. Precision is what the question is hitting at, the AI PM who quickly wrote a prompt and passed all of their thinking duties to the LLM would likely have worse precision than a thoughtful scratch PM who worked diligently in the problem space, understood the market, customer needs and re-evaluated their decision from first principles.
Two things emerge though:
One is that a good prompt writer shouldn't view AI as a replacement for their thinking, the LLM is instead a reasoning agent that can actually accelerate and improve their thinking and understanding. Prompt writing is an iterative exercise just like writing code. Prompts beget answers which beget more thinking and this leads back to more questions and answers until the problem becomes much sharper than at the outset. The guiding principle is that the PM actually has be a PM in the first place, they can't walk into the profession without experience and use the LLM to become a PM
ex nihilo
if you will. Yet I believe that prompt writing improves precision which means PMs are making better decisions even further grounded in first principles thinking.But the second point is even more important, it's the recall. As PMs we have to not only make great decisions, to accelerate delivery and go fast we also have to make lots of them. This is recall. LLMs are incredibly useful here because they can take care of so much of the administrative parts of a PMs job so we can focus on the core of the problem. AI is deflationary technology which allows PMs to do more and go faster.