Sentiment Analysis Model AI Prompts for NLP Engineers
Basic sentiment analysis is solved. Text is positive or negative. Confidence scores range from 0 to 1. The accuracy on clean, straightforward text is over 90%. Every major NLP library can do it.
The frontier of sentiment analysis is nuance. Sarcasm. Context-dependence. Mixed sentiments. Implicit opinions. Cultural references. The text where the actual meaning is different from the literal meaning. This is where traditional models fail and where advanced prompting techniques can help.
The question is not how to detect sentiment in “I love this product.” The question is how to detect the skepticism in “Great, another feature nobody asked for.” The question is how to understand that “About what I expected” is a polite way of saying “This was disappointing.”
AI can help you build sentiment analysis that handles nuance. It can help you design prompts that capture context, detect sarcasm, and understand the difference between what is said and what is meant.
AI Unpacker provides prompts designed to help NLP engineers build sentiment analysis models that go beyond basic classification.
TL;DR
- Basic sentiment analysis is solved. Nuance is the frontier.
- Context determines meaning. Prompts must capture context.
- Sarcasm detection requires understanding the gap between literal and intended.
- Mixed sentiments are more common than pure positive or negative.
- Domain-specific training outperforms generic models.
- Human annotation quality determines model quality.
Introduction
Sentiment analysis has three levels of complexity. The first level is polarity: is the text positive, negative, or neutral. The second level is nuance: how positive or negative, with what intensity, toward what specific aspects. The third level is intention: what does the sentiment imply about future actions.
Most practical applications require at least the second level. A product review that says “The battery lasts about 6 hours” is not neutral. It is negative relative to expectations. A support ticket that says “This is getting ridiculous” is not just negative. It is escalating.
Building models that capture this nuance requires thoughtful prompt design, domain-specific training, and careful handling of edge cases.
1. Context-Aware Sentiment Analysis
Sentiment depends on context. The same words can mean different things in different contexts. Prompts must capture this context to produce accurate results.
Prompt for Context-Aware Sentiment Analysis
Design context-aware sentiment analysis for product reviews.
Domain: Consumer electronics reviews
Text example: "The screen resolution is impressive but the battery life is disappointing."
Challenges:
1. Mixed sentiment: Positive about one aspect, negative about another
2. Comparative context: Impressive relative to what?
3. Expectation framing: Disappointing relative to expectations
Traditional approach problems:
- Model outputs single sentiment: "Mixed" or "Neutral"
- Loses aspect-level nuance
- Cannot distinguish which aspects are positive vs negative
Context-aware approach:
Required context:
1. Product category: Consumer electronics
2. Product type: Laptop
3. Reviewer baseline: Typical consumer expectations
4. Competitive context: Similar laptops have 8-10 hour battery
What context enables:
1. "Impressive" is positive about screen resolution
2. "Disappointing" is negative about battery life
3. Intensity of disappointment depends on competitive context
Aspect-level sentiment design:
Task definition:
"Given a product review and a list of product aspects, identify the sentiment toward each aspect."
Aspects for laptops:
- Screen resolution
- Battery life
- Performance
- Build quality
- Price
- Portability
Example analysis:
Input: "The screen resolution is impressive but the battery life is disappointing."
Output:
{
"screen_resolution": {
"sentiment": "positive",
"intensity": 0.8,
"evidence": "impressive"
},
"battery_life": {
"sentiment": "negative",
"intensity": 0.7,
"evidence": "disappointing"
},
"overall_sentiment": "mixed"
}
Comparative context handling:
Input: "The battery life is about what I expected."
Problem: Neutral on surface, but negative relative to expectations
Context needed: What are typical expectations for this product category?
Resolution:
- If product is budget laptop: "Expected" = "Acceptable" = neutral
- If product is premium laptop: "Expected" = "Below standard" = negative
Prompt design for comparative context:
Task:
"Classify the sentiment of the review, accounting for the stated expectations and whether they were met."
Examples:
- "Better than expected" = positive (exceeded baseline)
- "About what I expected" = neutral or slight negative (depends on expectation type)
- "Worse than expected" = negative (did not meet baseline)
Confidence and uncertainty:
When context is ambiguous, the model should express uncertainty:
Input: "The battery life is acceptable."
Output:
{
"battery_life": {
"sentiment": "neutral",
"intensity": 0.6, # moderate confidence
"uncertainty": "Acceptable is context-dependent; could be positive in budget category or negative in premium category"
}
}
Training data requirements:
1. Aspect-level annotations (not just document-level)
2. Context documentation (product category, competitive landscape)
3. Comparative sentiment examples
4. Intensity ratings
Tasks:
1. Define aspect taxonomy for the domain
2. Design context-gathering approach
3. Create aspect-level annotation guidelines
4. Develop comparative sentiment handling
5. Implement confidence scoring
Generate context-aware sentiment analysis design with aspect-level taxonomy and prompt structure.
2. Sarcasm and Irony Detection
Sarcasm is the gap between what is said and what is meant. It is one of the hardest problems in sentiment analysis. Literal positive can be actual negative.
Prompt for Sarcasm Detection
Develop sarcasm detection for sentiment analysis.
Domain: Social media text (Twitter, Reddit)
Examples of sarcasm:
Example 1: "Oh great, another software update that breaks everything."
- Literal: "Great" suggests positive
- Actual: Negative (sarcasm)
- Signal: "Oh great" is a sarcastic exclamation
Example 2: "Love waiting in line for hours. Best day ever."
- Literal: "Love" and "best" suggest positive
- Actual: Negative (sarcasm)
- Signal: Excessive positive words in negative context
Example 3: "Because nothing says quality like [brand]."
- Literal: Could be positive reference
- Actual: Negative (sarcasm)
- Signal: Implied criticism through irony
Sarcasm detection signals:
Signal 1: Exaggerated positive language
- "Love," "Great," "Best," "Perfect" in negative contexts
- Excessive enthusiasm relative to context
Signal 2: Contextual incongruity
- Positive words describing negative experiences
- Benefits claimed that contradict the overall message
Signal 3: Punctuation and capitalization
- "Oh great" vs "Great"
- "WOW" vs "wow"
- Excessive punctuation !!! ???
Signal 4: Hashrags
- #sarcasm, #not, #irony
- But these are sometimes used sincerely
Prompt design for sarcasm detection:
Task:
"Determine if the text contains sarcasm or irony. If so, identify the true sentiment which differs from the literal sentiment."
Examples:
Input: "Oh great, another Monday. Just what I needed."
Output:
{
"contains_sarcasm": true,
"literal_sentiment": "positive",
"actual_sentiment": "negative",
"sarcasm_signal": "Exaggerated positive ('great', 'just what I needed') used to express frustration about Monday",
"intensity": 0.7
}
Input: "Love how my phone updates at 3am. Very considerate."
Output:
{
"contains_sarcasm": true,
"literal_sentiment": "positive",
"actual_sentiment": "negative",
"sarcasm_signal": "Positive words ('love', 'considerate') describing a frustrating automated behavior",
"intensity": 0.8
}
Input: "This product is amazing. It stopped working after one day."
Output:
{
"contains_sarcasm": true,
"literal_sentiment": "positive",
"actual_sentiment": "negative",
"sarcasm_signal": "Contradiction between 'amazing' and immediate failure description",
"intensity": 0.9
}
Training data requirements:
1. Examples labeled for sarcasm by human annotators
2. Annotator guidelines for what constitutes sarcasm
3. Confidence labels when annotators disagree
4. Context information (previous messages, user history)
Annotation guidelines for sarcasm:
Rule 1: Look for positive words in negative contexts
- Not: "I am so happy" (sincere)
- Yes: "I am so happy" in context of complaints (sarcastic)
Rule 2: Look for contradictions
- Explicit contradiction: "Love problems" while describing problems
- Implicit contradiction: Expectations vs reality
Rule 3: Consider punctuation and capitalization
- Over-the-top punctuation often signals sarcasm
- Excessive caps often signals sarcasm
Rule 4: Consider pragmatics
- "Great" in response to bad news is often sarcastic
- "Sure" as a response to obvious statement can be sarcastic
Challenges:
Challenge 1: Cultural context
- Sarcasm varies by culture and demographic
- What is obviously sarcastic to one group may not be to another
- Solution: Train on diverse data, include demographic context
Challenge 2: Ambiguity
- Some statements are genuinely ambiguous
- Model should express uncertainty
- Do not force classification when ambiguous
Challenge 3: Self-deprecating humor
- "I am so bad at this" can be sincere or humble-brag
- Context matters
- Solution: Look for signals that distinguish self-deprecation from sarcasm
Tasks:
1. Define sarcasm signals for your domain
2. Create annotation guidelines
3. Build diverse training dataset
4. Design ensemble approach (sarcasm detector + sentiment)
5. Implement uncertainty handling
Generate sarcasm detection system with signal taxonomy and training approach.
3. Multi-Aspect Sentiment Analysis
Products and services have multiple aspects. A hotel review can be positive about location but negative about service. A software review can be positive about features but negative about usability.
Prompt for Multi-Aspect Sentiment Analysis
Design multi-aspect sentiment analysis system.
Domain: Hotel reviews
Text example: "The location was perfect and the views were stunning, but the room was small and the service was slow."
Multi-aspect requirements:
Aspect taxonomy for hotels:
1. Location (proximity, convenience, neighborhood)
2. Room (size, cleanliness, amenities, view)
3. Service (staff responsiveness, professionalism, helpfulness)
4. Food (breakfast, restaurant, bar)
5. Value (price vs quality)
6. Facilities (gym, pool, business center)
Task design:
Task: "Identify sentiment toward each aspect mentioned in the review."
Example:
Input: "The location was perfect and the views were stunning, but the room was small and the service was slow."
Output:
{
"location": {
"sentiment": "positive",
"intensity": 0.95,
"evidence": ["perfect", "stunning views"]
},
"room": {
"sentiment": "negative",
"intensity": 0.6,
"evidence": ["small"]
},
"service": {
"sentiment": "negative",
"intensity": 0.6,
"evidence": ["slow"]
},
"food": {
"sentiment": "neutral",
"intensity": null,
"evidence": null
},
"value": {
"sentiment": "neutral",
"intensity": null,
"evidence": null
},
"facilities": {
"sentiment": "neutral",
"intensity": null,
"evidence": null
},
"overall_sentiment": "mixed"
}
Aspect extraction + sentiment classification:
Step 1: Aspect identification
- Which aspects are mentioned?
- Explicit mentions: "the room was..."
- Implicit mentions: "the view was..." (aspect = location or view)
Step 2: Sentiment extraction per aspect
- What is the sentiment toward each mentioned aspect?
- Use aspect-specific sentiment cues
Step 3: Intensity estimation
- How strong is the sentiment?
- "OK" vs "perfect" vs "adequate"
- Scale from -1 (very negative) to +1 (very positive)
Step 4: Overall aggregation
- How to combine aspect sentiments?
- Weight by aspect importance (explicit vs implicit mentions)
- Weight by aspect salience (first mentioned vs last)
Aspect importance weighting:
Explicit mentions get higher weight:
- "The room was great but the service was slow"
- Room: positive, Service: negative
- Both explicitly mentioned
Implicit mentions still count:
- "The view was stunning and we loved the location"
- View and location: positive
- Other aspects: neutral (not mentioned)
Aspect co-occurrence patterns:
Pattern 1: Contrastive
- "The room was great but the service was terrible"
- Both aspects explicitly contrasted
- Intensity should reflect the more extreme sentiment
Pattern 2: Additive
- "The room was great and the service was good"
- Both positive
- Overall strongly positive
Pattern 3: Mixed
- "The location was perfect but everything else was mediocre"
- Location: strongly positive
- Other: neutral to negative
- Overall: mixed with location weight
Aspect-level aggregation strategies:
Strategy 1: Average
- Sum all aspect sentiments / number of mentioned aspects
- Simple but loses contrastive information
Strategy 2: Weighted by mention
- More mentions of an aspect = more weight
- More explicit sentiment words = more weight
Strategy 3: Critical aspect override
- If any critical aspect is negative, overall may be negative
- Critical: For hotels, room and service are often critical
Comparative reviews:
Input: "Much better than the last hotel we stayed at."
Output:
{
"comparative_sentiment": "positive",
"comparative_target": "other hotels in general",
"reason": "Relative comparison implies positive experience"
}
Training data requirements:
1. Reviews with aspect-level annotations
2. Aspect mention boundaries marked
3. Sentiment and intensity per aspect
4. Comparative sentence identification
Tasks:
1. Define aspect taxonomy for domain
2. Create aspect extraction guidelines
3. Design intensity scale and examples
4. Build aggregation strategies
5. Develop comparative handling
Generate multi-aspect sentiment analysis design with taxonomy and aggregation approach.
4. Emotion Detection Beyond Sentiment
Sentiment is binary or ternary (positive/negative/neutral). Emotion is dimensional. Moving beyond basic sentiment to emotion detection unlocks richer understanding.
Prompt for Emotion Detection
Design emotion detection system for customer feedback.
Domain: Customer support conversations
Emotion categories: anger, frustration, satisfaction, confusion, urgency, disappointment, appreciation
Why emotion detection matters:
Sentiment: "I am frustrated with the wait time."
Emotion: frustration
Both indicate negative sentiment, but:
- Frustration requires different handling than disappointment
- Anger requires escalation, confusion requires education
- Same sentiment, different actions
Emotion taxonomy design:
Primary emotions for customer feedback:
1. Anger (high negative arousal)
- "This is ridiculous"
- "I cannot believe this"
- "You people are useless"
- Indicators: Caps, exclamation marks, profanity
2. Frustration (medium negative arousal)
- "I have been trying to fix this for days"
- "This keeps happening"
- "Why is this so complicated"
- Indicators: Repeated attempts, "keep happening", complexity focus
3. Disappointment (low negative arousal)
- "I was hoping this would work"
- "This is not what I expected"
- "I expected more"
- Indicators: "Expected," "hoped," unmet assumptions
4. Confusion (uncertainty)
- "I do not understand"
- "How do I..."
- "What does this mean"
- Indicators: Questions, "do not understand," "confused"
5. Satisfaction (positive)
- "This worked perfectly"
- "Exactly what I needed"
- "Thank you for helping"
- Indicators: Positive language, gratitude
6. Appreciation (positive, specific)
- "Thank you so much for..."
- "I really appreciate..."
- "You went above and beyond"
- Indicators: Gratitude language, specific praise
7. Urgency (action needed)
- "I need this fixed now"
- "This is blocking our entire team"
- "We cannot wait"
- Indicators: Time pressure, impact statements, "now," "immediately"
8. Neutral (no strong emotion)
- "I have a question"
- "Can you help me with..."
- "I wanted to check on..."
- Indicators: No strong emotion words
Prompt design for emotion detection:
Task: "Identify the primary emotion expressed in this feedback."
Examples:
Input: "I have been trying to reach someone about this issue for three days. Three days. This is completely unacceptable."
Output:
{
"emotion": "frustration",
"confidence": 0.92,
"indicators": ["three days", "trying to reach", "completely unacceptable"],
"arousal": "high",
"suggested_action": "Priority escalation, acknowledge delay, provide timeline"
}
Input: "The product arrived damaged but I guess these things happen."
Output:
{
"emotion": "disappointment",
"confidence": 0.78,
"indicators": ["damaged", "I guess"],
"arousal": "low",
"suggested_action": "Replacement, apology for disappointment"
}
Input: "WHAT DO I DO IF THE BUTTON DOES NOT WORK??"
Output:
{
"emotion": "confusion",
"confidence": 0.85,
"indicators": ["WHAT", "??", "question about action"],
"arousal": "high",
"suggested_action": "Clear instructions, offer call support"
}
Emotion intensity:
Emotion intensity scales from mild to strong:
- "I am a bit frustrated" = mild frustration
- "I am frustrated" = moderate frustration
- "I am extremely frustrated" = strong frustration
- "I am FURIOUS" = anger
Why intensity matters:
- Mild frustration: Standard response OK
- Strong frustration: Needs priority handling
- Anger: Needs immediate escalation
Mixed emotion handling:
Input: "I am frustrated with the wait but grateful for the agent's help."
Output:
{
"emotions": [
{"emotion": "frustration", "intensity": "moderate"},
{"emotion": "appreciation", "intensity": "mild"}
],
"primary_emotion": "frustration",
"note": "Mixed emotion; appreciation may be attempt to soften frustration"
}
Training data requirements:
1. Emotion labels by trained annotators
2. Confidence labels when ambiguous
3. Context information (previous conversation)
4. Outcome information (what happened next)
Annotation guidelines:
1. Primary emotion only (most dominant)
2. Intensity scale: mild, moderate, strong
3. Uncertainty when text is ambiguous
4. Consider cultural and demographic factors
Tasks:
1. Define emotion taxonomy for domain
2. Create annotation guidelines with examples
3. Design intensity scales
4. Build mixed emotion handling
5. Implement confidence scoring
Generate emotion detection system with taxonomy and handling of mixed emotions.
FAQ
What is the best approach for sarcasm detection?
Use an ensemble approach. Train a sarcasm detector as a separate binary classifier. Use it as a feature for sentiment analysis. When sarcasm is detected, flip or modulate the sentiment. This is more robust than trying to build a single model that handles both.
How do I handle mixed sentiments in training data?
Allow multi-label annotations. A text can be both positive and negative. Or use a confidence-weighted approach where the model outputs probability for each sentiment. Accept that mixed sentiments are valid and should not be forced into single categories.
How do I build domain-specific sentiment models?
Start with a general model as a baseline. Collect domain-specific labeled data. Fine-tune the model on domain-specific data. The domain specificity comes from the training data, not from the model architecture.
What accuracy should I expect for nuanced sentiment analysis?
Basic sentiment: 90%+ accuracy. Aspect-level sentiment: 80-85% accuracy. Sarcasm detection: 70-80% accuracy. Emotion detection: 75-85% accuracy. These are rough estimates; your actual accuracy depends on data quality and domain fit.
Conclusion
Nuance is the frontier of sentiment analysis. Basic polarity is solved. Detecting sarcasm, understanding mixed sentiments, and identifying emotions requires thoughtful prompt design, domain-specific training, and careful handling of ambiguity.
AI Unpacker gives you prompts to design context-aware sentiment analysis, sarcasm detection, multi-aspect analysis, and emotion detection. But the domain expertise to define what matters, the data quality to train models, and the judgment to interpret results — those come from you.
The goal is not perfect sentiment analysis. The goal is actionable insight from what customers are saying and how they are saying it.