Copy the formatted transcript to paste into ChatGPT or Claude for analysis
Hey, everybody. Today, I'd like to share with you my process of creating training data when
you're using chat GPT, Claude, or other AI models. Training data can be essential to
your task whenever you're using an AI model because it can lead to even more efficient
and accurate responses. You're essentially providing additional context and knowledge
in order for the AI model to act as an even better partner to complete the task. The issue
is that sometimes this training data isn't easy to get or find, so using chat GPT to
create this for you can lead to even more efficient results. I'm going to walk through
this task that I was looking at and how I created training data for the task. For this,
I was interested in how many aggregator or affiliate sites were in my search landscape.
As an SEO, it's really important to understand what types of sites you're competing against
online, whether those are direct competitors, search competitors, aggregators, and so on.
Within my specific search landscape, I found that there was 150 competitors that I was
competing with often online. I'm interested in the percentage of those that are aggregators,
so sites that are creating content designed to compare, review, and rank product or services.
I provided this information to chat GPT and gave it the list of 150 competitors that I
AI-generated overview
Seer Interactive demonstrates how generating training data within ChatGPT significantly improves categorization accuracy for SEO competitive analysis. The test case: categorizing 150 competitors as aggregator vs. non-aggregator sites. Without training data, ChatGPT identified 16 aggregators (10.7%). After first asking ChatGPT to generate a list of 100 example aggregator sites with definitions ("includes listicle content, like top 10 payroll companies or best internet providers"), then providing the same 150-competitor list, it identified 32 aggregators (21.3%)—doubling the detection rate. The training data approach creates a "context window" priming the model with relevant examples before the actual task. Validation remains critical: manual spot-checking confirmed Backpacker.com (flagged in the trained version) publishes gear guides and listicles characteristic of aggregators, which the baseline version missed. The method applies to other categorization tasks like thematic keyword analysis, organic ranking term grouping, and qualitative research where providing examples upfront yields "more organized and efficient" outputs.
Without training data, ChatGPT identified 16 aggregator sites from 150 competitors; after generating 100 example aggregators as training data, it identified 32—more than doubling detection accuracy from 10.7% to 21.3%.
Training data is created by asking ChatGPT itself: "Create a list of 100 aggregator websites" with a definition ("listicle content, like top 10 payroll companies") in a table format with domains.
The training data approach "creates that context window of aggregator and affiliate websites, so it's already starting to think about some of this information or generate some of this information" before the actual categorization task.
Manual validation revealed the trained model correctly identified Backpacker.com as an aggregator due to its "Winter Gear Guide" and other listicle content, which the baseline model missed entirely.
🔍 About This Video: Having trouble finding training data to use as your AI knowledge base? In this video, Nick walks through his process of using ChatGPT to create context data leading to more efficient and accurate AI outputs. 📊 Tools & Resources: ChatGPT Conversation: https://chatgpt.com/share/e/7a813b06-ce32-4a47-b6d4-37e05e2abe72 Timestamps: 00:00 - Introduction 00:40 - Aggregator Categorization Overview 01:45 - Initial Results 03:30 - Using ChatGPT to Create Examples & Context Data 05:55 - Updated Results 06:30 - Validating Results 06:30 - Final Thoughts & Recommendations 💡 Have Suggestions or Questions? Your feedback is invaluable. Share your ideas for updates or any questions in the comments section below! 👍 Like, Comment, and Subscribe! Enjoyed the video? Make sure to like, comment, and subscribe for more helpful guides and insights on SEO and digital marketing strategies. 🔗 Follow Us: Our Website: https://seerinteractive.com/ Twitter: https://twitter.com/SeerIntera