Copy the formatted transcript to paste into ChatGPT or Claude for analysis
Hey, everybody. Today, I'd like to share with you my process of creating training data when
you're using chat GPT, Claude, or other AI models. Training data can be essential to
your task whenever you're using an AI model because it can lead to even more efficient
and accurate responses. You're essentially providing additional context and knowledge
in order for the AI model to act as an even better partner to complete the task. The issue
is that sometimes this training data isn't easy to get or find, so using chat GPT to
create this for you can lead to even more efficient results. I'm going to walk through
this task that I was looking at and how I created training data for the task. For this,
I was interested in how many aggregator or affiliate sites were in my search landscape.
As an SEO, it's really important to understand what types of sites you're competing against
online, whether those are direct competitors, search competitors, aggregators, and so on.
Within my specific search landscape, I found that there was 150 competitors that I was
competing with often online. I'm interested in the percentage of those that are aggregators,
so sites that are creating content designed to compare, review, and rank product or services.
I provided this information to chat GPT and gave it the list of 150 competitors that I
was interested in. I'm going to scroll through my chat really quick. I'm asking it to tell
me the percentage of these sites that are aggregator sites and then show me the list
of aggregators versus non-aggregator sites. I want to have two buckets, the aggregators
and non-aggregators. Without any previous context, chat GPT already has some of this
training data in its backend. It already knows types of sites like Forbes, New York
Times, and so on. Using all this past information to complete this task. It gives me the list
of aggregator sites here as well and non-aggregator sites here. We have all of this list of 150
sites that I gave it. Sorry for the scrolling. Here's the end data that it gives me. Total
number of sites, 150 that are provided. List of aggregator sites, 16. Non-aggregator sites,
134. Now, it's very important to not just take this chat output as its word. You need
to validate. As I was looking through some of this information, I was thinking, some
of these sites do create some of that aggregator type of content. I'm just not 100% sure that
this is as accurate as it could be. In another case, you would want to provide some examples.
Providing those examples and reference documents are important whenever you're using AI models
to ensure that it is being as accurate as possible. In this case, you could search Google
and look for a list of aggregator websites. The AI overview is going to give you some
lists. There's this tab as well. Then a list of 50 news aggregator websites. This is from
2022. This is also a news site, so I'm not sure if it's going to be completely relevant
to me. Then scrolling down, so you're starting to look at other sites. This could lead to
a lot of research and a lot of time looking into this, but there's got to be a better
way. Within this other task, same task, but I'm starting it out a different way. Now,
I'm asking ChatGPT to create a list of 100 aggregator websites. I'm also giving it a
definition of what aggregators are. An aggregator or affiliate website is categorized as a site
that includes listicle content, like top 10 payroll companies or best internet providers,
and place this list of aggregator websites in a table along with their respective web
domain. I'm asking ChatGPT to begin start our conversation with this training data,
so one, it already has an idea of what the task could be. We're also creating that context
window of aggregator and affiliate websites, so it's already starting to think about some
of this information or generate some of this information. It creates this list of 100 sites.
You can take a look at some of this and check and make sure that some of these are more
aggregator sites. There are some that are kind of in the gray area with an aggregator
website, and you could double check some of this and tweak some of these as needed, but
in terms of this case, we're just going to double check that it's accurate and then ask
what type of content the aggregators create. Be specific if we need to identify even more
sites. ChatGPT double checks this information and then it creates this output of what types
of content the aggregators create. Aggregators can create comparison lists, best of lists,
guides, buying advice, deal aggregation, and so on and so forth. It gives us all this
really good information. It gives us some refined lists as well, so like where it got
things right, maybe where things that are more in that gray area as well, and now we're
going to go back to the start of the other task. Again, I'm very interested in how many
aggregator sites are in my search landscape. I have a list of 150 competitors in the search
landscape that I'm going to send, and please help me organize those into two buckets, the
aggregators and non-aggregators. I provide the same list here, and it tells us what its process
is going to be here, and then here's the categorization. We have aggregators, and now
there's a list of 32, and non-aggregators. Scrolling to the bottom here, and it gives
us the percentage here again. In the summary of the 150 sites that I shared, we have 32
sites that are listed as an aggregator, and then within the other task without the training
data, there were only 16 sites, so a pretty big difference in terms of your search landscape
over double the amount of aggregator sites. Again, just because we provided this data
does not mean that it's better. Validation is still going to be very important and critical
whenever you're looking this over before you're shipping this to a team member or a client. You
want to make sure that it's right. After looking through some of this, like non-aggregators,
we have Amazon, REI, Reddit, North Face, eBay, so on, and then you have the list of aggregators
here as well. Again, there's 16 more aggregators in this list than there were in the original list.
After I was looking over this a little bit more, one that stood out to me was Backpacker.com.
I went to this just to check if ChadGBT categorized this as an aggregator,
let's just double check and validate some of this information. I went to Backpack.com,
and I found Winter Gear Guide, and there's a bunch of other content with guides, listicles,
best of, and so on. To me, that is an aggregator, which was not identified in the initial task as
well. It's still going to be critical to validate this information to make sure that ChadGBT is
providing accurate information. However, providing this data and just asking ChadGBT for the data and
the context beforehand did lead to a more organized and efficient list that I can trust a little bit
more than I could before I had that training data. I would definitely recommend trying this
with other types of categorization and themes, if you want to analyze themes of keywords that
you're ranking for and categories of your organic ranking terms, as well as different qualitative
research that ChadGBT can help out with. Having that training data can be essential in making
sure that your process is efficient and you're providing as accurate of an output as possible.
Hope this is helpful. If you have any other ideas, I'd be definitely down to hear them. Thanks for watching.
🔍 About This Video: Having trouble finding training data to use as your AI knowledge base? In this video, Nick walks through his process of using ChatGPT to create context data leading to more efficient and accurate AI outputs. 📊 Tools & Resources: ChatGPT Conversation: https://chatgpt.com/share/e/7a813b06-ce32-4a47-b6d4-37e05e2abe72 Timestamps: 00:00 - Introduction 00:40 - Aggregator Categorization Overview 01:45 - Initial Results 03:30 - Using ChatGPT to Create Examples & Context Data 05:55 - Updated Results 06:30 - Validating Results 06:30 - Final Thoughts & Recommendations 💡 Have Suggestions or Questions? Your feedback is invaluable. Share your ideas for updates or any questions in the comments section below! 👍 Like, Comment, and Subscribe! Enjoyed the video? Make sure to like, comment, and subscribe for more helpful guides and insights on SEO and digital marketing strategies. 🔗 Follow Us: Our Website: https://seerinteractive.com/ Twitter: https://twitter.com/SeerIntera
54 minLunio