Copy the formatted transcript to paste into ChatGPT or Claude for analysis
So every single day, there are new AI search rank trackers popping up. And so many of them are making some pretty wild claims. And I just want to put some kind of misconceptions to rest here. And this is very important when you're trying to decide what rank tracker you want to use. Now, one thing I will say is that you should be tracking your brand's performance inside of ChatGPT and the large language models and also Google's new AI products like AI Mode and AI Overview. So you should be tracking. The problem is just some of the claims that are being made that just simply are not true. So let's dive right into it. First, there's tons of money being plowed into this industry right now. And basically, this industry, we can kind of segment it out as AI tracking as far as tracking your brand mentions inside of generated responses inside of AI platforms, and then also tracking citations. So the actual links that appear when the AI goes and retrieves information and then cites that source. So these companies I'm showing here, this isn't like some indictment on these specifically. I just pulled these random screenshots because I know they got funding. And also, just to clarify, this is not an indictment on every tool that's out there. It's just a couple key things I've seen. I won't be naming names either. I just want to make sure, though, that if you're trying to make an educated decision about investing in one of these tools, you need to kind of know some of the details. And so as of right now, the average cost for an AI tracking tool is about $337 per month. We studied over 30 of these tools. The average is about $337 per month, which is pretty significant. It's not a small investment. So you should be aware of some of these details. So let's get into it. So this is the first claim that I've seen pop up on a few different websites. And this one is one of the worst, which is explore keyword and topic volume across AI platforms. So just to be very clear, there is no possible way that anyone, any company, any person has access to any type of real search demand on ChatGBT or on any large language model. They do not share this information with anyone. This isn't like Google where you can go to Google Search Console and you can freely see the searches that are being conducted in Google. That does not exist for any of these platforms. So any tool that's claiming that they have real demand data, it's just not true. It's literally impossible to have that data. It does not exist. And in fact, ChatGBT has intentionally not given that data because of the fear that it's going to be abused. And maybe that'll change in the future, and hopefully it does. I hope that these LLMs give access to more data so that marketers can actually start to use these platforms better. But right now, no one has access to this data. So anyone that's showing you search volume data is really just pulling from search volume from the traditional searches. So going into Google Keyword Planner and using their API to pull that data. That's basically what's going on. So in essence, they're mirroring demand, but it's not true demand. It's not true demand of what people are actually searching in these platforms. And it's a very important point. So be careful if you see any tool claiming to have this type of demand vault, this demand data. Same thing with this, discover what millions of people ask on AI. Once again, no one knows this information. It does not exist. Trust me, if it existed, I would be using it. It does not exist. No one has access to this information. And so just be careful of this claim. And then the most, this is kind of the third one here, which is just too much conviction about the accuracy of the data. If you look at a lot of these AI trackers, they claim to have the keys to the kingdom. But the truth is, as I'm about to show you, getting reliable, accurate data as far as tracking is really, really hard in these platforms. This is not like traditional rank tracking that's existed for over a decade that I've been doing SEO, where you put your keyword into a rank tracker and you see how that page performs over time, that's much easier because the variance in search results is not that significant in traditional search. Like a couple of spots here and there, not significant. Then maybe a Google update will come around and that obviously changes the game and there's significant variance. But for the most part, it didn't move that much. So you could actually track it week over week, month over month, quarter over quarter, and see kind of where things are falling. This is radically different as you're about to see. So why tracking, gathering reliable data is really, really hard. So let's kind of dive into this. Number one is extreme variance. No two queries are the same. Now this may have been the case with Google because Google does consolidate queries together to give you kind of this broader search volume. But this is next level because almost everyone who's using ChatGBT or any LLM, they are really putting in their own unique queries. And it's not like they're just searching best blue shoes. Like most people aren't searching like that on these platforms. It's going to be unique. There's going to be grammatical errors. They're going to be sentences, sometimes dissertations. Some people are using voice search. Some people are not. Just the variance in queries is enormous. And so there is a way to handle this. But this is one variable that's going to make it very difficult to get reliable data. The next one is personalization. So we're looking at my account here. And when I search in my account, not in a private window, ChatGBT knows who I am because of my account. So I have the personalization turned on. And most people do because by default, personalization is turned on. So that means the large, large majority of people that are using ChatGBT are seeing that personalized experience. And the problem is ChatGBT in particular is really nice. It's always nice about your ideas and your business and your products. So what it will do oftentimes, as I'll show you in a second, if you search within your standard searches in your account without a private window or a private search, it's going to say that your product ranks for a lot of things. If you say, what are the best SEO content optimization tools? It's going to say, yeah, rankability is right in there. It's going to say that because that's what it wants. It thinks that's what it wants me to hear. So it is designed to be very positive in that regard. And it's not very objective in many ways. And so the point is that already muddies the waters massively with the personalization. What you see when you run a query through an API or you run a query through an incognito search is radically different than what a user is actually going to experience in the platform. And so until we actually get real data about this, we're all just kind of guessing. And so personalization really is a dramatic one. That's why I put it very early here because it is truly the one that makes tracking so incredibly difficult because every person's experience. And this is actually, if you watch what Sam Altman, the founder of OpenAI has talked about, he wants it to be like your personal assistant that's highly personalized to you specifically. That is going to cause some challenges with tracking. Model variants. Just in chat, GBT alone, we see how many different models you can pick from 1, 2, 3, 4, 5, 6, 7 different models. Now, I don't think this is going to be the case forever. I think there's going to be some consolidation of models and maybe have like one universal model that does kind of everything reasoning and multi-use case type of model. But in general, right now, the models are very different. And funny enough, when you run queries across each individual model, run the same exact query, you will get different results. It's pretty crazy. I'll show you an example of this. So check this out. So this is, I ran, what are the best SEO content optimization tools right now? Okay. So I intentionally use right now. So it would try to pull from search and retrieve that information. But funny enough for 4.0, it actually didn't retrieve anything. It just did straight answer in this regard. Now, once again, it's putting rankability in here. This is my account. So I can't trust this. I cannot trust that it did, that this is true. It's telling me what I want to hear, right? So I don't trust it. But you can see here, these are kind of generally the answers that you'll get. But then we run this through O4 mini, we get a very different set of results, right? Pretty similar, pretty similar. And I'll say, this is why you want to have a lot of prom diversity. And there's some other methods we can use. But once again, pretty different. And plus it retrieved on this one. So it actually went out and searched to enhance the prompting for this particular topic. Once again, I'm using the same exact seed though. This is the same seed and running it across different models and getting pretty different results. And even in a temporary chat. So I ran this across all the different models and we are starting to see some similarities. But once again, this is just one of those things where it's just going to add some stuff here that's just going to make it more challenging to actually get reliable data. Okay. Because there's so much variance with these answers. Okay. And then the responses that you get obviously are almost always different. Like you run the same exact query, you get a different response. Now, are the brands that show up going to be pretty similar? Yeah, typically. They're going to be kind of similar. It's not always going to be in the same exact position. It's not always going to be in the same exact format, but the responses are different typically. There's nothing that's cookie cutter based on what I've seen. And I've been using chat since it came out in November 2022. So I've been using it for a long time and I can tell you very rarely would you ever see a response that's exactly the same. I'd never see that. And then queries, even the queries that are used like an O3 model, if we go here into the O3 model or even O4, actually let's try to find one that where it did some significant reasoning. So like here, you can see the queries in the query fan out here. If we inspect, you'll see that what it uses here to actually go and search and retrieve information to make its responses more reliable and more accurate and more timely. It changes that all the time. It's similar usually on the first seed that it does. So if you do something like I did, it's usually going to search kind of some sort of listicle type of query like this that has a year modifier on it. But when it starts to fan out and it starts to go and introduce new queries into this to enhance that response and make it more accurate, it changes a lot. There's not kind of like, there are some general formats that it uses, some patterns, but it's pretty significantly different on most queries once again. So there's just a lot of variance on these things. And that's why when I say don't have too much conviction about what's going on here, this is what I mean, because there's just so many possible variations going on. So responses are almost always different. The types of queries that are used in the reasoning models for query fan out are almost there. There are some templates and patterns, but they tend to be pretty different for the most part. And then citations. Once again, you can look here in the citations and see that, yeah, we see some similarities in these citations, like a couple here and there. But once again, there is significant variance even in the citations that are used in this process. Sometimes there's a ton of citations. Sometimes there's very little. Sometimes they're using Reddit. Other times they're not. So there's just a lot of variance. Now, the good news is even when you look at this one and keep in mind, I think it's important to mention that I didn't use a different query for any of these. This is all the same query used across all of these different models. And we still got different citations for the same query. So that shows you that this is not exactly... Nothing is static. And I think that's the most important thing to take out of this is like nothing is static in this process. So just be careful with any very high conviction claims of tracking accuracy. And then another important thing is that most tools, even including the one that we're building, are using synthetic prompts. And what is a synthetic prompt? It just means that it is made up. Let's just be honest here. If it's synthetic, it's just a made up prompt. And so you can make them pretty... What would be a logical type of prompt that a user might use, but there's still no quantifiable data to prove any of this. So every AI tracking tool, AI search tracking tool is guessing right now, every single one, because no one has quantifiable data. Yes, you can look to Google and get data from there, but there's not like... You search this one phrase here, which is, I need an SEO tool to check my website's ranking for specific keywords. What do you recommend? That whole query, we don't know how many times people search that. No one knows. Like literally no one knows how many times that is searched on chat GPT or perplexity or Gemini. No one knows. And so it still is important to track, but just want to be very transparent here that these are synthetic prompts. And you can go... Like right now, this example here is from Claude. Going to Claude, you can have it generate synthetic prompts for you. And you can start to run these queries manually to see where your brand is popping up. You can do that. So it is possible. And then finally, this is a very important one. A lot of these tools claim that they can extract citations from AI generated responses. It is possible via API in chat GPT. However, some of the other platforms, you actually can't do that. So a lot of these tools, they're claiming that they're showing you the citations that are being used. But the reality is that they're mirroring what one of these LLMs would do. Okay. I'm not going to get into this super technical, but they're trying to mirror what an LLM would do. And what an LLM would do is it just goes to search to enhance its response, right? So for example, if you use the 4.0 model, it's a couple of years old at this point. So if you search something that you need a response that's more current, it has to go to the internet. And in most cases, when it goes and retrieves, it's just going to go to public search engines. As of right now, there's some debate about what chat GPT uses, but they have a partnership with Microsoft, which means we know for sure they can use Bing. There's some evidence that they are possibly using Google. And we know that like Claude tends to push towards brave search, and then perplexity goes towards Bing quite a bit. Okay. So the point is though, public search engines that, and so if you, the way these tools work is they're just kind of modeling what an LLM would do. They're extracting citations from the top ranking results, anywhere between positions one and 20, they're bringing that back in and they're enhancing their prompt. And then they're saying, here are the citations that were used. That's not necessarily the citations that were used directly in a real life user scenario. Okay. Because no one has access to that information. No one can see real user response, like real user scenarios, real user conversations. Okay. Other than that leak that just happened with Google, where you could see the shared conversations. But other than that, it just doesn't exist at this point. Now there is some like kind of sketchier things that tools can do, which is like using proxies and stuff, but in creating like thousands of chat GPT accounts and running stuff through there, they can do that, but that's against terms of service and they're probably going to get banned. So I wouldn't use a tool that's doing anything sketchy like that. Okay. So what should you do after all of that, after I just threw the kitchen sink at you, what should you do? Well, number one, be very careful about personalization, but more importantly, remember what the goals are of tracking. What are we even tracking in the first place? What are we doing here? How do we decide what's a good idea to track? Well, number one, what you're seeing here, this is the most important thing to track. And this is also the thing that doesn't track well back to the website and to a direct conversion. So for example, what we're looking at here is there's no links, there's no links here. So, but the thing is, if we're not showing up in this list, that's a problem. Okay. And so we need our brand to appear in the generated responses. This is the most important thing because when someone is searching these commercial queries, we need our brand to show up. We need our client's brands to show up. Okay. That is critical. So this should be tracked all the time and we should be doing it to ensure that we're actually showing up in these responses. And if we're not, we need to take the steps to do that. Okay. And now the question is like, how do we quantify this particular thing? Well, how do you quantify not being there? Right. That's something that should be talked about. Like a lot of people are talking about, well, if we're here, how do we track the conversions? Well, first of all, you're not because anyone that claims that attribution, like real attribution is even possible at this point, is just selling you nonsense. Attribution is almost impossible. Like true attribution, unless you have a really segmented siloed out funnel, it's really, really hard because the way people use just tools and different platforms these days is quite erratic. Right? Someone watches a YouTube video, then they go to chat, GBT, then they go to Reddit, then they go to X, and they go to LinkedIn. And then six months later, they become a lead. How are you going to track that? It's just impossible. So the point is though, we want to focus more on like, okay, do we have prominence in these responses? That's what matters. Okay. And then of course, you want to think about, okay, well, how do we improve? How do we become the number one, you know, recommended option in these responses? Okay. So first is overall visibility. Are we actually just showing up in general? Like, are we just even appearing? And if you're not even doing that, then you've got a lot of work to do. But once you're appearing, then it's a different game of trying to figure out like, okay, well, how do we get above some of these competitors? Okay. And you start to think about that and you start to exploit those variables that can influence that, the AI to make that recommendation higher. Okay. So this is what needs to be tracked. It's very, very important. And then you don't like there's, I've seen some AI search tracking tools out there that like, say they have analytics, like you just go into Google Analytics for and you can see the data for free. There's no, like, I don't know what would be the reason for using the analytics for inside another tool when you can just go into Google Analytics and use it yourself. It's literally right here available for anyone to see. So, you know, you can use it. You even use Google, Google Looker Studio and like build a dashboard for this too, if you want. So just use Google Analytics for it's right here. It's free. Okay. So now let's talk about a quick formula. This is really important. So if you want to be able to track this the right way, and you want to get within a certain level of certainty that somewhat models what would happen in, let's say a real life scenario, nothing is perfect by the way, but this can get you closer to getting a better idea. Okay. So here's kind of a formula that we're using. Number one, prompt diversity. Okay. So you want to have many different variants that are used. Okay. And obviously you got to be careful with certain variants that would cause the AI to give an answer that obviously isn't going to be super useful. Like for example, if you were to do like, you know, Nike baseball cleats review, it's just going to talk about Nike. So like, if you're a competitor of Nike, that's not really a useful query, right? So you don't want to use that. You want to keep the, you want to keep the queries pretty broad. So keep them more commercial intent, but without any brands inside of there, we've tested this rigorously. It's just, it ends up getting weird data. So just stick to the broader level. Let's call it, they are not brand aware. Let's just use it. Let's just use that phrase. Okay. They're not brand aware, but they have commercial intent. Okay. That's what I focus on. So the best SEO tools would be the seed and you would build out, you know, the synthetic prompts based on that seed. Now I recommend having at least 25 variants, at least probably even more. Okay. Because you want to kind of blanket every possibility to get a broad idea of your overall visibility across all these, these prompts. Okay. And I'm talking on one platform. So if you were just do chat GBT, you would run 25 unique queries across chat GBT that are related to that, that seed keyword that you use, which then you've, you turned into a natural language synthetic prompt. And then you would see how your brand is performing across all 25 to get a broad idea. Okay. So you don't want to be thinking like, I'm just gonna track this one prompt and I'm gonna see where we're, that's just pointless. You should be looking at it broadly and looking like across the board, like, okay, based on these 50, you know, unique queries that we ran, we have 25% visibility across these, these, you know, queries. Okay. That's a much better kind of tracking mechanism than trying to individually track prompts that may not even be used really at all. So you just want the goal here of tracking is to get a broad idea of performance of visibility. Okay. And then this is the one that's really important is you probably should run the same prompt multiple times. You don't need to go super crazy and do it like 25 times, but maybe run it at least one more time. So take that same exact query and run that same exact query. Again, this can help eliminate some of the variants that might exist. If you do it three times, you're going to be probably more than enough. Right. So essentially what would happen is you would basically run anywhere between 75 to a hundred of these individual chats with these unique natural language prompts that are synthetic. Okay. That will give you a pretty good idea from one base seed keyword, how your brand is performing. I mean, it'd be pretty hard to, you know, not get a good idea from that. So this is the way that we're doing it. Everyone's got their own philosophy on this, but for us, we're focused on the brand element. Okay. Like where is the brand showing up in, in the AI recommendations and in the generated response? I think that matters more. Okay. And then you can also, if you're trying to figure out like, okay, what are the types of queries that people might use? There's is possibly some data starting to pop in that we could maybe use. Okay. So go into Google search console and filter by slash overview or filter by slash search. Okay. I cannot prove what I'm about to say, but I do think this is a, this is, you know, it is telling. Okay. Let's just say that much. So like on the overview, I'm totally making an assumption that this is AI overviews. Okay. Just, just possibly, I'm not sure. I can't prove it once again, but it seems quite likely that these would be AI overview queries because of the, the, how they're structured, what they look like, they're long, they mimic what someone would do in a large language model. So I would, you know, be looking at these at least and kind of investigate. And then on the right hand side, we have slash search. So these are, in my opinion, once again, I cannot prove this. It, it looks like this could be AI mode. I'm not sure because it looks like it's a segmented out product from Google so that it can segment from traditional search. So that's just what my opinion is. Like, I'm thinking in terms of like what a SAS company would do. This probably seems likely also given the nature of the queries, right? These queries are much more natural language. They're much longer. They fit, they fit the criteria. Okay. So you can start looking into this data in Google search console. And if you find ones that don't look super synthetic, that look kind of pretty natural, then those would be ones that you'd want to add to your kind of, you know, your database of prompts so that you can kind of get more, get more education about how your brand is showing up. Okay. So, but the key here is really do it based on clusters. Okay. So pick one seed, maybe for one of your core products and build out, you know, let's say 25 to a hundred of these variants around your core product. And then you should be tracking those around your one core product. And you do the same thing for another product. You repeat that. So I mean, you're gonna be tracking a lot, but that's really the, you need a lot of surface area to really get a good idea of how you're performing. Okay. And then as I mentioned before, run the same prompt multiple times. It's very, very important. Now we are building a rankability AI analyzer. So if you want to get early access to it, the cool thing about this is this is not going to be like an add-on to rankability. This is going to be included into your core membership. So if you do go and sign up for rankability, obviously we have some, we have, in my opinion, the best content optimization tool that exists, but we're also adding this in here as well. So just go to rankability.com and you can get early access to this, but we're going to focus purely on chat GBT initially, just because it has about 60% of the large language model market. So hope that helps. I know I kind of like threw the kitchen sink at you, but it's just important to understand how these tools work and kind of the, this, the, the nature of AI search tracking. It is not a perfect science by any means, but I hope this help. And thank you so much for watching.
Critical assessment of AI tracking tools

Amit Tiwari