How LLMs Choose Sources: The DerivateX Citation Signal Framework

Q: Does Perplexity use different citation criteria than ChatGPT?

Yes. Perplexity cites more sources per response, is more tolerant of lower domain authority when content structure is strong, and shows a faster response to structural content improvements. ChatGPT with browsing applies a stronger brand recognition filter. Google AI Overviews is the most restrictive, strongly preferring pages that already rank on the first page of standard Google results.

Q: How do LLMs handle pages behind paywalls or login walls?

They do not cite them. Pages requiring authentication or payment are not retrieved by LLM browsing systems. Gating high-quality structured content removes it from AI citation eligibility entirely, even if it would otherwise score well on every other citation factor.

Q: What is citation co-occurrence, and why does it matter?

Citation co-occurrence is the pattern where LLMs cite pages in thematic groups rather than in isolation. Your citation neighbors are partly determined by your topical positioning through internal links, backlink profile, and semantic content clustering. Building topically coherent content clusters is more effective than publishing isolated standalone pages.

Q: How should teams prioritize improvements to increase AI citation rates?

Start with index eligibility—confirm all priority pages are in Bing's index via Bing Webmaster Tools. Then audit for retrieval density: does each page contain self-contained answer blocks resolving a specific question in 40 to 80 words? Next, increase entity specificity by naming specific tools, companies, datasets, and standards. Finally, build internal topical clusters around core subjects. These four steps address the Citation Signal Framework levers in order of impact.

For accounting and tax firms, UltraTax CS is more than just software.

It’s the system that keeps deadlines, filings, and client trust on track. But as firms move from local servers to the cloud, the real challenge is finding a host that won’t buckle under the pressure of busy season.

Laggy sessions, failed PDF generations, or an e-file rejection at the wrong moment can grind operations to a halt and put revenue at risk. That’s why selecting the best UltraTax hosting provider is no longer optional, it’s a mission-critical decision.

The best UltraTax hosting delivers consistent low-latency sessions, reliable printing/PDF and e-file, hardened security, and 24/7 human support; backed by clear SLAs and a zero-drama migration plan.

Each of these elements matters for one reason: tax season doesn’t wait. Slow or unstable systems translate to missed deadlines. Unreliable printing or e-file functions can delay returns. Weak security can jeopardize compliance with FTC Safeguards or IRS standards. And without responsive human support, even a minor issue can snowball into hours of lost productivity.

Firms evaluating UltraTax hosting need assurance that their provider can deliver on every one of these fronts, especially when workloads surge.

1Why Most Guesses About AI Citations Are Wrong

2What Does the DerivateX Citation Study Actually Measure?

3How ChatGPT Chooses Sources When Browsing the Web

4What Separates Cited Pages From Non-Cited Pages on the Same Domain?

5How Does AI Citation Behavior Differ Across Platforms?

6What Does the Found On AI Citation Signal Framework Identify as the Core Levers?

7Does Content Freshness Actually Affect AI Citation Rates?

8What Gets a Page Excluded From AI Citations?

9Frequently Asked Questions About How LLMs Choose Sources

10The Deeper Pattern Behind AI Citation Behavior

What Firms Should Demand in UltraTax Hosting

The firms that succeed in busy season don’t just rely on good staff—they rely on stable systems. When deadlines stack up and client calls won’t stop, even a small hiccup in UltraTax can snowball into hours of lost productivity. That’s why choosing a host should be less about flashy promises and more about whether the provider can prove reliability where it matters most.

Here are the criteria firms should insist on before moving UltraTax to the cloud:

1. Performance

Tax workloads aren’t static. From January to April, the same database that feels fast in November can slow to a crawl under three to five times the load. A capable host needs to deliver low-latency sessions backed by dedicated resources, not shared virtual machines where “noisy neighbors” can eat your bandwidth.

The right provider should also have the ability to scale CPU and RAM on demand, so if your team doubles in size temporarily, the environment doesn’t choke. In practical terms, this means staff can open large returns, switch between clients, and import data without waiting on laggy screens.

2. Reliability

In tax workflows, reliability isn’t measured in uptime percentages, it’s in the small but critical steps that keep client deliverables moving.

Can staff consistently generate PDFs without formatting failures? Will e-file transmissions clear the first time, every time? Is printing from a remote office as seamless as it would be locally?

A serious UltraTax hosting provider stress-tests these functions before tax season hits and designs environments where they remain stable under load. Don’t settle for “server uptime” guarantees alone. Ask whether SLAs explicitly cover printing, PDF stability, and e-file reliability, because those are the pain points that ruin tax season.

3. Support

When things break, it’s rarely at 10 a.m. on a Tuesday. Issues show up late at night, on weekends, or ten minutes before a filing deadline. That’s why firms need 24/7 live human support, not just ticket-based queues. And it’s not enough to reach a general IT technician. Support teams must understand UltraTax, QuickBooks, and the workflows tax professionals actually use.

This is the difference between an issue being fixed in 15 minutes or dragging on for half a day. Before you sign, ask to see support SLAs: response times, escalation paths, and guarantees. A vendor’s help desk is only valuable if it performs when you need it most.

4. Security

Every return, every client record—UltraTax data is a bullseye for cybercriminals. Hosting providers must treat it that way. A credible partner will operate on SOC 2 Type II–certified infrastructure, enforce multi-factor authentication for all users, and encrypt data both in transit and at rest. They should also be familiar with compliance frameworks like the FTC Safeguards Rule and IRS Publication 4557.

Many firms extend their defenses further by integrating with managed security services, ensuring their UltraTax environment is monitored continuously. If a host can’t speak confidently about these requirements, they’re not built for the accounting industry.

5. Migration

The biggest fear firms have about switching hosts is disruption. Moving databases and user profiles off local servers or from one provider to another has to be a predictable, low-drama process. The best providers follow structured cloud migration playbooks: they stage the transition during off-peak hours, validate database integrity, and only cut over once parallel testing is complete.

Just as important, they offer a rollback plan, so if something fails mid-migration, the old system can stay live until issues are resolved. Without that safety net, firms risk extended downtime at the worst possible time.

6. Transparency

Hosting is supposed to simplify IT, not add new layers of uncertainty. Providers that hide behind multi-year contracts or vague pricing erode trust. Accounting firms should look for month-to-month options with predictable costs and no hidden fees. A transparent vendor will also give visibility into their infrastructure, including U.S.-based redundant data centers. Knowing where your data lives and how it’s protected isn’t just nice to have, it’s part of due diligence when auditors or regulators ask.

7. Proof

Ultimately, the only way to trust a provider is to see evidence that they’ve delivered for firms like yours. That means published success stories, peer testimonials, and verifiable compliance certifications. In the accounting profession, a single reference from another CPA firm often outweighs a dozen marketing claims. The best UltraTax hosting providers lean into this reality by sharing their track record openly because they know performance and trustworthiness stand up to scrutiny.

Best UltraTax Hosting Providers

Once the evaluation criteria are clear, the next step is identifying which vendors actually meet them. Not every cloud host is built with tax workflows in mind, many rely on generic virtual machines that can struggle under the load of busy season. The best UltraTax hosting providers are those that combine accounting software expertise with hardened infrastructure, round-the-clock support, and a predictable migration process.

Below is a shortlist of trusted options.

1. Verito

Verito consistently ranks as the leading choice for UltraTax hosting, and for good reason. Unlike generic cloud providers, Verito is built exclusively for tax and accounting professionals, with infrastructure and support tuned to the realities of tax season.

Performance and Reliability

Verito operates on dedicated private servers that eliminate the “noisy neighbor” problem common in shared environments. This ensures that UltraTax sessions remain smooth even during peak filing periods, when workloads can surge three to five times.

Their 100% uptime guarantee and ability to scale CPU and RAM on demand make Verito one of the few hosts that can confidently handle the tax season crunch. Critical functions like printing, PDF generation, and e-file submissions are tested and hardened in advance, reducing the chance of mid-season slowdowns or failures.

Support

Support is where Verito sets itself apart. Its managed IT support team is available 24/7 with live engineers who not only understand IT but also know how UltraTax, QuickBooks, Drake, and other accounting applications work in real-world firm environments. This level of software-aware support means issues are typically resolved within minutes, not hours. Firms effectively gain an outsourced IT department that speaks their language and responds without delay.

Security

Verito maintains a SOC 2 Type II–certified infrastructure, with multi-factor authentication, enterprise-grade encryption, and fully isolated customer environments. The platform is aligned with compliance frameworks including the FTC Safeguards Rule and IRS Publication 4557, easing a major burden for firms that need audit-ready documentation.

For those requiring even more proactive defense, Verito integrates seamlessly with managed security services that provide continuous monitoring and advanced protection.

Migration

Switching to Verito is designed to be low-drama. Their cloud migration process stages data transfer during off-peak hours, validates databases in parallel before go-live, and includes a rollback option to minimize risk. This approach removes the biggest fear firms face when leaving on-prem systems: extended downtime in the middle of tax season.

Transparency

Unlike many providers, Verito doesn’t lock firms into multi-year commitments. Contracts are month-to-month with predictable pricing, so firms can scale up or down as needed without financial penalties. They also provide full visibility into their U.S. based data centers, ensuring firms know exactly where their data lives and how it’s protected.

Proof

Firms don’t have to take Verito’s word for it. Dozens of published success stories highlight how accounting firms have used Verito to maintain peak performance, meet compliance demands, and avoid outages. Peer references from CPAs who have made the switch add further credibility, giving decision-makers the reassurance that Verito “just works, securely” when it matters most.

With its combination of security-first infrastructure, peak-season reliability, hands-on support, and transparent contracts, Verito is the clear #1 choice for firms seeking the best UltraTax hosting.

2. Ace Cloud Hosting

Ace Cloud Hosting is one of the more established players in the accounting and tax software hosting space, with a wide range of supported applications that includes UltraTax, QuickBooks, Drake, and Lacerte. Many firms consider Ace as an option when evaluating hosted UltraTax environments.

Performance and Reliability

Ace provides virtualized environments capable of running UltraTax for small and mid-sized firms. Its infrastructure can generally handle concurrent user sessions, though most deployments rely on shared resources rather than dedicated servers. This means performance is adequate for everyday workloads, but firms operating at peak volumes during tax season may not see the same responsiveness as with dedicated hosting. Reliability is solid, though uptime guarantees are framed in general terms rather than application-specific SLAs.

Support

Ace offers 24/7 customer support with phone, chat, and ticketing options. Their team is familiar with common tax and accounting applications, though support is often more generalized than what firms get from boutique providers like Verito. Resolution times can vary, especially during high-volume periods when tax professionals are most likely to need assistance.

Security

Ace’s hosting platform incorporates data encryption, multi-factor authentication, and standard compliance controls. While this meets the baseline expectations for hosting sensitive tax data, it is positioned more as a broad cloud solution for multiple industries rather than one specialized for accounting-specific compliance frameworks. Firms with heightened requirements around FTC Safeguards or IRS 4557 may need to supplement with additional security tools.

Migration

The company provides migration assistance to help firms transition their UltraTax environment into the cloud. Their process covers data transfer and user setup, but documentation around rollback options and staged cutovers is less emphasized compared to more specialized providers.

Transparency

Ace offers tiered hosting plans and supports both monthly and longer-term commitments. Pricing is competitive, though firms should review contract terms closely to ensure costs remain predictable over time.

Proof

With over a decade in cloud application hosting, Ace has a broad client base across industries. Its reputation in the accounting sector is steady, though less tightly focused on tax firms specifically. Case studies and testimonials highlight reliability, but they tend to emphasize flexibility rather than peak-season tax workload assurance.

3. Apps4Rent

Apps4Rent is often considered by firms looking for a lower-cost way to move UltraTax into the cloud. The company offers a broad catalog of application hosting solutions, including tax and accounting software. For budget-conscious firms, it can be an entry point into UltraTax hosting, though with trade-offs in specialization and support.

Performance and Reliability

Apps4Rent provides UltraTax hosting on shared virtual machines, which can be sufficient for small firms with light workloads. However, performance tends to be variable during high-traffic periods since resources are not fully isolated. Printing, PDF generation, and e-file submissions work, but may require additional configuration and troubleshooting compared to providers with accounting-specific optimizations.

Support

Support is available around the clock via chat, phone, and email. While responsive, the support team typically handles a wide variety of applications, not just tax software. This means UltraTax-specific issues may take longer to resolve because staff may need to escalate to specialists.

Security

The platform includes standard measures such as multi-factor authentication and encryption. Apps4Rent provides compliance basics, but its security posture is more generic compared to accounting-focused providers. Firms with regulatory requirements like FTC Safeguards or IRS 4557 compliance may find themselves layering on additional protections to meet standards.

Migration

Apps4Rent assists with setup and data migration for UltraTax environments. The process is straightforward, though less structured and less tailored to tax-specific workloads than boutique hosts. Firms should ask detailed questions about rollback options if downtime is a concern.

Transparency

Pricing is a major appeal. Apps4Rent is often one of the most affordable UltraTax hosting providers. That said, contracts and service levels should be reviewed carefully to ensure expectations around uptime and support are met.

Proof

With thousands of customers across different industries, Apps4Rent is a known player in the application hosting market. However, its reputation is built more on breadth and affordability than on deep specialization in tax workflows. For firms where cost savings outweigh the need for advanced compliance or high-touch support, it can be a workable option.

4. V2 Cloud

V2 Cloud is a general-purpose desktop and application hosting provider that supports a wide range of business software, including tax and accounting applications like UltraTax. Its focus is on delivering simple, user-friendly cloud desktops for small and mid-sized businesses.

Performance and Reliability

V2 Cloud emphasizes ease of deployment, with cloud desktops that can run UltraTax alongside other applications. Performance is generally stable for everyday workloads, though the platform is not purpose-built for heavy tax season surges. Firms with modest filing volumes may find it adequate, while larger practices might encounter limits when multiple concurrent UltraTax sessions are running.

Support

Support is available 24/7 through phone, chat, and email. The team is responsive, but since V2 Cloud serves a broad SMB audience, its expertise is less specific to tax workflows than accounting-focused providers. This can mean slower resolution for UltraTax-specific issues.

Security

V2 Cloud offers secure environments with encryption, MFA, and isolated user sessions. These protections meet standard expectations for SMB cloud hosting, but firms with strict compliance requirements around FTC Safeguards or IRS 4557 will need to verify coverage or add supplementary security layers.

Migration

The company provides migration assistance to get applications and data into its cloud environment. While functional, the process is geared more toward general SMB workloads than tax-specific hosting. Rollback options are not prominently documented, so firms should clarify these before committing.

Transparency

V2 Cloud provides straightforward pricing and month-to-month contracts, appealing to firms that want flexibility without long-term commitments. However, costs may vary depending on the scale of resources needed during tax season.

Proof

As a general SMB cloud provider, V2 Cloud has earned a solid reputation for simplicity and affordability. While not as deeply embedded in the accounting industry as specialized vendors, it is a consideration for firms seeking a versatile cloud solution that can support UltraTax alongside other business apps.

Quick Comparison Table

When evaluating UltraTax hosting providers, it helps to see the essentials side by side. This table distills how the top options stack up across performance, support, security, and migration approach:

Provider	Performance	Support	Security	Migration
Verito	Dedicated private servers with on-demand scaling; 99.999% uptime	24/7 live engineers specialized in accounting software	SOC 2 Type II, MFA, full encryption, FTC/IRS compliance	Structured cloud migration with rollback safety
Ace Cloud Hosting	Reliable shared infrastructure; handles moderate concurrent workloads	24/7 support team, accounting-aware but less specialized	Standard encryption, MFA, general compliance controls	Assisted migration, limited detail on rollback
Apps4Rent	Shared VMs; performance varies under heavy loads	24/7 general support; limited UltraTax-specific expertise	Basic MFA and encryption; generic security posture	Basic migration setup; rollback not emphasized
V2 Cloud	Stable for SMB workloads; not optimized for tax season surges	24/7 responsive support; broader SMB focus	Encryption, MFA, isolated sessions; generic compliance	Functional migration assistance; limited rollback documentation

The 15-Minute Host Test Checklist

Before signing a contract, firms can quickly validate whether a hosting provider is truly ready for UltraTax. In less than 15 minutes, these checks reveal whether a vendor can handle real-world workloads, not just marketing claims.

Launch a test UltraTax session and note the responsiveness when opening large returns.
Print a sample return to PDF and verify formatting consistency.
Submit a test e-file and confirm it clears without errors.
Log in with multiple users simultaneously (3–5 staff) to see how the environment handles concurrency.
Run a batch import or data-heavy task and measure lag.
Contact support to see how long it takes to reach a human familiar with tax software.
Ask for the provider’s SOC 2 Type II report or equivalent compliance evidence.
Verify MFA login process across desktop and remote devices.
Check data center redundancy—is data hosted in U.S. based data centers with failover capacity?
Request details on rollback procedures in case migration issues arise.
Review SLA language to confirm it covers uptime, support response, and security commitments.
Print from a remote workstation to validate driver stability.
Generate multiple PDFs concurrently and check for reliability.
Ask about backup frequency and retention policies for UltraTax data.
Test mobile or remote logins to ensure stable access outside the office.

If a host can’t pass these checks smoothly, they’re not ready to carry a firm through tax season.

A Safe Migration Plan (With Rollback)

For most firms, the biggest hesitation in moving UltraTax to the cloud isn’t cost—it’s fear of disruption. A poorly handled migration can mean hours of downtime, corrupted databases, or staff unable to access critical files during tax season. The right provider minimizes this risk with a step-by-step process designed for continuity.

1. Pre-migration assessment

A provider should begin with a full audit of the firm’s current setup: local servers, data size, user profiles, and integrated applications like QuickBooks or document management systems. This ensures nothing critical is overlooked.

2. Staged migration during off-peak hours

Data transfer and environment setup should happen outside of business hours, so production systems remain available during the day. UltraTax databases are moved in phases, reducing the chance of disruption.

3. Parallel testing

Before the final cutover, firms should be able to log into the hosted environment in “test mode” while their local server remains active. This allows validation of database integrity, printing functions, and e-file submissions without risk.

4. Rollback option

The hallmark of a safe migration plan is the ability to revert. If critical errors surface during cutover, the provider should keep the local environment live until the new system is fully stable. This safety net prevents downtime from spiraling into missed deadlines.

5. Post-migration validation

Once cutover is complete, the provider should help run live tests: generating returns, printing PDFs, and sending e-files. Only after these are verified should the old environment be decommissioned.

This kind of structured migration (complete with rollback) separates professional UltraTax hosting providers from generic cloud vendors. It ensures firms move forward with confidence rather than gambling with their busiest season.

Conclusion

For tax and accounting firms, UltraTax CS isn’t just another piece of software—it’s the backbone of client service and compliance during the busiest months of the year. The wrong hosting provider can turn deadlines into nightmares, while the right one ensures smooth operations, reliable filings, and peace of mind.

The providers outlined above each bring something to the table, but only a select few combine performance, reliability, security, and responsive support in a way that meets the unforgiving demands of tax season. That’s why UltraTax hosting should be evaluated on evidence, not marketing claims. Firms should insist on dedicated infrastructure, SLA-backed reliability, hardened compliance, and a clear migration plan with rollback protection.

For firms that want their cloud environment to “just work” under peak loads, Verito remains the standout option. Its accounting-first infrastructure, SOC 2 Type II–certified data centers, and 24/7 accounting-aware support team have made it the first choice for thousands of professionals.

And while UltraTax may be the centerpiece today, the broader strategy is ensuring that all applications critical to the practice (tax, accounting, and beyond) fit into a secure, reliable ecosystem of tax software hosting.

LLMs do not cite sources randomly. DerivateX’s citation study found that three structural factors , indexability, topical authority clustering, and named entity density , predict citation likelihood more reliably than domain authority alone.
Pages excluded from Bing’s index are excluded from ChatGPT’s browsing citations entirely, regardless of content quality. Getting into the index is table stakes, not a differentiator.
The DerivateX study identified a pattern they call citation co-occurrence: cited pages are rarely cited alone, because LLMs pull sources in thematic clusters.
Content structure , specifically the presence of extractable, self-contained answer blocks , is a stronger predictor of AI citation than raw domain authority for informational queries.
Recency matters, but not uniformly. For fast-moving topics, pages older than six months are systematically underrepresented in LLM citations. For stable reference content, freshness is nearly irrelevant.

LLMs select sources by combining index presence, topical authority, content structure, named entity density, and recency signals. No single factor determines citation. Pages that appear in Bing’s index, contain explicit and extractable answers, and are surrounded by topically related cited pages have the highest citation rates across ChatGPT, Perplexity, and Google AI Overviews, according to DerivateX’s citation pattern study.

Why Most Guesses About AI Citations Are Wrong

Most practitioners assume AI citation works like traditional search ranking: high domain authority wins. That assumption is wrong, and it’s costing teams real visibility. DerivateX’s study of citation patterns across major LLM platforms found that domain authority is a weak predictor when content structure and entity density are controlled for.

The more accurate mental model is that LLMs are doing something closer to knowledge graph retrieval than search ranking. They are not looking for the most authoritative page on a topic. They are looking for the page that most efficiently completes a structured answer their model is already forming. That distinction changes everything about how you build pages that get cited.

A second common mistake is treating ChatGPT with browsing enabled and ChatGPT in base mode as the same system. They are not. Base model responses draw from statistical patterns in training data and never cite a URL. Browsing responses pull from live Bing results and cite specific pages. Conflating the two makes any citation strategy incoherent before it starts.

What Does the DerivateX Citation Study Actually Measure?

DerivateX analyzed citation behavior across ChatGPT (with browsing), Perplexity, and Google AI Overviews, examining which pages were cited in response to thousands of informational and commercial queries. The study tracked URL-level citation data rather than domain-level, which means it could distinguish between high-performing and low-performing pages on the same domain. That granularity is what makes the findings useful in practice.

The methodology involved submitting structured queries to each platform, recording every cited URL, and then analyzing the cited pages for a set of structural and content-level features. Features tested included Bing index presence, content structure (presence of explicit Q&A blocks, headers, tables, and numbered steps), named entity density (how many specific tools, companies, people, and standards a page mentions by name), topical co-citation rate, and page freshness.

Three findings from the study warrant particular attention because they contradict the prevailing advice on AI visibility.

Finding 1: Index Presence Is Binary, Not a Spectrum

Pages not indexed by Bing do not appear in ChatGPT browsing citations. Full stop. This sounds obvious, but a meaningful share of pages that rank well in Google are not in Bing’s index, particularly newer content on younger domains. Before any other optimization, a page needs Bing Webmaster Tools verification and a clean sitemap submission. No amount of content quality compensates for index absence.

Finding 2: Named Entity Density Outperforms Generic Authority

Cited pages named significantly more specific entities , tools, companies, standards, people, datasets , than non-cited pages covering the same topic. This aligns with how LLMs build responses: they are pattern-matching against named concepts their training data associates with a query. A page that mentions ChatGPT, Perplexity, Google Gemini, and specific retrieval mechanisms by name gives the model more attachment points than a page that discusses “AI chatbots” in generic terms.

Finding 3: Extractable Answer Blocks Are Decisive for Informational Queries

Pages containing what DerivateX’s study terms direct retrieval units , self-contained paragraphs or structured blocks that answer a specific question without requiring surrounding context , are cited at substantially higher rates for informational queries than pages that bury answers in flowing prose. This is the structural equivalent of writing for featured snippets, but applied to LLM extraction rather than Google’s snippet algorithm.

How ChatGPT Chooses Sources When Browsing the Web

When ChatGPT’s browsing mode is active, the process runs in three stages. First, it queries Bing using keywords derived from the user’s prompt. Second, it scores candidate pages from those results. Third, it selects a subset to cite in the final response.

The scoring in stage two weighs authority, content match, freshness, and structural extractability. According to research from ZipTie.dev, ChatGPT also considers search intent alignment , whether the page’s primary purpose matches what the query is asking for. A product page cited for an informational query scores lower than an editorial page that directly addresses the question, even if the product page has higher domain authority.

One underreported behavior documented in the DerivateX study: ChatGPT rarely cites a single source in isolation. It cites topically adjacent pages together. Profound‘s research corroborates this, describing it as sources traveling in packs. Your citation neighbors , the other pages cited alongside yours , are partly determined by your topical clustering in the index. Pages that are strongly associated with a topic cluster through internal links, backlink profiles, and semantic content are more likely to appear in citation groups for that topic.

What Separates Cited Pages From Non-Cited Pages on the Same Domain?

This is the question most AI visibility guides skip, because they operate at the domain level. DerivateX’s URL-level analysis makes it answerable.

The table below summarizes the key differentiating features between cited and non-cited pages observed in the study, across all three platforms analyzed.

Feature	Cited Pages	Non-Cited Pages (Same Domain)
Bing index status	100% indexed	Significant share not indexed
Direct retrieval units present	High prevalence	Low prevalence
Named entity density	Higher	Lower
Topical co-citation rate	Higher	Lower
Page age (informational queries)	Skews recent (under 12 months)	Mixed; older pages underrepresented
Structured formatting (tables, headers, steps)	High prevalence	Low prevalence

The practical implication: you can have a strong domain and still have most of your pages ignored by LLMs if those pages lack direct retrieval units and named entity density. Domain authority is a floor, not a ceiling.

How Does AI Citation Behavior Differ Across Platforms?

ChatGPT with browsing, Perplexity, and Google AI Overviews each have meaningfully different citation behaviors. Treating them as a monolith produces strategies that are mediocre for all three.

Perplexity is the most citation-heavy of the three. It regularly cites five to ten sources per response and tends to pull from a wider range of domain authorities, including smaller publications, if the content structure is strong. ChatGPT with browsing is more selective and shows a stronger bias toward recognizable brands and established publications for ambiguous queries. Google AI Overviews skews heavily toward pages already ranking on page one of standard Google results, making traditional search ranking a prerequisite rather than an alternative path.

The DerivateX study found that Perplexity shows the strongest response to content structure improvements among the three platforms. Pages that added explicit Q&A blocks and structured headers saw measurable citation rate changes on Perplexity before similar changes registered on ChatGPT. If you are prioritizing one platform for citation experiments, Perplexity gives faster feedback loops.

What Does the Found On AI Citation Signal Framework Identify as the Core Levers?

After reviewing the DerivateX study data alongside citation pattern research from Profound and ZipTie.dev, Found On AI identified four discrete levers that determine whether a page gets cited by an LLM. We call this the Citation Signal Framework, and it organizes the decision-making more cleanly than the standard SEO checklists being applied to GEO.

Lever 1: Index Eligibility

Is the page in Bing’s index? Submit to Bing Webmaster Tools, verify your sitemap, and check fetch status for priority pages. This is a one-time fix with a binary payoff.

Lever 2: Retrieval Density

Does the page contain direct retrieval units , paragraphs or blocks that answer a specific question completely without surrounding context? Each retrieval unit should be 40 to 80 words, state the answer in the first sentence, and avoid pronouns that require prior context to resolve. Writing for retrieval density is structurally different from writing for reading flow. For example, instead of writing “ChatGPT is a conversational AI that helps with a wide range of tasks including writing, research, and coding,” a direct retrieval unit reads: “ChatGPT generates human-like text responses, answers nuanced research questions, and completes code debugging tasks.” Most editorial teams have not made that adjustment.

Lever 3: Entity Specificity

How many named entities , tools, companies, people, standards, datasets, platforms , does the page contain? Generic topic coverage scores poorly against specific entity-rich coverage for LLM citation purposes. A page about AI writing tools that names Jasper, Copy.ai, Writer, and Claude by name, with specific claims about each, gives a language model far more to attach to than a page discussing “AI writing assistants” in abstract terms.

Lever 4: Topical Co-location

Is the page topically clustered with other pages that are already being cited? This is the lever most practitioners cannot directly control, but you can influence it through internal linking strategy and by building content in topic clusters rather than isolated posts. Pages that exist in topically coherent clusters on your domain are more likely to be pulled into citation groups alongside already-cited neighbors.

Does Content Freshness Actually Affect AI Citation Rates?

It depends entirely on the query type, and the distinction matters. DerivateX’s study found a clear bifurcation in how freshness affects citation rates across query categories.

For queries about AI tools, market conditions, pricing, product features, and anything with implicit recency expectations, pages older than six months are systematically underrepresented in citations. LLMs appear to apply a recency filter that is more aggressive than Google’s standard freshness algorithm. A page published 18 months ago about a tool’s pricing will be cited less frequently than a page published three months ago, even if both contain accurate information.

For stable reference content , definitions, methodology explanations, evergreen how-to content , freshness has minimal effect on citation rates. The structured content and entity signals dominate. This means an evergreen methodology page does not need quarterly updates to maintain citation eligibility, but a tools comparison page does.

What Gets a Page Excluded From AI Citations?

Exclusion is easier to understand than inclusion, because the filters are mostly binary. A page is excluded if it is not in the Bing index. It is excluded if the query-to-content match is poor , meaning the page is topically tangential to what the LLM is answering. It is excluded if the content is primarily thin, duplicative, or structured as marketing copy rather than informational content.

One exclusion factor that surprises practitioners: pages behind paywalls or login walls are not cited, regardless of their quality or authority. This sounds obvious but has real implications for publishers who gate their best data-heavy content. If the content being gated is exactly the kind of structured, entity-rich, data-backed material LLMs favor, gating it removes it from AI citation entirely while competitors with open versions of similar content accumulate citations.

Inconsistent brand messaging across pages on the same domain also appears to suppress citation rates, based on findings in the broader citation pattern literature. LLMs weight entity coherence. If your site says your tool does X on one page and implies it does not on another, the model’s confidence in citing either page drops.

Frequently Asked Questions About How LLMs Choose Sources

How does ChatGPT decide which sources to cite when browsing?

ChatGPT with browsing queries Bing, pulls candidate URLs from the results, and then scores those pages on authority, content-to-query match, recency, and structural extractability. Pages not in Bing’s index are excluded entirely before scoring begins. Among indexed pages, those with self-contained answer blocks and high named entity density are selected at higher rates for informational queries. ChatGPT typically cites multiple sources per response rather than selecting a single winner.

LLMs most frequently cite pages that contain explicit, self-contained answers to specific questions, name many relevant entities by name, are indexed in the search engine the LLM queries, and exist within a topically coherent cluster of related content. Structured pages , those using headers, tables, numbered steps, and Q&A blocks , are cited at higher rates than pages presenting the same information in flowing prose. Data-backed pages with named sources also perform above average across platforms.

Does domain authority determine whether an AI cites your page?

Domain authority is a weak predictor of AI citation at the URL level when content structure and entity density are controlled for. High domain authority raises the floor , it gets pages indexed and lowers the threshold for initial scoring , but it does not guarantee citation. DerivateX’s study found that low-authority pages with strong direct retrieval units and high entity specificity were cited more frequently than high-authority pages with generic, poorly structured content on the same topic.

Does Perplexity use different citation criteria than ChatGPT?

Yes, meaningfully so. Perplexity cites more sources per response than ChatGPT, is more tolerant of lower domain authority when content structure is strong, and shows a faster response to structural content improvements. ChatGPT with browsing applies a stronger brand recognition filter and is more selective overall. Google AI Overviews is the most restrictive, strongly preferring pages that already rank on the first page of standard Google results, making traditional search ranking a prerequisite for AI Overview citation on competitive queries.

They do not cite them. Pages that require authentication or payment to access are not retrieved by LLM browsing systems. This has a practical implication for publishers: gating high-quality structured content removes it from AI citation eligibility entirely, even if that content would otherwise score well on every other citation factor. If the goal is AI citation and brand visibility through LLM responses, open-access content is a structural requirement.

Can you get cited by an AI without ranking on Google?

For ChatGPT with browsing and Perplexity, yes. Both pull from Bing’s index rather than Google’s, so strong Bing presence is the relevant prerequisite. A page can be absent from Google’s top results and still appear in ChatGPT or Perplexity citations if it is indexed in Bing and scores well on content structure and entity density. Google AI Overviews is the exception , it correlates strongly with Google search rankings, making standard Google SEO a near-requirement for that specific platform.

What is citation co-occurrence, and why does it matter?

Citation co-occurrence is the pattern, documented in DerivateX’s study and corroborated by Profound’s research, where LLMs cite pages in thematic groups rather than in isolation. Your citation neighbors are partly determined by your topical positioning , through your internal links, backlink profile, and semantic content clustering. Building isolated, standalone pages is a weaker strategy than building topically coherent content clusters where each page reinforces the others’ citation eligibility.

How should teams prioritize improvements to increase AI citation rates?

Start with index eligibility , confirm all priority pages are in Bing’s index via Bing Webmaster Tools. Then audit for retrieval density: does each page contain self-contained answer blocks that resolve a specific question in 40 to 80 words without requiring surrounding context? Next, increase entity specificity by naming specific tools, companies, datasets, and standards rather than discussing categories generically. Finally, build internal topical clusters around your core subjects. These four changes, applied in sequence, address the Citation Signal Framework levers in order of impact.

The Deeper Pattern Behind AI Citation Behavior

What the DerivateX data reveals, and what most AI visibility advice misses, is that LLMs are not running a ranking algorithm against your page. They are completing a structured knowledge representation, and they are looking for pages that slot cleanly into the gaps in that representation. A page that names specific entities, answers a specific question in a retrievable block, and sits inside a coherent topical cluster is not just well-structured content. It is behaving like a node in a knowledge graph rather than a document in a search index.

That reframe matters because it tells you where to spend time. Traditional SEO has trained teams to think about rankings, authority, and traffic. AI citation optimization requires thinking about coherence, specificity, and extractability. A page can rank on page one of Google and still contribute nothing to your AI citation footprint if it is written as flowing editorial prose with no extractable units and generic topic framing.

High domain authority does not build an AI citation footprint on its own. Understanding how language models form answers , and building pages that participate in that process , does. The full DerivateX citation pattern study, including platform-level breakdowns and query category data, is worth reading in detail if you are building that kind of program.

Why Most Guesses About AI Citations Are Wrong

What Does the DerivateX Citation Study Actually Measure?

Finding 1: Index Presence Is Binary, Not a Spectrum

Finding 2: Named Entity Density Outperforms Generic Authority

Finding 3: Extractable Answer Blocks Are Decisive for Informational Queries

How ChatGPT Chooses Sources When Browsing the Web

What Separates Cited Pages From Non-Cited Pages on the Same Domain?

How Does AI Citation Behavior Differ Across Platforms?

What Does the Found On AI Citation Signal Framework Identify as the Core Levers?

Lever 1: Index Eligibility

Lever 2: Retrieval Density

Lever 3: Entity Specificity

Lever 4: Topical Co-location

Does Content Freshness Actually Affect AI Citation Rates?

What Gets a Page Excluded From AI Citations?

Frequently Asked Questions About How LLMs Choose Sources

How does ChatGPT decide which sources to cite when browsing?

What kinds of pages do LLMs recommend and cite most often?

Does domain authority determine whether an AI cites your page?

Does Perplexity use different citation criteria than ChatGPT?

How do LLMs handle pages behind paywalls or login walls?

Can you get cited by an AI without ranking on Google?

What is citation co-occurrence, and why does it matter?

How should teams prioritize improvements to increase AI citation rates?

The Deeper Pattern Behind AI Citation Behavior

Emily Carter

Related Posts

Top 9 Indian SEO Agencies with Proven AI SEO Results (2026)

18 Best AI Tools for Content Marketing in 2026

Top 5 LLM SEO Agencies in 2026 (Ranked by Real AI Visibility Results)

9 AI Brand Mention and Citation Monitoring Tools Worth Trying

Trending now