Connecting AI Engines to Company Knowledge

Custom GPTs for Company Knowledge: The Honest Limits Nobody Tells You

By the Praxiron team · Last updated July 5, 2026 · 12 min read

Custom GPTs are a good fit for narrow, repeatable tasks built on a small, stable set of documents. As a company knowledge base they hit hard limits: a 20-file cap, unreliable retrieval from complex PDFs, no permission controls, and confident answers that sometimes ignore the files entirely. Before relying on one for decisions, know that a custom GPT gives you no source references, no confidence level, and no way to abstain when its files do not contain the answer.

What is a custom GPT and when is it the right tool?

A custom GPT is a configured version of ChatGPT: you give it instructions, upload reference files, optionally wire in external actions, and share the result with a link or inside your workspace. Building one takes minutes in the GPT builder and requires no code. It is the right tool when the job is narrow, repeatable, and rests on a small, stable set of documents. It is the wrong tool when you expect it to be the memory of your company.

Give custom GPTs their full due, because within their lane they are genuinely useful. A GPT loaded with your style guide that rewrites drafts in the house voice. A GPT with one onboarding handbook that answers new-hire questions. A GPT carrying your proposal template and three annotated examples that produces consistent first drafts. A GPT with a single product spec sheet that helps support agents phrase responses. These work because they share a shape: one job, a handful of documents that rarely change, and output a human reviews before it goes anywhere that matters. For that shape, a custom GPT is fast to build, cheap to run, easy to share, and better than most of the alternatives.

The trouble starts when the shape changes. Teams see the handbook bot work, conclude the same trick will work for everything, and start uploading contracts, pricing sheets, engineering standards, and project archives, expecting a company brain. That is the move this article is about, because it fails in specific, documented ways, and because the failure is quiet: the GPT keeps producing fluent, confident output long after it has stopped using your files.

If your goal is broader than one narrow task, ChatGPT itself offers stronger routes to organizational knowledge, covered in how to connect ChatGPT to your company’s files. And if your goal is output you can act on, what matters is not which upload feature you pick but what sits between your files and the model, which is where a knowledge and control layer comes in. More on both below.

How to build one on company documents, properly

If a narrow custom GPT is the right call, build it deliberately. Most “my GPT ignores its files” complaints trace back to skipping one of these steps.

1. Define one job, in one sentence. “Answers questions about the 2026 employee handbook” is a job. “Knows our company” is not. If you cannot state the job in a sentence, you are building a knowledge base, and a custom GPT is the wrong container for it.

2. Curate the files, do not dump them. Fewer, cleaner, and more current beats more. Prefer simple, text-first formats: single-column documents, plain headings, tables kept small and simple. Multi-column PDFs, scanned pages, and dense spreadsheet-style tables inside PDFs are widely reported to retrieve poorly or not at all, so convert them to clean single-column text or a structured document before uploading. Delete superseded versions; the GPT has no concept of which version is in force, so if two versions are present it can draw from either.

3. Write instructions that force file use. The instruction block is your only steering wheel, so be explicit: answer only from the uploaded files; if the files do not contain the answer, say so plainly instead of answering from general knowledge; name the file each answer draws on. These instructions raise the floor meaningfully. Keep expectations honest, though: they are requests, not guarantees, for reasons covered in the next section.

4. Match the capabilities to the job. If the GPT should answer only from your documents, consider disabling web browsing so it cannot quietly substitute internet sources for your material. Enable only the capabilities the job needs.

5. Test with an answer key, including trick questions. Before sharing anything, write ten questions you know the files answer and check each response against the source. Then ask three questions the files do not answer and watch what happens. A well-behaved GPT says the files do not cover it. A poorly instructed one improvises. This second test is the one most builders skip and the one that predicts real-world behavior best.

6. Plan for maintenance. Uploaded files are frozen snapshots. Nothing syncs. When the price list changes or the policy is revised, someone must remember to delete the old file and upload the new one, for every GPT that carries it. Put a named owner and a review date on any GPT that people rely on, because a stale GPT does not look stale: it answers from the old file in the same confident voice.

7. Share deliberately. A GPT can be private, shared by link, or published to your workspace or the GPT store. Anyone who can use the GPT can query everything inside it, so the sharing decision is also a data-exposure decision. Treat every uploaded file as visible to every user of the GPT, because determined users can often extract file contents through targeted prompting regardless of your instructions.

Follow all seven steps and you will have a solid narrow-task assistant. What you still will not have is a company knowledge base, and the reasons are structural rather than fixable with effort.

Why does my custom GPT ignore its uploaded files?

Because file retrieval is a tool the model may call, not a pipeline it must pass through. When you upload files to a custom GPT, they are indexed for search. At question time, the model decides whether to search, what to search for, and what to do with the results. Each of those decision points can go wrong quietly.

The most common miss is retrieval itself. Files are split into chunks for indexing, and a question phrased differently from the source text can fail to match the chunk that holds the answer. Long documents, merged documents, and complex formatting all raise the miss rate. When retrieval comes back thin, the model does what language models do: it produces the most plausible answer from its general training. Community reports document the most corrosive version of this, cases where the bot answers from general knowledge while stating that it consulted the uploaded files. The answer arrives with the same fluency and the same apparent grounding as a real one, and nothing in the interface tells you which kind you got.

Format is the second offender. Multi-column PDF layouts can be extracted in scrambled reading order, complex tables lose the row-and-column relationships that gave the numbers meaning, and scanned documents may contribute nothing at all. The upload succeeds, the file appears in the list, and the content is effectively invisible. Builders naturally assume that an accepted upload is a readable upload; it is not.

The third factor is instruction drift on easy questions. If the model believes it already knows the answer, it may skip the search entirely, even when your instructions say otherwise. That is tolerable when its general knowledge happens to match your documents and silently wrong when your company’s actual policy differs from the internet’s average answer, which is precisely the case where you needed the files.

The fixes in the previous section, clean formats, forceful instructions, adversarial testing, shrink the failure rate. They cannot take it to zero, because nothing in the design requires the model to ground its output in your files or to tell you when it has not. That missing requirement has a name, abstention, and it is a property of the layer around the model, not of any prompt you can write.

The documented limits: file count, formats, complex PDFs

Beyond behavior, there are hard boundaries. Per OpenAI’s documentation, a custom GPT accepts up to 20 uploaded files. Twenty files is generous for a style guide and a handful of exemplars. It is nowhere near a company’s contracts, standards, procedures, price history, and project archive. Teams that hit the cap usually respond by concatenating many documents into a few giant PDFs, which stays under the count while making retrieval measurably worse: bigger files mean coarser chunking, more collisions between unrelated topics, and more scrambled extractions.

The format limits compound the count limit. The documents most companies care about are exactly the ones that retrieve worst: multi-column contracts, specification sheets built around dense tables, scanned legacy paperwork. Widely reported failures on multi-column PDFs and complex tables mean the practical ceiling is not 20 files but 20 clean, simple, text-first files, a much smaller universe.

Then there is the limit that should worry gatekeepers most: a custom GPT has no permission model at all. One pool of files, one level of access. Anyone who can use the GPT can query everything in it, and instructions telling the model to withhold certain content are advisory, not enforced. There is no per-file access, no distinction by role or department, no notion that the finance folder and the lunch menu deserve different treatment, and no audit trail showing who retrieved what. Whatever access discipline your file storage enforces ends at the moment of upload.

Finally, uploads are snapshots. There is no sync, no version awareness, no recency rule. The GPT will answer from the March price list in November with total composure, because it has no way to know a newer one exists.

None of this is a flaw in the product; it is the product’s scope. OpenAI built custom GPTs as lightweight, shareable assistants for focused tasks, and at that they succeed. The limits only bite when the tool is asked to be something it was never designed to be: governed organizational knowledge that decisions can rest on. Seeing that gap clearly is useful, because it defines exactly what a company needs to add, and adding it is very possible.

Why confident answers without sources are worse than no answers

A custom GPT that said “I cannot find this in my files” every time it was uncertain would be a far more valuable tool, even though it would answer fewer questions. What it does instead is answer nearly everything in the same assured tone, with no source reference a reader can open, no confidence level that drops when support is thin, and no abstention when support is absent. That combination quietly converts small retrieval failures into business mistakes.

The cost pattern is predictable. Experienced people spot-check a few answers against documents they know, catch a fabrication, and stop trusting the GPT for anything that matters, so their scarce time returns as the bottleneck. Less experienced people lack the background to catch fluent errors, so wrong answers pass through and surface later as a mispriced quote or an out-of-date policy applied to a client. The organization gets senior time still consumed and junior output that is harder to check than before.

The adoption research shows how expensive this pattern is at scale. MIT NANDA 2025 found that 95% of enterprise generative AI pilots showed no measurable P&L impact. S&P Global Market Intelligence 2025 reported that 42% of companies abandoned most of their AI initiatives. And WRITER’s 2026 enterprise AI survey found only 29% of executives report significant organizational ROI from AI. Quick wins that cannot be trusted for consequential work are heavily represented in those numbers, and an ungoverned pile of uploaded files is a common way to produce one.

“A custom GPT built on twenty files will answer the twenty-first question anyway. The output that should concern a company is not the visibly wrong one, it is the plausible one with no source, no confidence level, and no admission that the files never contained the answer.”

The Praxiron team

This is not a custom GPT problem so much as a retrieval problem. Even when retrieval works perfectly, fetching text is not the same as reasoning toward a decision under your company’s rules; that argument is laid out in why RAG isn’t enough for enterprise decisions. The gap between what uploads provide and what decisions require is exactly the space the next two sections describe, and filling it is what turns the model’s real fluency from a liability back into an asset.

From uploaded files to decision DNA: the structural difference

A pile of uploaded files is not structured knowledge, and the difference is not size, it is structure.

Uploaded files carry none of the metadata that makes company knowledge usable for decisions. Nothing marks the 2026 standard as superseding the 2024 one. Nothing records that the CFO’s pricing memo outranks a salesperson’s draft, that one clause applies only to public-sector clients, or that the margin floor in the spreadsheet is a hard rule while the target in the slide deck is an aspiration. Most importantly, nothing captures the judgment of your senior people: which precedents matter, which exceptions were approved and why, what “we do not do this” actually covers. In an uploaded pile, all of that lives between the lines, and a retrieval step reads only the lines.

Structured knowledge inverts this. Documents are enriched with context about what they are, when they apply, and how much authority they carry. They are organized into catalogs and contexts rather than one undifferentiated heap. Authority and recency rules are explicit, so a conflict between two documents resolves by rule instead of by whichever chunk the search happened to return. And the company’s decision logic, the standards and precedents that make an answer right for this company rather than plausible in general, is encoded deliberately. That structured, company-owned asset is what we call decision DNA, and it is the raw material decisions actually need. Files are what your company has; decision DNA is what your company knows.

What a knowledge and control layer does that a custom GPT cannot

A knowledge and control layer is the category of platform that sits between a company’s knowledge and the AI engines, and it supplies exactly the properties the upload approach is missing.

It grounds every output in the company’s structured knowledge and attaches a source reference, so a reader can open the documents an output rests on and see document content separated from generated conclusions. It attaches calibrated confidence, a level that visibly drops when support is thin rather than reading high everywhere. It abstains when sources are insufficient, returning “no sufficient source” instead of a guess, which tells the decision-maker precisely where the company’s knowledge ends. It enforces permission control by file type and role, so who can ask what of which knowledge is a governed rule rather than a hope, and access to the finance catalog is a policy decision instead of an upload decision. And it is engine-agnostic by design: the knowledge is structured once and serves whichever engines the company chooses, today’s and tomorrow’s, instead of being re-uploaded into each tool’s private pile. Praxiron is built as exactly this kind of platform, with decision DNA at the center and those controls on every output.

The practical upshot for the custom GPT question: keep the narrow GPTs for the narrow jobs they do well, and put the knowledge your decisions depend on into a governed layer above the engines. To see how source references, confidence, and abstention work on real outputs, read how the platform works.

A custom GPT alone vs. a knowledge and control layer

Capability	Custom GPT alone	With a knowledge and control layer
Source references	File names sometimes mentioned; claimed file use is not verifiable	Every output carries references to the documents it rests on, with content separated from conclusions
Calibrated confidence	None; every answer arrives in the same confident tone	Confidence level attached to each output, dropping visibly when support is thin
Abstention when sources are insufficient	Answers from general training, sometimes while claiming to use files	Returns “no sufficient source” instead of guessing
Permission granularity by file type and role	None; every user of the GPT can query every uploaded file	Access governed by file type, role, and context
Consistency across repeated questions	Varies with retrieval luck between runs	Explicit authority and recency rules make repeated questions resolve the same way
Engine independence	Locked to ChatGPT; files re-uploaded per tool	Knowledge structured once, serving any engine

Frequently asked questions

Why does my custom GPT make up answers instead of using my files?

File retrieval in a custom GPT is a tool the model chooses to call, not a guarantee. When retrieval misses, often because of chunking, complex formatting, or a question phrased differently from the text, the model falls back on its general training and answers anyway, sometimes while claiming it used your files. Stronger instructions reduce this but cannot eliminate it, because the model has no abstention requirement.

How many files can a custom GPT actually handle?

Per OpenAI's documentation, a custom GPT accepts up to 20 uploaded files. In practice the useful limit is lower: retrieval quality drops as files grow longer and more complex, and merging documents to stay under the cap tends to make retrieval worse, not better. A curated handbook fits comfortably. A company's contracts, standards, price lists, and project history do not.

Can I control who sees what inside a custom GPT?

No. A custom GPT has one pool of uploaded files, and every user who can access the GPT can query all of them. There are no permissions by file, file type, role, or department, and users of a shared GPT can often extract file contents through targeted prompting. Access control has to live outside the GPT, which usually means not uploading sensitive material at all.

Are custom GPTs safe for confidential company documents?

Treat every uploaded file as readable by anyone who can use the GPT. Within a Business or Enterprise workspace, sharing can be restricted to specific members, which helps. But there is no per-file permission model, no audit trail of what was retrieved, and instructions cannot reliably stop a determined user from pulling file contents. Confidential material belongs behind permission control by file type and role, which a custom GPT does not offer.

What is the difference between uploading files and building decision DNA?

Uploaded files are a static pile the model searches when it chooses to. Decision DNA is the company's knowledge structured for decisions: documents enriched with context, organized into catalogs, weighted by authority and recency, and connected to the rules the company actually decides by. The first gives you retrieval on a good day. The second gives outputs with sources, calibrated confidence, and abstention when support is insufficient.