Your PDF problem isn't going away. Here's how to solve it.
Thousands of public sector organisations face the same challenge: legacy PDF documents that don't meet accessibility standards. We've built an AI-powered solution that transforms them into accessible HTML at scale.
The challenge every public body faces
If you work in digital for a government department, regulator, or arm's length body, this will sound familiar: thousands of PDF documents, PSBAR 2018 compliance deadlines, and no realistic way to manually convert everything.
PDFs remain the dominant format for publishing official documents. They're easy to produce and familiar to users, but they create significant barriers for people using assistive technology, mobile devices, or who need to adjust text size and contrast.
The Public Sector Bodies Accessibility Regulations 2018 aren't a suggestion, they're the law. Yet manual conversion of thousands of documents would take years and cost hundreds of thousands of pounds.
When the Financial Reporting Council approached us with this exact challenge, we knew we needed a different approach. Following our successful website replatform, FRC commissioned us to tackle their library of 7,800+ PDF documents, including financial standards and regulatory guidance.
The arms-length body dilemma
This problem is particularly acute for regulatory bodies and standards organisations. Your documents aren't just information, they're the standards and guidance that professionals rely on daily. Accuracy is non-negotiable.
You're also caught in a publishing cycle that won't pause. New standards are issued, existing guidance is updated, and all of it continues to be published as PDFs because that's what your subject matter experts know how to produce and your users expect to receive.
Meanwhile, compliance deadlines loom. You're caught between legal requirements, limited resources, and the reality that your organisation needs to keep functioning.
Our approach
Building a bridge to accessible content
We developed a solution that acknowledges a simple truth: PDFs aren't going away overnight. So we created a way for organisations to maintain familiar publishing workflows whilst meeting accessibility requirements.
Working within the Financial Reporting Council's Wagtail CMS, we built a converter that uses large language models to transform PDF documents into structured, accessible HTML. It extracts text, tables, and images whilst preserving formatting and document structure. Complex tables become semantic HTML. Images get meaningful alt text. Footnotes and cross-references are maintained.
The converter produces editable Markdown that staff can review and refine before publishing. The AI handles the heavy lifting, but editors maintain full control over accuracy, essential when dealing with regulatory documents where precision matters.
For FRC, this meant converting their legacy document library whilst also building the converter directly into their publishing workflow. Staff now upload PDFs as they always have, the system converts them automatically, editors review the result, and content is published in both formats.
Making compliance sustainable
Converting your existing document library is only half the battle. The real win is creating a sustainable approach for new content.
By integrating conversion into your CMS, you're making accessibility automatic rather than bolting on extra steps. Your subject matter experts continue working the way they always have, but the output now meets WCAG 2.2 AA standards without manual intervention.
Accessibility shouldn't be a special project that happens once. It needs to be built into how your organisation publishes, so that every new document is accessible from day one.
The outcomes
Beyond compliance
Making public sector information genuinely accessible means more people can engage with the guidance, standards, and regulations that affect their lives and work. Users can access content on their phones, navigate with screen readers, or adjust text to suit their needs without losing functionality.
The approach works for any public sector organisation with a large document library. The same technology that's handling FRC's documents can work for regulatory frameworks, clinical guidelines, or safety regulations. We've also implemented similar solutions for Nesta, proving the approach works across different types of organisations and content.
For organisations worried about ongoing maintenance costs or the risk of falling behind again, this approach makes accessibility part of your standard workflow. It happens automatically with every publication, rather than being a separate task you need to remember.
Where to start
If you're looking at thousands of inaccessible PDFs and wondering how to tackle the problem, here's what we'd recommend:
Start with your highest-traffic documents. These give you the most impact and let you prove the approach works before committing to bulk conversion.
Integrate conversion into your publishing workflow early. The sooner new content is accessible by default, the less retrofit work you'll need in future.
Keep editorial control. AI is powerful, but subject matter expertise matters. Build in review stages that let your teams verify accuracy.
Think beyond compliance. The real goal is making your content work better for everyone who needs it.