Parse Document

Extract text, metadata, and suggested questions from PDFs, Word docs, spreadsheets, and more. Perfect for content analysis pipelines.

📝
Full Text Extraction
Extract all text content with structure preserved
🔍
OCR Support
Recognize text in scanned documents and images
💡
Smart Suggestions
AI-generated questions based on content
example.js
REST API
const formData = new FormData();
formData.append('file', fileInput.files[0]);

const response = await fetch(
  'https://api.skimming.ai/source/v1/api/parse/document',
  {
    method: 'POST',
    headers: {
      'Authorization': 'Bearer YOUR_API_KEY',
    },
    body: formData
  }
);
const data = await response.json();
console.log(data.success.textContent);

What's in the API

Powerful features designed to help you build amazing applications with ease.

Intelligent Document Parsing

Advanced text extraction that maintains document structure, headings, and formatting across all major formats.

Built-in OCR Engine

Automatically processes scanned documents and extracts text from images embedded within documents.

AI-Generated Questions

Receive smart suggested questions based on document content to kickstart exploration.

URL & Reference Extraction

Automatically identify and extract all hyperlinks, references, and citations from documents.

Structured Metadata

Get rich metadata including document properties, author info, creation date, and page count.

No Storage Overhead

Content processed and returned directly—no file_id needed, no database storage.

Common Use Cases

See how developers are using this API to solve real-world problems.

Content Pipeline

Feed documents into your AI/ML processing pipeline.

Search Indexing

Extract text for full-text search engines.

Data Migration

Convert legacy documents to structured data.

Compliance Scanning

Extract text for compliance and audit tools.

Technical Specifications

Everything you need to know to integrate this API.

Endpoint

Base URL

/v1/source/v1/api/parse/document

HTTP Method

Request type

POST

Authentication

Security method

Bearer Token (API Key)

Rate Limit

Request limits

Based on subscription tier

Frequently Asked Questions

Haven’t got your answer? Contact our support now

What data is returned?

Is OCR supported?

What's the max file size?