PDF to Text API

Accurately Extract Text from Any PDF with Our Advanced OCR-Powered API.

hero banner

Code Examples in Popular Languages

Integrate our PDF to Text API easily into your apps with comprehensive code examples in popular languages to get started quickly.

CURL Request
curl --location 'https://theonlineconverter.com/api/v1/document-converter' \
--header 'Content-Type: application/json' \
--header 'x-api-key: enter_your_api_key' \
--form 'from="pdf"' \
--form 'to="txt"' \
--form 'file=@"/D:/data/Document/pdf/other.pdf"'
JavaScript Fetch
const myHeaders = new Headers();
myHeaders.append("Content-Type", "application/json");
myHeaders.append("x-api-key", "enter_your_api_key");

const formdata = new FormData();
formdata.append("from", "pdf");
formdata.append("to", "txt");
formdata.append("file", fileInput.files[0], "/D:/data/Document/pdf/other.pdf");

const requestOptions = {
  method: "POST",
  headers: myHeaders,
  body: formdata,
  redirect: "follow"
};

fetch("https://theonlineconverter.com/api/v1/document-converter", requestOptions)
  .then((response) => response.text())
  .then((result) => console.log(result))
  .catch((error) => console.error(error));
Ruby Net::HTTP
import requests
import json

url = "https://theonlineconverter.com/api/v1/document-converter"

payload = {'from': 'pdf',
'to': 'txt'}
files=[
  ('file',('other.pdf',open('/D:/data/Document/pdf/other.pdf','rb'),'application/pdf'))
]
headers = {
  'Content-Type': 'application/json',
  'x-api-key': 'enter_your_api_key'
}

response = requests.request("POST", url, headers=headers, data=payload, files=files)

print(response.text)
Python Requests
import requests
import json

url = "https://theonlineconverter.com/api/v1/document-converter"

payload = {'from': 'pdf',
'to': 'txt'}
files=[
  ('file',('other.pdf',open('/D:/data/Document/pdf/other.pdf','rb'),'application/pdf'))
]
headers = {
  'Content-Type': 'application/json',
  'x-api-key': 'enter_your_api_key'
}

response = requests.request("POST", url, headers=headers, data=payload, files=files)

print(response.text)
PHP Guzzle
<?php
$client = new Client();
$headers = [
  'Content-Type' => 'application/json',
  'x-api-key' => 'enter_your_api_key'
];
$options = [
  'multipart' => [
    [
      'name' => 'from',
      'contents' => 'pdf'
    ],
    [
      'name' => 'to',
      'contents' => 'txt'
    ],
    [
      'name' => 'file',
      'contents' => Utils::tryFopen('/D:/data/Document/pdf/other.pdf', 'r'),
      'filename' => '/D:/data/Document/pdf/other.pdf',
      'headers'  => [
        'Content-Type' => '<Content-type header>'
      ]
    ]
]];
$request = new Request('POST', 'https://theonlineconverter.com/api/v1/document-converter', $headers);
$res = $client->sendAsync($request, $options)->wait();
echo $res->getBody();
Java HttpURLConnection
OkHttpClient client = new OkHttpClient().newBuilder()
  .build();
MediaType mediaType = MediaType.parse("application/json");
RequestBody body = new MultipartBody.Builder().setType(MultipartBody.FORM)
  .addFormDataPart("from","pdf")
  .addFormDataPart("to","txt")
  .addFormDataPart("file","/D:/data/Document/pdf/other.pdf",
    RequestBody.create(MediaType.parse("application/octet-stream"),
    new File("/D:/data/Document/pdf/other.pdf")))
  .build();
Request request = new Request.Builder()
  .url("https://theonlineconverter.com/api/v1/document-converter")
  .method("POST", body)
  .addHeader("Content-Type", "application/json")
  .addHeader("x-api-key", "enter_your_api_key")
  .build();
Response response = client.newCall(request).execute();
Go net/http
package main

import (
  "fmt"
  "bytes"
  "mime/multipart"
  "os"
  "path/filepath"
  "net/http"
  "io"
)

func main() {

  url := "https://theonlineconverter.com/api/v1/document-converter"
  method := "POST"

  payload := &bytes.Buffer{}
  writer := multipart.NewWriter(payload)
  _ = writer.WriteField("from", "pdf")
  _ = writer.WriteField("to", "txt")
  file, errFile3 := os.Open("/D:/data/Document/pdf/other.pdf")
  defer file.Close()
  part3,
         errFile3 := writer.CreateFormFile("file",filepath.Base("/D:/data/Document/pdf/other.pdf"))
  _, errFile3 = io.Copy(part3, file)
  if errFile3 != nil {
    fmt.Println(errFile3)
    return
  }
  err := writer.Close()
  if err != nil {
    fmt.Println(err)
    return
  }


  client := &http.Client {
  }
  req, err := http.NewRequest(method, url, payload)

  if err != nil {
    fmt.Println(err)
    return
  }
  req.Header.Add("Content-Type", "application/json")
  req.Header.Add("x-api-key", "enter_your_api_key")

  req.Header.Set("Content-Type", writer.FormDataContentType())
  res, err := client.Do(req)
  if err != nil {
    fmt.Println(err)
    return
  }
  defer res.Body.Close()

  body, err := io.ReadAll(res.Body)
  if err != nil {
    fmt.Println(err)
    return
  }
  fmt.Println(string(body))
}
C# HttpClient
var options = new RestClientOptions("https://theonlineconverter.com")
{
  MaxTimeout = -1,
};
var client = new RestClient(options);
var request = new RestRequest("/api/v1/document-converter", Method.Post);
request.AddHeader("Content-Type", "application/json");
request.AddHeader("x-api-key", "enter_your_api_key");
request.AlwaysMultipartFormData = true;
request.AddParameter("from", "pdf");
request.AddParameter("to", "txt");
request.AddFile("file", "/D:/data/Document/pdf/other.pdf");
RestResponse response = await client.ExecuteAsync(request);
Console.WriteLine(response.Content);

Key Features & Capabilities

Our advanced API provides a comprehensive suite of features for accurate and flexible PDF text extraction.

Simple User Interface

Advanced OCR Engine

Our API integrates a cutting-edge OCR engine, capable of accurately recognizing and extracting text from even scanned or image-based PDF documents, ensuring no content is left behind.

Simple Data Format

Native & Scanned PDF Support

Seamlessly handle both text-selectable (native) PDFs and image-only (scanned) PDFs, thanks to our intelligent content detection and advanced OCR.

Time

High Extraction Accuracy

Benefit from superior accuracy in text extraction, preserving content integrity regardless of the PDF's complexity, fonts, or language.

Secure Data

Table Data Extraction

Intelligently detect and extract tabular data from PDFs, converting it into structured formats that are ready for analysis or database integration.

Universal Access

Multi-Language Recognition

Extract text from PDFs in a multitude of languages, enabling global application development and international document processing.

Freedom

Secure & Privacy Compliant

All PDF documents are processed over secure connections with robust data encryption and strict privacy policies, ensuring your data remains confidential.

Frequently Asked Questions

Find quick answers to common questions about our PDF to Text API to help you get started and optimize your text extraction workflows.

Native PDFs contain selectable text, allowing direct extraction. Scanned PDFs are essentially images, requiring our advanced OCR engine to recognize and convert the image of text into digital characters for extraction. Our API handles both seamlessly.

Yes, you can specify individual pages or a range of pages in your API request, allowing you to extract only the relevant content from a multi-page document.

Our advanced OCR engine is highly trained to handle complex layouts, various fonts, and even mixed content (images, tables, text) within PDFs, resulting in very high accuracy. Performance may vary with extremely poor quality scans.

Absolutely. Our API can intelligently detect tables within PDFs and extract their content into a structured format (e.g., JSON) for easy use.

You can receive the output as clean plain text, structured JSON (which can include metadata like page numbers), or even Markdown.

For privacy and security, uploaded PDFs are processed and then securely deleted from our servers within a short, specified timeframe, typically immediately after successful extraction.