The Extract API allows you to extract structured data from web pages. In this quickstart guide, you will find a few examples to showcase how you can extract any unstructured web data into structured formats.

How it works

1

Submit a Request

Start by submitting a POST request with the URL of the page you wish to extract data from, along with the specific query and columns you need. This will initiate the extraction process.

2

Spin Up a Remote Browser

Upon receiving your request, the API spins up a remote browser instance. This browser navigates to the URL you provided, creating a new session for your extraction task.

3

Navigate and Understand the Webpage

The remote browser navigates to the requested URL and performs an intelligent analysis of the webpage, identifying the data that matches your extraction query.

4

Extract the Information

Using the details specified in your columns parameter as a schema, the API extracts the relevant information, structuring it according to your needs. If the columns is not defined, the agent automatically evaluates what should be the interface

5

Respond with Desired Output

After the data is extracted and structured, the API responds back to your request with the desired output, which you can then use for your applications.

Examples

Goal

Extract a list of trending repositories from GitHub, including their name, author, stars, and repository URL.

Example Input Payload

{
  "url": "https://github.com/trending",
  "query": "list of all trending repositories",
  "columns": "name, author, stars, repository_url"
}

Example Output Payload

{
  "data": [
    {
      "name": "example-repo",
      "author": "example-author",
      "stars": 1234,
      "repository_url": "https://github.com/example-author/example-repo"
    },
    ...
  ]
}

Example cURL

curl --location 'https://api.induced.ai/api/v1/extract' \\
--header 'x-api-key: <your-api-key>' \\
--header 'Content-Type: application/json' \\
--data '{
  "url" : "https://github.com/trending",
  "query" : "list of all trending repositories",
  "columns" : "name, author, stars, repository_url"
}'

Example 2: Extracting Products Launching Today on Product Hunt

Goal

Extract a list of products launching today on Product Hunt, including their name, tagline, votes, and product URL.

Example Input Payload

{
  "url": "https://www.producthunt.com/",
  "query": "list of all products launching today"
}

Example Output Payload

{
  "data": [
    {
      "name": "example-product",
      "tagline": "example-tagline",
      "votes": 123,
      "product_url": "https://www.producthunt.com/posts/example-product"
    },
    ...
  ]
}

Example cURL

curl --location 'https://api.induced.ai/api/v1/extract' \\
--header 'x-api-key: <your-api-key>' \\
--header 'Content-Type: application/json' \\
--data '{
  "url" : "https://www.producthunt.com/",
  "query" : "list of all products launching today"
}'

Example 3: Extracting all Stocks from TradingView

Goal

Extract a list of all the trending US stocks with their name, ticket and other infor

Example Input Payload

{
  "url": "https://www.tradingview.com/stock-screener/",
  "query": "list of all stocks"
}

Example Output Payload

{
  "data": [
    {
      "name": "example-stock",
      "symbol": "example-stock-ticket",
      "market_cap": "example-market-cap",
      "price": "example-stock-price"
    },
    ...
  ]
}

Example cURL

curl --location 'https://api.induced.ai/api/v1/extract' \\
--header 'x-api-key: <your-api-key>' \\
--header 'Content-Type: application/json' \\
--data '{
  "url": "https://www.tradingview.com/stock-screener/",
  "query": "list of all stocks"
}'