Quickstart
Get started with the Extract API by exploring these examples
The Extract API allows you to extract structured data from web pages. In this quickstart guide, you will find a few examples to showcase how you can extract any unstructured web data into structured formats.
How it works
Submit a Request
Start by submitting a POST request with the URL of the page you wish to extract data from, along with the specific query and columns you need. This will initiate the extraction process.
Spin Up a Remote Browser
Upon receiving your request, the API spins up a remote browser instance. This browser navigates to the URL you provided, creating a new session for your extraction task.
Navigate and Understand the Webpage
The remote browser navigates to the requested URL and performs an intelligent analysis of the webpage, identifying the data that matches your extraction query.
Extract the Information
Using the details specified in your columns
parameter as a schema, the API extracts the relevant information, structuring it according to your needs. If the columns
is not defined, the agent automatically evaluates what should be the interface
Respond with Desired Output
After the data is extracted and structured, the API responds back to your request with the desired output, which you can then use for your applications.
Examples
Example 1: Extracting Trending Repositories
Goal
Extract a list of trending repositories from GitHub, including their name, author, stars, and repository URL.
Example Input Payload
{
"url": "https://github.com/trending",
"query": "list of all trending repositories",
"columns": "name, author, stars, repository_url"
}
Example Output Payload
{
"data": [
{
"name": "example-repo",
"author": "example-author",
"stars": 1234,
"repository_url": "https://github.com/example-author/example-repo"
},
...
]
}
Example cURL
curl --location 'https://api.induced.ai/api/v1/extract' \\
--header 'x-api-key: <your-api-key>' \\
--header 'Content-Type: application/json' \\
--data '{
"url" : "https://github.com/trending",
"query" : "list of all trending repositories",
"columns" : "name, author, stars, repository_url"
}'
Example 2: Extracting Products Launching Today on Product Hunt
Goal
Extract a list of products launching today on Product Hunt, including their name, tagline, votes, and product URL.
Example Input Payload
{
"url": "https://www.producthunt.com/",
"query": "list of all products launching today"
}
Example Output Payload
{
"data": [
{
"name": "example-product",
"tagline": "example-tagline",
"votes": 123,
"product_url": "https://www.producthunt.com/posts/example-product"
},
...
]
}
Example cURL
curl --location 'https://api.induced.ai/api/v1/extract' \\
--header 'x-api-key: <your-api-key>' \\
--header 'Content-Type: application/json' \\
--data '{
"url" : "https://www.producthunt.com/",
"query" : "list of all products launching today"
}'
Example 3: Extracting all Stocks from TradingView
Goal
Extract a list of all the trending US stocks with their name, ticket and other infor
Example Input Payload
{
"url": "https://www.tradingview.com/stock-screener/",
"query": "list of all stocks"
}
Example Output Payload
{
"data": [
{
"name": "example-stock",
"symbol": "example-stock-ticket",
"market_cap": "example-market-cap",
"price": "example-stock-price"
},
...
]
}
Example cURL
curl --location 'https://api.induced.ai/api/v1/extract' \\
--header 'x-api-key: <your-api-key>' \\
--header 'Content-Type: application/json' \\
--data '{
"url": "https://www.tradingview.com/stock-screener/",
"query": "list of all stocks"
}'