Web Browsing
PromptLoop helps you automatically research thousands of companies to find exactly what you need. Whether that is qualifying customers by market segment, monitoring their hiring page, checking what software they use, or analyzing their services. Get accurate data in minutes instead of hours of manual research.
How it works
The web browsing tasks allow you to build AI systems that can extract specific data from websites. You define the item(s) you are looking for and then our models will handle the rest allowing you to input a website and get back a row of data corresponding to the search items you defined in the task. We then let you run these tasks on a CSV or Excel file with thousands of inputs (websites). Its as simple as defining what you are looking for and uploading a file.
Advanced options let you define exactly the format you need data in to effortlessly build a proprietary dataset from scratch in minutes.
Types of Tasks
PromptLoop has three types of web browsing tasks:
Crawl Task - This takes a single website or domain and finds new datapoints (new columns) for it.
Search Task - This takes a search term and returns the matching link based on search criteria. This is a helpful starting point if you do not have a list of websites yet and need to find appropriate web resources.
List Task (Legacy) - This takes a single website or domain and returns multiple rows with multiple datapoints for it. For example pulling out all companies from a customer page.
Creating Web Browsing Tasks
Auto Generating a Task (Recommended)
To start, navigate to the tasks tab and click New Task. The New Task option will open a popup like below where you can describe what you are looking for. You should put in the data points you are interested in and any special formatting instructions.

PromptLoop will generate columns for your task based on what you asked for and you can preview and accept the columns.
With the query:

Or the query:

Templates
There are also many templates available for you to copy and edit. You can find them in the templates tab, or in our templates library.
Running Tests
You can immediately input and run results right in your account. When you are satisfied and don't need further edits, you can upload your spreadsheet with a column that has the relevant inputs (whatever the task requires). This test page is representative of one row of data and is useful for quickly testing formatting and instruction edits. Results are cached per version and input, but you can always select the three dots to the right of "Edit Task" and select "Clear Cache" to clear the cache for a specific version.

Viewing Data
To run your task on a dataset, you should follow the guide here. Once your dataset is run on the web task, you will be able to view all results right in the datasets tab and search, filter and save versions.

Editing Web Browsing Tasks
When you go to the edit page of a task, you will see the following options:
Browsing Type
There are three different depths for the crawl tasks: Single page, Smart Crawl, and Deep Research Smart Crawl. This determines how far into the website our models will navigate, looking for answers to your search queries.

- Single Page will only look on the website page you provide as an input. For example, if we input www.promptloop.com/about into a single page search task then PromptLoop would only search the about page for information. This is useful if you don't want to navigate around a website and want to control where the information is extracted from. An example would be if you are looking at listing pages on a marketplace or listing website, single page ensures you don't get information from any other listings on the website.
- Smart Crawl is the default for all crawling tasks and uses the items you are looking for to intelligently navigate through the site, very similar to how a human would do it. This depth is sufficient for the majority of website crawl tasks and is optimized to allow enough navigation while also returning answers as quickly as possible.
- Deep Research Smart Crawl is a brand new crawling option we have released to our enterprise customers. This crawl type, enables our models to spend more time and explore more pages to make sure we are finding the answers to your queries wherever they may be on the website. Because we are doing significantly more crawling, this option will cause the task to run slightly slower.
Which should I use? For information that is located deep in a website, Deep research is the best option, but if the information you are looking for is always on the homepage, then basic Smart Crawl will be sufficient and run faster. We encourage customers to start with the Smart Crawl and if you notice that there are items not getting retrieved, then edit the task to use Deep Research. It takes no time to toggle between the two. If you only want information from a specific page, then Single Page is the best option for you.
Search Items
These are the ouput columns for each input row of data. For example if the input is a law firm website, the results will be a row of data with the law firm's description, number, and whether or not they offer personal injury services. You can add, remove or adjust each of these output columns.

- Search Item Name - This will be the title of the column in the output dataset. It should be unique but as simple and descriptive as possible.
- Search Query - This is the query that will be used to find the information you are looking for. It should be formatted as a natural language question i.e ("What is the description of the law firm?", "What is the main phone number of the law firm?"). While it can be unique to the task, it shouldn't be specific to an individual business. (i.e "What is the description of the law firm?" is good, "What is the description of the Acme Law Firm?" is not good)
- Additional Instructions - This is any additional instructions that will be used to find the information you are looking for. It can be used for unique formatting instructions (i.e "Only return a description of 10 words or less").
- Source - If checked, we will return the link to the page where the information was found. It will be returned in an additional column with the title being "Current Search Item Name" + "Source".
- Answer Format - This is the format of the information you are looking for. You can choose from a variety of common output formats or adjust to one of the other types of outputs (categories, lists, etc.) See below for more details.
Answer Formats
Formatting allows you to collect data in exactly the output format that you need. This includes pulling out website links, raw text, or numbers. You should also add additional instructions on top of formatting for more specific preferences.

- Text - This will return the raw text of the information you are looking for.
- Number - This will return the number of the information you are looking for.
- True / False - This will return a boolean value (true or false) based on the information you are looking for.
- Single Category / Multiple Categories - This will return only one of the specified category(s) see below for more details.
- Single Category - This will return a single category.
- Multiple Categories - This will return multiple categories in a comma separated list.
- List - This will return multiple rows of data depending on the number of items found. See below for more details.
- Image Link - This will return a link to the image that answers the query. (i.e "What is the profile image of each partner?")
- Link - This will only ever return a properly formatted link answering the query. (i.e "What is the link to the pricing page?")
- Script - This allows you to specify
Categories
This is a useful formatting type to ensure the uniformity of the output. When this is set for a search item then the output data for that column will always be one of the options or 'Not Found'. By ensuring output uniformity across the entire dataset, you can more easily analyze, sort, and compare the data.
When you select categories, you will have to add at least one category (you can add as many as you need as well as detailed instructions for how the models should categorize the output).

Lists
Sometimes you may want to extract a list of items from a website. This output will allow you to specify what data you want to get for each item as well as instructions about which items to include or exclude. Similarly to categories, you have to specify at least one list column.
Common use cases for this include:
- Extracting a list of portfolio companies from a VC firm website
- Extracting a list of people from a company website
- Extracting a list of pricing plans from a SaaS website

If you want to generate a list from across the internet, you should use dataset generation instead. This list extract is designed to extract a list of items from across a single website.
Search Engine tricks
You can also use web search engines instead of websites as a starting point for a task. This opens up a variety of powerful options. There are some nuances that can yield better results when selecting one or more links from a search to return.
Common Examples
- Find a company website from its name - the input is the company name, the output is the website
- Find a linkedIn Profile of a specific person or title - the input is the person's name, the output is the linkedIn profile
- Find or summarize recent news articles about a topic or company - the input is the topic or company, the output is the news articles
- Find the last selling price of a property - the input is the property address, the output is the last selling price or link to the property listing
Query Options You can use standard search engine techniques within PromptLoop tasks. Here are a few that can work well.
For help setting up advanced search tasks, get in touch for options to customize and optimize your results.
Why Leverage Our Web Browsing Capability
Accuracy, Visibility, and Flexibility
Our web browsing technology is engineered for speed. It allows you to navigate through multiple web pages in a fraction of the time it would take using traditional methods. This means you can quickly gather the data you need, making the most of your time and resources.
We run optimized resources to find the information you need directly from relevant sources. This allows for up-to-date, proprietary data delivered precisely in your required format. Because we leverage language models to navigate pages and identify relevant information, we can handle thousands of formats from millions of company page types, all at speeds unmatched by alternatives, including human-led research.
Unlike generative AI chat applications, PromptLoop uses models and techniques tuned to deliver accurate and formatted results only from trusted and provided sources. Without a source, we will not generate text for information, providing repeatable and reliable answers.
Scalable Solutions
Whether you're looking to scrape data from a handful of web pages or perform Excel web scraping on a massive scale, our service is designed to scale with your needs. Our robust infrastructure ensures stability and performance, even when your data requirements grow exponentially.
We allow you to upload entire datasets to enrich using a CSV file or Excel file using the Datasets tool. This allows non-technical teams to create tasks to find precise information from company or entity websites and return formatted web scraping results for a bespoke dataset.
Precision and Customization
Our web browsing feature is not just about accessing the web—it's about retrieving data with precision. With it, you can specify the data format you need, whether it's Booleans, direct answers to questions, or other specific data types that standard datasets may not provide.
You can learn more about creating and customizing web research and enrichment tasks with this guide to creating custom tasks
Industry-Leading Web Scraping AI
A cutting-edge web scraping AI is at the core of our web browsing capability. It navigates the web intelligently, understanding the context and semantics of the content to deliver relevant and accurate results.
Secure and Compliant
We take web scraping seriously, adhering to the best practices and compliance standards. Our web browsing capability is optimized for you, allowing you to focus on the information and sources essential to driving your business decisions, not on setting up the specifics of a research pipeline.
For companies requiring an even higher level of data security than that which we offer standard to all customers, we can customize our infrastructure to meet your team's requirements. You can learn more by scheduling a call with our team today
Tips and Tricks
Designing an effective task depends on your end goal. This includes the tolerance for errors, importance of formatting, and how varying your input sites are.
- Focus - its best to set up tasks for specific purposes and only ask for the data you need.
- Formatting - PromptLoop Models will automatically and strictly format data for you, select the correct format for each output.
- Testing and Versions - It may take a few quick edits to get a task to work well. This is simple when using the editor and versions are always saved for future edits
Leveraging Web Browsing in Custom Tasks
Our Custom Tasks utilize the full potential of our web browsing feature to cater to a wide range of business applications, such as:
- Company data web scraping: Extract detailed company profiles, financial data, or employee information.
- Market research: Gather insights on market trends, customer preferences, or competitive landscapes.
- Product analysis: Compare product features, prices, and availability across different e-commerce platforms.
With our web browsing capability, Custom Tasks can transform the web into a treasure trove of actionable data for your business.
Get Started
Experience our web browsing feature's unparalleled efficiency and precision within Custom Tasks. Start harnessing the power of structured data extraction to make informed decisions and drive your business forward.
Ready to unlock the full potential of the web for your business needs? Request a demo here and see our web browsing capability.