Cloudflare vs Perplexity: The Battle Over AI Web Scraping Heats Up

Cloudflare vs Perplexity: The Battle Over AI Web Scraping Heats Up

 

Reading through Cloudflare’s detailed exposé and the extensive media coverage, the controversy surrounding Perplexity AI’s web scraping practices is deeper — and more polarizing — than it first appears. Cloudflare accuses Perplexity of systematically ignoring website blocks and masking its identity to scrape data from sites that have opted out, raising serious questions about ethics, transparency, and the future of the Internet’s business model.

What Cloudflare Observed

Cloudflare’s report and independent investigations show that Perplexity, an AI startup, allegedly crawls and scrapes content from websites that explicitly signal (through robots.txt and direct blocks) that AI tools are not welcome. The technical evidence includes changing user agents to impersonate browsers like Google Chrome on macOS and rotating Autonomous System Numbers (ASNs) — sophisticated tactics intended to evade detection and blocks. Cloudflare claims it detected this covert scraping across tens of thousands of domains, generating millions of requests daily, and fingerprinted the crawler using machine learning and other network signals.

Why the Accusations Matter

For decades, websites have used robots.txt as a “gentleman’s agreement” to tell bots what’s allowed. While illegal in very few jurisdictions, the norm among leaders like OpenAI and Anthropic is to respect these signals. Perplexity’s alleged approach undermines this unwritten contract, suggesting a willingness to bypass website owners’ wishes in pursuit of training data.

This issue exploded just as Cloudflare launched its new “Pay Per Crawl” marketplace, which lets publishers charge for AI bot access and blocks most crawlers by default. Major outlets — The Atlantic, BuzzFeed, Time Inc., and O’Reilly — have signed up, and over 2.5million websites now disallow AI training outright.

Perplexity Responds

Perplexity’s spokesperson dismissed Cloudflare’s blog post as little more than a “sales pitch,” claiming the screenshots “show that no content was accessed” and denying ownership of the bot in question. Perplexity later argued that much of what Cloudflare saw was user-driven fetching (an AI agent acting on direct user requests) rather than automated crawling — a key distinction in ongoing debates about what “scraping” really means. They also mentioned that similar incidents had happened before, notably accusations of plagiarism from outlets like Wired, and the company has struggled to define its own standards for content use.

Divided Reactions & Broader Implications

  • Cloudflare’s stance: Protect publishers’ business models, enforce block signals, and charge for “AI access” to content.
  • Perplexity’s defense: AI web agents, when acting for users, shouldn’t be distinguished from human browsing.
  • Community Debate: Some argue on social platforms that if a user requests a public site via Perplexity, it’s akin to opening it in Firefox. Others counter that this hurts site owners’ ad-driven revenue and control over their data.

The Big Picture: The Internet’s Business Model Is Changing

  • Content monetization is rapidly shifting. Publishers are moving from ads to access fees, and scraping is becoming a pay-to-play market.
  • Transparency and compliance are no longer optional. AI firms face mounting reputational and legal risks if caught evading blocks or misusing content.
  • Data partnerships will define the future. Major AI players are investing in licensing deals with publishers rather than relying on stealth scraping.

Conclusion

Whether Perplexity is being singled out unfairly or genuinely violating web norms, this is a watershed moment. The era of “free data” for AI is ending. Ethics, economics, and new gatekeeping platforms like Cloudflare are pushing a shift toward paid data, greater accountability, and sustainable content partnerships. Unless AI companies adapt, they’ll face locked gates and a fragmented, paywalled Internet — and that ultimately reshapes the foundation of the digital world.


Check out the Technical details. Feel free to check out our GitHub Page for Tutorials, Codes and Notebooks

The post Cloudflare vs Perplexity: The Battle Over AI Web Scraping Heats Up appeared first on MarkTechPost.

MarkTechPost

Read More
A Developer’s Guide to OpenAI’s GPT-5 Model Capabilities

A Developer’s Guide to OpenAI’s GPT-5 Model Capabilities

 

In this tutorial, we’ll explore the new capabilities introduced in OpenAI’s latest model, GPT-5. The update brings several powerful features, including the Verbosity parameter, Free-form Function Calling, Context-Free Grammar (CFG), and Minimal Reasoning. We’ll look at what they do and how to use them in practice. Check out the Full Codes here.

Installing the libraries

!pip install pandas openai

To get an OpenAI API key, visit https://platform.openai.com/settings/organization/api-keys and generate a new key. If you’re a new user, you may need to add billing details and make a minimum payment of $5 to activate API access. Check out the Full Codes here.

import os
from getpass import getpass
os.environ['OPENAI_API_KEY'] = getpass('Enter OpenAI API Key: ')

Verbosity Parameter

The Verbosity parameter lets you control how detailed the model’s replies are without changing your prompt.

  • low → Short and concise, minimal extra text.
  • medium (default) → Balanced detail and clarity.
  • high → Very detailed, ideal for explanations, audits, or teaching. Check out the Full Codes here.
from openai import OpenAI
import pandas as pd
from IPython.display import display

client = OpenAI()

question = "Write a poem about a detective and his first solve"

data = []

for verbosity in ["low", "medium", "high"]:
    response = client.responses.create(
        model="gpt-5-mini",
        input=question,
        text={"verbosity": verbosity}
    )

    # Extract text
    output_text = ""
    for item in response.output:
        if hasattr(item, "content"):
            for content in item.content:
                if hasattr(content, "text"):
                    output_text += content.text

    usage = response.usage
    data.append({
        "Verbosity": verbosity,
        "Sample Output": output_text,
        "Output Tokens": usage.output_tokens
    })
# Create DataFrame
df = pd.DataFrame(data)

# Display nicely with centered headers
pd.set_option('display.max_colwidth', None)
styled_df = df.style.set_table_styles(
    [
        {'selector': 'th', 'props': [('text-align', 'center')]},  # Center column headers
        {'selector': 'td', 'props': [('text-align', 'left')]}     # Left-align table cells
    ]
)

display(styled_df)

The output tokens scale roughly linearly with verbosity: low (731) → medium (1017) → high (1263).

Free-Form Function Calling

Free-form function calling lets GPT-5 send raw text payloads—like Python scripts, SQL queries, or shell commands—directly to your tool, without the JSON formatting used in GPT-4. Check out the Full Codes here.

This makes it easier to connect GPT-5 to external runtimes such as:

  • Code sandboxes (Python, C++, Java, etc.)
  • SQL databases (outputs raw SQL directly)
  • Shell environments (outputs ready-to-run Bash)
  • Config generators
from openai import OpenAI

client = OpenAI()

response = client.responses.create(
    model="gpt-5-mini",
    input="Please use the code_exec tool to calculate the cube of the number of vowels in the word 'pineapple'",
    text={"format": {"type": "text"}},
    tools=[
        {
            "type": "custom",
            "name": "code_exec",
            "description": "Executes arbitrary python code",
        }
    ]
)
print(response.output[1].input)

This output shows GPT-5 generating raw Python code that counts the vowels in the word pineapple, calculates the cube of that count, and prints both values. Instead of returning a structured JSON object (like GPT-4 typically would for tool calls), GPT-5 delivers plain executable code. This makes it possible to feed the result directly into a Python runtime without extra parsing.

Context-Free Grammar (CFG)

A Context-Free Grammar (CFG) is a set of production rules that define valid strings in a language. Each rule rewrites a non-terminal symbol into terminals and/or other non-terminals, without depending on the surrounding context.

CFGs are useful when you want to strictly constrain the model’s output so it always follows the syntax of a programming language, data format, or other structured text — for example, ensuring generated SQL, JSON, or code is always syntactically correct.

For comparison, we’ll run the same script using GPT-4 and GPT-5 with an identical CFG to see how both models adhere to the grammar rules and how their outputs differ in accuracy and speed. Check out the Full Codes here.

from openai import OpenAI
import re

client = OpenAI()

email_regex = r"^[A-Za-z0-9._%+-]+@[A-Za-z0-9.-]+.[A-Za-z]{2,}$"

prompt = "Give me a valid email address for John Doe. It can be a dummy email"

# No grammar constraints -- model might give prose or invalid format
response = client.responses.create(
    model="gpt-4o",  # or earlier
    input=prompt
)

output = response.output_text.strip()
print("GPT Output:", output)
print("Valid?", bool(re.match(email_regex, output)))
from openai import OpenAI

client = OpenAI()

email_regex = r"^[A-Za-z0-9._%+-]+@[A-Za-z0-9.-]+.[A-Za-z]{2,}$"

prompt = "Give me a valid email address for John Doe. It can be a dummy email"

response = client.responses.create(
    model="gpt-5",  # grammar-constrained model
    input=prompt,
    text={"format": {"type": "text"}},
    tools=[
        {
            "type": "custom",
            "name": "email_grammar",
            "description": "Outputs a valid email address.",
            "format": {
                "type": "grammar",
                "syntax": "regex",
                "definition": email_regex
            }
        }
    ],
    parallel_tool_calls=False
)

print("GPT-5 Output:", response.output[1].input)

This example shows how GPT-5 can adhere more closely to a specified format when using a Context-Free Grammar.

With the same grammar rules, GPT-4 produced extra text around the email address (“Sure, here’s a test email you can use for John Doe: johndoe@example.com”), which makes it invalid according to the strict format requirement.

GPT-5, however, output exactly john.doe@example.com, matching the grammar and passing validation. This demonstrates GPT-5’s improved ability to follow CFG constraints precisely. Check out the Full Codes here.

Minimal Reasoning

Minimal reasoning mode runs GPT-5 with very few or no reasoning tokens, reducing latency and delivering a faster time-to-first-token.

It’s ideal for deterministic, lightweight tasks such as:

  • Data extraction
  • Formatting
  • Short rewrites
  • Simple classification

Because the model skips most intermediate reasoning steps, responses are quick and concise. If not specified, the reasoning effort defaults to medium. Check out the Full Codes here.

import time
from openai import OpenAI

client = OpenAI()

prompt = "Classify the given number as odd or even. Return one word only."

start_time = time.time()  # Start timer

response = client.responses.create(
    model="gpt-5",
    input=[
        { "role": "developer", "content": prompt },
        { "role": "user", "content": "57" }
    ],
    reasoning={
        "effort": "minimal"  # Faster time-to-first-token
    },
)

latency = time.time() - start_time  # End timer

# Extract model's text output
output_text = ""
for item in response.output:
    if hasattr(item, "content"):
        for content in item.content:
            if hasattr(content, "text"):
                output_text += content.text

print("--------------------------------")
print("Output:", output_text)
print(f"Latency: {latency:.3f} seconds")

Check out the Full Codes here. Feel free to check out our GitHub Page for Tutorials, Codes and Notebooks. Also, feel free to follow us on Twitter and don’t forget to join our 100k+ ML SubReddit and Subscribe to our Newsletter.

The post A Developer’s Guide to OpenAI’s GPT-5 Model Capabilities appeared first on MarkTechPost.

MarkTechPost

Read More
10 Best AI Tools for Sales Teams in 2025

10 Best AI Tools for Sales Teams in 2025

 10 Best AI Tools for Sales Teams in 2025

By 2025, 75% of sales teams will rely on AI-native tools. Early adopters are already seeing 20% more revenue with 15% higher conversion rates. The market is racing toward a $37 billion valuation, making 2025 the year AI moves from nice-to-have to must-have.

Why now? AI eliminates the busywork that slows you down. Data entry, prospect research, follow-ups—all handled automatically. You get real-time insights that help close deals faster. Sales reps gain back hours every week, outreach becomes personally relevant at scale, and forecasting accuracy jumps dramatically.

This guide covers ten AI tools driving these results. You’ll see exactly how each platform delivers measurable gains, from route optimization that cuts drive time to email coaching that boosts reply rates. Pricing is included, plus a practical tip you can test immediately with each tool.

Selection criteria & categories

Choosing the right AI sales tool comes down to six key factors that separate real results from marketing hype:

  • Core AI capability
  • Sales-specific impact
  • Ease of deployment
  • Scalability
  • Pricing value
  • 2025 product roadmap strength

Every tool earned its spot by excelling across these pillars. The best AI sales platforms automate data entry, surface predictive insights, and enable conversational AI interactions. Teams using these technologies report productivity gains in the 20-25% range and up to 20% revenue increases.

Category tags like “Best for field sales” or “Best for real-time email coaching” help you skip straight to solutions that solve your biggest pain points. Each tag reflects the primary workflow that tool streamlines—whether that’s route planning, email follow-up, or data enrichment.

Every section includes an implementation tip to help you test value quickly. Try rolling out SPOTIO in one territory or mastering Superhuman shortcuts—these pilot approaches let you benchmark time saved and build a business case before scaling company-wide.

Quick-glance comparison table

You’re short on time and need to find the right AI tool fast. This table shows you exactly what each platform does best and what it costs.

Tool

Best for category

Starting price

Signature AI feature

Superhuman

Lightning-fast, AI-native email workflow

$30 per user / month

Instant reply generation

Grammarly

Polished, high-impact sales messaging

Free

Real-time tone detection

Coda

Building custom sales workflows & dashboards

Free

AI blocks that automate tasks

Apollo.io

All-in-one sales intelligence & engagement

Free

Predictive prospect recommendations

Clay

Data enrichment & hyper-personalization

Custom / Contact sales

Automated firmographic enrichment

HubSpot Sales Hub

CRM for scaling teams

Free

AI predictive forecasting

Gong.io

Revenue intelligence

Custom 

AI Composer

InsightSquared

AI-driven sales forecasting

Custom / Contact sales

Machine-learning pipeline scoring

Regie.ai

Autonomous sales prospecting

$29 per user / month

Generative sequence creation

Superhuman — Best for Lightning-Fast, AI-Native Email Workflow

You spend hours triaging email when you could be closing deals. Superhuman turns that bottleneck into a competitive edge, helping teams save 4 hours every week, reply 12 hours faster, and handle twice as many emails in the same time. Your inbox feels lighter, and every follow-up lands sooner.

Superhuman AI kicks in the moment you start drafting. It studies past messages to that contact, mirrors your tone, and suggests a response that sounds like you. You’ll see Instant Reply, which will draft three replies at the bottom of the latest message. You can hit Tab to cycle and preview each response. 

Then use Ask AI to surface deal details buried in email conversations. Everything happens inside your inbox, so you never lose momentum switching between tools.

Organizing your email becomes effortless with Split Inbox. Priority conversations like active opportunities and VIP customers float to the top, while noise slips into dedicated sections you can ignore until later. Auto Summarize compresses long threads into a single sentence so you can scan context at a glance, and Auto Reminders nudge you when a prospect hasn’t replied, keeping deals from stalling.

Collaboration moves just as fast with Thread Sharing, allowing you to pull RevOps or legal into a conversation with one click, giving stakeholders full context while you stay in flow. Teams using these features report spending less time in their inboxes, replying to more emails, and experiencing faster response times during pilot programs. Leaders also note shorter deal cycles, attributing this to fewer internal back-and-forths, according to user testimonials and Superhuman’s reports.

Superhuman connects with Salesforce, HubSpot, and popular calendars, enabling automatic email logging, streamlined CRM updates, and integrated scheduling. At $30 per seat per month, it fits into most sales tech budgets without heavy implementation work.

Implementation tip: Book a 45-minute shortcut workshop for your team. Mastering keyboard commands is the fastest route to Inbox Zero, and reps often reclaim a full hour the very first day.

Reduce distractions and save 4+ hours every week with Superhuman!
Keyboard shortcuts, Undo send, AI triage, Reminders, Beautiful design

Get Superhuman for Email

10 Best AI Tools for Sales Teams in 2025
10 Best AI Tools for Sales Teams in 2025

Pros

  • Inboxes feel 10× lighter; reps fly through email
  • AI-native writing that matches your voice, no manual training required
  • Split Inbox, Ask AI, and Auto Summarize remove busywork
  • Seamless CRM sync and Thread Sharing improve team collaboration

Cons

  • Premium price may require budget approval for large teams
  • Keyboard-first workflow can feel unfamiliar until shortcut training is complete

Grammarly — Best for Polished, High-Impact Sales Messaging

Your prospects decide in seconds whether to trust your outreach. Grammarly’s AI tools make those seconds count by catching errors, sharpening language, and flagging tone issues before you hit send.

The platform scans each sentence as you type, highlighting wordiness, passive voice, and off-brand phrasing in real time. Tone detection alerts you when messages sound too formal or not formal enough, while delivery suggestions cut filler that slows readers down. The result? Concise, on-brand copy that keeps prospects engaged.

Grammarly integrates directly into Gmail, Outlook, and LinkedIn so reps never leave their selling screen. Business and Enterprise plans unlock shared style guides—set your preferred terms, tone, and formatting once, and every rep gets inline nudges that enforce brand consistency. This simple switch cuts editing cycles and contributes to improved team productivity.

Pricing stays approachable. Start free with core suggestions, then move to Business at $12 per seat monthly for advanced tone checks, analytics, and the crucial style guide. Enterprise tiers add single sign-on and deeper reporting.

Implementation tip: Draft your style guide before rollout. Define your tone (confident, friendly), ban specific phrases, and set preferred calls to action. Activate the guide on day one so Grammarly enforces it automatically, freeing leaders from constant copy reviews.

Pros

  • Real-time grammar, clarity, and tone coaching
  • Shared style guide that guarantees brand voice
  • Integrations with primary email, CRM, and chat platforms

Cons

  • Limited insight into message performance beyond writing quality
  • Free tier omits team analytics and centralized style controls

Grammarly turns polished writing into a competitive edge, giving your team the confidence and consistency to close more deals with fewer edits.

Coda — Best for Building Custom Sales Workflows & Dashboards

Picture a single doc that thinks like a database, talks to your CRM, and automates half the busywork that keeps you from selling. That’s Coda. By mixing flexible pages with relational tables, it turns scattered spreadsheets, notes, and playbooks into one living workspace that adapts to the way you sell.

The magic starts with Packs, which let you pull real-time opportunity data from Salesforce, sync calendars, or post deal updates to Slack with just a few clicks. Need to tweak the flow? Just drag a column, add a formula, and the entire dashboard updates instantly. When Coda’s AI blocks join the party, meeting notes summarize themselves and surface next-step suggestions, part of the workflow automation trend reshaping sales.

Sales teams use Coda to run live quota dashboards that refresh the moment pipeline data changes, build interactive playbooks so every rep follows the same discovery path and objection handling tips, and track pipeline health in one view that rolls up weighted forecasts, renewal dates, and at-risk deals.

Pricing stays straightforward: the Free plan covers smaller teams, while Pro and Team tiers run about $10–$30 per person each month. Enterprise plans include advanced governance and priority support.

Implementation tip: import a sample of your CRM data first, then set up automations that trigger follow-up tasks, send reminder emails, or update opportunity stages. This quick win shows reps how Coda removes manual data entry and builds momentum for broader adoption.

Pros

  • Extreme customization lets you shape workflows instead of bending to a preset template
  • Packs ecosystem connects to email, calendars, analytics, and every major CRM, creating a unified source of truth
  • Built-in AI blocks turn raw notes into action items, saving prep and recap time

Cons

  • The blank-canvas approach can overwhelm new customers who prefer a strict structure
  • Building a fully automated workspace may take several days of tinkering before everything runs smoothly

Apollo.io — Best for All-in-One AI-Powered Sales Intelligence & Engagement

Apollo.io combines prospecting, engagement, and deal intelligence into one AI-native platform. You spend less time hunting for leads and more time closing deals. The platform starts with a 275-million-contact database, then uses algorithms to find companies that match your ideal customer profile. It ranks each prospect by real-time intent signals like website visits, hiring trends, and tech-stack changes, so you know exactly who’s ready to buy.

Once you identify a prospect, Apollo.io tracks every email, call, and meeting. Engagement scores update automatically, giving you live feedback on interest levels without manual data entry. Add automated A/B testing for subject lines and send times, and you get measurable results that contribute to the improved performance many AI-equipped teams report.

Pricing stays reasonable with a free Starter tier covering basic search and outreach. The Growth plan costs $39 per customer monthly and unlocks advanced intent filters, automated sequences, and deeper CRM integrations. Enterprise options scale further with custom enrichment and governance controls.

Implementation tip: turn on buyer-intent filters during week one. Sort prospects by “high intent” and route them to your top closers. Most teams see immediate improvements in close rates after implementing this filter.

Pros

  • Comprehensive data with AI-native enrichment and scoring
  • Continuous engagement tracking that updates your CRM automatically
  • Native dialer and email sequencer keep outreach in one workspace

Cons

  • Data coverage can be limited for niche international markets
  • Feature breadth may feel overwhelming until workflows are optimized

Apollo.io’s combination of data depth, intent analytics, and automated outreach makes it ideal when you want one platform to find, qualify, and engage the right buyers without juggling multiple tools.

Clay — Best for Data Enrichment & Hyper-Personalization

Clay turns scattered prospect data into complete profiles so you can talk to every lead like you’ve known them for years. It automatically pulls company details, tech stacks, funding news, and hiring trends from dozens of sources, then syncs everything to your CRM instantly. No more jumping between LinkedIn, Crunchbase, and half-empty contact records.

This rich data fuels truly personal outreach at scale. Teams using similar data enrichment see improved results because every email references specific pain points and growth signals. Your sequences become conversations, not cold pitches.

Clay uses credit-based pricing, allowing you to buy credits monthly and spend them only when enriching records. Costs scale with your volume, not team size.

Implementation tip: Connect Clay to your email sequencer and map enrichment fields like “recent funding” or “tech stack” directly into email variables. Your sequences update automatically as Clay refreshes data, keeping every touchpoint current.

Pros

  • Pulls firmographic and technographic data from dozens of sources automatically
  • AI-native segmentation enables true one-to-one personalization at scale
  • Credit pricing fits both small pilots and enterprise rollouts
  • Direct CRM and sequencer integrations eliminate manual data entry

Cons

  • Initial setup requires mapping fields and workflows
  • Credit costs add up quickly without proper usage guardrails

HubSpot Sales Hub — Best CRM for Scaling Teams with Complex Needs

Scaling a sales org brings sprawling pipelines, extra handoffs, and high-stakes forecasting. HubSpot Sales Hub meets that complexity with an AI-native stack that connects your entire revenue operation in one Smart CRM.

Predictive forecasting crunches historical, behavioral, and intent data to project revenue in real time. Deal health signals flag stalled opportunities while AI Playbooks surface next-best actions inside each record, so reps stay focused on deals that can close this quarter. Smart CRM keeps every interaction synced across marketing, service, and finance, ending the copy-and-paste cycle that slows growing teams.

The impact shows up quickly as teams report significant productivity increases and shorter sales cycles after rollout. With most sales organizations expected to use AI tools by 2025, joining that majority now protects pipeline accuracy and frees leaders to coach rather than chase data.

Pricing starts with a free plan for core contact management, then moves to Starter, Professional, and Enterprise tiers as automation, forecasting, and reporting sophistication expand. Each paid tier includes expanded support options and access to a marketplace of native integrations for dialers, quoting apps, and BI platforms. Actual pricing and support offerings may vary; consult HubSpot’s website for current details.

Implementation works best when you begin with HubSpot’s free CRM. Import a single region or business unit first, switch on predictive forecasting, and compare win rates against teams still living in spreadsheets. Early wins build momentum and make the business case for a full upgrade clear.

Pros

  • Smart CRM unifies data across revenue teams
  • Predictive forecasting and AI Playbooks boost accuracy and focus
  • Tiered pricing grows with headcount

Cons

  • Enterprise analytics and AI features sit behind higher tiers
  • Large migrations can demand extensive data cleaning

HubSpot Sales Hub turns sprawling processes into one connected motion, giving scaling teams the visibility and precision needed to hit aggressive targets.

Gong.io — Best for Revenue Intelligence & Conversation Analytics

Your revenue team makes critical decisions based on incomplete data. Gong changes that entirely by capturing and analyzing every customer interaction (calls, emails, meetings) then surfaces the insights that actually close deals.

The platform’s AI analyzes conversations for buying signals, competitive mentions, and deal risks you’d otherwise miss. It automatically updates your CRM with accurate data from actual conversations, ending the guesswork in pipeline reviews. Sales managers get instant visibility into which deals need attention and why.

Gong’s pricing reflects its enterprise focus: platform fees start at $5,000 annually, plus $1,360-$1,600 per user. Implementation typically takes 3-6 months with professional services ranging from $7,500-$30,000.

Implementation tip: Start with your top-performing team. Use their conversation data to identify winning patterns, then scale those insights across the organization. Most teams see 16% win rate improvements within the first quarter.

Pros

  • 40+ proprietary AI models trained on billions of sales conversations
  • Automatic CRM updates from actual customer interactions
  • 250+ integrations create comprehensive revenue insights

Cons

  • High platform fees challenge smaller team economics
  • Steep learning curve requires dedicated training resources

InsightSquared — Best for AI-Driven Sales Forecasting

If you need clear visibility into next quarter’s numbers, InsightSquared delivers. The platform applies machine-learning scoring to every deal, then rolls those scores into dynamic projections you can trust. You gain the confidence to invest, hire, or pivot before it’s too late.

InsightSquared ingests historical performance, activity data, and intent signals to predict pipeline health and quota attainment. As deals progress, the AI model flags slippage risks, recommends next steps, and recalculates the forecast in real time. You move from spreadsheet guesswork to a living forecast that sharpens every time your team sends an email or logs a call.

Pricing is custom, reflecting the depth of analytics and the level of support required. Most teams begin with an initial assessment that maps data sources, business rules, and reporting needs.

Implementation tip: connect InsightSquared directly to your CRM and communication tools so the model trains on clean, up-to-date information. Continuous data sync tightens projections and prevents the “garbage in, garbage out” problem that plagues legacy reporting.

Pros

  • Pipeline scores update automatically, eliminating manual roll-ups
  • Granular dashboards highlight win/loss patterns and coach reps on next-best actions
  • Historical data replays show how pipeline health changed over time, helping you course-correct earlier

Cons

  • Custom deployment demands solid data hygiene and cross-team alignment
  • Advanced analytics can overwhelm newcomers without dedicated enablement

With an accurate, always-on forecast, you stop managing by instinct and start steering growth with data. InsightSquared turns every deal update into a sharper prediction, freeing you to focus on strategy instead of spreadsheet gymnastics.

Regie.ai — Best for Autonomous Sales Prospecting

Prospecting eats up your entire day before you know it. Regie.ai changes that completely by drafting, scheduling, and fine-tuning entire outbound sequences for you, learning from every email you send.

The platform pulls your buyer personas, product messaging, and past wins to write multi-touch sequences that sound human. It pushes them straight to your email or sales tool and helps optimize subject lines, calls to action, and send times. Users can adjust and improve messaging, but the system does not automatically rewrite content based on built-in tests.

Pricing for Regie.ai varies by plan and team size. For the most accurate rates, teams should check the official pricing page or contact their sales team. One license can replace hours of manual copywriting and spreadsheet testing.

Implementation tip: Start with at least two subject lines and two email bodies per step. Let Regie.ai run those four versions for two weeks, then keep the winners and try new ideas in the next cycle. This keeps your open and reply rates climbing.

Pros

  • Creates sequences that improve themselves over time
  • Built-in testing that optimizes automatically
  • Works with your existing CRM and email tools

Cons

  • Needs good input data and brand guidelines to match your voice
  • You should still review emails before important sends
  • Costs grow as you add more team members

How to Choose the Right AI Sales Tool for Your Team

Start by identifying what’s actually slowing down your sales process. Are your reps drowning in data entry? Is your forecast accuracy all over the place? Does your outreach sound like it came from a template factory? Once you know your biggest pain points, you can focus on AI tools built to solve those specific problems.

Next, get clear on what success looks like. Companies using AI sales tools typically see significant revenue increases, higher conversion rates, and productivity gains as reps spend less time on busywork. Set your baseline metrics now so you can measure real impact later.

When evaluating tools, focus on five key factors: core AI capability, ease of setup, how well it scales with your team, pricing that makes sense, and whether the company has a solid roadmap for 2025.

Start small with your highest-impact opportunities. Run a four-week pilot with a few reps and track everything. If your forecast accuracy starts hitting the benchmarks that top-performing teams achieve, you’ve found a winner.

Think about integration from day one. Choose platforms that connect natively to your CRM so you don’t create data quality headaches. If your current data is messy, budget time for cleanup first. Clean data creates better AI recommendations.

Your team will only use tools that actually make their lives easier. Look for intuitive interfaces and solid training programs. Find your early adopters who can help coach others through the transition.

Finally, think beyond the sticker price. That “free” tool might get expensive fast as you add users. Conversely, a pricier solution could replace three separate tools and save money overall. Rank your options by impact per dollar spent.

With most sales teams expected to use AI tools by 2025, choosing the right stack today sets you up to outpace the competition tomorrow. Focus on solving your biggest problems first, then build from there.

Key takeaways & next steps

Even adopting a few of these tools delivers measurable results. Teams using AI report significant improvements in performance, with automated data capture, predictive insights, and personalized content eliminating hours of admin work while accelerating deal cycles. You spend more time on conversations that close deals.

Your next step is straightforward: identify the biggest bottlenecks in your pipeline, match them to the tools designed to solve those specific problems, pilot one or two high-impact solutions, and scale what delivers the fastest results. Save this guide and revisit it each quarter—2025 brings new features, better integrations, and more ways to turn AI into revenue.

Superhuman Blog

Read More
A Code Implementation to Build a Multi-Agent Research System with OpenAI Agents, Function Tools, Handoffs, and Session Memory

A Code Implementation to Build a Multi-Agent Research System with OpenAI Agents, Function Tools, Handoffs, and Session Memory

 

In this tutorial, we begin by showcasing the power of OpenAI Agents as the driving force behind our multi-agent research system. We set up our Colab environment with the OpenAI API key, installed the OpenAI Agents SDK, and then defined custom function tools, web_search, analyze_data, and save_research, to harness the agents’ capabilities. We instantiate three specialized OpenAI Agents (Research Specialist, Data Analyst, and Research Coordinator), each with clear, role-specific instructions and tool access. We demonstrate how these agents collaborate asynchronously and synchronously, maintain session memory for continuity, and allow rapid experimentation through helper functions. Check out the Full Codes here.

!pip install openai-agents python-dotenv


import asyncio
import json
from datetime import datetime
from agents import Agent, Runner, function_tool, SQLiteSession
import os


os.environ['OPENAI_API_KEY'] = 'Use Your Own API Key'

We install openai-agents and python-dotenv, then import asyncio, json, datetime, and the core SDK primitives (Agent, Runner, function_tool, SQLiteSession). We set OPENAI_API_KEY in the environment so we can immediately run our agents in this runtime. Check out the Full Codes here.

@function_tool
def web_search(query: str, max_results: int = 3) -> str:
   """Simulate web search results for demonstration"""
   results = [
       f"Result 1 for '{query}': Latest findings show significant developments...",
       f"Result 2 for '{query}': Research indicates new approaches in this field...",
       f"Result 3 for '{query}': Expert analysis suggests important implications..."
   ]
   return f"Search results for '{query}':n" + "n".join(results[:max_results])


@function_tool
def analyze_data(data: str, analysis_type: str = "summary") -> str:
   """Analyze provided data with different analysis types"""
   analyses = {
       "summary": f"Summary: The data contains {len(data.split())} key points with main themes around innovation and efficiency.",
       "detailed": f"Detailed Analysis: Breaking down the {len(data)} characters of data reveals patterns in methodology and conclusions.",
       "trends": f"Trend Analysis: Current data suggests upward trajectory with 3 major inflection points identified."
   }
   return analyses.get(analysis_type, "Analysis complete: Standard evaluation performed.")


@function_tool
def save_research(title: str, content: str, category: str = "general") -> str:
   """Save research findings to a structured format"""
   timestamp = datetime.now().strftime("%Y-%m-%d %H:%M:%S")
   research_entry = {
       "title": title,
       "content": content,
       "category": category,
       "timestamp": timestamp,
       "id": f"research_{len(content) % 1000}"
   }
   return f"✅ Research saved: '{title}' in category '{category}' at {timestamp}"

We define three function tools for our agents: web_search simulates quick results, analyze_data returns summary/detailed/trend insights, and save_research stores findings with a timestamped ID. We use them to gather signals, turn text into insights, and persist outputs for later steps. Check out the Full Codes here.

research_agent = Agent(
   name="Research Specialist",
   instructions="""You are an expert researcher who:
   - Conducts thorough web searches on any topic
   - Analyzes information critically and objectively
   - Identifies key insights and patterns
   - Always uses tools to gather and analyze data before responding""",
   tools=[web_search, analyze_data]
)


analyst_agent = Agent(
   name="Data Analyst",
   instructions="""You are a senior data analyst who:
   - Takes research findings and performs deep analysis
   - Identifies trends, patterns, and actionable insights
   - Creates structured summaries and recommendations
   - Uses analysis tools to enhance understanding""",
   tools=[analyze_data, save_research]
)


coordinator_agent = Agent(
   name="Research Coordinator",
   instructions="""You are a research coordinator who:
   - Manages multi-step research projects
   - Delegates tasks to appropriate specialists
   - Synthesizes findings from multiple sources
   - Makes final decisions on research direction
   - Handoff to research_agent for initial data gathering
   - Handoff to analyst_agent for detailed analysis""",
   handoffs=[research_agent, analyst_agent],
   tools=[save_research]
)

We define three OpenAI Agents with clear roles: the Research Specialist gathers and synthesizes information, the Data Analyst deep-dives and saves structured outputs, and the Research Coordinator orchestrates handoffs and final decisions. Together, we delegate, analyze with tools, and produce actionable summaries end-to-end. Check out the Full Codes here.

async def run_advanced_research_workflow():
   """Demonstrates a complete multi-agent research workflow"""
  
   session = SQLiteSession("research_session_001")
  
   print("🚀 Starting Advanced Multi-Agent Research System")
   print("=" * 60)
  
   research_topic = "artificial intelligence in healthcare 2024"
  
   print(f"n📋 PHASE 1: Initiating research on '{research_topic}'")
   result1 = await Runner.run(
       coordinator_agent,
       f"I need comprehensive research on '{research_topic}'. Please coordinate a full research workflow including data gathering, analysis, and final report generation.",
       session=session
   )
   print(f"Coordinator Response: {result1.final_output}")
  
   print(f"n📊 PHASE 2: Requesting detailed trend analysis")
   result2 = await Runner.run(
       coordinator_agent,
       "Based on the previous research, I need a detailed trend analysis focusing on emerging opportunities and potential challenges. Save the final analysis for future reference.",
       session=session
   )
   print(f"Analysis Response: {result2.final_output}")
  
   print(f"n🔬 PHASE 3: Direct specialist analysis")
   result3 = await Runner.run(
       analyst_agent,
       "Perform a detailed analysis of the healthcare AI market, focusing on regulatory challenges and market opportunities. Categorize this as 'market_analysis'.",
       session=session
   )
   print(f"Specialist Response: {result3.final_output}")
  
   print("n✅ Research workflow completed successfully!")
   return result1, result2, result3


async def run_focused_analysis():
   """Shows focused single-agent capabilities"""
  
   print("n🎯 FOCUSED ANALYSIS DEMO")
   print("-" * 40)
  
   result = await Runner.run(
       research_agent,
       "Research in quantum computing and analyze the key breakthroughs from 2024.",
       max_turns=5
   )
  
   print(f"Focused Analysis Result: {result.final_output}")
   return result


def quick_research_sync(topic: str):
   """Synchronous research for quick queries"""
  
   print(f"n⚡ QUICK SYNC RESEARCH: {topic}")
   print("-" * 40)
  
   result = Runner.run_sync(
       research_agent,
       f"Quickly research {topic} and provide 3 key insights."
   )
  
   print(f"Quick Result: {result.final_output}")
   return result

We run a full multi-agent workflow with session memory (three phases coordinated by the coordinator and analyst). We perform a focused single-agent analysis with a turn cap, and finally, we trigger a quick synchronous research helper for fast, three-insight summaries. Check out the Full Codes here.

async def main():
   """Main function demonstrating all capabilities"""
  
   print("🤖 OpenAI Agents SDK - Advanced Tutorial")
   print("Building a Multi-Agent Research System")
   print("=" * 60)
  
   try:
       await run_advanced_research_workflow()
      
       await run_focused_analysis()
      
       quick_research_sync("blockchain adoption in enterprise")
      
       print("n🎉 Tutorial completed successfully!")
       print("nKey Features Demonstrated:")
       print("✅ Multi-agent coordination with handoffs")
       print("✅ Custom function tools")
       print("✅ Session memory for conversation continuity")
       print("✅ Async and sync execution patterns")
       print("✅ Structured workflows with max_turns control")
       print("✅ Specialized agent roles and capabilities")
      
   except Exception as e:
       print(f"❌ Error: {e}")
       print("nTroubleshooting tips:")
       print("- Ensure OPENAI_API_KEY is set correctly")
       print("- Check internet connection")
       print("- Verify openai-agents package is installed")


if __name__ == "__main__":
   import nest_asyncio
   nest_asyncio.apply()
  
   asyncio.run(main())


def create_custom_agent(name: str, role: str, tools_list: list = None):
   """Helper function to create custom agents quickly"""
   return Agent(
       name=name,
       instructions=f"You are a {role} who provides expert assistance.",
       tools=tools_list or []
   )


custom_agent = create_custom_agent("Code Reviewer", "senior software engineer", [analyze_data])
result = Runner.run_sync(custom_agent, "Review this Python code for best practices")


print("n📚 Tutorial Notes:")
print("- Modify research topics and agent instructions to explore different use cases")
print("- Add your own custom tools using the @function_tool decorator")
print("- Experiment with different agent handoff patterns")
print("- Use sessions for multi-turn conversations")
print("- Perfect for Colab - just add your OpenAI API key and run!")

We orchestrate the end-to-end demo with main(), running the multi-agent workflow, a focused analysis, and a quick sync task, while handling errors and logging key features. We also provide a helper to spin up custom agents and show a synchronous “Code Reviewer” example for immediate feedback.

In conclusion, we wrap up the Advanced OpenAI Agents tutorial by highlighting the core strengths of this framework: coordinated multi-agent collaboration, extensible custom tools, persistent session memory, and flexible execution modes. We encourage you to expand on these foundations by adding new tools, crafting custom agent roles, and experimenting with different handoff strategies. We emphasize that this modular architecture empowers you to build sophisticated AI-driven research pipelines with minimal boilerplate.


Check out the Full Codes here. Feel free to check out our GitHub Page for Tutorials, Codes and Notebooks. Also, feel free to follow us on Twitter and don’t forget to join our 100k+ ML SubReddit and Subscribe to our Newsletter.

The post A Code Implementation to Build a Multi-Agent Research System with OpenAI Agents, Function Tools, Handoffs, and Session Memory appeared first on MarkTechPost.

MarkTechPost

Read More
Meta CLIP 2: The First Contrastive Language-Image Pre-training (CLIP) Trained with Worldwide Image-Text Pairs from Scratch

Meta CLIP 2: The First Contrastive Language-Image Pre-training (CLIP) Trained with Worldwide Image-Text Pairs from Scratch

 

Contrastive Language-Image Pre-training (CLIP) has become important for modern vision and multimodal models, enabling applications such as zero-shot image classification and serving as vision encoders in MLLMs. However, most CLIP variants, including Meta CLIP, are limited to English-only data curation, ignoring a significant amount of non-English content from the worldwide web. Scaling CLIP to include multilingual data has two challenges: (a) the lack of an efficient method to curate non-English data at scale and (b) the decline of English performance when adding multilingual data, also known as the curse of multilinguality. These issues hinder the development of unified models optimized for both English and non-English tasks.

Methods like OpenAI CLIP and Meta CLIP depend on English-centric curation, and distillation-based approaches introduce biases from external teacher models. SigLIP and SigLIP 2 attempt to utilize data from Google Image Search, but their dependency on proprietary sources limits scalability. Multilingual CLIP models, such as M-CLIP and mCLIP, adopt distillation techniques, using English-only CLIP as a vision encoder and training multilingual text encoders with low-quality data. Moreover, hybrid methods such as SLIP and LiT combine language supervision with self-supervised learning (SSL) for balancing semantic alignment and visual representation. Despite these efforts, none of the methods has resolved the core issues.

Researchers from Meta, MIT, Princeton University, and New York University have proposed Meta CLIP 2, the first method to train CLIP models from scratch using native worldwide image-text pairs without relying on external resources like private data, machine translation, or distillation. It removes the performance trade-offs between English and non-English data by designing and jointly scaling metadata, data curation, model capacity, and training. Meta CLIP 2 maximizes compatibility with OpenAI CLIP’s architecture, ensuring generalizability to CLIP and its variants. Moreover, its recipe introduces three innovations for scaling to worldwide: (a) scalable metadata across 300+ languages, (b) a per-language curation algorithm for balanced concept distribution, and (c) an advanced training framework.

To address the first challenge, researchers used globally curated data, and to tackle the second, they developed a worldwide CLIP training framework. This framework follows OpenAI and Meta CLIP’s training settings and model architecture, including three additions: a multilingual text tokenizer, scaling of seen training pairs, and an analysis of minimal viable model capacity. To ensure generalizability, the training setup uses OpenAI CLIP’s ViT-L/14 and Meta CLIP’s ViT-H/14 models, with modifications for multilingual support. Moreover, studies on the minimal model expressivity reveal that even OpenAI’s ViT-L/14 struggles with the curse due to limited capacity, whereas ViT-H/14 serves as an inflection point, achieving notable gains in both English and non-English tasks.

Meta Clip 2 outperforms its English-only (1.0×) and non-English (1.3×) counterparts in both English and multilingual tasks when trained on ViT-H/14 with worldwide data and scaled seen pairs. However, the curse persists in non-scaled settings or with smaller models like ViT-L/14. Transitioning from English-centric metadata to worldwide equivalents is essential. For example, removing the English filter on alt-texts leads to a 0.6% drop in ImageNet accuracy, highlighting the role of language isolation. Replacing English metadata with merged worldwide metadata initially lowers English performance but boosts multilingual capabilities. Evaluations on zero-shot classification and few-shot geo-localization benchmarks show that scaling from 13B English to 29B worldwide pairs improves results, except for saturated performance in GeoDE.

In conclusion, researchers introduced Meta CLIP 2, the first CLIP model trained from scratch on worldwide image-text pairs. It shows that scaling metadata, curation, and training capacity can break the “curse of multilinguality”, enabling mutual benefits for English and non-English performance. Meta CLIP 2 (ViT-H/14) outperforms its English-only counterpart on zero-shot ImageNet (80.5% → 81.3%) and excels on multilingual benchmarks such as XM3600, Babel-IN, and CVQA with a single unified model. By open-sourcing its metadata, curation methods, and training code, Meta CLIP 2 enables the research community to move beyond English-centric approaches and embrace the potential of the worldwide multimodal web.


Check out the Paper and GitHub Page. Feel free to check out our GitHub Page for Tutorials, Codes and Notebooks. Also, feel free to follow us on Twitter and don’t forget to join our 100k+ ML SubReddit and Subscribe to our Newsletter.

The post Meta CLIP 2: The First Contrastive Language-Image Pre-training (CLIP) Trained with Worldwide Image-Text Pairs from Scratch appeared first on MarkTechPost.

MarkTechPost

Read More