Our Blog

Computer Science Grads Struggle to Find Jobs in the A.I. Age

As companies like Amazon and Microsoft lay off workers and embrace A.I. coding tools, computer science graduates say they’re struggling to land tech jobs.

Artificial Intelligence

AI Agent Trends of 2025: A Transformative Landscape

August 10, 2025

AI Agent Trends of 2025: A Transformative Landscape

The year 2025 marks a defining moment in the evolution of artificial intelligence, ushering in an era where agentic systems—autonomous AI agents capable of complex reasoning and coordinated action—are transforming enterprise workflows, research, software development, and day-to-day user experiences. This articles focuses on five core AI agent trends for 2025: Agentic RAG, Voice Agents, AI Agent Protocols, DeepResearch Agents, Coding Agents, and Computer Using Agents (CUA).

1. Agentic RAG: Reasoning-Driven AI Workflows

Agentic Retrieval-Augmented Generation (RAG) stands as the cornerstone use case in 2025 for real-world AI agents. Building on the standard RAG architecture, Agentic RAG introduces goal-driven autonomy, memory, and planning. Here’s how the agentic approach refines classical RAG:

Memory & Context Retention: Agents track user queries across sessions, building short-term and long-term memory for seamless context management.
Planning & Tool Use: Agents dynamically select retrieval strategies (vector DBs, APIs) and coordinate the right tool for the task.
Multi-Step Reasoning: They orchestrate complex workflows—involving dynamic data fetching, prompt optimization, and leveraging diverse sources—before generating responses via LLMs.
Accuracy and Adaptability: Enhanced post-generation verification and learning loop improve output quality and domain adaptability, creating systems that can synthesize and reason over vast data sets, not just retrieve answers.

Enterprise adoption of Agentic RAG is sweeping across sectors, powering smart assistants, search engines, and collaborative platforms that rely on multi-source data retrieval and reasoning.

2. Voice Agents: Natural Language Interfaces

Voice-controlled agents are reaching new heights, seamlessly blending speech-to-text (STT) and text-to-speech (TTS) technologies with agentic reasoning pipelines. These agents interact conversationally with users, retrieve data from diverse sources, and even execute tasks such as placing calls or managing calendars—all through spoken language.

Intelligent Telephony: Agents can participate in live phone conversations, interpret natural queries, and deliver informed responses based on enterprise databases.
Context-Aware Interaction: Deep integration with agentic workflows ensures voice agents adapt to context, understand intent, and use planning to fulfill spoken tasks beyond simple command-and-response.

3. AI Agent Protocols: Coordination at Scale

With the proliferation of multi-agent systems, open communication protocols are vital. The most prominent ones include:

MCP (Model Context Protocol): Shares workflow states, tools, and memory across agents.
ACP (Agent Communication Protocol): Enables reliable message exchange, workflow orchestration, context management, and observability.
A2A (Agent-to-Agent Protocol): Facilitates seamless, decentralized collaboration and task delegation among agents—even across platform or vendor boundaries.

These protocols are rapidly adopted to enable scalable, interoperable, and secure agentic ecosystems in the enterprise—supporting everything from customer support to supply chain automation.

4. DeepResearch Agents: Advanced Collaborative Analysis

A new category of agents, DeepResearch Agents, is architected for tackling multi-step research problems. These AI systems aggregate and analyze vast swathes of structured and unstructured information from the web and databases, synthesizing analytical reports and actionable insights.

Long-Horizon Planning: Capable of breaking down research tasks into sub-queries, aggregating results, and iteratively refining outputs with reasoned analysis.
Multi-Agent Collaboration: Specialized agents—for citation, aggregation, verification—work together to generate thoroughly researched deliverables.
Tool Integration: DeepResearch agents leverage APIs, browsers, code execution tools, and context protocols to drive high-depth reports at a speed impossible for human researchers.

Business, science, and finance sectors are rapidly integrating DeepResearch architecture, reshaping how teams approach knowledge-intensive work.

5. Coding Agents & CUA: Autonomous Software Engineering

Coding Agents are revolutionizing application development, debugging, and testing:

Code Generation: Agents propose solutions, architect systems, and write code based on abstract queries or specifications.
Autonomous Debugging: They diagnose issues, apply fixes, and even run test suites iteratively.
Testing & Continuous Integration: Agents manage testing environments, execute test runners, and ensure code quality at scale.

CUA (Computer Using Agents) bridge the gap between human-computer interaction and autonomous interfaces. These agents operate desktop sandboxes, manipulate files and data, and use third-party tools—fully automating tasks as a human would.

The Bigger Picture: Autonomous, Collaborative, and Context-Aware AI

The AI agent revolution of 2025 is defined by several key themes:

Autonomy: Agents plan and execute complex tasks with minimal human intervention.
Collaboration: Robust protocols unlock federated, large-scale coordination between agents and platforms.
Memory & Reasoning: Enhanced long-term memory and advanced reasoning deliver higher-quality, more relevant results.
Accessibility: Low-code and no-code tools are democratizing agent development, enabling non-technical users to harness agentic AI.

With ongoing innovations, human oversight remains critical. As agents become more capable, establishing boundaries around agent autonomy—and ensuring transparency and safety—are vital for responsible adoption.

In Summary

2025’s agentic AI trends is not about single-purpose bots, but sophisticated, task-oriented systems capable of holistic reasoning, collaboration, and learning. These advances are redefining how we work, research, build, and interact with technology—fulfilling the vision set forth in the AI Agent Trends of 2025

Feel free to check out our GitHub Page for Tutorials, Codes and Notebooks. Also, feel free to follow us on Twitter and don’t forget to join our 100k+ ML SubReddit and Subscribe to our Newsletter.

Star us on GitHub

Sponsor us

The post AI Agent Trends of 2025: A Transformative Landscape appeared first on MarkTechPost.

MarkTechPost

From 100,000 to Under 500 Labels: How Google AI Cuts LLM Training Data by Orders of Magnitude

August 10, 2025

From 100,000 to Under 500 Labels: How Google AI Cuts LLM Training Data by Orders of Magnitude

Google Research has unveiled a groundbreaking method for fine-tuning large language models (LLMs) that slashes the amount of required training data by up to 10,000x, while maintaining or even improving model quality. This approach centers on active learning and focusing expert labeling efforts on the most informative examples—the “boundary cases” where model uncertainty peaks.

The Traditional Bottleneck

Fine-tuning LLMs for tasks demanding deep contextual and cultural understanding—like ad content safety or moderation—has typically required massive, high-quality labeled datasets. Most data is benign, meaning that for policy violation detection, only a small fraction of examples matter, driving up the cost and complexity of data curation. Standard methods also struggle to keep up when policies or problematic patterns shift, necessitating expensive retraining.

Google’s Active Learning Breakthrough

How It Works:

LLM-as-Scout: The LLM is used to scan a vast corpus (hundreds of billions of examples) and identify cases it’s least certain about.
Targeted Expert Labeling: Instead of labeling thousands of random examples, human experts only annotate those borderline, confusing items.
Iterative Curation: This process repeats, with each batch of new “problematic” examples informed by the latest model’s confusion points.
Rapid Convergence: Models are fine-tuned in multiple rounds, and the iteration continues until the model’s output aligns closely with expert judgment—measured by Cohen’s Kappa, which compares agreement between annotators beyond chance.

Image source: https://research.google/blog/achieving-10000x-training-data-reduction-with-high-fidelity-labels/

Impact:

Data Needs Plummet: In experiments with Gemini Nano-1 and Nano-2 models, alignment with human experts reached parity or better using 250–450 well-chosen examples rather than ~100,000 random crowdsourced labels—a reduction of three to four orders of magnitude.
Model Quality Rises: For more complex tasks and larger models, performance improvements reached 55–65% over baseline, demonstrating more reliable alignment with policy experts.
Label Efficiency: For reliable gains using tiny datasets, high label quality was consistently necessary (Cohen’s Kappa > 0.8).

Why It Matters

This approach flips the traditional paradigm. Rather than drowning models in vast pools of noisy, redundant data, it leverages both LLMs’ ability to identify ambiguous cases and the domain expertise of human annotators where their input is most valuable. The benefits are profound:

Cost Reduction: Vastly fewer examples to label, dramatically lowering labor and capital expenditure.
Faster Updates: The ability to retrain models on a handful of examples makes adaptation to new abuse patterns, policy changes, or domain shifts rapid and feasible.
Societal Impact: Enhanced capacity for contextual and cultural understanding increases the safety and reliability of automated systems handling sensitive content.

In Summary

Google’s new methodology enables LLM fine-tuning on complex, evolving tasks with just hundreds (not hundreds of thousands) of targeted, high-fidelity labels—ushering in far leaner, more agile, and cost-effective model development.

Check out the technical article from Google blog. Feel free to check out our GitHub Page for Tutorials, Codes and Notebooks. Also, feel free to follow us on Twitter and don’t forget to join our 100k+ ML SubReddit and Subscribe to our Newsletter.

Star us on GitHub

Sponsor us

The post From 100,000 to Under 500 Labels: How Google AI Cuts LLM Training Data by Orders of Magnitude appeared first on MarkTechPost.

MarkTechPost

Graph-R1: An Agentic GraphRAG Framework for Structured, Multi-Turn Reasoning with Reinforcement Learning

August 10, 2025

Graph-R1: An Agentic GraphRAG Framework for Structured, Multi-Turn Reasoning with Reinforcement Learning

Introduction

Large Language Models (LLMs) have set new benchmarks in natural language processing, but their tendency for hallucination—generating inaccurate outputs—remains a critical issue for knowledge-intensive applications. Retrieval-Augmented Generation (RAG) frameworks attempt to solve this by incorporating external knowledge into language generation. However, traditional RAG approaches rely on chunk-based retrieval, which limits their ability to represent complex semantic relationships. Entity-relation graph-based RAG methods (GraphRAG) address some structural limitations, but still face high construction cost, one-shot retrieval inflexibility, and dependence on long-context reasoning and carefully crafted prompts.

Researchers from Nanyang Technological University, National University of Singapore, Beijing Institute of Computer Technology and Application, and Beijing Anzhen Hospital have introduced Graph-R1, an agentic GraphRAG framework powered by end-to-end reinforcement learning.

Image source: https://arxiv.org/pdf/2507.21892v1

Core Innovations of Graph-R1

1. Lightweight Knowledge Hypergraph Construction

Graph-R1 constructs knowledge as a hypergraph, where each knowledge segment is extracted using LLM-driven n-ary relation extraction. This approach encodes richer and more semantically grounded relationships, boosting agentic reasoning capabilities while maintaining manageable cost and computational requirements.

Efficiency: Only 5.69s and $2.81 per 1,000 tokens for construction (vs. $3.35 for GraphRAG and $4.14 for HyperGraphRAG), while generating semantically rich graphs with 120,499 nodes and 98,073 edges.

2. Multi-Turn Agentic Retrieval Process

Graph-R1 models retrieval as a multi-turn interaction loop (“think-retrieve-rethink-generate”), allowing the agent to adaptively query and refine its knowledge path, unlike previous methods that use one-shot retrieval.

Dynamic Reasoning: The agent decides at each step whether to continue exploring or terminate with an answer. Entity-based and direct hyperedge retrieval are fused through reciprocal rank aggregation, improving the chances of retrieving the most relevant knowledge.

3. End-to-End Reinforcement Learning Optimization

Graph-R1 uses Group Relative Policy Optimization (GRPO) for end-to-end RL, integrating rewards for format adherence, relevance, and answer correctness. This unified reward guides agents to develop generalizable reasoning strategies tightly aligned with both the knowledge structure and output quality.

Outcome-directed reward mechanism: Combines format rewards (structural coherence) and answer rewards (semantic accuracy) for effective optimization, only rewarding answers embedded in structurally valid reasoning trajectories.

Key Findings

Benchmarking on RAG QA Tasks

Graph-R1 was evaluated across six standard QA datasets (2WikiMultiHopQA, HotpotQA, Musique, Natural Questions, PopQA, TriviaQA).

Method	Avg. F1 (Qwen2.5-7B)
NaiveGeneration	13.87
StandardRAG	15.89
GraphRAG	24.87
HyperGraphRAG	29.40
Search-R1	46.19
R1-Searcher	42.29
Graph-R1	57.82

Graph-R1 achieves up to 57.82 average F1 with Qwen2.5-7B, surpassing all previous baselines by a wide margin. Larger base models amplify its performance gains.

Ablation Analysis

Component ablation demonstrates that removing hypergraph construction, multi-turn reasoning, or RL optimization dramatically reduces performance, validating the necessity of each module within Graph-R1.

Retrieval and Efficiency

Graph-R1 retrieval is more concise and effective. It achieves high F1 scores with moderate average content lengths (~1200-1500 tokens per exchange), and supports more interaction turns (average 2.3-2.5), facilitating stable and accurate knowledge extraction.2507.21892v1.pdf
Generation cost is minimal: Despite richer representation, Graph-R1’s response time per query (7.0s) and per-query cost ($0) outperforms graph-based competitors like HyperGraphRAG (9.6s, $8.76).2507.21892v1.pdf

Generation Quality

Graph-R1’s generation quality is evaluated across seven dimensions—comprehensiveness, knowledgeability, correctness, relevance, diversity, logical coherence, factuality—and consistently outperforms all RL-based and graph-based baselines, achieving top scores in correctness (86.9), relevance (95.2), and coherence (88.5).

Generalizability

Cross-validation on out-of-distribution (O.O.D.) settings reveals that Graph-R1 maintains robust performance across datasets, with O.O.D./I.I.D. ratios often above 85%, demonstrating strong domain generalization properties.

Theoretical Guarantees

Graph-R1 is supported by information-theoretic analyses:

Graph-structured knowledge provides higher information density per retrieval and faster convergence to correct answers compared to chunk-based retrieval.
Multi-turn interaction enables the agent to achieve higher retrieval efficiency by dynamically focusing on high-impact graph regions.
End-to-end RL optimization bridges graph-structured evidence and language generation, reducing output entropy and error rates.

Algorithmic Workflow (High-Level)

Knowledge Hypergraph Extraction: LLM extracts n-ary relations to build entity and hyperedge sets.
Multi-turn Agentic Reasoning: The agent alternates between reflective thinking, querying, hypergraph retrieval (entity and hyperedge dual paths), and synthesis.
GRPO Optimization: RL policy is updated using sampled trajectories and reward normalization, enforcing structure and answer correctness.

Conclusion

Graph-R1 demonstrates that integrating hypergraph-based knowledge representation, agentic multi-turn reasoning, and end-to-end RL delivers unprecedented gains in factual QA performance, retrieval efficiency, and generation quality, charting the path for next-generation agentic and knowledge-driven LLM systems.

FAQ 1: What is the key innovation of Graph-R1 compared to earlier GraphRAG and RAG systems?

Graph-R1 introduces an agentic framework where retrieval is modeled as a multi-turn interaction rather than a single one-shot process. Its main innovations are:

Hypergraph Knowledge Representation: Instead of simple entity-relation graphs or text chunks, Graph-R1 constructs a semantic hypergraph that enables more expressive, n-ary relationships between entities.
Multi-Turn Reasoning Loop: The agent operates in repeated cycles of “think–retrieve–rethink–generate” over the hypergraph, dynamically focusing queries rather than retrieving everything at once.
End-to-End Reinforcement Learning (RL): The agent is trained with a reward function that simultaneously optimizes for step-wise logical reasoning and final answer correctness, enabling tighter alignment between structured knowledge and natural language answers.

FAQ 2: How does Graph-R1’s retrieval and generation efficiency compare to previous methods?

Graph-R1 is significantly more efficient and effective in both retrieval and answer generation:

Lower Construction & Retrieval Cost: For building the knowledge hypergraph, Graph-R1 takes only 5.69 seconds and costs $2.81 per 1,000 tokens (on the 2Wiki dataset), outperforming similar graph-based methods.
Faster and Cheaper Generation: Query response times (average 7 seconds per query) and generation costs ($0 per query) are better than prior graph-RAG systems, such as HyperGraphRAG.
Conciseness & Robustness: Graph-R1 answers are both more concise (usually 1,200–1,500 tokens) and more accurate due to the multi-turn interaction, with state-of-the-art F1 scores across six QA datasets.

FAQ 3: In which scenarios or domains is the Graph-R1 framework most applicable?

Graph-R1 is ideal for complex knowledge-intensive applications demanding both factual accuracy and reasoning transparency, such as:

Healthcare and Medical AI: Where multi-hop reasoning, traceability, and reliability are essential.
Legal and Regulatory Domains: That require precise grounded answers and interpretable multi-step reasoning.
Enterprise Knowledge Automation: For tasks needing scalable, dynamic querying and retrieval across large document or data corpora.
The model’s architecture also allows for easy adaptation to other fields that benefit from agentic, multi-turn knowledge search anchored in structured representations.

Check out the Paper here and GitHub Page. Feel free to check out our GitHub Page for Tutorials, Codes and Notebooks.

Discuss on Hacker News

Sponsor us

The post Graph-R1: An Agentic GraphRAG Framework for Structured, Multi-Turn Reasoning with Reinforcement Learning appeared first on MarkTechPost.

MarkTechPost

August 10, 2025

Building an Advanced PaperQA2 Research Agent with Google Gemini for Scientific Literature Analysis

In this tutorial, we walk through building an advanced PaperQA2 AI Agent powered by Google’s Gemini model, designed specifically for scientific literature analysis. We set up the environment in Google Colab/Notebook, configure the Gemini API, and integrate it seamlessly with PaperQA2 to process and query multiple research papers. By the end of the setup, we have an intelligent agent capable of answering complex questions, performing multi-question analyses, and conducting comparative research across papers, all while providing clear answers with evidence from source documents. Check out the Full Codes here.

!pip install paper-qa>=5 google-generativeai requests pypdf2 -q


import os
import asyncio
import tempfile
import requests
from pathlib import Path
from paperqa import Settings, ask, agent_query
from paperqa.settings import AgentSettings
import google.generativeai as genai


GEMINI_API_KEY = "Use Your Own API Key Here"
os.environ["GEMINI_API_KEY"] = GEMINI_API_KEY


genai.configure(api_key=GEMINI_API_KEY)
print(" Gemini API key configured successfully!")

We begin by installing the required libraries, including PaperQA2 and Google’s Generative AI SDK, and then import the necessary modules for our project. We set our Gemini API key as an environment variable and configure it, ensuring the integration is ready for use. Check out the Full Codes here.

def download_sample_papers():
   """Download sample AI/ML research papers for demonstration"""
   papers = {
       "attention_is_all_you_need.pdf": "https://arxiv.org/pdf/1706.03762.pdf",
       "bert_paper.pdf": "https://arxiv.org/pdf/1810.04805.pdf",
       "gpt3_paper.pdf": "https://arxiv.org/pdf/2005.14165.pdf"
   }
  
   papers_dir = Path("sample_papers")
   papers_dir.mkdir(exist_ok=True)
  
   print(" Downloading sample research papers...")
   for filename, url in papers.items():
       filepath = papers_dir / filename
       if not filepath.exists():
           try:
               response = requests.get(url, stream=True, timeout=30)
               response.raise_for_status()
               with open(filepath, 'wb') as f:
                   for chunk in response.iter_content(chunk_size=8192):
                       f.write(chunk)
               print(f" Downloaded: {filename}")
           except Exception as e:
               print(f" Failed to download {filename}: {e}")
       else:
           print(f" Already exists: {filename}")
  
   return str(papers_dir)


papers_directory = download_sample_papers()


def create_gemini_settings(paper_dir: str, temperature: float = 0.1):
   """Create optimized settings for PaperQA2 with Gemini models"""
  
   return Settings(
       llm="gemini/gemini-1.5-flash",
       summary_llm="gemini/gemini-1.5-flash",
      
       agent=AgentSettings(
           agent_llm="gemini/gemini-1.5-flash",
           search_count=6, 
           timeout=300.0, 
       ),
      
       embedding="gemini/text-embedding-004",
      
       temperature=temperature,
       paper_directory=paper_dir,
      
       answer=dict(
           evidence_k=8,            
           answer_max_sources=4,      
           evidence_summary_length="about 80 words",
           answer_length="about 150 words, but can be longer",
           max_concurrent_requests=2,
       ),
      
       parsing=dict(
           chunk_size=4000,
           overlap=200,
       ),
      
       verbosity=1,
   )

We download a set of well-known AI/ML research papers for our analysis and store them in a dedicated folder. We then create optimized PaperQA2 settings configured to use Gemini for all LLM and embedding tasks, fine-tuning parameters like search count, evidence retrieval, and parsing for efficient and accurate literature processing. Check out the Full Codes here.

class PaperQAAgent:
   """Advanced AI Agent for scientific literature analysis using PaperQA2"""
  
   def __init__(self, papers_directory: str, temperature: float = 0.1):
       self.settings = create_gemini_settings(papers_directory, temperature)
       self.papers_dir = papers_directory
       print(f" PaperQA Agent initialized with papers from: {papers_directory}")
      
   async def ask_question(self, question: str, use_agent: bool = True):
       """Ask a question about the research papers"""
       print(f"n Question: {question}")
       print(" Searching through research papers...")
      
       try:
           if use_agent:
               response = await agent_query(query=question, settings=self.settings)
           else:
               response = ask(question, settings=self.settings)
              
           return response
          
       except Exception as e:
           print(f" Error processing question: {e}")
           return None
  
   def display_answer(self, response):
       """Display the answer with formatting"""
       if response is None:
           print(" No response received")
           return
          
       print("n" + "="*60)
       print(" ANSWER:")
       print("="*60)
      
       answer_text = getattr(response, 'answer', str(response))
       print(f"n{answer_text}")
      
       contexts = getattr(response, 'contexts', getattr(response, 'context', []))
       if contexts:
           print("n" + "-"*40)
           print(" SOURCES USED:")
           print("-"*40)
           for i, context in enumerate(contexts[:3], 1):
               context_name = getattr(context, 'name', getattr(context, 'doc', f'Source {i}'))
               context_text = getattr(context, 'text', getattr(context, 'content', str(context)))
               print(f"n{i}. {context_name}")
               print(f"   Text preview: {context_text[:150]}...")
  
   async def multi_question_analysis(self, questions: list):
       """Analyze multiple questions in sequence"""
       results = {}
       for i, question in enumerate(questions, 1):
           print(f"n Processing question {i}/{len(questions)}")
           response = await self.ask_question(question)
           results = response
          
           if response:
               print(f" Completed: {question[:50]}...")
           else:
               print(f" Failed: {question[:50]}...")
              
       return results
  
   async def comparative_analysis(self, topic: str):
       """Perform comparative analysis across papers"""
       questions = [
           f"What are the key innovations in {topic}?",
           f"What are the limitations of current {topic} approaches?",
           f"What future research directions are suggested for {topic}?",
       ]
      
       print(f"n Starting comparative analysis on: {topic}")
       return await self.multi_question_analysis(questions)


async def basic_demo():
   """Demonstrate basic PaperQA functionality"""
   agent = PaperQAAgent(papers_directory)
  
   question = "What is the transformer architecture and why is it important?"
   response = await agent.ask_question(question)
   agent.display_answer(response)


print(" Running basic demonstration...")
await basic_demo()


async def advanced_demo():
   """Demonstrate advanced multi-question analysis"""
   agent = PaperQAAgent(papers_directory, temperature=0.2)
  
   questions = [
       "How do attention mechanisms work in transformers?",
       "What are the computational challenges of large language models?",
       "How has pre-training evolved in natural language processing?"
   ]
  
   print(" Running advanced multi-question analysis...")
   results = await agent.multi_question_analysis(questions)
  
   for question, response in results.items():
       print(f"n{'='*80}")
       print(f"Q: {question}")
       print('='*80)
       if response:
           answer_text = getattr(response, 'answer', str(response))
           display_text = answer_text[:300] + "..." if len(answer_text) > 300 else answer_text
           print(display_text)
       else:
           print(" No answer available")


print("n Running advanced demonstration...")
await advanced_demo()


async def research_comparison_demo():
   """Demonstrate comparative research analysis"""
   agent = PaperQAAgent(papers_directory)
  
   results = await agent.comparative_analysis("attention mechanisms in neural networks")
  
   print("n" + "="*80)
   print(" COMPARATIVE ANALYSIS RESULTS")
   print("="*80)
  
   for question, response in results.items():
       print(f"n {question}")
       print("-" * 50)
       if response:
           answer_text = getattr(response, 'answer', str(response))
           print(answer_text)
       else:
           print(" Analysis unavailable")
       print()


print(" Running comparative research analysis...")
await research_comparison_demo()

̌We define a PaperQAAgent that uses our Gemini-tuned PaperQA2 settings to search papers, answer questions, and cite sources with clean display helpers. We then run basic, advanced multi-question, and comparative demos so we can interrogate literature end-to-end and summarize findings efficiently. Check out the Full Codes here.

def create_interactive_agent():
   """Create an interactive agent for custom queries"""
   agent = PaperQAAgent(papers_directory)
  
   async def query(question: str, show_sources: bool = True):
       """Interactive query function"""
       response = await agent.ask_question(question)
      
       if response:
           answer_text = getattr(response, 'answer', str(response))
           print(f"n Answer:n{answer_text}")
          
           if show_sources:
               contexts = getattr(response, 'contexts', getattr(response, 'context', []))
               if contexts:
                   print(f"n Based on {len(contexts)} sources:")
                   for i, ctx in enumerate(contexts[:3], 1):
                       ctx_name = getattr(ctx, 'name', getattr(ctx, 'doc', f'Source {i}'))
                       print(f"  {i}. {ctx_name}")
       else:
           print(" Sorry, I couldn't find an answer to that question.")
          
       return response
  
   return query


interactive_query = create_interactive_agent()


print("n Interactive agent ready! You can now ask custom questions:")
print("Example: await interactive_query('How do transformers handle long sequences?')")


def print_usage_tips():
   """Print helpful usage tips"""
   tips = """
    USAGE TIPS FOR PAPERQA2 WITH GEMINI:
  
   1.  Question Formulation:
      - Be specific about what you want to know
      - Ask about comparisons, mechanisms, or implications
      - Use domain-specific terminology
  
   2.  Model Configuration:
      - Gemini 1.5 Flash is free and reliable
      - Adjust temperature (0.0-1.0) for creativity vs precision
      - Use smaller chunk_size for better processing
  
   3.  Document Management:
      - Add PDFs to the papers directory
      - Use meaningful filenames
      - Mix different types of papers for better coverage
  
   4.  Performance Optimization:
      - Limit concurrent requests for free tier
      - Use smaller evidence_k values for faster responses
      - Cache results by saving the agent state
  
   5.  Advanced Usage:
      - Chain multiple questions for deeper analysis
      - Use comparative analysis for research reviews
      - Combine with other tools for complete workflows
  
    Example Questions to Try:
   - "Compare the attention mechanisms in BERT vs GPT models"
   - "What are the computational bottlenecks in transformer training?"
   - "How has pre-training evolved from word2vec to modern LLMs?"
   - "What are the key innovations that made transformers successful?"
   """
   print(tips)


print_usage_tips()


def save_analysis_results(results: dict, filename: str = "paperqa_analysis.txt"):
   """Save analysis results to a file"""
   with open(filename, 'w', encoding='utf-8') as f:
       f.write("PaperQA2 Analysis Resultsn")
       f.write("=" * 50 + "nn")
      
       for question, response in results.items():
           f.write(f"Question: {question}n")
           f.write("-" * 30 + "n")
           if response:
               answer_text = getattr(response, 'answer', str(response))
               f.write(f"Answer: {answer_text}n")
              
               contexts = getattr(response, 'contexts', getattr(response, 'context', []))
               if contexts:
                   f.write(f"nSources ({len(contexts)}):n")
                   for i, ctx in enumerate(contexts, 1):
                       ctx_name = getattr(ctx, 'name', getattr(ctx, 'doc', f'Source {i}'))
                       f.write(f"  {i}. {ctx_name}n")
           else:
               f.write("Answer: No response availablen")
           f.write("n" + "="*50 + "nn")
  
   print(f" Results saved to: {filename}")


print(" Tutorial complete! You now have a fully functional PaperQA2 AI Agent with Gemini.")

We create an interactive query helper that allows us to ask custom questions on demand and optionally view cited sources. We also print practical usage tips and add a saver that writes every Q&A with source names to a results file, wrapping up the tutorial with a ready-to-use workflow.

In conclusion, we successfully created a fully functional AI research assistant that leverages the speed and versatility of Gemini with the robust paper processing capabilities of PaperQA2. We can now interactively explore scientific papers, run targeted queries, and even perform in-depth comparative analyses with minimal effort. This setup enhances our ability to digest complex research and also streamlines the entire literature review process, enabling us to focus on insights rather than manual searching.

Check out the Full Codes here. Feel free to check out our GitHub Page for Tutorials, Codes and Notebooks. Also, feel free to follow us on Twitter and don’t forget to join our 100k+ ML SubReddit and Subscribe to our Newsletter.

Discuss on Hacker News

Sponsor us

The post Building an Advanced PaperQA2 Research Agent with Google Gemini for Scientific Literature Analysis appeared first on MarkTechPost.

MarkTechPost

August 10, 2025

9 Agentic AI Workflow Patterns Transforming AI Agents in 2025

AI agents are at a pivotal moment: simply calling a language model is no longer enough for production-ready solutions. In 2025, intelligent automation depends on orchestrated, agentic workflows—modular coordination blueprints that transform isolated AI calls into systems of autonomous, adaptive, and self-improving agents. Here’s how nine workflow patterns can unlock the next generation of scalable, robust AI agents.

Why Classic AI Agent Workflows Fail

Most failed agent implementations rely on “single-step thinking”—expecting one model call to solve complex, multi-part problems. AI agents succeed when their intelligence is orchestrated across multi-step, parallel, routed, and self-improving workflows. According to Gartner, by 2028, at least 33% of enterprise software will depend on agentic AI, but overcoming the 85% failure rate requires these new paradigms.

The 9 Agentic Workflow Patterns for 2025

Sequential Intelligence

(1) Prompt Chaining:

Tasks are decomposed into step-by-step subgoals where each LLM’s output becomes the next step’s input. Ideal for complex customer support agents, assistants, and pipelines that require context preservation throughout multi-turn conversations.

(2) Plan and Execute:

Agents autonomously plan multi-step workflows, execute each stage sequentially, review outcomes, and adjust as needed. This adaptive “plan–do–check–act” loop is vital for business process automation and data orchestration, providing resilience against failures and offering granular control over progress.

Parallel Processing

(3) Parallelization:

Splitting a large task into independent sub-tasks for concurrent execution by multiple agents or LLMs. Popular for code review, candidate evaluation, A/B testing, and building guardrails, parallelization drastically reduces time to resolution and improves consensus accuracy.

(4) Orchestrator–Worker:

A central “orchestrator” agent breaks tasks down, assigns work to specialized “workers,” then synthesizes results. This pattern powers retrieval-augmented generation (RAG), coding agents, and sophisticated multi-modal research by leveraging specialization.

Intelligent Routing

(5) Routing:

Input classification decides which specialized agent should handle each part of a workflow, achieving separation of concerns and dynamic task assignment. This is the backbone of multi-domain customer support and debate systems, where routing enables scalable expertise.

(6) Evaluator–Optimizer:

Agents collaborate in a continuous loop: one generates solutions, the other evaluates and suggests improvements. This enables real-time data monitoring, iterative coding, and feedback-driven design—improving quality with every cycle.

Self-Improving Systems

(7) Reflection:

Agents self-review their performance after each run, learning from errors, feedback, and changing requirements. Reflection elevates agents from static performers to dynamic learners, essential for long-term automation in data-centric environments, such as app building or regulatory compliance.

(8) Rewoo:

Extensions of ReACT allow agents to plan, substitute strategies, and compress workflow logic—reducing computational overhead and aiding fine-tuning, especially in deep search and multi-step Q&A domains.

(9) Autonomous Workflow:

Agents continuously operate in loops, leveraging tool feedback and environmental signals for perpetual self-improvement. This is at the heart of autonomous evaluations and dynamic guardrail systems, allowing agents to operate reliably with minimal intervention.

How These Patterns Revolutionize AI Agents

Orchestrated Intelligence: These patterns unite isolated model calls into intelligent, context-aware agentic systems, each optimized for different problem structures (sequential, parallel, routed, and self-improving).
Complex Problem Solving: Collaborative agent workflows tackle problems that single LLM agents cannot address, dividing and conquering complexity for reliable business outcomes.
Continuous Improvement: By learning from feedback and failures at every step, agentic workflows evolve—offering a path to truly autonomous, adaptive intelligence.
Scalability & Flexibility: Agents can be specialized, added, or swapped, yielding modular pipelines that scale from simple automation to enterprise-grade orchestrations.

Real-World Impact & Implementation Best Practices

Design for Modularity: Build agents as composable, specialized entities. Orchestration patterns manage timing, data flow, and dependencies.
Leverage Tool Integration: Success depends on seamless interplay between agents and external systems (APIs, cloud, RPA), enabling dynamic adaptation to evolving requirements.
Focus on Feedback Loops: Reflection and evaluator–optimizer workflows keep agents improving, boosting precision and reliability in dynamic environments like healthcare, finance, and customer service.

Conclusion

Agentic workflows are no longer a future concept—they are the cornerstone of today’s leading AI teams. By mastering these nine patterns, developers and architects can unlock scalable, resilient, and adaptive AI systems that thrive in real-world production. The shift from single-step execution to orchestrated intelligence marks the dawn of enterprise-wide automation, making agentic thinking a required skill for the age of autonomous AI.

Star us on GitHub

Sponsor us

The post 9 Agentic AI Workflow Patterns Transforming AI Agents in 2025 appeared first on MarkTechPost.

MarkTechPost

Top 50 AI Vibe Coding Tools for Everyone in 2025

August 9, 2025

Top 50 AI Vibe Coding Tools for Everyone in 2025

Vibe coding in 2025 has completely changed how we used to build software with advanced large language models (LLMs); anyone can now turn plain-English ideas directly into working code. In this article, we’ve listed the top 50 AI vibe coding tools for everyone in 2025 that can make software creation easier than ever, perfect for beginners launching new projects or pros updating legacy code, and entrepreneurs and product teams for creating minimum viable product (MVPs)—even from your favorite café.

What is Vibe Coding?

Coined by AI thought leader Andrej Karpathy, “vibe coding” is an innovative way of programming where artificial intelligence (AI) can create functional code by interpreting natural language prompts. Instead of memorizing complex syntax or spending hours debugging, simply describe what you want and let AI do the heavy lifting.

AI vibe coding platforms analyze your requirements and deliver code, which could be a snippet, a full function, or even an entire production-ready application. This approach lowers the barriers to software development, welcoming non-coders and helping experienced developers’ productivity by automating repetitive programming tasks.

What makes a great vibe coding tool?

Before diving into our comprehensive list, it’s key to recognize what makes the top vibe coding tools stand out:

Intelligent AI: The best tools have a deep understanding of code context, not just individual lines.
Seamless Integration: They should easily fit into your existing workflow without causing disruption.
Speed and Performance: Quick and responsive suggestions are crucial for a smooth coding experience.
Broad Language Support: A wide range of supported languages and frameworks is a major advantage.
Customization and Adaptability: The ability to modify the tool to your specific needs is highly valuable.

How to pick your ideal vibe coding tool?

Selecting the perfect vibe coding tool depends on your individual needs and goals. Here’s a simple framework to help you make the right choice:

Define Your Primary Objective: Are you building web applications, mobile apps, or working on data science projects?
Assess Your Technical Skills: Some tools are designed for beginners, while others offer advanced features for experienced developers.
Verify Language and Framework Compatibility: Make sure the tool you are using supports the programming languages and frameworks you use.
Explore Integration Capabilities: The ideal tool should integrate seamlessly with your existing technology stack.
Consider Your Budget: Many tools offer free versions, but you’ll have to pay for premium features.

Here are the top 50 AI vibe coding tools for everyone in 2025:

Here’s a comprehensive list of the 50 best vibe coding tools available in 2025:

Lovable:^* Lovable makes web app development accessible to everyone by turning natural language descriptions into functional applications with appealing designs.
Base44:^* An AI-powered platform that lets you build fully-functional custom apps from just a text description, no coding required.
GitHub Copilot: A pioneer in AI-powered coding, GitHub Copilot is a powerful tool that adapts to your personal coding style, suggesting entire functions while supporting popular languages like Python, JavaScript, and more.
Bubble:^* A full-stack, AI-powered no-code platform for building, launching, and scaling serious web and native mobile applications with a visual editor.
Memex: A desktop-based “Everything Builder” that lets you vibe code internal tools and other projects locally on your computer using natural language.
Hostinger Horizons:^* Hostinger Horizons allows users to build, edit, and publish custom web applications without coding.
Softr: A no-code app builder for creating custom business software, client portals, and internal tools from your existing data sources.
Rork: An AI tool that builds complete, cross-platform native mobile apps using React Native from your descriptions.
Google Opal: An experimental Google tool to build, edit, and share mini-AI applications using natural language.
Cursor: Cursor is an AI-first code editor designed to accelerate development, allowing you to generate code by describing functions in plain English, and it offers AI assistance for debugging.
Devin by Cognition AI: Devin is a high-end AI coding assistant that can autonomously handle complex tasks like setting up repositories, writing code, and performing migrations.
String by Pipedream: An AI agent builder that allows you to prompt, run, edit, and deploy AI agents to automate various tasks in seconds.
Bolt.new by StackBlitz: This web-based AI development agent simplifies the web development workflow by allowing you to prompt, run, edit, and deploy full-stack applications directly from your browser.
v0 by Vercel: For front-end developers using React, v0 is an invaluable tool that generates React code based on text prompts, using Shadcn UI and Tailwind CSS.
Replit: Replit has grown from a simple online IDE to a full-fledged development platform to make apps and sites with powerful AI features.
Windsurf (formerly Codeium): Windsurf combines AI copilots and autonomous agents to provide deep contextual awareness across your codebase, helping you navigate unfamiliar code with ease.
Claude Code by Anthropic: Claude Code is an AI coding agent that can read and search code, edit files, run tests, and even commit and push to GitHub.
Google Jules: Jules is an autonomous AI coding agent by Google that integrates with existing repositories, understands project context, and generates pull requests.
GitHub Spark: An AI-powered platform from GitHub to build and deploy full-stack intelligent apps using natural language, visual tools, or code.
Squarespace AI Website Builder: A tool that uses AI to create a personalized, professional website with custom content and design in minutes, guided by your inputs.
Lazy AI: Lazy AI focuses on simplifying application creation with a no-code platform and a library of pre-configured workflows for common developer tasks.
Devika: Devika is an open-source AI-powered software engineer that can break down high-level instructions into smaller, manageable steps, using LLMs, reasoning algorithms, and web browsing to complete complex coding tasks.
bolt.diy: bolt.diy is an open-source platform for developers who want more control over their AI assistants, allowing you to create, run, edit, and deploy full-stack web apps using a variety of LLMs.
Rocket: An AI-powered platform that generates web and mobile apps from natural language prompts or Figma designs.
Softgen: Softgen is an AI-based web application builder that helps entrepreneurs and product managers to create full-stack web apps by describing their projects.
Databutton: An AI developer that collaborates with you to build and deploy business applications, handling technical decisions along the way.
Wonderish: A “vibe prompting” platform that creates websites, landing pages, and funnels based on your text descriptions.
Mocha: An AI-powered, no-code application builder that turns your plain English ideas into unique, working apps with built-in databases and authentication.
Airtable: An AI-native app-building platform that allows teams to create custom business apps and workflows from their data without code.
WebSparks: WebSparks takes AI application generation a step further by interpreting not just text but also images and sketches to produce complete full-stack applications.
Probz AI: An all-in-one AI platform to build fully-functioning web apps like CRMs and client portals without coding, featuring built-in databases and authentication.
ToolJet: An AI-native, low-code platform for building and deploying internal tools and business applications with a visual app builder and AI agents.
Fine.dev: Fine is an AI assistant designed for startup CTOs and development teams, automating tasks like coding, debugging, testing, and code review.
Google Firebase Studio: Firebase Studio is a cloud-based development tool that allows developers to prototype, build, and deploy full-stack AI apps quickly via a web browser.
Command by Langbase: A tool that turns natural language prompts into production-ready AI agents for a wide variety of tasks.
Magically: An AI-powered builder that creates fully functional native mobile apps, including backend and authentication, from your text descriptions.
Emergent: An agentic vibe-coding platform that helps you build ambitious applications with AI.
Flatlogic: An AI software development agent that builds full-stack business applications like CRMs and ERPs, giving you full ownership of the source code.
Create: Create is an AI-powered vibe coding tool that lets you build websites, apps, and tools by simply describing them in words or uploading an image of a design.
Co.dev: Codev specializes in turning everyday language descriptions into full-stack Next.js web applications, using Next.js and Supabase as a foundation.
Aider: Aider allows you to pair programs with LLMs to edit code in your local git repository and has shown strong performance on benchmarks like SWE Bench.
Zed by Zed Industries: Zed is a high-performance code editor built in Rust that integrates with upcoming LLMs for code generation and analysis.
Cline: Cline is a vibe coding tool that offers AI coding assistance with a focus on transparency and user control, always asking for permission before making changes.
Augment Code: Augment provides your team with quick access to its collective knowledge, including codebase, documentation, and dependencies, through chat, code completions, and suggested edits.
Tempo: Tempo is a designer-developer collaboration platform for React applications that offers a drag-and-drop editor for visual editing of React code.
Cody by Sourcegraph: Cody is an experienced developer’s assistant that can understand your codebase and provide contextually aware suggestions, integrating with popular IDEs like VS Code, Visual Studio, and Eclipse.
Qodo: Qodo is a coding assistant that prioritizes code quality over speed, ensuring that all generated code, reviews, and tests meet high standards.
GoCodeo: GoCodeo focuses on testing and debugging, two of the most time-consuming aspects of development, and can generate production-ready tests in under 30 seconds.
Goose: Goose, or Codename Goose, is an open-source AI agent that runs on your local machine, providing enhanced privacy and control.
HeyBossAI: HeyBoss is a personal AI engineer designed to help non-coders build apps, websites, and games using OpenAI’s technology.

Ethical Considerations of AI-Powered Coding

While vibe coding offers many benefits, it also raises important ethical questions that developers should consider:

Code Ownership and Credit: Who owns the code when an AI writes a significant portion of it? Clarify who holds rights when AI writes large code blocks.
Over-reliance on AI: Could depending too heavily on AI lead to a decline in fundamental coding skills? Keep manually coding critical paths to maintain problem-solving reflexes.
Bias in AI-Generated Code: AI models learn from existing code, which may contain biases or suboptimal practices. Audit generated code for hidden vulnerabilities or biased assumptions.

Maximizing Your Productivity with Vibe Coding Tools

To get the most out of your vibe coding tools, follow these tips:

Learn Keyboard Shortcuts: They may seem minor, but they can save a significant amount of time.
Customize Your Environment: Adjust the settings to create a setup that aligns with your workflow.
Using AI Suggestions as a Starting Point: Don’t blindly accept and use every suggestion given by AI. Use them as a base and improve them with your own knowledge.
Keep Your Tools Updated: Regularly update your tools to make sure you have the latest features and security fixes.

In Conclusion:

AI vibe coding could be more than just a passing trend that could fundamentally change how we approach software development and actually build apps. By reducing the mental effort of coding, these 50 vibe coding tools allow developers to focus on what truly matters: solving real-world problems and creating innovative solutions. Pick the vibe coding tool that meshes with your stack, integrate it thoughtfully, and let the “vibe” translate your next big idea into shipping code.

🤝

For Partnership/Promotion on AI Tools Club, please check out our partnership page.

_{*Affiliate: We do make a small profit from the sales of this AI product through affiliate marketing. This is not an official list; we have tried to mention as many tools as possible.}

AI Tools Club

How Older People Are Reaping Brain Benefits From New Tech

August 9, 2025

How Older People Are Reaping Brain Benefits From New Tech

Overuse of digital gadgets harms teenagers, research suggests. But ubiquitous technology may be helping older Americans stay sharp.

Artificial Intelligence

Alexa Got an A.I. Brain Transplant. How Smart Is It Now?

August 9, 2025